diff --git a/.beads/issues.jsonl b/.beads/issues.jsonl index 0e5bc4c..8b809b1 100644 --- a/.beads/issues.jsonl +++ b/.beads/issues.jsonl @@ -3,7 +3,7 @@ {"id":"bd-12ae","title":"OBSERV: Add structured tracing fields to rate-limit/retry handling","description":"## Background\nRate limit and retry events are currently logged at WARN with minimal context (src/gitlab/client.rs:~157). This enriches them with structured fields so MetricsLayer can count them and -v mode shows actionable retry information.\n\n## Approach\n### src/gitlab/client.rs - request() method (line ~119-171)\n\nCurrent 429 handling (~line 155-158):\n```rust\nif response.status() == StatusCode::TOO_MANY_REQUESTS && attempt < Self::MAX_RETRIES {\n let retry_after = Self::parse_retry_after(&response);\n tracing::warn!(retry_after_secs = retry_after, attempt, path, \"Rate limited by GitLab, retrying\");\n sleep(Duration::from_secs(retry_after)).await;\n continue;\n}\n```\n\nReplace with INFO-level structured log:\n```rust\nif response.status() == StatusCode::TOO_MANY_REQUESTS && attempt < Self::MAX_RETRIES {\n let retry_after = Self::parse_retry_after(&response);\n tracing::info!(\n path = %path,\n attempt = attempt,\n retry_after_secs = retry_after,\n status_code = 429u16,\n \"Rate limited, retrying\"\n );\n sleep(Duration::from_secs(retry_after)).await;\n continue;\n}\n```\n\nFor transient errors (network errors, 5xx responses), add similar structured logging:\n```rust\ntracing::info!(\n path = %path,\n attempt = attempt,\n error = %e,\n \"Retrying after transient error\"\n);\n```\n\nKey changes:\n- Level: WARN -> INFO (visible in -v mode, not alarming in default mode)\n- Added: status_code field for 429\n- Added: structured path, attempt fields for all retry events\n- These structured fields enable MetricsLayer (bd-3vqk) to count rate_limit_hits and retries\n\n## Acceptance Criteria\n- [ ] 429 responses log at INFO with fields: path, attempt, retry_after_secs, status_code=429\n- [ ] Transient error retries log at INFO with fields: path, attempt, error\n- [ ] lore -v sync shows retry activity on stderr (INFO is visible in -v mode)\n- [ ] Default mode (no -v) does NOT show retry lines on stderr (INFO filtered out)\n- [ ] File layer captures all retry events (always at DEBUG+)\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/gitlab/client.rs (modify request() method, lines ~119-171)\n\n## TDD Loop\nRED:\n - test_rate_limit_log_fields: mock 429 response, capture log output, parse JSON, assert fields\n - test_retry_log_fields: mock network error + retry, assert structured fields\nGREEN: Change log level and add structured fields\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- parse_retry_after returns 0 or very large values: the existing logic handles this\n- All retries exhausted: the final attempt returns the error normally. No special logging needed (the error propagates).\n- path may contain sensitive data (project IDs): project IDs are not sensitive in this context","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-04T15:55:02.448070Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:21:42.304259Z","closed_at":"2026-02-04T17:21:42.304213Z","close_reason":"Changed 429 rate-limit logging from WARN to INFO with structured fields: path, attempt, retry_after_secs, status_code=429 in both request() and request_with_headers()","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-12ae","depends_on_id":"bd-3pk","type":"parent-child","created_at":"2026-02-04T15:55:02.450343Z","created_by":"tayloreernisse"}]} {"id":"bd-13b","title":"[CP0] CLI entry point with Commander.js","description":"## Background\n\nCommander.js provides the CLI framework. The main entry point sets up the program with all subcommands. Uses ESM with proper shebang for npx/global installation.\n\nReference: docs/prd/checkpoint-0.md section \"CLI Commands\"\n\n## Approach\n\n**src/cli/index.ts:**\n```typescript\n#!/usr/bin/env node\n\nimport { Command } from 'commander';\nimport { version } from '../../package.json' with { type: 'json' };\nimport { initCommand } from './commands/init';\nimport { authTestCommand } from './commands/auth-test';\nimport { doctorCommand } from './commands/doctor';\nimport { versionCommand } from './commands/version';\nimport { backupCommand } from './commands/backup';\nimport { resetCommand } from './commands/reset';\nimport { syncStatusCommand } from './commands/sync-status';\n\nconst program = new Command();\n\nprogram\n .name('gi')\n .description('GitLab Inbox - Unified notification management')\n .version(version);\n\n// Global --config flag available to all commands\nprogram.option('-c, --config ', 'Path to config file');\n\n// Register subcommands\nprogram.addCommand(initCommand);\nprogram.addCommand(authTestCommand);\nprogram.addCommand(doctorCommand);\nprogram.addCommand(versionCommand);\nprogram.addCommand(backupCommand);\nprogram.addCommand(resetCommand);\nprogram.addCommand(syncStatusCommand);\n\nprogram.parse();\n```\n\nEach command file exports a Command instance:\n```typescript\n// src/cli/commands/version.ts\nimport { Command } from 'commander';\n\nexport const versionCommand = new Command('version')\n .description('Show version information')\n .action(() => {\n console.log(`gi version ${version}`);\n });\n```\n\n## Acceptance Criteria\n\n- [ ] `gi --help` shows all commands and global options\n- [ ] `gi --version` shows version from package.json\n- [ ] `gi --help` shows command-specific help\n- [ ] `gi --config ./path` passes config path to commands\n- [ ] Unknown command shows error and suggests --help\n- [ ] Exit code 0 on success, non-zero on error\n- [ ] Shebang line works for npx execution\n\n## Files\n\nCREATE:\n- src/cli/index.ts (main entry point)\n- src/cli/commands/version.ts (simple command as template)\n\nMODIFY (later beads):\n- package.json (add \"bin\" field pointing to dist/cli/index.js)\n\n## TDD Loop\n\nN/A for CLI entry point - verify with manual testing:\n\n```bash\nnpm run build\nnode dist/cli/index.js --help\nnode dist/cli/index.js version\nnode dist/cli/index.js unknown-command # should error\n```\n\n## Edge Cases\n\n- package.json import requires Node 20+ with { type: 'json' } assertion\n- Alternative: read version from package.json with readFileSync\n- Command registration order affects help display - alphabetical preferred\n- Global options must be defined before subcommands","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-24T16:09:50.499023Z","created_by":"tayloreernisse","updated_at":"2026-01-25T03:10:49.224627Z","closed_at":"2026-01-25T03:10:49.224499Z","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-13b","depends_on_id":"bd-gg1","type":"blocks","created_at":"2026-01-24T16:13:09.370408Z","created_by":"tayloreernisse"}]} {"id":"bd-140","title":"[CP1] Database migration 002_issues.sql","description":"Create migration file with tables for issues, labels, issue_labels, discussions, and notes.\n\nTables to create:\n- issues: gitlab_id, project_id, iid, title, description, state, author_username, timestamps, web_url, raw_payload_id\n- labels: gitlab_id, project_id, name, color, description (unique on project_id+name)\n- issue_labels: junction table\n- discussions: gitlab_discussion_id, project_id, issue_id, noteable_type, individual_note, timestamps, resolvable/resolved\n- notes: gitlab_id, discussion_id, project_id, type, is_system, author_username, body, timestamps, position, resolution fields, DiffNote position fields\n\nInclude appropriate indexes:\n- idx_issues_project_updated, idx_issues_author, uq_issues_project_iid\n- uq_labels_project_name, idx_labels_name\n- idx_issue_labels_label\n- uq_discussions_project_discussion_id, idx_discussions_issue/mr/last_note\n- idx_notes_discussion/author/system\n\nFiles: migrations/002_issues.sql\nDone when: Migration applies cleanly on top of 001_initial.sql","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:18:53.954039Z","created_by":"tayloreernisse","updated_at":"2026-01-25T15:21:35.154936Z","deleted_at":"2026-01-25T15:21:35.154934Z","deleted_by":"tayloreernisse","delete_reason":"delete","original_type":"task","compaction_level":0,"original_size":0} -{"id":"bd-14q","title":"Epic: Gate 4 - File Decision History (lore file-history)","description":"Implement 'lore file-history ' command showing MRs that touched a file, with rename chain resolution, discussion context, and linked issues. Includes migration 012 (renumbered from spec's 011) for mr_file_changes and commit SHA fields.\n\nChildren: bd-1oo (migration 012), bd-jec (config), bd-2yo (MR diffs API), bd-1yx (rename chains), bd-z94 (file-history command)","status":"open","priority":1,"issue_type":"feature","created_at":"2026-02-02T21:31:01.094024Z","created_by":"tayloreernisse","updated_at":"2026-02-02T21:50:51.656482Z","compaction_level":0,"original_size":0,"labels":["epic","gate-4","phase-b"],"dependencies":[{"issue_id":"bd-14q","depends_on_id":"bd-1se","type":"blocks","created_at":"2026-02-02T21:34:16.913465Z","created_by":"tayloreernisse"},{"issue_id":"bd-14q","depends_on_id":"bd-2zl","type":"blocks","created_at":"2026-02-02T21:34:16.870058Z","created_by":"tayloreernisse"}]} +{"id":"bd-14q","title":"Epic: Gate 4 - File Decision History (lore file-history)","description":"## Background\nGate 4 adds file-level decision history — \"which MRs touched this file, and why?\" This bridges the gap between code and project management by linking file paths to MRs, MRs to issues, and issues to discussions. The key innovation is rename chain resolution: when a file was renamed from src/auth/handler.rs to src/auth/oauth.rs, querying either path finds all historical MRs.\n\nGate 4 also captures merge_commit_sha and squash_commit_sha on merge_requests, which Gate 5 uses for code tracing and which Phase C will use for git blame integration.\n\n## Architecture\n- **New table:** mr_file_changes (migration 012) — file paths + change types per MR\n- **New columns:** merge_requests.merge_commit_sha, merge_requests.squash_commit_sha (migration 012)\n- **Opt-in config:** sync.fetchMrFileChanges (default true). --no-file-changes CLI flag.\n- **Data source:** GET /projects/:id/merge_requests/:iid/diffs — extract file metadata only, discard diff content.\n- **Queue integration:** Uses dependent fetch queue with job_type=mr_diffs\n- **Rename chain resolution:** BFS over mr_file_changes WHERE change_type=renamed, bounded at 10 hops with cycle detection.\n\n## Children (Execution Order)\n1. **bd-1oo** [OPEN] — Migration 012: mr_file_changes + merge_commit_sha + squash_commit_sha\n2. **bd-jec** [OPEN] — fetchMrFileChanges config flag + --no-file-changes CLI flag\n3. **bd-2yo** [OPEN] — Fetch MR diffs API, populate mr_file_changes, capture commit SHAs\n4. **bd-1yx** [OPEN] — Rename chain resolution algorithm (src/core/file_history.rs)\n5. **bd-z94** [OPEN] — lore file-history command with human + robot output\n\n## Gate Completion Criteria\n- [ ] mr_file_changes populated from GitLab diffs API for synced MRs\n- [ ] merge_commit_sha and squash_commit_sha captured in merge_requests\n- [ ] `lore file-history ` returns MRs ordered by merge/creation date\n- [ ] Output includes MR title, state, author, change type, discussion count\n- [ ] --discussions shows DiffNote snippets on the queried file\n- [ ] Rename chains resolved with bounded hop count (default 10) + cycle detection\n- [ ] --no-follow-renames disables chain resolution\n- [ ] Robot JSON includes rename_chain when renames detected\n- [ ] -p required when path exists in multiple projects (Ambiguous error, exit 18)\n- [ ] Graceful empty state: \"No MR data found. Run lore sync with fetchMrFileChanges: true\"\n\n## Dependencies\n- Depends on: Gate 1 (bd-2zl) for dependent fetch queue, Gate 2 (bd-1se) for entity_references (MR→issue linking)\n- Downstream: Gate 5 (bd-1ht) depends on mr_file_changes and commit SHAs","status":"open","priority":1,"issue_type":"feature","created_at":"2026-02-02T21:31:01.094024Z","created_by":"tayloreernisse","updated_at":"2026-02-04T19:33:06.778936Z","compaction_level":0,"original_size":0,"labels":["epic","gate-4","phase-b"],"dependencies":[{"issue_id":"bd-14q","depends_on_id":"bd-1se","type":"blocks","created_at":"2026-02-02T21:34:16.913465Z","created_by":"tayloreernisse"},{"issue_id":"bd-14q","depends_on_id":"bd-2zl","type":"blocks","created_at":"2026-02-02T21:34:16.870058Z","created_by":"tayloreernisse"}]} {"id":"bd-157","title":"[CP1] Issue transformer with label extraction","description":"Transform GitLab issue payloads to normalized database schema.\n\n## Module\nsrc/gitlab/transformers/issue.rs\n\n## Structs\n\n### NormalizedIssue\n- gitlab_id: i64\n- project_id: i64 (local DB project ID)\n- iid: i64\n- title: String\n- description: Option\n- state: String\n- author_username: String\n- created_at, updated_at, last_seen_at: i64 (ms epoch)\n- web_url: String\n\n### NormalizedLabel (CP1: name-only)\n- project_id: i64\n- name: String\n\n## Functions\n\n### transform_issue(gitlab_issue: &GitLabIssue, local_project_id: i64) -> NormalizedIssue\n- Convert ISO timestamps to ms epoch using iso_to_ms()\n- Set last_seen_at to now_ms()\n- Clone string fields\n\n### extract_labels(gitlab_issue: &GitLabIssue, local_project_id: i64) -> Vec\n- Map labels vec to NormalizedLabel structs\n\nFiles: \n- src/gitlab/transformers/mod.rs\n- src/gitlab/transformers/issue.rs\nTests: tests/issue_transformer_tests.rs\nDone when: Unit tests pass for payload transformation and label extraction","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:42:47.719562Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.736142Z","deleted_at":"2026-01-25T17:02:01.736129Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0} {"id":"bd-16m8","title":"OBSERV: Record item counts as span fields in sync stages","description":"## Background\nMetricsLayer (bd-34ek) captures span fields, but the stage functions must actually record item counts INTO their spans. This is the bridge between \"work happened\" and \"MetricsLayer knows about it.\"\n\n## Approach\nIn each stage function, after the work loop completes, record counts into the current span:\n\n### src/ingestion/orchestrator.rs - ingest_project_issues_with_progress() (~line 110)\nAfter issues are fetched and discussions synced:\n```rust\ntracing::Span::current().record(\"items_processed\", result.issues_upserted);\ntracing::Span::current().record(\"items_skipped\", result.issues_skipped);\ntracing::Span::current().record(\"errors\", result.errors);\n```\n\n### src/ingestion/orchestrator.rs - drain_resource_events() (~line 566)\nAfter the drain loop:\n```rust\ntracing::Span::current().record(\"items_processed\", result.fetched);\ntracing::Span::current().record(\"errors\", result.failed);\n```\n\n### src/documents/regenerator.rs - regenerate_dirty_documents() (~line 24)\nAfter the regeneration loop:\n```rust\ntracing::Span::current().record(\"items_processed\", result.regenerated);\ntracing::Span::current().record(\"items_skipped\", result.unchanged);\ntracing::Span::current().record(\"errors\", result.errored);\n```\n\n### src/embedding/pipeline.rs - embed_documents() (~line 36)\nAfter embedding completes:\n```rust\ntracing::Span::current().record(\"items_processed\", result.embedded);\ntracing::Span::current().record(\"items_skipped\", result.skipped);\ntracing::Span::current().record(\"errors\", result.failed);\n```\n\nIMPORTANT: These fields must be declared as tracing::field::Empty in the #[instrument] attribute (done in bd-24j1). You can only record() a field that was declared at span creation. Attempting to record an undeclared field silently does nothing.\n\n## Acceptance Criteria\n- [ ] MetricsLayer captures items_processed for each stage\n- [ ] MetricsLayer captures items_skipped and errors when non-zero\n- [ ] Fields match the span declarations from bd-24j1\n- [ ] extract_timings() returns correct counts in StageTiming\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/ingestion/orchestrator.rs (record counts in ingest + drain functions)\n- src/documents/regenerator.rs (record counts in regenerate)\n- src/embedding/pipeline.rs (record counts in embed)\n\n## TDD Loop\nRED: test_stage_fields_recorded (integration: run pipeline, extract timings, verify counts > 0)\nGREEN: Add Span::current().record() calls at end of each stage\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- Span::current() returns a disabled span if no subscriber is registered (e.g., in tests without subscriber setup). record() on disabled span is a no-op. Tests need a subscriber.\n- Field names must exactly match the declaration: \"items_processed\" not \"itemsProcessed\"\n- Recording must happen BEFORE the span closes (before function returns). Place at end of function but before Ok(result).","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-04T15:54:32.011236Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:27:38.620645Z","closed_at":"2026-02-04T17:27:38.620601Z","close_reason":"Added tracing::field::Empty declarations and Span::current().record() calls in 4 functions: ingest_project_issues, ingest_project_merge_requests, drain_resource_events, regenerate_dirty_documents, embed_documents","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-16m8","depends_on_id":"bd-24j1","type":"blocks","created_at":"2026-02-04T15:55:19.962261Z","created_by":"tayloreernisse"},{"issue_id":"bd-16m8","depends_on_id":"bd-34ek","type":"blocks","created_at":"2026-02-04T15:55:20.009988Z","created_by":"tayloreernisse"},{"issue_id":"bd-16m8","depends_on_id":"bd-3er","type":"parent-child","created_at":"2026-02-04T15:54:32.012091Z","created_by":"tayloreernisse"}]} {"id":"bd-17n","title":"OBSERV: Add LoggingConfig to Config struct","description":"## Background\nLoggingConfig centralizes log file settings so users can customize retention and disable file logging. It follows the same #[serde(default)] pattern as SyncConfig (src/core/config.rs:32-78) so existing config.json files continue working with zero changes.\n\n## Approach\nAdd to src/core/config.rs, after the EmbeddingConfig struct (around line 120):\n\n```rust\n#[derive(Debug, Clone, Deserialize)]\n#[serde(default)]\npub struct LoggingConfig {\n /// Directory for log files. Default: None (= XDG data dir + /logs/)\n pub log_dir: Option,\n\n /// Days to retain log files. Default: 30. Set to 0 to disable file logging.\n pub retention_days: u32,\n\n /// Enable JSON log files. Default: true.\n pub file_logging: bool,\n}\n\nimpl Default for LoggingConfig {\n fn default() -> Self {\n Self {\n log_dir: None,\n retention_days: 30,\n file_logging: true,\n }\n }\n}\n```\n\nAdd to the Config struct (src/core/config.rs:123-137), after the embedding field:\n\n```rust\n#[serde(default)]\npub logging: LoggingConfig,\n```\n\nNote: Using impl Default rather than default helper functions (default_retention_days, default_true) because #[serde(default)] on the struct applies Default::default() to the entire struct when the key is missing. This is the same pattern used by SyncConfig.\n\n## Acceptance Criteria\n- [ ] Deserializing {} as LoggingConfig yields retention_days=30, file_logging=true, log_dir=None\n- [ ] Deserializing {\"retention_days\": 7} preserves file_logging=true default\n- [ ] Existing config.json files (no \"logging\" key) deserialize without error\n- [ ] Config struct has .logging field accessible\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/core/config.rs (add LoggingConfig struct + Default impl, add field to Config)\n\n## TDD Loop\nRED: tests/config_tests.rs (or inline #[cfg(test)] mod):\n - test_logging_config_defaults\n - test_logging_config_partial\nGREEN: Add LoggingConfig struct, Default impl, field on Config\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- retention_days=0 means disable file logging entirely (not \"delete all files\") -- document this in the struct doc comment\n- log_dir with a relative path: should be resolved relative to CWD or treated as absolute? Decision: treat as absolute, document it\n- Missing \"logging\" key in JSON: #[serde(default)] handles this -- the entire LoggingConfig gets Default::default()","status":"closed","priority":1,"issue_type":"task","created_at":"2026-02-04T15:53:55.471193Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:10:22.751969Z","closed_at":"2026-02-04T17:10:22.751921Z","close_reason":"Added LoggingConfig struct with log_dir, retention_days, file_logging fields and serde defaults","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-17n","depends_on_id":"bd-2nx","type":"parent-child","created_at":"2026-02-04T15:53:55.471849Z","created_by":"tayloreernisse"}]} @@ -16,7 +16,7 @@ {"id":"bd-1fn","title":"[CP1] Integration tests for discussion watermark","description":"Integration tests verifying discussion sync watermark behavior.\n\n## Tests (tests/discussion_watermark_tests.rs)\n\n- skips_discussion_fetch_when_updated_at_unchanged\n- fetches_discussions_when_updated_at_advanced\n- updates_watermark_after_successful_discussion_sync\n- does_not_update_watermark_on_discussion_sync_failure\n\n## Test Scenario\n1. Ingest issue with updated_at = T1\n2. Verify discussions_synced_for_updated_at = T1\n3. Re-run ingest with same issue (updated_at = T1)\n4. Verify NO discussion API calls made (watermark prevents)\n5. Simulate issue update (updated_at = T2)\n6. Re-run ingest\n7. Verify discussion API calls made for T2\n8. Verify watermark updated to T2\n\n## Why This Matters\nDiscussion API is expensive (1 call per issue). Watermark ensures\nwe only refetch when issue actually changed, even with cursor rewind.\n\nFiles: tests/discussion_watermark_tests.rs\nDone when: Watermark correctly prevents redundant discussion refetch","status":"tombstone","priority":3,"issue_type":"task","created_at":"2026-01-25T16:59:11.362495Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:02.086158Z","deleted_at":"2026-01-25T17:02:02.086154Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0} {"id":"bd-1gu","title":"[CP0] gi auth-test command","description":"## Background\n\nauth-test is a quick diagnostic command to verify GitLab connectivity. Used for troubleshooting and CI pipelines. Simpler than doctor because it only checks auth, not full system health.\n\nReference: docs/prd/checkpoint-0.md section \"gi auth-test\"\n\n## Approach\n\n**src/cli/commands/auth-test.ts:**\n```typescript\nimport { Command } from 'commander';\nimport { loadConfig } from '../../core/config';\nimport { GitLabClient } from '../../gitlab/client';\nimport { TokenNotSetError } from '../../core/errors';\n\nexport const authTestCommand = new Command('auth-test')\n .description('Verify GitLab authentication')\n .action(async (options, command) => {\n const globalOpts = command.optsWithGlobals();\n \n // 1. Load config\n const config = loadConfig(globalOpts.config);\n \n // 2. Get token from environment\n const token = process.env[config.gitlab.tokenEnvVar];\n if (!token) {\n throw new TokenNotSetError(config.gitlab.tokenEnvVar);\n }\n \n // 3. Create client and test auth\n const client = new GitLabClient({\n baseUrl: config.gitlab.baseUrl,\n token,\n });\n \n // 4. Get current user\n const user = await client.getCurrentUser();\n \n // 5. Output success\n console.log(`Authenticated as @${user.username} (${user.name})`);\n console.log(`GitLab: ${config.gitlab.baseUrl}`);\n });\n```\n\n**Output format:**\n```\nAuthenticated as @johndoe (John Doe)\nGitLab: https://gitlab.example.com\n```\n\n## Acceptance Criteria\n\n- [ ] Loads config from default or --config path\n- [ ] Gets token from configured env var (default GITLAB_TOKEN)\n- [ ] Throws TokenNotSetError if env var not set\n- [ ] Calls GET /api/v4/user to verify auth\n- [ ] Prints username and display name on success\n- [ ] Exit 0 on success\n- [ ] Exit 1 on auth failure (GitLabAuthError)\n- [ ] Exit 1 if config not found (ConfigNotFoundError)\n\n## Files\n\nCREATE:\n- src/cli/commands/auth-test.ts\n\n## TDD Loop\n\nN/A - simple command, verify manually and with integration test in init.test.ts\n\n```bash\n# Manual verification\nexport GITLAB_TOKEN=\"valid-token\"\ngi auth-test\n\n# With invalid token\nexport GITLAB_TOKEN=\"invalid\"\ngi auth-test # should exit 1\n```\n\n## Edge Cases\n\n- Config exists but token env var not set - clear error message\n- Token exists but wrong scopes - GitLabAuthError (401)\n- Network unreachable - GitLabNetworkError\n- Token with extra whitespace - should trim","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-24T16:09:51.135580Z","created_by":"tayloreernisse","updated_at":"2026-01-25T03:28:16.369542Z","closed_at":"2026-01-25T03:28:16.369481Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1gu","depends_on_id":"bd-13b","type":"blocks","created_at":"2026-01-24T16:13:10.058655Z","created_by":"tayloreernisse"},{"issue_id":"bd-1gu","depends_on_id":"bd-1l1","type":"blocks","created_at":"2026-01-24T16:13:10.077581Z","created_by":"tayloreernisse"}]} {"id":"bd-1hj","title":"[CP1] Ingestion orchestrator","description":"Coordinate issue + dependent discussion sync with bounded concurrency.\n\n## Module\nsrc/ingestion/orchestrator.rs\n\n## Canonical Pattern (CP1)\n\nWhen gi ingest --type=issues runs:\n\n1. **Ingest issues** - cursor-based with incremental cursor updates per page\n2. **Collect touched issues** - record IssueForDiscussionSync for each issue passing cursor filter\n3. **Filter for discussion sync** - enqueue issues where:\n issue.updated_at > issues.discussions_synced_for_updated_at\n4. **Execute discussion sync** - with bounded concurrency (dependent_concurrency from config)\n5. **Update watermark** - after each issue's discussions successfully ingested\n\n## Concurrency Notes\n\nRuntime decision: Use single-threaded Tokio runtime (flavor = \"current_thread\")\n- rusqlite::Connection is !Send, conflicts with multi-threaded runtimes\n- Single-threaded avoids Send bounds entirely\n- Use tokio::task::spawn_local + LocalSet for concurrent discussion fetches\n- Keeps code simple; can upgrade to channel-based DB writer in CP2 if needed\n\n## Configuration Used\n- config.sync.dependent_concurrency - limits parallel discussion requests\n- config.sync.cursor_rewind_seconds - safety margin for cursor\n\n## Progress Reporting\n- Show total issues fetched\n- Show issues needing discussion sync\n- Show discussion/note counts per project\n\nFiles: src/ingestion/orchestrator.rs\nTests: Integration tests with mocked GitLab\nDone when: Full issue + discussion ingestion orchestrated correctly","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T16:57:57.325679Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.851047Z","deleted_at":"2026-01-25T17:02:01.851043Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0} -{"id":"bd-1ht","title":"Epic: Gate 5 - Code Trace (lore trace)","description":"Implement 'lore trace ' command (Tier 1 API-only) that traces file -> MR -> issue -> discussion chains. Builds on file-history (Gate 4) and cross-reference data (Gate 2) to answer 'Why was this code introduced?' Tier 2 (git blame integration) is deferred to Phase C.\n\nChildren: bd-2n4 (trace query), bd-9dd (trace command with human/robot output)","status":"open","priority":1,"issue_type":"feature","created_at":"2026-02-02T21:31:01.141053Z","created_by":"tayloreernisse","updated_at":"2026-02-02T21:50:54.487684Z","compaction_level":0,"original_size":0,"labels":["epic","gate-5","phase-b"],"dependencies":[{"issue_id":"bd-1ht","depends_on_id":"bd-14q","type":"blocks","created_at":"2026-02-02T21:34:38.033428Z","created_by":"tayloreernisse"},{"issue_id":"bd-1ht","depends_on_id":"bd-1se","type":"blocks","created_at":"2026-02-02T21:34:37.987232Z","created_by":"tayloreernisse"}]} +{"id":"bd-1ht","title":"Epic: Gate 5 - Code Trace (lore trace)","description":"## Background\nGate 5 implements \"lore trace\" — the command that answers \"Why was this code introduced?\" by tracing from a file path through the MR that modified it, to the issue that motivated the MR, to the discussions that contain the decision rationale. This is the capstone of Phase B, combining data from all previous gates.\n\nGate 5 ships Tier 1 only (API-only, no local git). Tier 2 (git blame integration via git2-rs for line-level precision) is deferred to Phase C.\n\n## Architecture\n- **No new tables.** Trace queries combine mr_file_changes (Gate 4), entity_references (Gate 2), and discussions/notes (existing).\n- **Query flow:** file → mr_file_changes → MRs → entity_references (closes/related) → issues → discussions with DiffNote context\n- **Tier 1 limitation:** File-level granularity only. Cannot trace a specific line to its introducing commit.\n- **Path parsing:** Supports \"src/foo.rs:45\" syntax — line number parsed but deferred with warning about Tier 2.\n- **Rename aware:** Reuses file_history::resolve_rename_chain for multi-path matching.\n\n## Children (Execution Order)\n1. **bd-2n4** [OPEN] — Trace query logic: file → MR → issue → discussion chain (src/core/trace.rs)\n2. **bd-9dd** [OPEN] — CLI command with human + robot output (src/cli/commands/trace.rs)\n\n## Gate Completion Criteria\n- [ ] `lore trace ` shows MRs that touched the file with linked issues + discussion context\n- [ ] Output includes MR → issue → discussion chain\n- [ ] DiffNote snippets show content positioned on the traced file\n- [ ] Cross-references from entity_references used for MR→issue linking\n- [ ] Robot JSON output with trace_chains array and meta.tier = \"api_only\"\n- [ ] :line suffix parsed with Tier 2 warning message\n- [ ] -p flag for project scoping\n- [ ] --no-follow-renames disables rename chain\n- [ ] Graceful empty state: \"No MR data found. Run lore sync with fetchMrFileChanges: true\"\n\n## Dependencies\n- Depends on: Gate 2 (bd-1se) for entity_references, Gate 4 (bd-14q) for mr_file_changes + commit SHAs\n- Downstream: None — Gate 5 is the terminal gate of Phase B","status":"open","priority":1,"issue_type":"feature","created_at":"2026-02-02T21:31:01.141053Z","created_by":"tayloreernisse","updated_at":"2026-02-04T19:33:19.836653Z","compaction_level":0,"original_size":0,"labels":["epic","gate-5","phase-b"],"dependencies":[{"issue_id":"bd-1ht","depends_on_id":"bd-14q","type":"blocks","created_at":"2026-02-02T21:34:38.033428Z","created_by":"tayloreernisse"},{"issue_id":"bd-1ht","depends_on_id":"bd-1se","type":"blocks","created_at":"2026-02-02T21:34:37.987232Z","created_by":"tayloreernisse"}]} {"id":"bd-1i2","title":"Integrate mark_dirty_tx into ingestion modules","description":"## Background\nThis bead integrates dirty source tracking into the existing ingestion pipelines. Every entity upserted during ingestion must be marked dirty so the document regenerator knows to update the corresponding search document. The critical constraint: mark_dirty_tx() must be called INSIDE the same transaction that upserts the entity — not after commit.\n\n**Key PRD clarification:** Mark ALL upserted entities dirty (not just changed ones). The regenerator's hash comparison handles \"unchanged\" detection cheaply — this avoids needing change detection in ingestion.\n\n## Approach\nModify 4 existing ingestion files to add mark_dirty_tx() calls inside existing transaction blocks per PRD Section 6.1.\n\n**1. src/ingestion/issues.rs:**\nInside the issue upsert loop, after each successful INSERT/UPDATE:\n```rust\ndirty_tracker::mark_dirty_tx(&tx, SourceType::Issue, issue_row.id)?;\n```\n\n**2. src/ingestion/merge_requests.rs:**\nInside the MR upsert loop:\n```rust\ndirty_tracker::mark_dirty_tx(&tx, SourceType::MergeRequest, mr_row.id)?;\n```\n\n**3. src/ingestion/discussions.rs:**\nInside discussion insert (issue discussions, full-refresh transaction):\n```rust\ndirty_tracker::mark_dirty_tx(&tx, SourceType::Discussion, discussion_row.id)?;\n```\n\n**4. src/ingestion/mr_discussions.rs:**\nInside discussion upsert (write phase):\n```rust\ndirty_tracker::mark_dirty_tx(&tx, SourceType::Discussion, discussion_row.id)?;\n```\n\n**Discussion Sweep Cleanup (PRD Section 6.1 — CRITICAL):**\nWhen the MR discussion sweep deletes stale discussions (`last_seen_at < run_start_time`), **delete the corresponding document rows directly** — do NOT use the dirty queue for cleanup. The `ON DELETE CASCADE` on `document_labels`/`document_paths` and the `documents_embeddings_ad` trigger handle all downstream cleanup.\n\n**PRD-exact CTE pattern:**\n```sql\n-- In src/ingestion/mr_discussions.rs, during sweep phase.\n-- Uses a CTE to capture stale IDs atomically before cascading deletes.\n-- This is more defensive than two separate statements because the CTE\n-- guarantees the ID set is captured before any row is deleted.\nWITH stale AS (\n SELECT id FROM discussions\n WHERE merge_request_id = ? AND last_seen_at < ?\n)\n-- Step 1: delete orphaned documents (must happen while source_id still resolves)\nDELETE FROM documents\n WHERE source_type = 'discussion' AND source_id IN (SELECT id FROM stale);\n-- Step 2: delete the stale discussions themselves\nDELETE FROM discussions\n WHERE id IN (SELECT id FROM stale);\n```\n\n**NOTE:** If SQLite version doesn't support CTE-based multi-statement, execute as two sequential statements capturing IDs in Rust first:\n```rust\nlet stale_ids: Vec = conn.prepare(\n \"SELECT id FROM discussions WHERE merge_request_id = ? AND last_seen_at < ?\"\n)?.query_map(params![mr_id, run_start], |r| r.get(0))?\n .collect::, _>>()?;\n\nif !stale_ids.is_empty() {\n // Delete documents FIRST (while source_id still resolves)\n conn.execute(\n \"DELETE FROM documents WHERE source_type = 'discussion' AND source_id IN (...)\",\n ...\n )?;\n // Then delete the discussions\n conn.execute(\n \"DELETE FROM discussions WHERE id IN (...)\",\n ...\n )?;\n}\n```\n\n**IMPORTANT difference from dirty queue pattern:** The sweep deletes documents DIRECTLY (not via dirty_sources queue). This is because the source entity is being deleted — there's nothing for the regenerator to regenerate from. The cascade handles FTS, labels, paths, and embeddings cleanup.\n\n## Acceptance Criteria\n- [ ] Every upserted issue is marked dirty inside the same transaction\n- [ ] Every upserted MR is marked dirty inside the same transaction\n- [ ] Every upserted discussion (issue + MR) is marked dirty inside the same transaction\n- [ ] ALL upserted entities marked dirty (not just changed ones) — regenerator handles skip\n- [ ] mark_dirty_tx called with &Transaction (not &Connection)\n- [ ] mark_dirty_tx uses upsert with ON CONFLICT to reset backoff state (not INSERT OR IGNORE)\n- [ ] Discussion sweep deletes documents DIRECTLY (not via dirty queue)\n- [ ] Discussion sweep uses CTE (or Rust-side ID capture) to capture stale IDs before cascading deletes\n- [ ] Documents deleted BEFORE discussions (while source_id still resolves)\n- [ ] ON DELETE CASCADE handles document_labels, document_paths cleanup\n- [ ] documents_embeddings_ad trigger handles embedding cleanup\n- [ ] `cargo build` succeeds\n- [ ] Existing ingestion tests still pass\n\n## Files\n- `src/ingestion/issues.rs` — add mark_dirty_tx calls in upsert loop\n- `src/ingestion/merge_requests.rs` — add mark_dirty_tx calls in upsert loop\n- `src/ingestion/discussions.rs` — add mark_dirty_tx calls in insert loop\n- `src/ingestion/mr_discussions.rs` — add mark_dirty_tx calls + direct document deletion in sweep\n\n## TDD Loop\nRED: Existing tests should still pass (regression); new tests:\n- `test_issue_upsert_marks_dirty` — after issue ingest, dirty_sources has entry\n- `test_mr_upsert_marks_dirty` — after MR ingest, dirty_sources has entry\n- `test_discussion_upsert_marks_dirty` — after discussion ingest, dirty_sources has entry\n- `test_discussion_sweep_deletes_documents` — stale discussion documents deleted directly\n- `test_sweep_cascade_cleans_labels_paths` — ON DELETE CASCADE works\nGREEN: Add mark_dirty_tx calls in all 4 files, implement sweep with CTE\nVERIFY: `cargo test ingestion && cargo build`\n\n## Edge Cases\n- Upsert that doesn't change data: still marks dirty (regenerator hash check handles skip)\n- Transaction rollback: dirty mark also rolled back (atomic, inside same txn)\n- Discussion sweep with zero stale IDs: CTE returns empty, no DELETE executed\n- Large batch of upserts: each mark_dirty_tx is O(1) INSERT with ON CONFLICT\n- Sweep deletes document before discussion: order matters for source_id resolution","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:27:09.540279Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:39:17.241433Z","closed_at":"2026-01-30T17:39:17.241390Z","close_reason":"Added mark_dirty_tx calls in issues.rs, merge_requests.rs, discussions.rs, mr_discussions.rs (2 paths)","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1i2","depends_on_id":"bd-38q","type":"blocks","created_at":"2026-01-30T15:29:35.105551Z","created_by":"tayloreernisse"}]} {"id":"bd-1j1","title":"Integration test: full Phase B sync pipeline","description":"## Background\nAfter all Gate 1-2 components are built, we need an integration test proving the full pipeline works end-to-end: sync → enqueue dependent fetches → drain queue → extract refs from state events → parse system notes for refs. Without this, individual unit tests pass but the pipeline may not wire together correctly.\n\n## Approach\nCreate tests/phase_b_integration.rs with a comprehensive test suite:\n\n```rust\n#[tokio::test]\nasync fn test_phase_b_sync_pipeline_integration() {\n // 1. Create test DB with migrations 001-012\n // 2. Seed: project, issues, MRs, discussions with system notes\n // 3. Seed: resource_state_events with source_merge_request_id\n // 4. Seed: dependent_fetch_queue entries (state_events, label_events)\n // 5. Run drain_dependent_queue (mocked HTTP → fixture JSON)\n // 6. Run extract_refs_from_state_events\n // 7. Run extract_refs_from_system_notes\n // 8. Assert: entity_references populated with correct source/target/type/method\n // 9. Assert: no duplicate refs (INSERT OR IGNORE worked)\n // 10. Assert: unresolved cross-project refs stored correctly\n}\n```\n\nUse wiremock or a trait-based HTTP mock for GitLab API responses. Fixture files in tests/fixtures/phase_b/.\n\n## Acceptance Criteria\n- [ ] Test creates DB, runs all migrations through 012\n- [ ] Test seeds realistic data (issues, MRs, state events, system notes)\n- [ ] Test runs the full pipeline in correct order\n- [ ] Test verifies entity_references from all 3 sources: closes_issues API, state events, system notes\n- [ ] Test verifies deduplication across sources\n- [ ] Test verifies unresolved cross-project references\n- [ ] Test passes with `cargo test phase_b_integration -- --nocapture`\n\n## Files\n- tests/phase_b_integration.rs (new)\n- tests/fixtures/phase_b/state_events.json (new)\n- tests/fixtures/phase_b/label_events.json (new)\n- tests/fixtures/phase_b/system_notes.json (new)\n\n## TDD Loop\nRED: tests/phase_b_integration.rs:\n- `test_full_pipeline_produces_entity_references` - seeds all data, runs full pipeline, asserts entity_references populated from state events + system notes + closes_issues API\n- `test_pipeline_deduplication_across_sources` - same ref discovered by API and system note → single row in entity_references\n- `test_pipeline_unresolved_cross_project_refs` - system note mentioning external project → entity_references row with target_entity_id=NULL and target_iid populated\n- `test_pipeline_empty_queue_succeeds` - no queue entries → pipeline completes with 0 refs, no error\n- `test_pipeline_migrations_001_through_012` - verify all migrations apply cleanly in sequence on fresh DB\n\nSetup: create_test_db helper applying all migrations, seed_phase_b_fixtures() populating issues, MRs, discussions, notes (including system notes with \"closed by !123\" patterns), resource_state_events with source_merge_request fields.\n\nGREEN: Wire pipeline calls in correct order, create fixture JSON files\n\nVERIFY: `cargo test phase_b_integration -- --nocapture`\n\n## Edge Cases\n- Empty queue: pipeline completes successfully with 0 refs\n- All refs duplicate: INSERT OR IGNORE produces 0 new inserts\n- Mixed sources: same ref discovered by API + system note → single entry\n- Migration failure: test should fail clearly if migrations don't apply cleanly","status":"open","priority":3,"issue_type":"task","created_at":"2026-02-02T22:42:26.355071Z","created_by":"tayloreernisse","updated_at":"2026-02-03T13:42:58.964288Z","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1j1","depends_on_id":"bd-1ji","type":"blocks","created_at":"2026-02-02T22:43:27.941002Z","created_by":"tayloreernisse"},{"issue_id":"bd-1j1","depends_on_id":"bd-1se","type":"parent-child","created_at":"2026-02-02T22:43:40.577709Z","created_by":"tayloreernisse"},{"issue_id":"bd-1j1","depends_on_id":"bd-3ia","type":"blocks","created_at":"2026-02-02T22:43:28.048311Z","created_by":"tayloreernisse"},{"issue_id":"bd-1j1","depends_on_id":"bd-8t4","type":"blocks","created_at":"2026-02-02T22:43:27.996061Z","created_by":"tayloreernisse"}]} {"id":"bd-1je","title":"Implement pending discussion queue","description":"## Background\nThe pending discussion queue tracks discussions that need to be fetched from GitLab. When an issue or MR is updated, its discussions may need re-fetching. This queue is separate from dirty_sources (which tracks entities needing document regeneration) — it tracks entities needing API calls to GitLab. The queue uses the same backoff pattern as dirty_sources for consistency.\n\n## Approach\nCreate `src/ingestion/discussion_queue.rs`:\n\n```rust\nuse crate::core::backoff::compute_next_attempt_at;\n\n/// Noteable type for discussion queue.\n#[derive(Debug, Clone, Copy)]\npub enum NoteableType {\n Issue,\n MergeRequest,\n}\n\nimpl NoteableType {\n pub fn as_str(&self) -> &'static str {\n match self {\n Self::Issue => \"Issue\",\n Self::MergeRequest => \"MergeRequest\",\n }\n }\n}\n\npub struct PendingFetch {\n pub project_id: i64,\n pub noteable_type: NoteableType,\n pub noteable_iid: i64,\n pub attempt_count: i32,\n}\n\n/// Queue a discussion fetch. ON CONFLICT DO UPDATE resets backoff (consistent with dirty_sources).\npub fn queue_discussion_fetch(\n conn: &Connection,\n project_id: i64,\n noteable_type: NoteableType,\n noteable_iid: i64,\n) -> Result<()>;\n\n/// Get next batch of pending fetches (WHERE next_attempt_at IS NULL OR <= now).\npub fn get_pending_fetches(conn: &Connection, limit: usize) -> Result>;\n\n/// Mark fetch complete (remove from queue).\npub fn complete_fetch(\n conn: &Connection,\n project_id: i64,\n noteable_type: NoteableType,\n noteable_iid: i64,\n) -> Result<()>;\n\n/// Record fetch error with backoff.\npub fn record_fetch_error(\n conn: &Connection,\n project_id: i64,\n noteable_type: NoteableType,\n noteable_iid: i64,\n error: &str,\n) -> Result<()>;\n```\n\n## Acceptance Criteria\n- [ ] queue_discussion_fetch uses ON CONFLICT DO UPDATE (consistent with dirty_sources pattern)\n- [ ] Re-queuing resets: attempt_count=0, next_attempt_at=NULL, last_error=NULL\n- [ ] get_pending_fetches respects next_attempt_at backoff\n- [ ] get_pending_fetches returns entries ordered by queued_at ASC\n- [ ] complete_fetch removes entry from queue\n- [ ] record_fetch_error increments attempt_count, computes next_attempt_at via shared backoff\n- [ ] NoteableType.as_str() returns \"Issue\" or \"MergeRequest\" (matches DB CHECK constraint)\n- [ ] `cargo test discussion_queue` passes\n\n## Files\n- `src/ingestion/discussion_queue.rs` — new file\n- `src/ingestion/mod.rs` — add `pub mod discussion_queue;`\n\n## TDD Loop\nRED: Tests in `#[cfg(test)] mod tests`:\n- `test_queue_and_get` — queue entry, get returns it\n- `test_requeue_resets_backoff` — queue, error, re-queue -> attempt_count=0\n- `test_backoff_respected` — entry with future next_attempt_at not returned\n- `test_complete_removes` — complete_fetch removes entry\n- `test_error_increments_attempts` — error -> attempt_count=1, next_attempt_at set\nGREEN: Implement all functions\nVERIFY: `cargo test discussion_queue`\n\n## Edge Cases\n- Queue same (project_id, noteable_type, noteable_iid) twice: ON CONFLICT resets state\n- NoteableType must match DB CHECK constraint exactly (\"Issue\", \"MergeRequest\" — capitalized)\n- Empty queue: get_pending_fetches returns empty Vec","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:27:09.505548Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:31:35.496454Z","closed_at":"2026-01-30T17:31:35.496405Z","close_reason":"Implemented discussion_queue with queue/get/complete/record_error + 6 tests","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1je","depends_on_id":"bd-hrs","type":"blocks","created_at":"2026-01-30T15:29:35.034753Z","created_by":"tayloreernisse"},{"issue_id":"bd-1je","depends_on_id":"bd-mem","type":"blocks","created_at":"2026-01-30T15:29:35.071573Z","created_by":"tayloreernisse"}]} @@ -37,7 +37,7 @@ {"id":"bd-1qz","title":"[CP1] Database migration 002_issues.sql","description":"Create migration file with tables for issues, labels, issue_labels, discussions, and notes.\n\n## Tables\n\n### issues\n- id INTEGER PRIMARY KEY\n- gitlab_id INTEGER UNIQUE NOT NULL\n- project_id INTEGER NOT NULL REFERENCES projects(id)\n- iid INTEGER NOT NULL\n- title TEXT, description TEXT, state TEXT\n- author_username TEXT\n- created_at, updated_at, last_seen_at INTEGER (ms epoch UTC)\n- discussions_synced_for_updated_at INTEGER (watermark for dependent sync)\n- web_url TEXT\n- raw_payload_id INTEGER REFERENCES raw_payloads(id)\n\n### labels (name-only for CP1)\n- id INTEGER PRIMARY KEY\n- gitlab_id INTEGER (optional, for future Labels API)\n- project_id INTEGER NOT NULL REFERENCES projects(id)\n- name TEXT NOT NULL\n- color TEXT, description TEXT (nullable, deferred)\n- UNIQUE(project_id, name)\n\n### issue_labels (junction)\n- issue_id, label_id with CASCADE DELETE\n- Clear existing links before INSERT to handle removed labels\n\n### discussions\n- gitlab_discussion_id TEXT (string ID from API)\n- project_id, issue_id/merge_request_id FKs\n- noteable_type TEXT ('Issue' | 'MergeRequest')\n- individual_note INTEGER, first_note_at, last_note_at, last_seen_at\n- resolvable, resolved flags\n- CHECK constraint for Issue vs MR exclusivity\n\n### notes\n- gitlab_id INTEGER UNIQUE NOT NULL\n- discussion_id, project_id FKs\n- note_type, is_system, author_username, body\n- timestamps, position (array order)\n- resolution fields, DiffNote position fields\n\n## Indexes\n- idx_issues_project_updated, idx_issues_author, idx_issues_discussions_sync\n- uq_issues_project_iid, uq_labels_project_name\n- idx_issue_labels_label\n- uq_discussions_project_discussion_id, idx_discussions_issue/mr/last_note\n- idx_notes_discussion/author/system\n\nFiles: migrations/002_issues.sql\nDone when: Migration applies cleanly on top of 001_initial.sql, schema_version = 2","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:42:31.464544Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.685262Z","deleted_at":"2026-01-25T17:02:01.685258Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0} {"id":"bd-1re","title":"[CP1] gi show issue command","description":"Show issue details with discussions.\n\nFlags:\n- --project=PATH (required if iid is ambiguous across projects)\n\nOutput:\n- Title, project, state, author, dates, labels, URL\n- Description text\n- All discussions with notes (formatted thread view)\n\nHandle ambiguity: If multiple projects have same iid, prompt for --project or show error.\n\nFiles: src/cli/commands/show.ts\nDone when: Issue detail view displays all fields including threaded discussions","status":"tombstone","priority":3,"issue_type":"task","created_at":"2026-01-25T15:20:29.826786Z","created_by":"tayloreernisse","updated_at":"2026-01-25T15:21:35.153211Z","deleted_at":"2026-01-25T15:21:35.153208Z","deleted_by":"tayloreernisse","delete_reason":"delete","original_type":"task","compaction_level":0,"original_size":0} {"id":"bd-1s1","title":"[CP1] Integration tests for issue ingestion","description":"Full integration tests for issue ingestion module.\n\n## Tests (tests/issue_ingestion_tests.rs)\n\n- inserts_issues_into_database\n- creates_labels_from_issue_payloads\n- links_issues_to_labels_via_junction_table\n- removes_stale_label_links_on_resync\n- stores_raw_payload_for_each_issue\n- stores_raw_payload_for_each_discussion\n- updates_cursor_incrementally_per_page\n- resumes_from_cursor_on_subsequent_runs\n- handles_issues_with_no_labels\n- upserts_existing_issues_on_refetch\n- skips_discussion_refetch_for_unchanged_issues\n\n## Test Setup\n- tempfile::TempDir for isolated database\n- wiremock::MockServer for GitLab API\n- Mock handlers returning fixture data\n\nFiles: tests/issue_ingestion_tests.rs\nDone when: All integration tests pass with mocked GitLab","status":"tombstone","priority":3,"issue_type":"task","created_at":"2026-01-25T16:59:12.158586Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:02.109109Z","deleted_at":"2026-01-25T17:02:02.109105Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0} -{"id":"bd-1se","title":"Epic: Gate 2 - Cross-Reference Extraction","description":"Build entity relationship graph from structured APIs (closes_issues endpoint, state event source_merge_request_id) and best-effort system note parsing. The entity_references table is included in migration 011 (same migration as Gate 1 event tables).\n\nChildren: bd-czk (entity_references schema, folded into bd-hu3's migration), bd-8t4 (state event extraction), bd-3ia (closes_issues API), bd-1ji (system note parsing)","status":"open","priority":1,"issue_type":"feature","created_at":"2026-02-02T21:31:00.981132Z","created_by":"tayloreernisse","updated_at":"2026-02-02T21:50:44.441799Z","compaction_level":0,"original_size":0,"labels":["epic","gate-2","phase-b"],"dependencies":[{"issue_id":"bd-1se","depends_on_id":"bd-2zl","type":"blocks","created_at":"2026-02-02T21:32:43.028033Z","created_by":"tayloreernisse"}]} +{"id":"bd-1se","title":"Epic: Gate 2 - Cross-Reference Extraction","description":"## Background\nGate 2 builds the entity relationship graph that connects issues, MRs, and discussions. Without cross-references, temporal queries can only show events for individually-matched entities. With them, \"lore timeline auth migration\" can discover that MR !567 closed issue #234, which spawned follow-up issue #299 — even if #299 does not contain the words \"auth migration.\"\n\nThree data sources feed entity_references:\n1. **Structured API (reliable):** GET /projects/:id/merge_requests/:iid/closes_issues\n2. **State events (reliable):** resource_state_events.source_merge_request_id\n3. **System note parsing (best-effort):** \"mentioned in !456\", \"closed by !789\" patterns\n\n## Architecture\n- **entity_references table:** Already created in migration 011 (bd-hu3/bd-czk). Stores source→target relationships with reference_type (closes/mentioned/related) and source_method provenance.\n- **Directionality convention:** source = entity where reference was observed, target = entity being referenced. Consistent across all source_methods.\n- **Unresolved references:** Cross-project refs stored with target_entity_id=NULL, target_project_path populated. Still valuable for timeline narratives.\n- **closes_issues fetch:** Uses generic dependent fetch queue (job_type = mr_closes_issues). One API call per MR.\n- **System note parsing:** Local post-processing after all dependent fetches complete. No API calls. English-only, best-effort.\n\n## Children (Execution Order)\n1. **bd-czk** [CLOSED] — entity_references schema (folded into migration 011)\n2. **bd-8t4** [OPEN] — Extract cross-references from resource_state_events (source_merge_request_id)\n3. **bd-3ia** [OPEN] — Fetch closes_issues API and populate entity_references\n4. **bd-1ji** [OPEN] — Parse system notes for cross-reference patterns\n\n## Gate Completion Criteria\n- [ ] entity_references populated from closes_issues API for all synced MRs\n- [ ] entity_references populated from state events where source_merge_request_id present\n- [ ] System notes parsed for cross-reference patterns (English instances)\n- [ ] Cross-project references stored as unresolved (target_entity_id=NULL)\n- [ ] source_method tracks provenance of each reference\n- [ ] Deduplication: same relationship from multiple sources stored once (UNIQUE constraint)\n- [ ] Timeline JSON includes expansion provenance (via) for expanded entities\n- [ ] Integration test: sync with all three extraction methods, verify entity_references populated\n\n## Dependencies\n- Depends on: Gate 1 (bd-2zl) — event tables and dependent fetch queue\n- Downstream: Gate 3 (bd-ike) depends on entity_references for BFS expansion","status":"open","priority":1,"issue_type":"feature","created_at":"2026-02-02T21:31:00.981132Z","created_by":"tayloreernisse","updated_at":"2026-02-04T19:32:29.146809Z","compaction_level":0,"original_size":0,"labels":["epic","gate-2","phase-b"],"dependencies":[{"issue_id":"bd-1se","depends_on_id":"bd-2zl","type":"blocks","created_at":"2026-02-02T21:32:43.028033Z","created_by":"tayloreernisse"}]} {"id":"bd-1t4","title":"Epic: CP2 Gate C - Dependent Discussion Sync","description":"## Background\nGate C validates the dependent discussion sync with DiffNote position capture. This is critical for code review context preservation - without DiffNote positions, we lose the file/line context for review comments.\n\n## Acceptance Criteria (Pass/Fail)\n- [ ] Discussions fetched for MRs with updated_at > discussions_synced_for_updated_at\n- [ ] `SELECT COUNT(*) FROM discussions WHERE merge_request_id IS NOT NULL` > 0\n- [ ] DiffNotes have `position_new_path` populated (file path)\n- [ ] DiffNotes have `position_new_line` populated (line number)\n- [ ] DiffNotes have `position_type` populated (text/image/file)\n- [ ] DiffNotes have SHA triplet: `position_base_sha`, `position_start_sha`, `position_head_sha`\n- [ ] Multi-line DiffNotes have `position_line_range_start` and `position_line_range_end`\n- [ ] Unchanged MRs skip discussion refetch (watermark comparison works)\n- [ ] Watermark NOT advanced on HTTP error mid-pagination\n- [ ] Watermark NOT advanced on note timestamp parse failure\n- [ ] `gi show mr ` displays DiffNote with file context `[path:line]`\n\n## Validation Script\n```bash\n#!/bin/bash\nset -e\n\nDB_PATH=\"${XDG_DATA_HOME:-$HOME/.local/share}/gitlab-inbox/db.sqlite3\"\n\necho \"=== Gate C: Dependent Discussion Sync ===\"\n\n# 1. Check discussion count for MRs\necho \"Step 1: Check MR discussion count...\"\nMR_DISC_COUNT=$(sqlite3 \"$DB_PATH\" \"SELECT COUNT(*) FROM discussions WHERE merge_request_id IS NOT NULL;\")\necho \" MR discussions: $MR_DISC_COUNT\"\n[ \"$MR_DISC_COUNT\" -gt 0 ] || { echo \"FAIL: No MR discussions found\"; exit 1; }\n\n# 2. Check note count\necho \"Step 2: Check note count...\"\nNOTE_COUNT=$(sqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM notes n\n JOIN discussions d ON d.id = n.discussion_id\n WHERE d.merge_request_id IS NOT NULL;\n\")\necho \" MR notes: $NOTE_COUNT\"\n\n# 3. Check DiffNote position data\necho \"Step 3: Check DiffNote positions...\"\nDIFFNOTE_COUNT=$(sqlite3 \"$DB_PATH\" \"SELECT COUNT(*) FROM notes WHERE position_new_path IS NOT NULL;\")\necho \" DiffNotes with position: $DIFFNOTE_COUNT\"\n\n# 4. Sample DiffNote data\necho \"Step 4: Sample DiffNote data...\"\nsqlite3 \"$DB_PATH\" \"\n SELECT \n n.gitlab_id,\n n.position_new_path,\n n.position_new_line,\n n.position_type,\n SUBSTR(n.position_head_sha, 1, 7) as head_sha\n FROM notes n\n WHERE n.position_new_path IS NOT NULL\n LIMIT 5;\n\"\n\n# 5. Check multi-line DiffNotes\necho \"Step 5: Check multi-line DiffNotes...\"\nMULTILINE_COUNT=$(sqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM notes \n WHERE position_line_range_start IS NOT NULL \n AND position_line_range_end IS NOT NULL\n AND position_line_range_start != position_line_range_end;\n\")\necho \" Multi-line DiffNotes: $MULTILINE_COUNT\"\n\n# 6. Check watermarks set\necho \"Step 6: Check watermarks...\"\nWATERMARKED=$(sqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM merge_requests \n WHERE discussions_synced_for_updated_at IS NOT NULL;\n\")\necho \" MRs with watermark set: $WATERMARKED\"\n\n# 7. Check last_seen_at for sweep pattern\necho \"Step 7: Check last_seen_at (sweep pattern)...\"\nsqlite3 \"$DB_PATH\" \"\n SELECT \n MIN(last_seen_at) as oldest,\n MAX(last_seen_at) as newest\n FROM discussions \n WHERE merge_request_id IS NOT NULL;\n\"\n\n# 8. Test show command with DiffNote\necho \"Step 8: Find MR with DiffNotes for show test...\"\nMR_IID=$(sqlite3 \"$DB_PATH\" \"\n SELECT DISTINCT m.iid\n FROM merge_requests m\n JOIN discussions d ON d.merge_request_id = m.id\n JOIN notes n ON n.discussion_id = d.id\n WHERE n.position_new_path IS NOT NULL\n LIMIT 1;\n\")\nif [ -n \"$MR_IID\" ]; then\n echo \" Testing: gi show mr $MR_IID\"\n gi show mr \"$MR_IID\" | head -50\nfi\n\n# 9. Re-run and verify skip count\necho \"Step 9: Re-run ingest (should skip unchanged MRs)...\"\ngi ingest --type=merge_requests\n# Should report \"Skipped discussion sync for N unchanged MRs\"\n\necho \"\"\necho \"=== Gate C: PASSED ===\"\n```\n\n## Atomicity Test (Manual - Kill Test)\n```bash\n# This tests that partial failure preserves data\n\n# 1. Get an MR with discussions\nMR_ID=$(sqlite3 \"$DB_PATH\" \"\n SELECT m.id FROM merge_requests m\n JOIN discussions d ON d.merge_request_id = m.id\n LIMIT 1;\n\")\n\n# 2. Note current note count\nBEFORE=$(sqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM notes n\n JOIN discussions d ON d.id = n.discussion_id\n WHERE d.merge_request_id = $MR_ID;\n\")\necho \"Notes before: $BEFORE\"\n\n# 3. Note watermark\nWATERMARK_BEFORE=$(sqlite3 \"$DB_PATH\" \"\n SELECT discussions_synced_for_updated_at FROM merge_requests WHERE id = $MR_ID;\n\")\necho \"Watermark before: $WATERMARK_BEFORE\"\n\n# 4. Force full sync and kill mid-run\ngi ingest --type=merge_requests --full &\nPID=$!\nsleep 3 && kill -9 $PID 2>/dev/null || true\nwait $PID 2>/dev/null || true\n\n# 5. Verify notes preserved (should be same or more, never less)\nAFTER=$(sqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM notes n\n JOIN discussions d ON d.id = n.discussion_id\n WHERE d.merge_request_id = $MR_ID;\n\")\necho \"Notes after kill: $AFTER\"\n[ \"$AFTER\" -ge \"$BEFORE\" ] || echo \"WARNING: Notes decreased - atomicity may be broken\"\n\n# 6. Note watermark should NOT have advanced if killed mid-pagination\nWATERMARK_AFTER=$(sqlite3 \"$DB_PATH\" \"\n SELECT discussions_synced_for_updated_at FROM merge_requests WHERE id = $MR_ID;\n\")\necho \"Watermark after: $WATERMARK_AFTER\"\n```\n\n## Test Commands (Quick Verification)\n```bash\n# Check DiffNote data:\nsqlite3 ~/.local/share/gitlab-inbox/db.sqlite3 \"\n SELECT \n (SELECT COUNT(*) FROM discussions WHERE merge_request_id IS NOT NULL) as mr_discussions,\n (SELECT COUNT(*) FROM notes WHERE position_new_path IS NOT NULL) as diffnotes,\n (SELECT COUNT(*) FROM merge_requests WHERE discussions_synced_for_updated_at IS NOT NULL) as watermarked;\n\"\n\n# Find MR with DiffNotes and show it:\ngi show mr $(sqlite3 ~/.local/share/gitlab-inbox/db.sqlite3 \"\n SELECT DISTINCT m.iid FROM merge_requests m\n JOIN discussions d ON d.merge_request_id = m.id\n JOIN notes n ON n.discussion_id = d.id\n WHERE n.position_new_path IS NOT NULL LIMIT 1;\n\")\n```\n\n## Dependencies\nThis gate requires:\n- bd-3j6 (Discussion transformer with DiffNote position extraction)\n- bd-20h (MR discussion ingestion with atomicity guarantees)\n- bd-iba (Client pagination for MR discussions)\n- Gates A and B must pass first\n\n## Edge Cases\n- MRs without discussions: should sync successfully, just with 0 discussions\n- Discussions without DiffNotes: regular comments have NULL position fields\n- Deleted discussions in GitLab: sweep pattern should remove them locally\n- Invalid note timestamps: should NOT advance watermark, should log warning","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-26T22:06:01.769694Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:48:21.060017Z","closed_at":"2026-01-27T00:48:21.059974Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1t4","depends_on_id":"bd-20h","type":"blocks","created_at":"2026-01-26T22:08:55.778989Z","created_by":"tayloreernisse"}]} {"id":"bd-1ta","title":"[CP1] Integration tests for pagination","description":"Integration tests for GitLab pagination with wiremock.\n\n## Tests (tests/pagination_tests.rs)\n\n### Page Navigation\n- fetches_all_pages_when_multiple_exist\n- respects_per_page_parameter\n- follows_x_next_page_header_until_empty\n- falls_back_to_empty_page_stop_if_headers_missing\n\n### Cursor Behavior\n- applies_cursor_rewind_for_tuple_semantics\n- clamps_negative_rewind_to_zero\n\n## Test Setup\n- Use wiremock::MockServer\n- Set up handlers for /api/v4/projects/:id/issues\n- Return x-next-page headers\n- Verify request params (updated_after, per_page)\n\nFiles: tests/pagination_tests.rs\nDone when: All pagination tests pass with mocked server","status":"tombstone","priority":3,"issue_type":"task","created_at":"2026-01-25T16:59:07.806593Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:02.038945Z","deleted_at":"2026-01-25T17:02:02.038939Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0} {"id":"bd-1u1","title":"Implement document regenerator","description":"## Background\nThe document regenerator drains the dirty_sources queue, regenerating documents for each entry. It uses per-item transactions for crash safety, a triple-hash fast path to skip unchanged documents entirely (no writes at all), and a bounded batch loop that drains completely. Error recording includes backoff computation.\n\n## Approach\nCreate `src/documents/regenerator.rs` per PRD Section 6.3.\n\n**Core function:**\n```rust\npub fn regenerate_dirty_documents(conn: &Connection) -> Result\n```\n\n**RegenerateResult:** { regenerated, unchanged, errored }\n\n**Algorithm (per PRD):**\n1. Loop: get_dirty_sources(conn) -> Vec<(SourceType, i64)>\n2. If empty, break (queue fully drained)\n3. For each (source_type, source_id):\n a. Begin transaction\n b. Call regenerate_one_tx(&tx, source_type, source_id) -> Result\n c. If Ok(changed): clear_dirty_tx, commit, count regenerated or unchanged\n d. If Err: record_dirty_error_tx (with backoff), commit, count errored\n\n**regenerate_one_tx (per PRD):**\n1. Extract document via extract_{type}_document(conn, source_id)\n2. If None (deleted): delete_document, return Ok(true)\n3. If Some(doc): call get_existing_hash() to check current state\n4. **If ALL THREE hashes match: return Ok(false) — skip ALL writes** (fast path)\n5. Otherwise: upsert_document with conditional label/path relinking\n6. Return Ok(content changed)\n\n**Helper functions (PRD-exact):**\n\n`get_existing_hash` — uses `optional()` to distinguish missing rows from DB errors:\n```rust\nfn get_existing_hash(\n conn: &Connection,\n source_type: SourceType,\n source_id: i64,\n) -> Result> {\n use rusqlite::OptionalExtension;\n let hash: Option = stmt\n .query_row(params, |row| row.get(0))\n .optional()?; // IMPORTANT: Not .ok() — .ok() would hide real DB errors\n Ok(hash)\n}\n```\n\n`get_document_id` — resolve document ID after upsert:\n```rust\nfn get_document_id(conn: &Connection, source_type: SourceType, source_id: i64) -> Result\n```\n\n`upsert_document` — checks existing triple hash before writing:\n```rust\nfn upsert_document(conn: &Connection, doc: &DocumentData) -> Result<()> {\n // 1. Query existing (id, content_hash, labels_hash, paths_hash) via OptionalExtension\n // 2. Triple-hash fast path: all match -> return Ok(())\n // 3. Upsert document row (ON CONFLICT DO UPDATE)\n // 4. Get doc_id (from existing or query after insert)\n // 5. Only delete+reinsert labels if labels_hash changed\n // 6. Only delete+reinsert paths if paths_hash changed\n}\n```\n\n**Key PRD detail — triple-hash fast path:**\n```rust\nif old_content_hash == &doc.content_hash\n && old_labels_hash == &doc.labels_hash\n && old_paths_hash == &doc.paths_hash\n{ return Ok(()); } // Skip ALL writes — prevents WAL churn\n```\n\n**Error recording with backoff:**\nrecord_dirty_error_tx reads current attempt_count from DB, computes next_attempt_at via shared backoff utility:\n```rust\nlet next_attempt_at = crate::core::backoff::compute_next_attempt_at(now, attempt_count + 1);\n```\n\n**All internal functions use _tx suffix** (take &Transaction) for atomicity.\n\n## Acceptance Criteria\n- [ ] Queue fully drained (bounded batch loop until empty)\n- [ ] Per-item transactions (crash loses at most 1 doc)\n- [ ] Triple-hash fast path: ALL THREE hashes match -> skip ALL writes (return Ok(false))\n- [ ] Content change: upsert document, update labels/paths\n- [ ] Labels-only change: relabels but skips path writes (paths_hash unchanged)\n- [ ] Deleted entity: delete document (cascade handles FTS/labels/paths/embeddings)\n- [ ] get_existing_hash uses `.optional()` (not `.ok()`) to preserve DB errors\n- [ ] get_document_id resolves document ID after upsert\n- [ ] Error recording: increment attempt_count, compute next_attempt_at via backoff\n- [ ] FTS triggers fire on insert/update/delete (verified by trigger, not regenerator)\n- [ ] RegenerateResult counts accurate (regenerated, unchanged, errored)\n- [ ] Errors do not abort batch (log, increment, continue)\n- [ ] `cargo test regenerator` passes\n\n## Files\n- `src/documents/regenerator.rs` — new file\n- `src/documents/mod.rs` — add `pub use regenerator::regenerate_dirty_documents;`\n\n## TDD Loop\nRED: Tests requiring DB:\n- `test_creates_new_document` — dirty source -> document created\n- `test_skips_unchanged_triple_hash` — all 3 hashes match -> unchanged count incremented, no DB writes\n- `test_updates_changed_content` — content_hash mismatch -> updated\n- `test_updates_changed_labels_only` — content same but labels_hash different -> updated\n- `test_updates_changed_paths_only` — content same but paths_hash different -> updated\n- `test_deletes_missing_source` — source deleted -> document deleted\n- `test_drains_queue` — queue empty after regeneration\n- `test_error_records_backoff` — error -> attempt_count incremented, next_attempt_at set\n- `test_get_existing_hash_not_found` — returns Ok(None) for missing document\nGREEN: Implement regenerate_dirty_documents + all helpers\nVERIFY: `cargo test regenerator`\n\n## Edge Cases\n- Empty queue: return immediately with all-zero counts\n- Extractor error for one item: record_dirty_error_tx, commit, continue\n- Triple-hash prevents WAL churn on incremental syncs (most entities unchanged)\n- Labels change but content does not: labels_hash mismatch triggers upsert with label relinking\n- get_existing_hash on missing document: returns Ok(None) via .optional() (not DB error)\n- get_existing_hash on corrupt DB: propagates real DB error (not masked by .ok())","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:25:55.178825Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:41:29.942386Z","closed_at":"2026-01-30T17:41:29.942324Z","close_reason":"Implemented document regenerator with triple-hash fast path, queue draining, fail-soft error handling + 5 tests","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1u1","depends_on_id":"bd-1yz","type":"blocks","created_at":"2026-01-30T15:29:16.020686Z","created_by":"tayloreernisse"},{"issue_id":"bd-1u1","depends_on_id":"bd-247","type":"blocks","created_at":"2026-01-30T15:29:15.982772Z","created_by":"tayloreernisse"},{"issue_id":"bd-1u1","depends_on_id":"bd-2fp","type":"blocks","created_at":"2026-01-30T15:29:16.055043Z","created_by":"tayloreernisse"}]} @@ -91,7 +91,7 @@ {"id":"bd-2yo","title":"Fetch MR diffs API and populate mr_file_changes","description":"## Background\nGET /projects/:id/merge_requests/:iid/diffs returns file change metadata. We extract file paths and change types but NOT diff content. Uses the generic dependent fetch queue (job_type = 'mr_diffs').\n\n## Approach\n\n**1. Add API endpoint (src/gitlab/client.rs):**\n```rust\npub async fn fetch_mr_diffs(&self, project_id: i64, iid: i64) -> Result>\n```\n\nNew type in src/gitlab/types.rs:\n```rust\n#[derive(Debug, Clone, Deserialize)]\npub struct GitLabDiffFile {\n pub old_path: String,\n pub new_path: String,\n pub new_file: bool,\n pub renamed_file: bool,\n pub deleted_file: bool,\n // diff content fields exist but we ignore them\n}\n```\n\nURL: `GET /api/v4/projects/{project_id}/merge_requests/{iid}/diffs?per_page=100`\n\n**2. Enqueue jobs during MR ingestion:**\nIn orchestrator.rs, after MR upsert (when fetch_mr_file_changes is true):\n```rust\nif config.sync.fetch_mr_file_changes {\n enqueue_job(conn, project_id, \"merge_request\", iid, local_id, \"mr_diffs\", None)?;\n}\n```\n\n**3. Process jobs in drain step:**\nHandle \"mr_diffs\" job_type:\n```rust\nlet diffs = client.fetch_mr_diffs(gitlab_project_id, job.entity_iid).await?;\n// DELETE existing rows for this MR (diffs can change on rebase)\nconn.execute(\"DELETE FROM mr_file_changes WHERE merge_request_id = ?\", [job.entity_local_id])?;\n// Insert new rows\nfor diff in &diffs {\n let change_type = if diff.new_file { \"added\" }\n else if diff.renamed_file { \"renamed\" }\n else if diff.deleted_file { \"deleted\" }\n else { \"modified\" };\n conn.execute(\n \"INSERT INTO mr_file_changes (merge_request_id, project_id, old_path, new_path, change_type) VALUES (?, ?, ?, ?, ?)\",\n params![job.entity_local_id, project_id, \n if diff.renamed_file { Some(&diff.old_path) } else { None },\n &diff.new_path, change_type],\n )?;\n}\n```\n\n**4. Also capture commit SHAs:**\nDuring MR ingestion (orchestrator.rs), update merge_requests with merge_commit_sha and squash_commit_sha from the GitLab API response. These fields need to be added to GitLabMergeRequest type and transformer.\n\nAdd to src/gitlab/types.rs GitLabMergeRequest:\n```rust\npub merge_commit_sha: Option,\npub squash_commit_sha: Option,\n```\n\nUpdate MR transformer to pass these through, and UPDATE merge_requests SET merge_commit_sha = ?, squash_commit_sha = ? during upsert.\n\n## Acceptance Criteria\n- [ ] fetch_mr_diffs returns file metadata (no diff content)\n- [ ] Change types correctly derived: new_file→added, renamed_file→renamed, deleted_file→deleted, else→modified\n- [ ] Re-sync DELETEs + re-inserts (handles rebased MRs)\n- [ ] old_path only populated for renamed files\n- [ ] merge_commit_sha and squash_commit_sha captured in merge_requests table\n- [ ] Jobs only enqueued when fetch_mr_file_changes is true\n\n## Files\n- src/gitlab/client.rs (add fetch_mr_diffs)\n- src/gitlab/types.rs (add GitLabDiffFile, add fields to GitLabMergeRequest)\n- src/gitlab/transformers/merge_request.rs (pass through commit SHAs)\n- src/ingestion/orchestrator.rs (enqueue mr_diffs jobs, update commit SHAs)\n- src/core/drain.rs or sync.rs (handle mr_diffs in drain dispatcher)\n\n## TDD Loop\nRED: tests/file_changes_tests.rs:\n- `test_derive_change_type_added` - new_file=true → \"added\"\n- `test_derive_change_type_renamed` - renamed_file=true → \"renamed\", old_path populated\n- `test_derive_change_type_deleted` - deleted_file=true → \"deleted\"\n- `test_derive_change_type_modified` - all false → \"modified\"\n- `test_resync_deletes_and_reinserts` - insert, then re-process with different files, verify old rows gone\n\ntests/gitlab_types_tests.rs:\n- `test_deserialize_diff_file` - verify GitLabDiffFile deserialization\n- `test_deserialize_mr_with_commit_shas` - verify new fields on GitLabMergeRequest\n\nGREEN: Implement API endpoint, change type derivation, drain handler, commit SHA capture\n\nVERIFY: `cargo test file_changes -- --nocapture && cargo test gitlab_types -- --nocapture`\n\n## Edge Cases\n- MR with 1000+ files (monorepo): pagination essential on diffs endpoint\n- old_path for non-renames: GitLab still returns old_path (same as new_path) — only store when renamed_file=true\n- Draft MRs: diffs may change frequently — DELETE+INSERT handles this\n- MR with no diffs (empty MR): returns empty array, no rows inserted, job still completed\n- merge_commit_sha is NULL until MR is merged — don't error on NULL","status":"open","priority":2,"issue_type":"task","created_at":"2026-02-02T21:34:08.939514Z","created_by":"tayloreernisse","updated_at":"2026-02-02T21:48:37.319521Z","compaction_level":0,"original_size":0,"labels":["api","gate-4","phase-b"],"dependencies":[{"issue_id":"bd-2yo","depends_on_id":"bd-14q","type":"parent-child","created_at":"2026-02-02T21:34:08.941359Z","created_by":"tayloreernisse"},{"issue_id":"bd-2yo","depends_on_id":"bd-1oo","type":"blocks","created_at":"2026-02-02T21:34:16.555239Z","created_by":"tayloreernisse"},{"issue_id":"bd-2yo","depends_on_id":"bd-jec","type":"blocks","created_at":"2026-02-02T21:34:16.656402Z","created_by":"tayloreernisse"},{"issue_id":"bd-2yo","depends_on_id":"bd-tir","type":"blocks","created_at":"2026-02-02T21:34:16.605198Z","created_by":"tayloreernisse"}]} {"id":"bd-2yq","title":"[CP1] Issue transformer with label extraction","description":"Transform GitLab issue payloads to normalized database schema.\n\nFunctions to implement:\n- transformIssue(gitlabIssue, localProjectId) → NormalizedIssue\n- extractLabels(gitlabIssue, localProjectId) → Label[]\n\nTransformation rules:\n- Convert ISO timestamps to ms epoch using isoToMs()\n- Set last_seen_at to nowMs()\n- Handle labels vs labels_details (prefer details when available)\n- Handle missing optional fields gracefully\n\nFiles: src/gitlab/transformers/issue.ts\nTests: tests/unit/issue-transformer.test.ts\nDone when: Unit tests pass for payload transformation and label extraction","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:19:09.660448Z","created_by":"tayloreernisse","updated_at":"2026-01-25T15:21:35.152259Z","deleted_at":"2026-01-25T15:21:35.152254Z","deleted_by":"tayloreernisse","delete_reason":"delete","original_type":"task","compaction_level":0,"original_size":0} {"id":"bd-2ys","title":"[CP1] Cargo.toml updates - async-stream and futures","description":"## Background\n\nThe GitLab client pagination methods require async streaming capabilities. The `async-stream` crate provides the `stream!` macro for creating async iterators, and `futures` provides `StreamExt` for consuming them with `.next()` and other combinators.\n\n## Approach\n\nAdd these dependencies to Cargo.toml:\n\n```toml\n[dependencies]\nasync-stream = \"0.3\"\nfutures = { version = \"0.3\", default-features = false, features = [\"alloc\"] }\n```\n\nUse minimal features on `futures` to avoid pulling unnecessary code.\n\n## Acceptance Criteria\n\n- [ ] `async-stream = \"0.3\"` is in Cargo.toml [dependencies]\n- [ ] `futures` with `alloc` feature is in Cargo.toml [dependencies]\n- [ ] `cargo check` succeeds after adding dependencies\n\n## Files\n\n- Cargo.toml (edit)\n\n## TDD Loop\n\nRED: Not applicable (dependency addition)\nGREEN: Add lines to Cargo.toml\nVERIFY: `cargo check`\n\n## Edge Cases\n\n- If `futures` is already present, merge features rather than duplicate\n- Use exact version pins for reproducibility","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-25T17:02:38.104664Z","created_by":"tayloreernisse","updated_at":"2026-01-25T22:25:10.274787Z","closed_at":"2026-01-25T22:25:10.274727Z","close_reason":"Added async-stream 0.3 and futures 0.3 (alloc feature) to Cargo.toml, cargo check passes","compaction_level":0,"original_size":0} -{"id":"bd-2zl","title":"Epic: Gate 1 - Resource Events Ingestion","description":"Ingest structured event data from GitLab Resource Events APIs (state, label, milestone) into local event tables. Includes migration 011 (renumbered from spec's 010 since 010_chunk_config.sql already exists), config extension, API client endpoints, generic dependent fetch queue, and ingestion pipeline integration.\n\nChildren: bd-hu3 (migration 011), bd-2e8 (config), bd-2fm (types), bd-sqw (API), bd-1uc (DB upserts), bd-tir (queue), bd-1ep (pipeline wiring), bd-3sh (count events), bd-1m8 (stats check)","status":"open","priority":1,"issue_type":"feature","created_at":"2026-02-02T21:30:49.136036Z","created_by":"tayloreernisse","updated_at":"2026-02-02T21:50:40.551205Z","compaction_level":0,"original_size":0,"labels":["epic","gate-1","phase-b"]} +{"id":"bd-2zl","title":"Epic: Gate 1 - Resource Events Ingestion","description":"## Background\nGate 1 transforms gitlore from a snapshot engine into a temporal data store by ingesting structured event data from GitLab Resource Events APIs (state, label, milestone changes). This is the foundation — Gates 2-5 all depend on the event tables and dependent fetch queue that Gate 1 establishes.\n\nCurrently, when an issue is closed or a label changes, gitlore overwrites the current state. The transition is lost. Gate 1 captures these transitions as discrete events with timestamps, actors, and provenance, enabling temporal queries like \"when did this issue become critical?\" and \"who closed this MR?\"\n\n## Architecture\n- **Three new tables:** resource_state_events, resource_label_events, resource_milestone_events (migration 011, already shipped as bd-hu3)\n- **Generic dependent fetch queue:** pending_dependent_fetches table replaces per-type queue tables. Supports job_types: resource_events, mr_closes_issues, mr_diffs. Used by Gates 1, 2, and 4.\n- **Opt-in via config:** sync.fetchResourceEvents (default true). --no-events CLI flag to skip.\n- **Incremental:** Only changed entities enqueued. --full re-enqueues all.\n- **Crash recovery:** locked_at column with 5-minute stale lock reclaim.\n\n## Children (Execution Order)\n1. **bd-hu3** [CLOSED] — Migration 011: event tables + entity_references + dependent fetch queue\n2. **bd-2e8** [CLOSED] — fetchResourceEvents config flag\n3. **bd-2fm** [CLOSED] — GitLab Resource Event serde types\n4. **bd-sqw** [CLOSED] — Resource Events API endpoints in GitLab client\n5. **bd-1uc** [CLOSED] — DB upsert functions for resource events\n6. **bd-tir** [CLOSED] — Generic dependent fetch queue (enqueue + drain)\n7. **bd-1ep** [CLOSED] — Wire resource event fetching into sync pipeline\n8. **bd-3sh** [CLOSED] — lore count events command\n9. **bd-1m8** [CLOSED] — lore stats --check for event integrity + queue health\n\n## Gate Completion Criteria\n- [ ] All 9 children closed\n- [ ] `lore sync` fetches resource events for changed entities\n- [ ] `lore sync --no-events` skips event fetching\n- [ ] Event fetch failures queued for retry with exponential backoff\n- [ ] Stale locks auto-reclaimed on next sync run\n- [ ] `lore count events` shows counts by type (state/label/milestone)\n- [ ] `lore stats --check` validates referential integrity + queue health\n- [ ] Robot mode JSON for all new commands\n- [ ] Integration test: full sync cycle with events enabled\n\n## Dependencies\n- None (Gate 1 is the foundation)\n- Downstream: Gate 2 (bd-1se) depends on event tables and dependent fetch queue","status":"open","priority":1,"issue_type":"feature","created_at":"2026-02-02T21:30:49.136036Z","created_by":"tayloreernisse","updated_at":"2026-02-04T19:32:13.737741Z","compaction_level":0,"original_size":0,"labels":["epic","gate-1","phase-b"]} {"id":"bd-2zr","title":"[CP1] GitLab client pagination methods","description":"Add async stream methods for paginated GitLab API calls.\n\n## Methods to Add to GitLabClient\n\n### paginate_issues(gitlab_project_id, updated_after, cursor_rewind_seconds) -> Stream\n- Use async_stream::try_stream! macro\n- Query params: scope=all, state=all, order_by=updated_at, sort=asc, per_page=100\n- If updated_after provided, apply cursor_rewind_seconds (subtract from timestamp)\n- Clamp to 0 to avoid underflow: (ts - rewind_ms).max(0)\n- Follow x-next-page header until empty/absent\n- Fall back to empty-page detection if headers missing\n\n### paginate_issue_discussions(gitlab_project_id, issue_iid) -> Stream\n- Paginate through discussions for single issue\n- per_page=100\n- Follow x-next-page header\n\n### request_with_headers(path, params) -> Result<(T, HeaderMap)>\n- Acquire rate limiter\n- Make request with PRIVATE-TOKEN header\n- Return both deserialized data and response headers\n\n## Dependencies\n- async-stream = \"0.3\" (for try_stream! macro)\n- futures = \"0.3\" (for Stream trait and StreamExt)\n\nFiles: src/gitlab/client.rs\nTests: tests/pagination_tests.rs\nDone when: Pagination handles multiple pages and respects cursors, tests pass","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T16:57:13.045971Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.784887Z","deleted_at":"2026-01-25T17:02:01.784883Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0} {"id":"bd-31b","title":"[CP1] Discussion ingestion module","description":"Fetch and store discussions/notes for each issue.\n\nImplement ingestIssueDiscussions(options) → { discussionsFetched, discussionsUpserted, notesUpserted, systemNotesCount }\n\nLogic:\n1. Paginate through all discussions for given issue\n2. For each discussion:\n - Store raw payload (compressed)\n - Upsert discussion record with correct issue FK\n - Transform and upsert all notes\n - Store raw payload per note\n - Track system notes count\n\nFiles: src/ingestion/discussions.ts\nTests: tests/integration/issue-discussion-ingestion.test.ts\nDone when: Discussions and notes populated with correct FKs and is_system flags","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:19:57.131442Z","created_by":"tayloreernisse","updated_at":"2026-01-25T15:21:35.156574Z","deleted_at":"2026-01-25T15:21:35.156571Z","deleted_by":"tayloreernisse","delete_reason":"delete","original_type":"task","compaction_level":0,"original_size":0} {"id":"bd-31i","title":"Epic: CP2 Gate B - Labels + Assignees + Reviewers","description":"## Background\nGate B validates junction tables for labels, assignees, and reviewers. Ensures relationships are tracked correctly and stale links are removed on resync. This is critical for filtering (`--reviewer=alice`) and display.\n\n## Acceptance Criteria (Pass/Fail)\n- [ ] `mr_labels` table has rows for MRs with labels\n- [ ] Label count per MR matches GitLab UI (spot check 3 MRs)\n- [ ] `mr_assignees` table has rows for MRs with assignees\n- [ ] Assignee usernames match GitLab UI (spot check 3 MRs)\n- [ ] `mr_reviewers` table has rows for MRs with reviewers\n- [ ] Reviewer usernames match GitLab UI (spot check 3 MRs)\n- [ ] Remove label in GitLab -> resync -> link removed from mr_labels\n- [ ] Add reviewer in GitLab -> resync -> link added to mr_reviewers\n- [ ] `gi list mrs --label=bugfix` filters correctly\n- [ ] `gi list mrs --reviewer=alice` filters correctly\n\n## Validation Script\n```bash\n#!/bin/bash\nset -e\n\nDB_PATH=\"${XDG_DATA_HOME:-$HOME/.local/share}/gitlab-inbox/db.sqlite3\"\n\necho \"=== Gate B: Labels + Assignees + Reviewers ===\"\n\n# 1. Check label linkage exists\necho \"Step 1: Check label linkage...\"\nLABEL_LINKS=$(sqlite3 \"$DB_PATH\" \"SELECT COUNT(*) FROM mr_labels;\")\necho \" Total label links: $LABEL_LINKS\"\n\n# 2. Show sample label linkage\necho \"Step 2: Sample label linkage...\"\nsqlite3 \"$DB_PATH\" \"\n SELECT m.iid, GROUP_CONCAT(l.name, ', ') as labels\n FROM merge_requests m\n JOIN mr_labels ml ON ml.merge_request_id = m.id\n JOIN labels l ON l.id = ml.label_id\n GROUP BY m.id\n LIMIT 5;\n\"\n\n# 3. Check assignee linkage\necho \"Step 3: Check assignee linkage...\"\nASSIGNEE_LINKS=$(sqlite3 \"$DB_PATH\" \"SELECT COUNT(*) FROM mr_assignees;\")\necho \" Total assignee links: $ASSIGNEE_LINKS\"\n\n# 4. Show sample assignee linkage\necho \"Step 4: Sample assignee linkage...\"\nsqlite3 \"$DB_PATH\" \"\n SELECT m.iid, GROUP_CONCAT(a.username, ', ') as assignees\n FROM merge_requests m\n JOIN mr_assignees a ON a.merge_request_id = m.id\n GROUP BY m.id\n LIMIT 5;\n\"\n\n# 5. Check reviewer linkage\necho \"Step 5: Check reviewer linkage...\"\nREVIEWER_LINKS=$(sqlite3 \"$DB_PATH\" \"SELECT COUNT(*) FROM mr_reviewers;\")\necho \" Total reviewer links: $REVIEWER_LINKS\"\n\n# 6. Show sample reviewer linkage\necho \"Step 6: Sample reviewer linkage...\"\nsqlite3 \"$DB_PATH\" \"\n SELECT m.iid, GROUP_CONCAT(r.username, ', ') as reviewers\n FROM merge_requests m\n JOIN mr_reviewers r ON r.merge_request_id = m.id\n GROUP BY m.id\n LIMIT 5;\n\"\n\n# 7. Test filter commands\necho \"Step 7: Test filter commands...\"\n# Get a label that exists\nLABEL=$(sqlite3 \"$DB_PATH\" \"SELECT name FROM labels LIMIT 1;\")\nif [ -n \"$LABEL\" ]; then\n echo \" Testing --label=$LABEL\"\n gi list mrs --label=\"$LABEL\" --limit=3\nfi\n\n# Get a reviewer that exists\nREVIEWER=$(sqlite3 \"$DB_PATH\" \"SELECT username FROM mr_reviewers LIMIT 1;\")\nif [ -n \"$REVIEWER\" ]; then\n echo \" Testing --reviewer=$REVIEWER\"\n gi list mrs --reviewer=\"$REVIEWER\" --limit=3\nfi\n\necho \"\"\necho \"=== Gate B: PASSED ===\"\n```\n\n## Stale Link Removal Test (Manual)\n```bash\n# 1. Pick an MR with labels in GitLab UI\nMR_IID=123\n\n# 2. Note current label count\nsqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM mr_labels ml\n JOIN merge_requests m ON m.id = ml.merge_request_id\n WHERE m.iid = $MR_IID;\n\"\n# Example: 3 labels\n\n# 3. Remove a label in GitLab UI (manually)\n\n# 4. Resync\ngi ingest --type=merge_requests\n\n# 5. Verify label removed\nsqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM mr_labels ml\n JOIN merge_requests m ON m.id = ml.merge_request_id\n WHERE m.iid = $MR_IID;\n\"\n# Should be: 2 labels (one less)\n```\n\n## Test Commands (Quick Verification)\n```bash\n# Check counts:\nsqlite3 ~/.local/share/gitlab-inbox/db.sqlite3 \"\n SELECT \n (SELECT COUNT(*) FROM mr_labels) as label_links,\n (SELECT COUNT(*) FROM mr_assignees) as assignee_links,\n (SELECT COUNT(*) FROM mr_reviewers) as reviewer_links;\n\"\n\n# Test filtering:\ngi list mrs --label=enhancement --limit=5\ngi list mrs --reviewer=alice --limit=5\ngi list mrs --assignee=bob --limit=5\n```\n\n## Dependencies\nThis gate requires:\n- bd-ser (MR ingestion with label/assignee/reviewer linking via clear-and-relink pattern)\n- Gate A must pass first\n\n## Edge Cases\n- MRs with no labels/assignees/reviewers: junction tables should have no rows for that MR\n- Labels shared across issues and MRs: labels table is shared, only junction differs\n- Usernames are case-sensitive: `Alice` != `alice`","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-26T22:06:01.292318Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:48:21.059422Z","closed_at":"2026-01-27T00:48:21.059378Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-31i","depends_on_id":"bd-ser","type":"blocks","created_at":"2026-01-26T22:08:55.684769Z","created_by":"tayloreernisse"}]} @@ -156,7 +156,7 @@ {"id":"bd-hrs","title":"Create migration 007_documents.sql","description":"## Background\nMigration 007 creates the document storage layer that Gate A's entire search pipeline depends on. It introduces 5 tables: `documents` (the searchable unit), `document_labels` and `document_paths` (for filtered search), and two queue tables (`dirty_sources`, `pending_discussion_fetches`) that drive incremental document regeneration and discussion fetching in Gate C. This is the most-depended-on bead in the project (6 downstream beads block on it).\n\n## Approach\nCreate `migrations/007_documents.sql` with the exact SQL from PRD Section 1.1. The schema is fully specified in the PRD — no design decisions remain.\n\nKey implementation details:\n- `documents` table has `UNIQUE(source_type, source_id)` constraint for upsert support\n- `document_labels` and `document_paths` use `WITHOUT ROWID` for compact storage\n- `dirty_sources` uses composite PK `(source_type, source_id)` with `ON CONFLICT` upsert semantics\n- `pending_discussion_fetches` uses composite PK `(project_id, noteable_type, noteable_iid)`\n- Both queue tables have `next_attempt_at` indexed for efficient backoff queries\n- `labels_hash` and `paths_hash` on documents enable write optimization (skip unchanged labels/paths)\n\nRegister the migration in `src/core/db.rs` by adding entry 7 to the `MIGRATIONS` array.\n\n## Acceptance Criteria\n- [ ] `migrations/007_documents.sql` file exists with all 5 CREATE TABLE statements\n- [ ] Migration applies cleanly on fresh DB (`cargo test migration_tests`)\n- [ ] Migration applies cleanly after CP2 schema (migrations 001-006 already applied)\n- [ ] All foreign keys enforced: `documents.project_id -> projects(id)`, `document_labels.document_id -> documents(id) ON DELETE CASCADE`, `document_paths.document_id -> documents(id) ON DELETE CASCADE`, `pending_discussion_fetches.project_id -> projects(id)`\n- [ ] All indexes created: `idx_documents_project_updated`, `idx_documents_author`, `idx_documents_source`, `idx_documents_hash`, `idx_document_labels_label`, `idx_document_paths_path`, `idx_dirty_sources_next_attempt`, `idx_pending_discussions_next_attempt`\n- [ ] `labels_hash TEXT NOT NULL DEFAULT ''` and `paths_hash TEXT NOT NULL DEFAULT ''` columns present on `documents`\n- [ ] Schema version 7 recorded in `schema_version` table\n- [ ] `cargo build` succeeds after registering migration in db.rs\n\n## Files\n- `migrations/007_documents.sql` — new file (copy exact SQL from PRD Section 1.1)\n- `src/core/db.rs` — add migration 7 to `MIGRATIONS` array\n\n## TDD Loop\nRED: Add migration to db.rs, run `cargo test migration_tests` — fails because SQL file missing\nGREEN: Create `migrations/007_documents.sql` with full schema\nVERIFY: `cargo test migration_tests && cargo build`\n\n## Edge Cases\n- Migration must be idempotent-safe if applied twice (INSERT into schema_version will fail on second run — this is expected and handled by the migration runner's version check)\n- `WITHOUT ROWID` tables (document_labels, document_paths) require explicit PK — already defined\n- `CHECK` constraint on `documents.source_type` must match exactly: `'issue','merge_request','discussion'`\n- `CHECK` constraint on `documents.truncated_reason` allows NULL or one of 4 specific values","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:25:25.734380Z","created_by":"tayloreernisse","updated_at":"2026-01-30T16:54:12.854351Z","closed_at":"2026-01-30T16:54:12.854149Z","close_reason":"Completed: migration 007_documents.sql with 5 tables (documents, document_labels, document_paths, dirty_sources, pending_discussion_fetches), 8 indexes, registered in db.rs, cargo build + migration tests pass","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-hrs","depends_on_id":"bd-3lc","type":"blocks","created_at":"2026-01-30T15:29:15.536304Z","created_by":"tayloreernisse"}]} {"id":"bd-hu3","title":"Write migration 011: resource event tables, entity_references, and dependent fetch queue","description":"## Background\nPhase B needs three new event tables and a generic dependent fetch queue to power temporal queries (timeline, file-history, trace). These tables store structured event data from GitLab Resource Events APIs, replacing fragile system note parsing for state/label/milestone changes.\n\nMigration 010_chunk_config.sql already exists, so Phase B starts at migration 011.\n\n## Approach\nCreate migrations/011_resource_events.sql with the exact schema from the Phase B spec (§1.2 + §2.2):\n\n**Event tables:**\n- resource_state_events: state changes (opened/closed/reopened/merged/locked) with source_merge_request_id for \"closed by MR\" linking\n- resource_label_events: label add/remove with label_name\n- resource_milestone_events: milestone add/remove with milestone_title + milestone_id\n\n**Cross-reference table (Gate 2):**\n- entity_references: source/target entity pairs with reference_type (closes/mentioned/related), source_method provenance, and unresolved reference support (target_entity_id NULL with target_project_path + target_entity_iid)\n\n**Dependent fetch queue:**\n- pending_dependent_fetches: generic job queue with job_type IN ('resource_events', 'mr_closes_issues', 'mr_diffs'), locked_at crash recovery, exponential backoff via attempts + next_retry_at\n\n**All tables must have:**\n- CHECK constraints for entity exclusivity (issue XOR merge_request) on event tables\n- UNIQUE constraints (gitlab_id + project_id for events, composite for queue, multi-column for references)\n- Partial indexes (WHERE issue_id IS NOT NULL, WHERE target_entity_id IS NULL, etc.)\n- CASCADE deletes on project_id and entity FKs\n\nRegister in src/core/db.rs MIGRATIONS array:\n```rust\n(\"011\", include_str!(\"../../migrations/011_resource_events.sql\")),\n```\n\nEnd migration with:\n```sql\nINSERT INTO schema_version (version, applied_at, description)\nVALUES (11, strftime('%s', 'now') * 1000, 'Resource events, entity references, and dependent fetch queue');\n```\n\n## Acceptance Criteria\n- [ ] migrations/011_resource_events.sql exists with all 4 tables + indexes + constraints\n- [ ] src/core/db.rs MIGRATIONS array includes (\"011\", include_str!(...))\n- [ ] `cargo build` succeeds (migration SQL compiles into binary)\n- [ ] `cargo test migration` passes (migration applies cleanly on fresh DB)\n- [ ] All CHECK constraints enforced (issue XOR merge_request on event tables)\n- [ ] All UNIQUE constraints present (prevents duplicate events/refs/jobs)\n- [ ] entity_references UNIQUE handles NULL coalescing correctly\n- [ ] pending_dependent_fetches job_type CHECK includes all three types\n\n## Files\n- migrations/011_resource_events.sql (new)\n- src/core/db.rs (add to MIGRATIONS array, line ~46)\n\n## TDD Loop\nRED: Add test in tests/migration_tests.rs:\n- `test_migration_011_creates_event_tables` - verify all 4 tables exist after migration\n- `test_migration_011_entity_exclusivity_constraint` - verify CHECK rejects both NULL and both non-NULL for issue_id/merge_request_id\n- `test_migration_011_event_dedup` - verify UNIQUE(gitlab_id, project_id) rejects duplicate events\n- `test_migration_011_entity_references_dedup` - verify UNIQUE constraint with NULL coalescing\n- `test_migration_011_queue_dedup` - verify UNIQUE(project_id, entity_type, entity_iid, job_type)\n\nGREEN: Write the migration SQL + register in db.rs\n\nVERIFY: `cargo test migration_tests -- --nocapture`\n\n## Edge Cases\n- entity_references UNIQUE uses COALESCE for NULLable columns — test with both resolved and unresolved refs\n- pending_dependent_fetches job_type CHECK — ensure 'mr_diffs' is included (Gate 4 needs it)\n- SQLite doesn't enforce CHECK on INSERT OR REPLACE — verify constraint behavior\n- The entity exclusivity CHECK must allow exactly one of issue_id/merge_request_id to be non-NULL\n- Verify CASCADE deletes work (delete project → all events/refs/jobs deleted)","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-02T21:31:23.933894Z","created_by":"tayloreernisse","updated_at":"2026-02-03T16:06:28.918228Z","closed_at":"2026-02-03T16:06:28.917906Z","close_reason":"Already completed in prior session, re-closing after accidental reopen","compaction_level":0,"original_size":0,"labels":["gate-1","phase-b","schema"],"dependencies":[{"issue_id":"bd-hu3","depends_on_id":"bd-2zl","type":"parent-child","created_at":"2026-02-02T21:31:23.937375Z","created_by":"tayloreernisse"}]} {"id":"bd-iba","title":"Add GitLab client MR pagination methods","description":"## Background\nGitLab client pagination for merge requests and discussions. Must support robust pagination with fallback chain because some GitLab instances/proxies strip headers.\n\n## Approach\nAdd to existing `src/gitlab/client.rs`:\n1. `MergeRequestPage` struct - Items + pagination metadata\n2. `parse_link_header_next()` - RFC 8288 Link header parsing\n3. `fetch_merge_requests_page()` - Single page fetch with metadata\n4. `paginate_merge_requests()` - Async stream for all MRs\n5. `paginate_mr_discussions()` - Async stream for MR discussions\n\n## Files\n- `src/gitlab/client.rs` - Add pagination methods\n\n## Acceptance Criteria\n- [ ] `MergeRequestPage` struct exists with `items`, `next_page`, `is_last_page`\n- [ ] `parse_link_header_next()` extracts `rel=\"next\"` URL from Link header\n- [ ] Pagination fallback chain: Link header > x-next-page > full-page heuristic\n- [ ] `paginate_merge_requests()` returns `Pin>>>`\n- [ ] `paginate_mr_discussions()` returns `Pin>>>`\n- [ ] MR endpoint uses `scope=all&state=all` to include all MRs\n- [ ] `cargo test client` passes\n\n## TDD Loop\nRED: `cargo test fetch_merge_requests` -> method not found\nGREEN: Add pagination methods\nVERIFY: `cargo test client`\n\n## Struct Definitions\n```rust\n#[derive(Debug)]\npub struct MergeRequestPage {\n pub items: Vec,\n pub next_page: Option,\n pub is_last_page: bool,\n}\n```\n\n## Link Header Parsing (RFC 8288)\n```rust\n/// Parse Link header to extract rel=\"next\" URL.\nfn parse_link_header_next(headers: &reqwest::header::HeaderMap) -> Option {\n headers\n .get(\"link\")\n .and_then(|v| v.to_str().ok())\n .and_then(|link_str| {\n // Format: ; rel=\"next\", ; rel=\"last\"\n for part in link_str.split(',') {\n let part = part.trim();\n if part.contains(\"rel=\\\"next\\\"\") || part.contains(\"rel=next\") {\n if let Some(start) = part.find('<') {\n if let Some(end) = part.find('>') {\n return Some(part[start + 1..end].to_string());\n }\n }\n }\n }\n None\n })\n}\n```\n\n## Pagination Fallback Chain\n```rust\nlet next_page = match (link_next, x_next_page, items.len() as u32 == per_page) {\n (Some(_), _, _) => Some(page + 1), // Link header present: continue\n (None, Some(np), _) => Some(np), // x-next-page present: use it\n (None, None, true) => Some(page + 1), // Full page, no headers: try next\n (None, None, false) => None, // Partial page: we're done\n};\n```\n\n## Fetch Single Page\n```rust\npub async fn fetch_merge_requests_page(\n &self,\n gitlab_project_id: i64,\n updated_after: Option,\n cursor_rewind_seconds: u32,\n page: u32,\n per_page: u32,\n) -> Result {\n let mut params = vec![\n (\"scope\", \"all\".to_string()),\n (\"state\", \"all\".to_string()),\n (\"order_by\", \"updated_at\".to_string()),\n (\"sort\", \"asc\".to_string()),\n (\"per_page\", per_page.to_string()),\n (\"page\", page.to_string()),\n ];\n // Apply cursor rewind for safety\n // ...\n}\n```\n\n## Async Stream Pattern\n```rust\npub fn paginate_merge_requests(\n &self,\n gitlab_project_id: i64,\n updated_after: Option,\n cursor_rewind_seconds: u32,\n) -> Pin> + Send + '_>> {\n Box::pin(async_stream::try_stream! {\n let mut page = 1u32;\n let per_page = 100u32;\n loop {\n let page_result = self.fetch_merge_requests_page(...).await?;\n for mr in page_result.items {\n yield mr;\n }\n if page_result.is_last_page {\n break;\n }\n match page_result.next_page {\n Some(np) => page = np,\n None => break,\n }\n }\n })\n}\n```\n\n## Edge Cases\n- `scope=all` required to include all MRs (not just authored by current user)\n- `state=all` required to include merged/closed (GitLab defaults may exclude)\n- `locked` state cannot be filtered server-side (use local SQL filtering)\n- Cursor rewind should clamp to 0 to avoid negative timestamps","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-26T22:06:41.633065Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:13:05.613625Z","closed_at":"2026-01-27T00:13:05.613440Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-iba","depends_on_id":"bd-5ta","type":"blocks","created_at":"2026-01-26T22:08:54.364647Z","created_by":"tayloreernisse"}]} -{"id":"bd-ike","title":"Epic: Gate 3 - Decision Timeline (lore timeline)","description":"Implement 'lore timeline ' command that produces keyword-driven chronological narratives across issues, MRs, and discussions. Combines FTS5 search (seed), BFS cross-reference expansion, event collection from resource event tables, and evidence note surfacing. Output in both human-readable and robot JSON formats.\n\nChildren: bd-20e (types), bd-32q (seed phase), bd-ypa (expand phase), bd-3as (collect events), bd-2f2 (human output), bd-dty (robot JSON), bd-1nf (CLI wiring)","status":"open","priority":1,"issue_type":"feature","created_at":"2026-02-02T21:31:01.036474Z","created_by":"tayloreernisse","updated_at":"2026-02-02T21:50:48.982831Z","compaction_level":0,"original_size":0,"labels":["epic","gate-3","phase-b"],"dependencies":[{"issue_id":"bd-ike","depends_on_id":"bd-1se","type":"blocks","created_at":"2026-02-02T21:33:37.875622Z","created_by":"tayloreernisse"},{"issue_id":"bd-ike","depends_on_id":"bd-2zl","type":"blocks","created_at":"2026-02-02T21:33:37.831914Z","created_by":"tayloreernisse"}]} +{"id":"bd-ike","title":"Epic: Gate 3 - Decision Timeline (lore timeline)","description":"## Background\nGate 3 is the flagship feature of Phase B — the \"lore timeline\" command that answers \"What happened with X?\" by producing a keyword-driven chronological narrative across issues, MRs, and discussions. This is the forcing function for the entire phase: if timeline works, the architecture is validated.\n\nThe query pipeline has 5 stages: SEED (FTS5 search) → HYDRATE (map docs to entities) → EXPAND (BFS cross-reference traversal) → COLLECT EVENTS → INTERLEAVE (chronological sort). Evidence-bearing notes from FTS5 are included as first-class timeline events to surface decision rationale, not just activity counts.\n\n## Architecture\n- **No new tables.** Timeline reads across existing tables at query time and produces a virtual event stream.\n- **TimelineEvent model:** Unified struct covering created/state_changed/label/milestone/merged/note_evidence/cross_referenced events.\n- **Expansion:** BFS over entity_references, depth-limited (default 1). Default follows closes+related edges; --expand-mentions adds mentioned edges (high fan-out).\n- **Evidence notes:** Top 10 FTS5-matched notes included as NoteEvidence events with ~200 char snippets.\n- **Dual output:** Human-readable colored format + robot JSON with expansion provenance.\n\n## Children (Execution Order)\n1. **bd-20e** [OPEN] — TimelineEvent model and TimelineEventType enum (src/core/timeline.rs types)\n2. **bd-32q** [OPEN] — Seed phase: FTS5 keyword search → entity IDs + evidence note candidates\n3. **bd-ypa** [OPEN] — Expand phase: BFS cross-reference expansion over entity_references\n4. **bd-3as** [OPEN] — Collect events: gather state/label/milestone/creation/merge events + interleave\n5. **bd-2f2** [OPEN] — Human output renderer (colored, spec §3.4 format)\n6. **bd-dty** [OPEN] — Robot mode JSON output (spec §3.5 format)\n7. **bd-1nf** [OPEN] — CLI wiring: register command with all flags (--depth, --since, --expand-mentions, -p, -n)\n\n## Gate Completion Criteria\n- [ ] `lore timeline ` returns chronologically ordered events\n- [ ] Seed entities found via FTS5 (issues, MRs, and notes)\n- [ ] State, label, milestone events interleaved from resource event tables\n- [ ] Entity creation and merge events included\n- [ ] Evidence notes included as note_evidence events (top 10 FTS5 matches)\n- [ ] Cross-reference expansion follows entity_references to configurable depth\n- [ ] Default: closes+related edges; --expand-mentions adds mentioned edges\n- [ ] --depth 0 disables expansion\n- [ ] --since filters by event timestamp\n- [ ] -p scopes to project\n- [ ] Human output colored and readable\n- [ ] Robot JSON with expansion provenance (via) and unresolved references\n- [ ] Query latency < 200ms for < 50 seed entities\n\n## Dependencies\n- Depends on: Gate 1 (bd-2zl) for event tables, Gate 2 (bd-1se) for entity_references\n- Downstream: Gates 4 and 5 can proceed in parallel (no dependency on Gate 3)","status":"open","priority":1,"issue_type":"feature","created_at":"2026-02-02T21:31:01.036474Z","created_by":"tayloreernisse","updated_at":"2026-02-04T19:32:48.598603Z","compaction_level":0,"original_size":0,"labels":["epic","gate-3","phase-b"],"dependencies":[{"issue_id":"bd-ike","depends_on_id":"bd-1se","type":"blocks","created_at":"2026-02-02T21:33:37.875622Z","created_by":"tayloreernisse"},{"issue_id":"bd-ike","depends_on_id":"bd-2zl","type":"blocks","created_at":"2026-02-02T21:33:37.831914Z","created_by":"tayloreernisse"}]} {"id":"bd-jec","title":"Add fetchMrFileChanges config flag","description":"## Background\nMR file change fetching should be opt-in (default true) to avoid extra API calls. Follows the same pattern as fetchResourceEvents (bd-2e8).\n\n## Approach\nAdd to SyncConfig in src/core/config.rs:\n```rust\n#[serde(rename = \"fetchMrFileChanges\", default = \"default_true\")]\npub fetch_mr_file_changes: bool,\n```\n\nUpdate Default impl for SyncConfig to include `fetch_mr_file_changes: true`.\n\nAdd --no-file-changes CLI flag to sync command in src/cli/mod.rs (SyncArgs):\n```rust\n/// Skip MR file change fetching (overrides config)\n#[arg(long = \"no-file-changes\", help_heading = \"Sync Options\")]\npub no_file_changes: bool,\n```\n\nIn sync handler, override config:\n```rust\nif args.no_file_changes {\n config.sync.fetch_mr_file_changes = false;\n}\n```\n\n## Acceptance Criteria\n- [ ] SyncConfig deserializes fetchMrFileChanges from JSON config\n- [ ] Defaults to true when field absent\n- [ ] --no-file-changes flag parses correctly\n- [ ] --no-file-changes overrides config to false\n- [ ] `cargo test` passes\n\n## Files\n- src/core/config.rs (add field + Default update)\n- src/cli/mod.rs (add --no-file-changes to SyncArgs)\n- src/cli/commands/sync.rs (override logic)\n\n## TDD Loop\nRED: tests/config_tests.rs:\n- `test_sync_config_fetch_mr_file_changes_default_true`\n- `test_sync_config_fetch_mr_file_changes_explicit_false`\n\nGREEN: Add field, default, flag, override\n\nVERIFY: `cargo test config -- --nocapture && cargo build`\n\n## Edge Cases\n- Same as bd-2e8: reuse default_true fn, check for naming conflicts with existing flags","status":"open","priority":3,"issue_type":"task","created_at":"2026-02-02T21:34:08.892666Z","created_by":"tayloreernisse","updated_at":"2026-02-02T21:48:11.012962Z","compaction_level":0,"original_size":0,"labels":["config","gate-4","phase-b"],"dependencies":[{"issue_id":"bd-jec","depends_on_id":"bd-14q","type":"parent-child","created_at":"2026-02-02T21:34:08.895167Z","created_by":"tayloreernisse"}]} {"id":"bd-jov","title":"[CP1] Discussion and note transformers","description":"Transform GitLab discussion/note payloads to normalized database schema.\n\n## Module\nsrc/gitlab/transformers/discussion.rs\n\n## Structs\n\n### NormalizedDiscussion\n- gitlab_discussion_id: String\n- project_id: i64\n- issue_id: i64\n- noteable_type: String (\"Issue\")\n- individual_note: bool\n- first_note_at, last_note_at: Option\n- last_seen_at: i64\n- resolvable, resolved: bool\n\n### NormalizedNote\n- gitlab_id: i64\n- project_id: i64\n- note_type: Option\n- is_system: bool\n- author_username: String\n- body: String\n- created_at, updated_at, last_seen_at: i64\n- position: i32 (array index in notes[])\n- resolvable, resolved: bool\n- resolved_by: Option\n- resolved_at: Option\n\n## Functions\n\n### transform_discussion(gitlab_discussion, local_project_id, local_issue_id) -> NormalizedDiscussion\n- Compute first_note_at/last_note_at from notes array min/max created_at\n- Compute resolvable (any note resolvable)\n- Compute resolved (resolvable AND all resolvable notes resolved)\n\n### transform_notes(gitlab_discussion, local_project_id) -> Vec\n- Enumerate notes to get position (array index)\n- Set is_system from note.system\n- Convert timestamps to ms epoch\n\nFiles: src/gitlab/transformers/discussion.rs\nTests: tests/discussion_transformer_tests.rs\nDone when: Unit tests pass for discussion/note transformation with system note flagging","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:43:04.481361Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.759691Z","deleted_at":"2026-01-25T17:02:01.759684Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0} {"id":"bd-k7b","title":"[CP1] gi show issue command","description":"Show issue details with discussions.\n\n## Module\nsrc/cli/commands/show.rs\n\n## Clap Definition\nShow {\n #[arg(value_parser = [\"issue\", \"mr\"])]\n entity: String,\n \n iid: i64,\n \n #[arg(long)]\n project: Option,\n}\n\n## Output Format\nIssue #1234: Authentication redesign\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n\nProject: group/project-one\nState: opened\nAuthor: @johndoe\nCreated: 2024-01-15\nUpdated: 2024-03-20\nLabels: enhancement, auth\nURL: https://gitlab.example.com/group/project-one/-/issues/1234\n\nDescription:\n We need to redesign the authentication flow to support...\n\nDiscussions (5):\n\n @janedoe (2024-01-16):\n I agree we should move to JWT-based auth...\n\n @johndoe (2024-01-16):\n What about refresh token strategy?\n\n @bobsmith (2024-01-17):\n Have we considered OAuth2?\n\n## Ambiguity Handling\nIf multiple projects have same iid, either:\n- Prompt for --project flag\n- Show error listing which projects have that iid\n\nFiles: src/cli/commands/show.rs\nDone when: Issue detail view displays all fields including threaded discussions","status":"tombstone","priority":3,"issue_type":"task","created_at":"2026-01-25T16:58:26.904813Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.944183Z","deleted_at":"2026-01-25T17:02:01.944179Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0} diff --git a/.beads/last-touched b/.beads/last-touched index c804944..72b3adf 100644 --- a/.beads/last-touched +++ b/.beads/last-touched @@ -1 +1 @@ -bd-3pz +bd-1ht