Files
gitlore/.beads/issues.jsonl
Taylor Eernisse 873d2c0ab8 fix(beads): Align bead descriptions with Phase B spec
Reconciled 9 beads (bd-20e, bd-dty, bd-2f2, bd-3as, bd-ypa, bd-32q,
bd-1nf, bd-2ez, bd-343o) against docs/phase-b-temporal-intelligence.md.

Key fixes:
- bd-20e: Add url field, align StateChanged to {state} per spec 3.3,
  fix NoteEvidence fields (note_id, snippet, discussion_id), simplify
  Merged to unit variant, align CrossReferenced to {target}
- bd-dty: Restructure expanded_entities JSON to use nested "via" object
  per spec 3.5, add url/details fields to events, use "project" key
- bd-3as: Align event collection with updated TimelineEventType variants
- bd-ypa: Add via_from/via_reference_type/via_source_method provenance
- bd-32q, bd-1nf, bd-2f2: Add spec section references throughout
- bd-2ez: Document source_method value discrepancy (spec vs codebase)
- bd-343o: Add spec context for how it extends Gate 2

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 14:13:34 -05:00

185 lines
679 KiB
JSON

{"id":"bd-10f","title":"Update orchestrator for MR ingestion","description":"## Background\nOrchestrator coordinates MR ingestion followed by dependent discussion sync. Discussion sync targets are queried from DB (not collected in-memory) to handle large projects without memory growth. This is critical for projects with 10k+ MRs where collecting sync targets in memory during ingestion would cause unbounded growth.\n\n## Approach\nUpdate `src/ingestion/orchestrator.rs` to:\n1. Support `merge_requests` resource type in `run_ingestion()` match arm\n2. Query DB for MRs needing discussion sync after MR ingestion completes\n3. Execute discussion sync with bounded concurrency using `futures::stream::buffer_unordered`\n\n## Files\n- `src/ingestion/orchestrator.rs` - Update existing orchestrator\n\n## Acceptance Criteria\n- [ ] `run_ingestion()` handles `resource_type == \"merge_requests\"`\n- [ ] After MR ingestion, queries DB for MRs where `updated_at > discussions_synced_for_updated_at`\n- [ ] Discussion sync uses `dependent_concurrency` from config (default 5)\n- [ ] Each MR's discussion sync is independent (partial failures don't block others)\n- [ ] Results aggregated from MR ingestion + all discussion ingestion results\n- [ ] `cargo test orchestrator` passes\n\n## TDD Loop\nRED: `cargo test orchestrator_mr` -> merge_requests not handled\nGREEN: Add MR branch to orchestrator\nVERIFY: `cargo test orchestrator`\n\n## Struct Definition\n```rust\n/// Lightweight struct for DB query results - only fields needed for discussion sync\nstruct MrForDiscussionSync {\n local_mr_id: i64,\n iid: i64,\n updated_at: i64,\n}\n```\n\n## DB Query for Discussion Sync Targets\n```sql\nSELECT id, iid, updated_at\nFROM merge_requests\nWHERE project_id = ?\n AND (discussions_synced_for_updated_at IS NULL\n OR updated_at > discussions_synced_for_updated_at)\nORDER BY updated_at ASC;\n```\n\n## Orchestrator Flow\n```rust\npub async fn run_ingestion(\n &self,\n resource_type: &str,\n full_sync: bool,\n) -> Result<IngestResult> {\n match resource_type {\n \"issues\" => self.run_issue_ingestion(full_sync).await,\n \"merge_requests\" => self.run_mr_ingestion(full_sync).await,\n _ => Err(GiError::InvalidArgument {\n name: \"type\".to_string(),\n value: resource_type.to_string(),\n expected: \"issues or merge_requests\".to_string(),\n }),\n }\n}\n\nasync fn run_mr_ingestion(&self, full_sync: bool) -> Result<IngestResult> {\n // 1. Ingest MRs (handles cursor reset if full_sync)\n let mr_result = ingest_merge_requests(\n &self.conn, &self.client, &self.config,\n self.project_id, self.gitlab_project_id, full_sync,\n ).await?;\n \n // 2. Query DB for MRs needing discussion sync\n // CRITICAL: Do this AFTER ingestion, not during, to avoid memory growth\n let mrs_needing_sync: Vec<MrForDiscussionSync> = {\n let mut stmt = self.conn.prepare(\n \"SELECT id, iid, updated_at FROM merge_requests\n WHERE project_id = ? AND (discussions_synced_for_updated_at IS NULL\n OR updated_at > discussions_synced_for_updated_at)\n ORDER BY updated_at ASC\"\n )?;\n stmt.query_map([self.project_id], |row| {\n Ok(MrForDiscussionSync {\n local_mr_id: row.get(0)?,\n iid: row.get(1)?,\n updated_at: row.get(2)?,\n })\n })?.collect::<Result<Vec<_>, _>>()?\n };\n \n let total_needing_sync = mrs_needing_sync.len();\n info!(\"Discussion sync needed for {} MRs\", total_needing_sync);\n \n // 3. Execute discussion sync with bounded concurrency\n let concurrency = self.config.sync.dependent_concurrency.unwrap_or(5);\n \n let discussion_results: Vec<Result<IngestMrDiscussionsResult>> = \n futures::stream::iter(mrs_needing_sync)\n .map(|mr| {\n let conn = &self.conn;\n let client = &self.client;\n let config = &self.config;\n let project_id = self.project_id;\n let gitlab_project_id = self.gitlab_project_id;\n async move {\n ingest_mr_discussions(\n conn, client, config,\n project_id, gitlab_project_id,\n mr.iid, mr.local_mr_id, mr.updated_at,\n ).await\n }\n })\n .buffer_unordered(concurrency)\n .collect()\n .await;\n \n // 4. Aggregate results\n let mut total_discussions = 0;\n let mut total_notes = 0;\n let mut total_diffnotes = 0;\n let mut failed_syncs = 0;\n \n for result in discussion_results {\n match result {\n Ok(r) => {\n total_discussions += r.discussions_upserted;\n total_notes += r.notes_upserted;\n total_diffnotes += r.diffnotes_count;\n }\n Err(e) => {\n warn!(\"Discussion sync failed: {}\", e);\n failed_syncs += 1;\n }\n }\n }\n \n Ok(IngestResult {\n mrs_fetched: mr_result.fetched,\n mrs_upserted: mr_result.upserted,\n labels_created: mr_result.labels_created,\n assignees_linked: mr_result.assignees_linked,\n reviewers_linked: mr_result.reviewers_linked,\n discussions_synced: total_discussions,\n notes_synced: total_notes,\n diffnotes_count: total_diffnotes,\n mrs_skipped_discussion_sync: (mr_result.fetched as usize).saturating_sub(total_needing_sync),\n failed_discussion_syncs: failed_syncs,\n })\n}\n```\n\n## Required Imports\n```rust\nuse futures::stream::StreamExt;\nuse crate::ingestion::merge_requests::ingest_merge_requests;\nuse crate::ingestion::mr_discussions::{ingest_mr_discussions, IngestMrDiscussionsResult};\n```\n\n## Config Reference\n```rust\n// In config.rs or similar\npub struct SyncConfig {\n pub dependent_concurrency: Option<usize>, // Default 5\n // ... other fields\n}\n```\n\n## Edge Cases\n- Large projects: 10k+ MRs may need discussion sync - DB-driven query avoids memory growth\n- Partial failures: Each MR's discussion sync is independent; failures logged but don't stop others\n- Concurrency: Too high (>10) may hit GitLab rate limits; default 5 balances throughput with safety\n- Empty result: If no MRs need sync, discussion phase completes immediately with zero counts","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-26T22:06:42.731140Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:25:13.472341Z","closed_at":"2026-01-27T00:25:13.472281Z","close_reason":"Updated orchestrator for MR ingestion:\n- Added IngestMrProjectResult struct with all MR-specific metrics\n- Added ingest_project_merge_requests() and ingest_project_merge_requests_with_progress()\n- Queries DB for MRs needing discussion sync AFTER ingestion (memory-safe for large projects)\n- Added MR-specific progress events (MrsFetchStarted, MrFetched, etc.)\n- Sequential discussion sync using dependent_concurrency config\n- All 164 tests passing","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-10f","depends_on_id":"bd-20h","type":"blocks","created_at":"2026-01-26T22:08:54.915469Z","created_by":"tayloreernisse"},{"issue_id":"bd-10f","depends_on_id":"bd-ser","type":"blocks","created_at":"2026-01-26T22:08:54.805860Z","created_by":"tayloreernisse"}]}
{"id":"bd-10i","title":"Epic: CP2 Gate D - Resumability Proof","description":"## Background\nGate D validates resumability and crash recovery. Proves that cursor and watermark mechanics prevent massive refetch after interruption. This is critical for large projects where a full refetch would take hours.\n\n## Acceptance Criteria (Pass/Fail)\n- [ ] Kill mid-run, rerun -> bounded redo (not full refetch from beginning)\n- [ ] Cursor saved at page boundary (not item boundary)\n- [ ] No redundant discussion refetch after crash recovery\n- [ ] No watermark advancement on partial pagination failure\n- [ ] Single-flight lock prevents concurrent ingest runs\n- [ ] `--full` flag resets MR cursor to NULL\n- [ ] `--full` flag resets ALL `discussions_synced_for_updated_at` to NULL\n- [ ] `--force` bypasses single-flight lock\n\n## Validation Script\n```bash\n#!/bin/bash\nset -e\n\nDB_PATH=\"${XDG_DATA_HOME:-$HOME/.local/share}/gitlab-inbox/db.sqlite3\"\n\necho \"=== Gate D: Resumability Proof ===\"\n\n# 1. Test single-flight lock\necho \"Step 1: Test single-flight lock...\"\ngi ingest --type=merge_requests &\nFIRST_PID=$!\nsleep 1\n\n# Try second ingest - should fail with lock error\nif gi ingest --type=merge_requests 2>&1 | grep -q \"lock\\|already running\"; then\n echo \" PASS: Second ingest blocked by lock\"\nelse\n echo \" FAIL: Lock not working\"\nfi\nwait $FIRST_PID 2>/dev/null || true\n\n# 2. Test --force bypasses lock\necho \"Step 2: Test --force flag...\"\ngi ingest --type=merge_requests &\nFIRST_PID=$!\nsleep 1\nif gi ingest --type=merge_requests --force 2>&1; then\n echo \" PASS: --force bypassed lock\"\nelse\n echo \" Note: --force test inconclusive\"\nfi\nwait $FIRST_PID 2>/dev/null || true\n\n# 3. Check cursor state\necho \"Step 3: Check cursor state...\"\nsqlite3 \"$DB_PATH\" \"\n SELECT resource_type, updated_at, gitlab_id\n FROM sync_cursors \n WHERE resource_type = 'merge_requests';\n\"\n\n# 4. Test crash recovery\necho \"Step 4: Test crash recovery...\"\n\n# Record current cursor\nCURSOR_BEFORE=$(sqlite3 \"$DB_PATH\" \"\n SELECT updated_at FROM sync_cursors WHERE resource_type = 'merge_requests';\n\")\necho \" Cursor before: $CURSOR_BEFORE\"\n\n# Force full sync and kill\necho \" Starting full sync then killing...\"\ngi ingest --type=merge_requests --full &\nPID=$!\nsleep 5 && kill -9 $PID 2>/dev/null || true\nwait $PID 2>/dev/null || true\n\n# Check cursor was saved (should be non-null if any page completed)\nCURSOR_AFTER=$(sqlite3 \"$DB_PATH\" \"\n SELECT updated_at FROM sync_cursors WHERE resource_type = 'merge_requests';\n\")\necho \" Cursor after kill: $CURSOR_AFTER\"\n\n# Re-run and verify bounded redo\necho \" Re-running (should resume from cursor)...\"\ntime gi ingest --type=merge_requests\n# Should be faster than first full sync\n\n# 5. Test --full reset\necho \"Step 5: Test --full resets watermarks...\"\n\n# Check watermarks before\nWATERMARKS_BEFORE=$(sqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM merge_requests \n WHERE discussions_synced_for_updated_at IS NOT NULL;\n\")\necho \" Watermarks set before --full: $WATERMARKS_BEFORE\"\n\n# Record cursor before\nCURSOR_BEFORE_FULL=$(sqlite3 \"$DB_PATH\" \"\n SELECT updated_at, gitlab_id FROM sync_cursors WHERE resource_type = 'merge_requests';\n\")\necho \" Cursor before --full: $CURSOR_BEFORE_FULL\"\n\n# Run --full\ngi ingest --type=merge_requests --full\n\n# Check cursor was reset then rebuilt\nCURSOR_AFTER_FULL=$(sqlite3 \"$DB_PATH\" \"\n SELECT updated_at, gitlab_id FROM sync_cursors WHERE resource_type = 'merge_requests';\n\")\necho \" Cursor after --full: $CURSOR_AFTER_FULL\"\n\n# Watermarks should be set again (sync completed)\nWATERMARKS_AFTER=$(sqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM merge_requests \n WHERE discussions_synced_for_updated_at IS NOT NULL;\n\")\necho \" Watermarks set after --full: $WATERMARKS_AFTER\"\n\necho \"\"\necho \"=== Gate D: PASSED ===\"\n```\n\n## Watermark Safety Test (Simulated Network Failure)\n```bash\n# This tests that watermark doesn't advance on partial failure\n# Requires ability to simulate network issues\n\n# 1. Get an MR that needs discussion sync\nMR_ID=$(sqlite3 \"$DB_PATH\" \"\n SELECT id FROM merge_requests \n WHERE discussions_synced_for_updated_at IS NULL \n OR updated_at > discussions_synced_for_updated_at\n LIMIT 1;\n\")\n\n# 2. Note current watermark\nWATERMARK_BEFORE=$(sqlite3 \"$DB_PATH\" \"\n SELECT discussions_synced_for_updated_at FROM merge_requests WHERE id = $MR_ID;\n\")\necho \"Watermark before: $WATERMARK_BEFORE\"\n\n# 3. Simulate network failure (requires network manipulation)\n# Option A: Block GitLab API temporarily\n# Option B: Run in a container with network limits\n# Option C: Use the automated test instead:\ncargo test does_not_advance_discussion_watermark_on_partial_failure\n\n# 4. Verify watermark unchanged after failure\nWATERMARK_AFTER=$(sqlite3 \"$DB_PATH\" \"\n SELECT discussions_synced_for_updated_at FROM merge_requests WHERE id = $MR_ID;\n\")\necho \"Watermark after failure: $WATERMARK_AFTER\"\n[ \"$WATERMARK_BEFORE\" = \"$WATERMARK_AFTER\" ] && echo \"PASS: Watermark preserved\"\n```\n\n## Test Commands (Quick Verification)\n```bash\n# Check cursor state:\nsqlite3 ~/.local/share/gitlab-inbox/db.sqlite3 \"\n SELECT * FROM sync_cursors WHERE resource_type = 'merge_requests';\n\"\n\n# Check watermark distribution:\nsqlite3 ~/.local/share/gitlab-inbox/db.sqlite3 \"\n SELECT \n SUM(CASE WHEN discussions_synced_for_updated_at IS NULL THEN 1 ELSE 0 END) as needs_sync,\n SUM(CASE WHEN discussions_synced_for_updated_at IS NOT NULL THEN 1 ELSE 0 END) as synced\n FROM merge_requests;\n\"\n\n# Test --full resets (check before/after):\nsqlite3 ~/.local/share/gitlab-inbox/db.sqlite3 \"SELECT COUNT(*) FROM merge_requests WHERE discussions_synced_for_updated_at IS NOT NULL;\"\ngi ingest --type=merge_requests --full\n# During full sync, watermarks should be NULL, then repopulated\n```\n\n## Critical Automated Tests\nThese tests MUST pass for Gate D:\n```bash\ncargo test does_not_advance_discussion_watermark_on_partial_failure\ncargo test full_sync_resets_discussion_watermarks\ncargo test cursor_saved_at_page_boundary\n```\n\n## Dependencies\nThis gate requires:\n- bd-mk3 (ingest command with --full and --force support)\n- bd-ser (MR ingestion with cursor mechanics)\n- bd-20h (MR discussion ingestion with watermark safety)\n- Gates A, B, C must pass first\n\n## Edge Cases\n- Very fast sync: May complete before kill signal reaches; retest with larger project\n- Lock file stale: If previous run crashed, lock file may exist; --force handles this\n- Clock skew: Cursor timestamps should use server time, not local time","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-26T22:06:02.124186Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:48:21.060596Z","closed_at":"2026-01-27T00:48:21.060555Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-10i","depends_on_id":"bd-mk3","type":"blocks","created_at":"2026-01-26T22:08:55.875790Z","created_by":"tayloreernisse"}]}
{"id":"bd-12ae","title":"OBSERV: Add structured tracing fields to rate-limit/retry handling","description":"## Background\nRate limit and retry events are currently logged at WARN with minimal context (src/gitlab/client.rs:~157). This enriches them with structured fields so MetricsLayer can count them and -v mode shows actionable retry information.\n\n## Approach\n### src/gitlab/client.rs - request() method (line ~119-171)\n\nCurrent 429 handling (~line 155-158):\n```rust\nif response.status() == StatusCode::TOO_MANY_REQUESTS && attempt < Self::MAX_RETRIES {\n let retry_after = Self::parse_retry_after(&response);\n tracing::warn!(retry_after_secs = retry_after, attempt, path, \"Rate limited by GitLab, retrying\");\n sleep(Duration::from_secs(retry_after)).await;\n continue;\n}\n```\n\nReplace with INFO-level structured log:\n```rust\nif response.status() == StatusCode::TOO_MANY_REQUESTS && attempt < Self::MAX_RETRIES {\n let retry_after = Self::parse_retry_after(&response);\n tracing::info!(\n path = %path,\n attempt = attempt,\n retry_after_secs = retry_after,\n status_code = 429u16,\n \"Rate limited, retrying\"\n );\n sleep(Duration::from_secs(retry_after)).await;\n continue;\n}\n```\n\nFor transient errors (network errors, 5xx responses), add similar structured logging:\n```rust\ntracing::info!(\n path = %path,\n attempt = attempt,\n error = %e,\n \"Retrying after transient error\"\n);\n```\n\nKey changes:\n- Level: WARN -> INFO (visible in -v mode, not alarming in default mode)\n- Added: status_code field for 429\n- Added: structured path, attempt fields for all retry events\n- These structured fields enable MetricsLayer (bd-3vqk) to count rate_limit_hits and retries\n\n## Acceptance Criteria\n- [ ] 429 responses log at INFO with fields: path, attempt, retry_after_secs, status_code=429\n- [ ] Transient error retries log at INFO with fields: path, attempt, error\n- [ ] lore -v sync shows retry activity on stderr (INFO is visible in -v mode)\n- [ ] Default mode (no -v) does NOT show retry lines on stderr (INFO filtered out)\n- [ ] File layer captures all retry events (always at DEBUG+)\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/gitlab/client.rs (modify request() method, lines ~119-171)\n\n## TDD Loop\nRED:\n - test_rate_limit_log_fields: mock 429 response, capture log output, parse JSON, assert fields\n - test_retry_log_fields: mock network error + retry, assert structured fields\nGREEN: Change log level and add structured fields\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- parse_retry_after returns 0 or very large values: the existing logic handles this\n- All retries exhausted: the final attempt returns the error normally. No special logging needed (the error propagates).\n- path may contain sensitive data (project IDs): project IDs are not sensitive in this context","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-04T15:55:02.448070Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:21:42.304259Z","closed_at":"2026-02-04T17:21:42.304213Z","close_reason":"Changed 429 rate-limit logging from WARN to INFO with structured fields: path, attempt, retry_after_secs, status_code=429 in both request() and request_with_headers()","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-12ae","depends_on_id":"bd-3pk","type":"parent-child","created_at":"2026-02-04T15:55:02.450343Z","created_by":"tayloreernisse"}]}
{"id":"bd-13b","title":"[CP0] CLI entry point with Commander.js","description":"## Background\n\nCommander.js provides the CLI framework. The main entry point sets up the program with all subcommands. Uses ESM with proper shebang for npx/global installation.\n\nReference: docs/prd/checkpoint-0.md section \"CLI Commands\"\n\n## Approach\n\n**src/cli/index.ts:**\n```typescript\n#!/usr/bin/env node\n\nimport { Command } from 'commander';\nimport { version } from '../../package.json' with { type: 'json' };\nimport { initCommand } from './commands/init';\nimport { authTestCommand } from './commands/auth-test';\nimport { doctorCommand } from './commands/doctor';\nimport { versionCommand } from './commands/version';\nimport { backupCommand } from './commands/backup';\nimport { resetCommand } from './commands/reset';\nimport { syncStatusCommand } from './commands/sync-status';\n\nconst program = new Command();\n\nprogram\n .name('gi')\n .description('GitLab Inbox - Unified notification management')\n .version(version);\n\n// Global --config flag available to all commands\nprogram.option('-c, --config <path>', 'Path to config file');\n\n// Register subcommands\nprogram.addCommand(initCommand);\nprogram.addCommand(authTestCommand);\nprogram.addCommand(doctorCommand);\nprogram.addCommand(versionCommand);\nprogram.addCommand(backupCommand);\nprogram.addCommand(resetCommand);\nprogram.addCommand(syncStatusCommand);\n\nprogram.parse();\n```\n\nEach command file exports a Command instance:\n```typescript\n// src/cli/commands/version.ts\nimport { Command } from 'commander';\n\nexport const versionCommand = new Command('version')\n .description('Show version information')\n .action(() => {\n console.log(`gi version ${version}`);\n });\n```\n\n## Acceptance Criteria\n\n- [ ] `gi --help` shows all commands and global options\n- [ ] `gi --version` shows version from package.json\n- [ ] `gi <command> --help` shows command-specific help\n- [ ] `gi --config ./path` passes config path to commands\n- [ ] Unknown command shows error and suggests --help\n- [ ] Exit code 0 on success, non-zero on error\n- [ ] Shebang line works for npx execution\n\n## Files\n\nCREATE:\n- src/cli/index.ts (main entry point)\n- src/cli/commands/version.ts (simple command as template)\n\nMODIFY (later beads):\n- package.json (add \"bin\" field pointing to dist/cli/index.js)\n\n## TDD Loop\n\nN/A for CLI entry point - verify with manual testing:\n\n```bash\nnpm run build\nnode dist/cli/index.js --help\nnode dist/cli/index.js version\nnode dist/cli/index.js unknown-command # should error\n```\n\n## Edge Cases\n\n- package.json import requires Node 20+ with { type: 'json' } assertion\n- Alternative: read version from package.json with readFileSync\n- Command registration order affects help display - alphabetical preferred\n- Global options must be defined before subcommands","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-24T16:09:50.499023Z","created_by":"tayloreernisse","updated_at":"2026-01-25T03:10:49.224627Z","closed_at":"2026-01-25T03:10:49.224499Z","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-13b","depends_on_id":"bd-gg1","type":"blocks","created_at":"2026-01-24T16:13:09.370408Z","created_by":"tayloreernisse"}]}
{"id":"bd-13pt","title":"Display closing MRs in lore issues <id> output","description":"## Background\nThe `entity_references` table stores MR->Issue 'closes' relationships (from the closes_issues API), but this data is never displayed when viewing an issue. This is the 'Development' section in GitLab UI showing which MRs will close an issue when merged.\n\n**System fit**: Data already flows through `fetch_mr_closes_issues()` -> `store_closes_issues_refs()` -> `entity_references` table. We just need to query and display it.\n\n## Approach\n\nAll changes in `src/cli/commands/show.rs`:\n\n### 1. Add ClosingMrRef struct (after DiffNotePosition ~line 57)\n```rust\n#[derive(Debug, Clone, Serialize)]\npub struct ClosingMrRef {\n pub iid: i64,\n pub title: String,\n pub state: String,\n pub web_url: Option<String>,\n}\n```\n\n### 2. Update IssueDetail struct (line ~59)\n```rust\npub struct IssueDetail {\n // ... existing fields ...\n pub closing_merge_requests: Vec<ClosingMrRef>, // NEW - add after discussions\n}\n```\n\n### 3. Add ClosingMrRefJson struct (after NoteDetailJson ~line 797)\n```rust\n#[derive(Serialize)]\npub struct ClosingMrRefJson {\n pub iid: i64,\n pub title: String,\n pub state: String,\n pub web_url: Option<String>,\n}\n```\n\n### 4. Update IssueDetailJson struct (line ~770)\n```rust\npub struct IssueDetailJson {\n // ... existing fields ...\n pub closing_merge_requests: Vec<ClosingMrRefJson>, // NEW\n}\n```\n\n### 5. Add get_closing_mrs() function (after get_issue_discussions ~line 245)\n```rust\nfn get_closing_mrs(conn: &Connection, issue_id: i64) -> Result<Vec<ClosingMrRef>> {\n let mut stmt = conn.prepare(\n \"SELECT mr.iid, mr.title, mr.state, mr.web_url\n FROM entity_references er\n JOIN merge_requests mr ON mr.id = er.source_entity_id\n WHERE er.target_entity_type = 'issue'\n AND er.target_entity_id = ?\n AND er.source_entity_type = 'merge_request'\n AND er.reference_type = 'closes'\n ORDER BY mr.iid\"\n )?;\n \n let mrs = stmt\n .query_map([issue_id], |row| {\n Ok(ClosingMrRef {\n iid: row.get(0)?,\n title: row.get(1)?,\n state: row.get(2)?,\n web_url: row.get(3)?,\n })\n })?\n .collect::<std::result::Result<Vec<_>, _>>()?;\n \n Ok(mrs)\n}\n```\n\n### 6. Update run_show_issue() (line ~89)\n```rust\nlet closing_mrs = get_closing_mrs(&conn, issue.id)?;\n// In return struct:\nclosing_merge_requests: closing_mrs,\n```\n\n### 7. Update print_show_issue() (after Labels section ~line 556)\n```rust\nif !issue.closing_merge_requests.is_empty() {\n println!(\"Development:\");\n for mr in &issue.closing_merge_requests {\n let state_indicator = match mr.state.as_str() {\n \"merged\" => style(\"merged\").green(),\n \"opened\" => style(\"opened\").cyan(),\n \"closed\" => style(\"closed\").red(),\n _ => style(&mr.state).dim(),\n };\n println!(\" !{} {} ({})\", mr.iid, mr.title, state_indicator);\n }\n}\n```\n\n### 8. Update From<&IssueDetail> for IssueDetailJson (line ~799)\n```rust\nclosing_merge_requests: issue.closing_merge_requests.iter().map(|mr| ClosingMrRefJson {\n iid: mr.iid,\n title: mr.title.clone(),\n state: mr.state.clone(),\n web_url: mr.web_url.clone(),\n}).collect(),\n```\n\n## Acceptance Criteria\n- [ ] `cargo test test_get_closing_mrs` passes (4 tests)\n- [ ] `lore issues <id>` shows Development section when closing MRs exist\n- [ ] Development section shows MR iid, title, and state\n- [ ] State is color-coded (green=merged, cyan=opened, red=closed)\n- [ ] `lore -J issues <id>` includes closing_merge_requests array\n- [ ] `cargo clippy --all-targets -- -D warnings` passes\n\n## Files\n- `src/cli/commands/show.rs` - ALL changes\n\n## TDD Loop\n\n**RED** - Add tests to `src/cli/commands/show.rs` `#[cfg(test)] mod tests`:\n\n```rust\nfn seed_issue_with_closing_mr(conn: &Connection) -> (i64, i64) {\n conn.execute(\n \"INSERT INTO projects (id, gitlab_project_id, path_with_namespace, web_url, created_at, updated_at)\n VALUES (1, 100, 'group/repo', 'https://gitlab.example.com', 1000, 2000)\", []\n ).unwrap();\n conn.execute(\n \"INSERT INTO issues (id, gitlab_id, iid, project_id, title, state, author_username,\n created_at, updated_at, last_seen_at) VALUES (1, 200, 10, 1, 'Bug fix', 'opened', 'dev', 1000, 2000, 2000)\", []\n ).unwrap();\n conn.execute(\n \"INSERT INTO merge_requests (id, gitlab_id, iid, project_id, title, state, author_username,\n source_branch, target_branch, created_at, updated_at, last_seen_at)\n VALUES (1, 300, 5, 1, 'Fix the bug', 'merged', 'dev', 'fix', 'main', 1000, 2000, 2000)\", []\n ).unwrap();\n conn.execute(\n \"INSERT INTO entity_references (project_id, source_entity_type, source_entity_id,\n target_entity_type, target_entity_id, reference_type, source_method, created_at)\n VALUES (1, 'merge_request', 1, 'issue', 1, 'closes', 'api', 3000)\", []\n ).unwrap();\n (1, 1) // (issue_id, mr_id)\n}\n\n#[test]\nfn test_get_closing_mrs_empty() {\n let conn = setup_test_db();\n // seed project + issue with no closing MRs\n conn.execute(\"INSERT INTO projects ...\", []).unwrap();\n conn.execute(\"INSERT INTO issues ...\", []).unwrap();\n let result = get_closing_mrs(&conn, 1).unwrap();\n assert!(result.is_empty());\n}\n\n#[test]\nfn test_get_closing_mrs_single() {\n let conn = setup_test_db();\n seed_issue_with_closing_mr(&conn);\n let result = get_closing_mrs(&conn, 1).unwrap();\n assert_eq!(result.len(), 1);\n assert_eq!(result[0].iid, 5);\n assert_eq!(result[0].title, \"Fix the bug\");\n assert_eq!(result[0].state, \"merged\");\n}\n\n#[test]\nfn test_get_closing_mrs_ignores_mentioned() {\n let conn = setup_test_db();\n seed_issue_with_closing_mr(&conn);\n // Add a 'mentioned' reference that should be ignored\n conn.execute(\n \"INSERT INTO merge_requests (id, gitlab_id, iid, project_id, title, state, author_username,\n source_branch, target_branch, created_at, updated_at, last_seen_at)\n VALUES (2, 301, 6, 1, 'Other MR', 'opened', 'dev', 'other', 'main', 1000, 2000, 2000)\", []\n ).unwrap();\n conn.execute(\n \"INSERT INTO entity_references (project_id, source_entity_type, source_entity_id,\n target_entity_type, target_entity_id, reference_type, source_method, created_at)\n VALUES (1, 'merge_request', 2, 'issue', 1, 'mentioned', 'note_parse', 3000)\", []\n ).unwrap();\n let result = get_closing_mrs(&conn, 1).unwrap();\n assert_eq!(result.len(), 1); // Only the 'closes' ref\n}\n\n#[test]\nfn test_get_closing_mrs_multiple_sorted() {\n let conn = setup_test_db();\n seed_issue_with_closing_mr(&conn);\n // Add second closing MR with higher iid\n conn.execute(\n \"INSERT INTO merge_requests (id, gitlab_id, iid, project_id, title, state, author_username,\n source_branch, target_branch, created_at, updated_at, last_seen_at)\n VALUES (2, 301, 8, 1, 'Another fix', 'opened', 'dev', 'fix2', 'main', 1000, 2000, 2000)\", []\n ).unwrap();\n conn.execute(\n \"INSERT INTO entity_references (project_id, source_entity_type, source_entity_id,\n target_entity_type, target_entity_id, reference_type, source_method, created_at)\n VALUES (1, 'merge_request', 2, 'issue', 1, 'closes', 'api', 3000)\", []\n ).unwrap();\n let result = get_closing_mrs(&conn, 1).unwrap();\n assert_eq!(result.len(), 2);\n assert_eq!(result[0].iid, 5); // Lower iid first\n assert_eq!(result[1].iid, 8);\n}\n```\n\n**GREEN** - Implement get_closing_mrs() and struct updates\n\n**VERIFY**: `cargo test test_get_closing_mrs && cargo clippy --all-targets -- -D warnings`\n\n## Edge Cases\n- Empty closing MRs -> don't print Development section\n- MR in different states -> color-coded appropriately \n- Cross-project closes (target_entity_id IS NULL) -> not displayed (unresolved refs)\n- Multiple MRs closing same issue -> all shown, ordered by iid","status":"closed","priority":1,"issue_type":"feature","created_at":"2026-02-05T15:15:37.598249Z","created_by":"tayloreernisse","updated_at":"2026-02-05T15:26:09.522557Z","closed_at":"2026-02-05T15:26:09.522506Z","close_reason":"Implemented: closing MRs (Development section) now display in lore issues <id>. All 4 new tests pass.","compaction_level":0,"original_size":0,"labels":["ISSUE"]}
{"id":"bd-140","title":"[CP1] Database migration 002_issues.sql","description":"Create migration file with tables for issues, labels, issue_labels, discussions, and notes.\n\nTables to create:\n- issues: gitlab_id, project_id, iid, title, description, state, author_username, timestamps, web_url, raw_payload_id\n- labels: gitlab_id, project_id, name, color, description (unique on project_id+name)\n- issue_labels: junction table\n- discussions: gitlab_discussion_id, project_id, issue_id, noteable_type, individual_note, timestamps, resolvable/resolved\n- notes: gitlab_id, discussion_id, project_id, type, is_system, author_username, body, timestamps, position, resolution fields, DiffNote position fields\n\nInclude appropriate indexes:\n- idx_issues_project_updated, idx_issues_author, uq_issues_project_iid\n- uq_labels_project_name, idx_labels_name\n- idx_issue_labels_label\n- uq_discussions_project_discussion_id, idx_discussions_issue/mr/last_note\n- idx_notes_discussion/author/system\n\nFiles: migrations/002_issues.sql\nDone when: Migration applies cleanly on top of 001_initial.sql","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:18:53.954039Z","created_by":"tayloreernisse","updated_at":"2026-01-25T15:21:35.154936Z","deleted_at":"2026-01-25T15:21:35.154934Z","deleted_by":"tayloreernisse","delete_reason":"delete","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-14q","title":"Epic: Gate 4 - File Decision History (lore file-history)","description":"## Background\nGate 4 adds file-level decision history — \"which MRs touched this file, and why?\" This bridges the gap between code and project management by linking file paths to MRs, MRs to issues, and issues to discussions. The key innovation is rename chain resolution: when a file was renamed from src/auth/handler.rs to src/auth/oauth.rs, querying either path finds all historical MRs.\n\nGate 4 also captures merge_commit_sha and squash_commit_sha on merge_requests, which Gate 5 uses for code tracing and which Phase C will use for git blame integration.\n\n## Architecture\n- **New table:** mr_file_changes (migration 012) — file paths + change types per MR\n- **New columns:** merge_requests.merge_commit_sha, merge_requests.squash_commit_sha (migration 012)\n- **Opt-in config:** sync.fetchMrFileChanges (default true). --no-file-changes CLI flag.\n- **Data source:** GET /projects/:id/merge_requests/:iid/diffs — extract file metadata only, discard diff content.\n- **Queue integration:** Uses dependent fetch queue with job_type=mr_diffs\n- **Rename chain resolution:** BFS over mr_file_changes WHERE change_type=renamed, bounded at 10 hops with cycle detection.\n\n## Children (Execution Order)\n1. **bd-1oo** [OPEN] — Migration 012: mr_file_changes + merge_commit_sha + squash_commit_sha\n2. **bd-jec** [OPEN] — fetchMrFileChanges config flag + --no-file-changes CLI flag\n3. **bd-2yo** [OPEN] — Fetch MR diffs API, populate mr_file_changes, capture commit SHAs\n4. **bd-1yx** [OPEN] — Rename chain resolution algorithm (src/core/file_history.rs)\n5. **bd-z94** [OPEN] — lore file-history command with human + robot output\n\n## Gate Completion Criteria\n- [ ] mr_file_changes populated from GitLab diffs API for synced MRs\n- [ ] merge_commit_sha and squash_commit_sha captured in merge_requests\n- [ ] `lore file-history <path>` returns MRs ordered by merge/creation date\n- [ ] Output includes MR title, state, author, change type, discussion count\n- [ ] --discussions shows DiffNote snippets on the queried file\n- [ ] Rename chains resolved with bounded hop count (default 10) + cycle detection\n- [ ] --no-follow-renames disables chain resolution\n- [ ] Robot JSON includes rename_chain when renames detected\n- [ ] -p required when path exists in multiple projects (Ambiguous error, exit 18)\n- [ ] Graceful empty state: \"No MR data found. Run lore sync with fetchMrFileChanges: true\"\n\n## Dependencies\n- Depends on: Gate 1 (bd-2zl) for dependent fetch queue, Gate 2 (bd-1se) for entity_references (MR→issue linking)\n- Downstream: Gate 5 (bd-1ht) depends on mr_file_changes and commit SHAs","status":"open","priority":1,"issue_type":"feature","created_at":"2026-02-02T21:31:01.094024Z","created_by":"tayloreernisse","updated_at":"2026-02-04T19:33:06.778936Z","compaction_level":0,"original_size":0,"labels":["epic","gate-4","phase-b"],"dependencies":[{"issue_id":"bd-14q","depends_on_id":"bd-1se","type":"blocks","created_at":"2026-02-02T21:34:16.913465Z","created_by":"tayloreernisse"},{"issue_id":"bd-14q","depends_on_id":"bd-2zl","type":"blocks","created_at":"2026-02-02T21:34:16.870058Z","created_by":"tayloreernisse"}]}
{"id":"bd-157","title":"[CP1] Issue transformer with label extraction","description":"Transform GitLab issue payloads to normalized database schema.\n\n## Module\nsrc/gitlab/transformers/issue.rs\n\n## Structs\n\n### NormalizedIssue\n- gitlab_id: i64\n- project_id: i64 (local DB project ID)\n- iid: i64\n- title: String\n- description: Option<String>\n- state: String\n- author_username: String\n- created_at, updated_at, last_seen_at: i64 (ms epoch)\n- web_url: String\n\n### NormalizedLabel (CP1: name-only)\n- project_id: i64\n- name: String\n\n## Functions\n\n### transform_issue(gitlab_issue: &GitLabIssue, local_project_id: i64) -> NormalizedIssue\n- Convert ISO timestamps to ms epoch using iso_to_ms()\n- Set last_seen_at to now_ms()\n- Clone string fields\n\n### extract_labels(gitlab_issue: &GitLabIssue, local_project_id: i64) -> Vec<NormalizedLabel>\n- Map labels vec to NormalizedLabel structs\n\nFiles: \n- src/gitlab/transformers/mod.rs\n- src/gitlab/transformers/issue.rs\nTests: tests/issue_transformer_tests.rs\nDone when: Unit tests pass for payload transformation and label extraction","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:42:47.719562Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.736142Z","deleted_at":"2026-01-25T17:02:01.736129Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-16m8","title":"OBSERV: Record item counts as span fields in sync stages","description":"## Background\nMetricsLayer (bd-34ek) captures span fields, but the stage functions must actually record item counts INTO their spans. This is the bridge between \"work happened\" and \"MetricsLayer knows about it.\"\n\n## Approach\nIn each stage function, after the work loop completes, record counts into the current span:\n\n### src/ingestion/orchestrator.rs - ingest_project_issues_with_progress() (~line 110)\nAfter issues are fetched and discussions synced:\n```rust\ntracing::Span::current().record(\"items_processed\", result.issues_upserted);\ntracing::Span::current().record(\"items_skipped\", result.issues_skipped);\ntracing::Span::current().record(\"errors\", result.errors);\n```\n\n### src/ingestion/orchestrator.rs - drain_resource_events() (~line 566)\nAfter the drain loop:\n```rust\ntracing::Span::current().record(\"items_processed\", result.fetched);\ntracing::Span::current().record(\"errors\", result.failed);\n```\n\n### src/documents/regenerator.rs - regenerate_dirty_documents() (~line 24)\nAfter the regeneration loop:\n```rust\ntracing::Span::current().record(\"items_processed\", result.regenerated);\ntracing::Span::current().record(\"items_skipped\", result.unchanged);\ntracing::Span::current().record(\"errors\", result.errored);\n```\n\n### src/embedding/pipeline.rs - embed_documents() (~line 36)\nAfter embedding completes:\n```rust\ntracing::Span::current().record(\"items_processed\", result.embedded);\ntracing::Span::current().record(\"items_skipped\", result.skipped);\ntracing::Span::current().record(\"errors\", result.failed);\n```\n\nIMPORTANT: These fields must be declared as tracing::field::Empty in the #[instrument] attribute (done in bd-24j1). You can only record() a field that was declared at span creation. Attempting to record an undeclared field silently does nothing.\n\n## Acceptance Criteria\n- [ ] MetricsLayer captures items_processed for each stage\n- [ ] MetricsLayer captures items_skipped and errors when non-zero\n- [ ] Fields match the span declarations from bd-24j1\n- [ ] extract_timings() returns correct counts in StageTiming\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/ingestion/orchestrator.rs (record counts in ingest + drain functions)\n- src/documents/regenerator.rs (record counts in regenerate)\n- src/embedding/pipeline.rs (record counts in embed)\n\n## TDD Loop\nRED: test_stage_fields_recorded (integration: run pipeline, extract timings, verify counts > 0)\nGREEN: Add Span::current().record() calls at end of each stage\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- Span::current() returns a disabled span if no subscriber is registered (e.g., in tests without subscriber setup). record() on disabled span is a no-op. Tests need a subscriber.\n- Field names must exactly match the declaration: \"items_processed\" not \"itemsProcessed\"\n- Recording must happen BEFORE the span closes (before function returns). Place at end of function but before Ok(result).","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-04T15:54:32.011236Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:27:38.620645Z","closed_at":"2026-02-04T17:27:38.620601Z","close_reason":"Added tracing::field::Empty declarations and Span::current().record() calls in 4 functions: ingest_project_issues, ingest_project_merge_requests, drain_resource_events, regenerate_dirty_documents, embed_documents","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-16m8","depends_on_id":"bd-24j1","type":"blocks","created_at":"2026-02-04T15:55:19.962261Z","created_by":"tayloreernisse"},{"issue_id":"bd-16m8","depends_on_id":"bd-34ek","type":"blocks","created_at":"2026-02-04T15:55:20.009988Z","created_by":"tayloreernisse"},{"issue_id":"bd-16m8","depends_on_id":"bd-3er","type":"parent-child","created_at":"2026-02-04T15:54:32.012091Z","created_by":"tayloreernisse"}]}
{"id":"bd-17n","title":"OBSERV: Add LoggingConfig to Config struct","description":"## Background\nLoggingConfig centralizes log file settings so users can customize retention and disable file logging. It follows the same #[serde(default)] pattern as SyncConfig (src/core/config.rs:32-78) so existing config.json files continue working with zero changes.\n\n## Approach\nAdd to src/core/config.rs, after the EmbeddingConfig struct (around line 120):\n\n```rust\n#[derive(Debug, Clone, Deserialize)]\n#[serde(default)]\npub struct LoggingConfig {\n /// Directory for log files. Default: None (= XDG data dir + /logs/)\n pub log_dir: Option<String>,\n\n /// Days to retain log files. Default: 30. Set to 0 to disable file logging.\n pub retention_days: u32,\n\n /// Enable JSON log files. Default: true.\n pub file_logging: bool,\n}\n\nimpl Default for LoggingConfig {\n fn default() -> Self {\n Self {\n log_dir: None,\n retention_days: 30,\n file_logging: true,\n }\n }\n}\n```\n\nAdd to the Config struct (src/core/config.rs:123-137), after the embedding field:\n\n```rust\n#[serde(default)]\npub logging: LoggingConfig,\n```\n\nNote: Using impl Default rather than default helper functions (default_retention_days, default_true) because #[serde(default)] on the struct applies Default::default() to the entire struct when the key is missing. This is the same pattern used by SyncConfig.\n\n## Acceptance Criteria\n- [ ] Deserializing {} as LoggingConfig yields retention_days=30, file_logging=true, log_dir=None\n- [ ] Deserializing {\"retention_days\": 7} preserves file_logging=true default\n- [ ] Existing config.json files (no \"logging\" key) deserialize without error\n- [ ] Config struct has .logging field accessible\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/core/config.rs (add LoggingConfig struct + Default impl, add field to Config)\n\n## TDD Loop\nRED: tests/config_tests.rs (or inline #[cfg(test)] mod):\n - test_logging_config_defaults\n - test_logging_config_partial\nGREEN: Add LoggingConfig struct, Default impl, field on Config\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- retention_days=0 means disable file logging entirely (not \"delete all files\") -- document this in the struct doc comment\n- log_dir with a relative path: should be resolved relative to CWD or treated as absolute? Decision: treat as absolute, document it\n- Missing \"logging\" key in JSON: #[serde(default)] handles this -- the entire LoggingConfig gets Default::default()","status":"closed","priority":1,"issue_type":"task","created_at":"2026-02-04T15:53:55.471193Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:10:22.751969Z","closed_at":"2026-02-04T17:10:22.751921Z","close_reason":"Added LoggingConfig struct with log_dir, retention_days, file_logging fields and serde defaults","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-17n","depends_on_id":"bd-2nx","type":"parent-child","created_at":"2026-02-04T15:53:55.471849Z","created_by":"tayloreernisse"}]}
{"id":"bd-17v","title":"[CP1] gi sync-status enhancement","description":"## Background\n\nThe `gi sync-status` command shows synchronization state: last successful sync time, cursor positions per project/resource, and overall health. This helps users understand when data was last refreshed and diagnose sync issues.\n\n## Approach\n\n### Module: src/cli/commands/sync_status.rs (enhance existing or create)\n\n### Handler Function\n\n```rust\npub async fn handle_sync_status(conn: &Connection) -> Result<()>\n```\n\n### Data to Display\n\n1. **Last sync run**: From `sync_runs` table\n - Started at, completed at, status\n - Issues fetched, discussions fetched\n\n2. **Cursor positions**: From `sync_cursors` table\n - Per (project, resource_type) pair\n - Show updated_at_cursor as human-readable date\n - Show tie_breaker_id (GitLab ID of last processed item)\n\n3. **Overall counts**: Quick summary\n - Total issues, discussions, notes in DB\n\n### Output Format\n\n```\nLast Sync\n─────────\nStatus: completed\nStarted: 2024-01-25 10:30:00\nCompleted: 2024-01-25 10:35:00\nDuration: 5m 23s\n\nCursor Positions\n────────────────\ngroup/project-one (issues):\n Last updated_at: 2024-01-25 10:30:00\n Last GitLab ID: 12345\n\nData Summary\n────────────\nIssues: 1,234\nDiscussions: 5,678\nNotes: 12,345 (excluding 2,000 system)\n```\n\n### Queries\n\n```sql\n-- Last sync run\nSELECT * FROM sync_runs ORDER BY started_at DESC LIMIT 1\n\n-- Cursor positions\nSELECT p.path, sc.resource_type, sc.updated_at_cursor, sc.tie_breaker_id\nFROM sync_cursors sc\nJOIN projects p ON sc.project_id = p.id\n\n-- Data summary\nSELECT COUNT(*) FROM issues\nSELECT COUNT(*) FROM discussions\nSELECT COUNT(*), SUM(is_system) FROM notes\n```\n\n## Acceptance Criteria\n\n- [ ] Shows last sync run with status and timing\n- [ ] Shows cursor position per project/resource\n- [ ] Shows total counts for issues, discussions, notes\n- [ ] Handles case where no sync has run yet\n- [ ] Formats timestamps as human-readable local time\n\n## Files\n\n- src/cli/commands/sync_status.rs (create or enhance)\n- src/cli/mod.rs (add SyncStatus variant if new)\n\n## TDD Loop\n\nRED:\n```rust\n#[tokio::test] async fn sync_status_shows_last_run()\n#[tokio::test] async fn sync_status_shows_cursor_positions()\n#[tokio::test] async fn sync_status_handles_no_sync_yet()\n```\n\nGREEN: Implement handler with queries and formatting\n\nVERIFY: `cargo test sync_status`\n\n## Edge Cases\n\n- No sync has ever run - show \"No sync runs recorded\"\n- Sync in progress - show \"Status: running\" with started_at\n- Cursor at epoch 0 - means fresh start, show \"Not started\"\n- Multiple projects - show cursor for each","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-25T17:02:38.409353Z","created_by":"tayloreernisse","updated_at":"2026-01-25T23:03:21.851557Z","closed_at":"2026-01-25T23:03:21.851496Z","close_reason":"Implemented gi sync-status showing last run, cursor positions, and data summary","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-17v","depends_on_id":"bd-208","type":"blocks","created_at":"2026-01-25T17:04:05.749433Z","created_by":"tayloreernisse"}]}
{"id":"bd-18t","title":"Implement discussion truncation logic","description":"## Background\nDiscussion threads can contain dozens of notes spanning thousands of characters. The truncation module ensures discussion documents stay within a 32k character limit (suitable for embedding chunking) by dropping middle notes while preserving first and last notes for context. A separate hard safety cap of 2MB applies to ALL document types for pathological content (pasted logs, base64 blobs). Issue/MR documents are NOT truncated by the discussion logic — only the hard cap applies.\n\n## Approach\nCreate `src/documents/truncation.rs` per PRD Section 2.3:\n\n```rust\npub const MAX_DISCUSSION_CHARS: usize = 32_000;\npub const MAX_DOCUMENT_CHARS_HARD: usize = 2_000_000;\n\npub struct NoteContent {\n pub author: String,\n pub date: String,\n pub body: String,\n}\n\npub struct TruncationResult {\n pub content: String,\n pub is_truncated: bool,\n pub reason: Option<TruncationReason>,\n}\n\npub enum TruncationReason {\n TokenLimitMiddleDrop,\n SingleNoteOversized,\n FirstLastOversized,\n HardCapOversized,\n}\n```\n\n**Core functions:**\n- `truncate_discussion(notes: &[NoteContent], max_chars: usize) -> TruncationResult`\n- `truncate_utf8(s: &str, max_bytes: usize) -> &str` (shared with fts.rs)\n- `truncate_hard_cap(content: &str) -> TruncationResult` (for any doc type)\n\n**Algorithm for truncate_discussion:**\n1. Format all notes as `@author (date):\\nbody\\n\\n`\n2. If total <= max_chars: return as-is\n3. If single note: truncate at UTF-8 boundary, append `[truncated]`, reason = SingleNoteOversized\n4. Binary search: find max N where first N notes + last 1 note + marker fit within max_chars\n5. If first + last > max_chars: keep only first (truncated), reason = FirstLastOversized\n6. Otherwise: first N + marker + last M, reason = TokenLimitMiddleDrop\n\n**Marker format:** `\\n\\n[... N notes omitted for length ...]\\n\\n`\n\n## Acceptance Criteria\n- [ ] Discussion with total < 32k chars returns untruncated\n- [ ] Discussion > 32k chars: middle notes dropped, first + last preserved\n- [ ] Truncation marker shows correct count of omitted notes\n- [ ] Single note > 32k chars: truncated at UTF-8-safe boundary with `[truncated]` appended\n- [ ] First + last note > 32k: only first note kept (truncated if needed)\n- [ ] Hard cap (2MB) truncates any document type at UTF-8-safe boundary\n- [ ] `truncate_utf8` never panics on multi-byte codepoints (emoji, CJK, accented chars)\n- [ ] `TruncationReason::as_str()` returns DB-compatible strings matching CHECK constraint\n\n## Files\n- `src/documents/truncation.rs` — new file\n- `src/documents/mod.rs` — add `pub use truncation::{truncate_discussion, truncate_hard_cap, TruncationResult, NoteContent};`\n\n## TDD Loop\nRED: Tests in `#[cfg(test)] mod tests`:\n- `test_no_truncation_under_limit` — 3 short notes, all fit\n- `test_middle_notes_dropped` — 10 notes totaling > 32k, first+last preserved\n- `test_single_note_oversized` — one note of 50k chars, truncated safely\n- `test_first_last_oversized` — first=20k, last=20k, only first kept\n- `test_one_note_total` — single note under limit: no truncation\n- `test_utf8_boundary_safety` — content with emoji/CJK at truncation point\n- `test_hard_cap` — 3MB content truncated to 2MB\n- `test_marker_count_correct` — marker says \"[... 5 notes omitted ...]\" when 5 dropped\nGREEN: Implement truncation logic\nVERIFY: `cargo test truncation`\n\n## Edge Cases\n- Empty notes list: return empty content, not truncated\n- All notes are empty strings: total = 0, no truncation\n- Note body contains only multi-byte characters: truncate_utf8 walks backward to find safe boundary\n- Note body with trailing newlines: formatted output should not have excessive blank lines","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-30T15:25:45.597167Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:21:32.256569Z","closed_at":"2026-01-30T17:21:32.256507Z","close_reason":"Completed: truncate_discussion, truncate_hard_cap, truncate_utf8, TruncationReason with as_str(), 12 tests pass","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-18t","depends_on_id":"bd-36p","type":"blocks","created_at":"2026-01-30T15:29:15.947679Z","created_by":"tayloreernisse"}]}
{"id":"bd-1b0n","title":"OBSERV: Print human-readable timing summary after interactive sync","description":"## Background\nInteractive users want a quick timing summary after sync completes. This is the human-readable equivalent of meta.stages in robot JSON. Gated behind IngestDisplay::show_text so it doesn't appear in -q, robot, or progress_only modes.\n\n## Approach\nAdd a function to format and print the timing summary, called from run_sync() after the pipeline completes:\n\n```rust\nfn print_timing_summary(stages: &[StageTiming], total_elapsed: Duration) {\n eprintln!();\n eprintln!(\"Sync complete in {:.1}s\", total_elapsed.as_secs_f64());\n for stage in stages {\n let dots = \".\".repeat(20_usize.saturating_sub(stage.name.len()));\n eprintln!(\n \" {} {} {:.1}s ({} items{})\",\n stage.name,\n dots,\n stage.elapsed_ms as f64 / 1000.0,\n stage.items_processed,\n if stage.errors > 0 { format!(\", {} errors\", stage.errors) } else { String::new() },\n );\n }\n}\n```\n\nCall in run_sync() (src/cli/commands/sync.rs), after pipeline and before return:\n```rust\nif display.show_text {\n let stages = metrics_handle.extract_timings();\n print_timing_summary(&stages, start.elapsed());\n}\n```\n\nOutput format per PRD Section 4.6.4:\n```\nSync complete in 45.2s\n Ingest issues .... 12.3s (150 items, 42 discussions)\n Ingest MRs ....... 18.9s (85 items, 1 error)\n Generate docs .... 8.5s (235 documents)\n Embed ............ 5.5s (1024 chunks)\n```\n\n## Acceptance Criteria\n- [ ] Interactive lore sync prints timing summary to stderr after completion\n- [ ] Summary shows total time and per-stage breakdown\n- [ ] lore -q sync does NOT print timing summary\n- [ ] Robot mode does NOT print timing summary (only JSON)\n- [ ] Error counts shown when non-zero\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/cli/commands/sync.rs (add print_timing_summary function, call after pipeline)\n\n## TDD Loop\nRED: test_timing_summary_format (capture stderr, verify format matches PRD example pattern)\nGREEN: Implement print_timing_summary, gate behind display.show_text\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- Empty stages (e.g., sync with no projects configured): print \"Sync complete in 0.0s\" with no stage lines\n- Very fast stages (<1ms): show \"0.0s\" not scientific notation\n- Stage names with varying lengths: dot padding keeps alignment readable","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-04T15:54:32.109882Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:32:52.558314Z","closed_at":"2026-02-04T17:32:52.558264Z","close_reason":"Added print_timing_summary with per-stage breakdown (name, elapsed, items, errors, rate limits), nested sub-stage support, gated behind metrics Option","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-1b0n","depends_on_id":"bd-1zj6","type":"blocks","created_at":"2026-02-04T15:55:20.162069Z","created_by":"tayloreernisse"},{"issue_id":"bd-1b0n","depends_on_id":"bd-3er","type":"parent-child","created_at":"2026-02-04T15:54:32.110706Z","created_by":"tayloreernisse"}]}
{"id":"bd-1cb","title":"[CP0] gi doctor command - health checks","description":"## Background\n\ndoctor is the primary diagnostic command. It checks all system components and reports their status. Supports JSON output for scripting and CI integration. Must degrade gracefully - warn about optional components (Ollama) without failing.\n\nReference: docs/prd/checkpoint-0.md section \"gi doctor\"\n\n## Approach\n\n**src/cli/commands/doctor.ts:**\n\nPerforms 5 checks:\n1. **Config**: Load and validate config file\n2. **Database**: Open DB, verify pragmas, check schema version\n3. **GitLab**: Auth with token, verify connectivity\n4. **Projects**: Count configured vs resolved in DB\n5. **Ollama**: Ping embedding endpoint (optional - warn if unavailable)\n\n**DoctorResult interface:**\n```typescript\ninterface DoctorResult {\n success: boolean; // All required checks passed\n checks: {\n config: { status: 'ok' | 'error'; path?: string; error?: string };\n database: { status: 'ok' | 'error'; path?: string; schemaVersion?: number; error?: string };\n gitlab: { status: 'ok' | 'error'; url?: string; username?: string; error?: string };\n projects: { status: 'ok' | 'error'; configured?: number; resolved?: number; error?: string };\n ollama: { status: 'ok' | 'warning' | 'error'; url?: string; model?: string; error?: string };\n };\n}\n```\n\n**Human-readable output (default):**\n```\ngi doctor\n\n Config ✓ Loaded from ~/.config/gi/config.json\n Database ✓ ~/.local/share/gi/data.db (schema v1)\n GitLab ✓ https://gitlab.example.com (authenticated as @johndoe)\n Projects ✓ 2 configured, 2 resolved\n Ollama ⚠ Not running (semantic search unavailable)\n\nStatus: Ready (lexical search available, semantic search requires Ollama)\n```\n\n**JSON output (--json flag):**\nOutputs DoctorResult as JSON to stdout\n\n## Acceptance Criteria\n\n- [ ] Config check: shows path and validation status\n- [ ] Database check: shows path, schema version, pragma verification\n- [ ] GitLab check: shows URL and authenticated username\n- [ ] Projects check: shows configured count and resolved count\n- [ ] Ollama check: warns if not running, doesn't fail overall\n- [ ] success=true only if config, database, gitlab, projects all ok\n- [ ] --json outputs valid JSON matching DoctorResult interface\n- [ ] Exit 0 if success=true, exit 1 if any required check fails\n- [ ] Colors and symbols in human output (✓, ⚠, ✗)\n\n## Files\n\nCREATE:\n- src/cli/commands/doctor.ts\n- src/types/doctor.ts (DoctorResult interface)\n\n## TDD Loop\n\nN/A - diagnostic command, verify with manual testing:\n\n```bash\n# All good\ngi doctor\n\n# JSON output\ngi doctor --json | jq .\n\n# With missing Ollama\n# (just don't run Ollama - should show warning)\n\n# With bad config\nmv ~/.config/gi/config.json ~/.config/gi/config.json.bak\ngi doctor # should show config error\n```\n\n## Edge Cases\n\n- Ollama timeout should be short (2s) - don't block on slow network\n- Ollama 404 (wrong model) vs connection refused (not running)\n- Database file exists but wrong schema version\n- Projects in config but not in database (init not run)\n- Token valid for user but project access revoked","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-24T16:09:51.435540Z","created_by":"tayloreernisse","updated_at":"2026-01-25T03:30:24.921206Z","closed_at":"2026-01-25T03:30:24.921041Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1cb","depends_on_id":"bd-13b","type":"blocks","created_at":"2026-01-24T16:13:10.427307Z","created_by":"tayloreernisse"},{"issue_id":"bd-1cb","depends_on_id":"bd-1l1","type":"blocks","created_at":"2026-01-24T16:13:10.478469Z","created_by":"tayloreernisse"},{"issue_id":"bd-1cb","depends_on_id":"bd-3ng","type":"blocks","created_at":"2026-01-24T16:13:10.461940Z","created_by":"tayloreernisse"},{"issue_id":"bd-1cb","depends_on_id":"bd-epj","type":"blocks","created_at":"2026-01-24T16:13:10.443612Z","created_by":"tayloreernisse"}]}
{"id":"bd-1d5","title":"[CP1] GitLab client pagination methods","description":"Add async generator methods for paginated GitLab API calls.\n\nMethods to add to src/gitlab/client.ts:\n- paginateIssues(gitlabProjectId, updatedAfter?) → AsyncGenerator<GitLabIssue>\n- paginateIssueDiscussions(gitlabProjectId, issueIid) → AsyncGenerator<GitLabDiscussion>\n- requestWithHeaders<T>(path) → { data: T, headers: Headers }\n\nImplementation:\n- Use scope=all, state=all for issues\n- Order by updated_at ASC\n- Follow X-Next-Page header until empty/absent\n- Apply cursor rewind (subtract cursorRewindSeconds) for tuple semantics\n- Fall back to empty-page detection if headers missing\n\nFiles: src/gitlab/client.ts\nTests: tests/unit/pagination.test.ts\nDone when: Pagination handles multiple pages and respects cursors","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:19:43.069869Z","created_by":"tayloreernisse","updated_at":"2026-01-25T15:21:35.156881Z","deleted_at":"2026-01-25T15:21:35.156877Z","deleted_by":"tayloreernisse","delete_reason":"delete","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-1ep","title":"Wire resource event fetching into sync pipeline","description":"## Background\nAfter issue/MR primary ingestion and discussion fetch, changed entities need resource_events jobs enqueued and drained. This is the integration point that connects the queue (bd-tir), API client (bd-sqw), DB upserts (bd-1uc), and config flag (bd-2e8).\n\n## Approach\nModify the sync pipeline to add two new phases after discussion sync:\n\n**Phase 1 — Enqueue during ingestion:**\nIn src/ingestion/orchestrator.rs, after each entity upsert (issue or MR), call:\n```rust\nif config.sync.fetch_resource_events {\n enqueue_job(conn, project_id, \"issue\", iid, local_id, \"resource_events\", None)?;\n}\n// For MRs, also enqueue mr_closes_issues (always) and mr_diffs (when fetchMrFileChanges)\n```\n\nThe \"changed entity\" detection uses the existing dirty tracker: if an entity was inserted or updated during this sync run, it gets enqueued. On --full sync, all entities are enqueued.\n\n**Phase 2 — Drain dependent queue:**\nAdd a new drain step in src/cli/commands/sync.rs (or new src/core/drain.rs), called after discussion sync:\n```rust\npub async fn drain_dependent_queue(\n conn: &Connection,\n client: &GitLabClient,\n config: &Config,\n progress: Option<ProgressCallback>,\n) -> Result<DrainResult>\n```\n\nFlow:\n1. reclaim_stale_locks(conn, config.sync.stale_lock_minutes)\n2. Loop: claim_jobs(conn, \"resource_events\", batch_size=10)\n3. For each job:\n a. Fetch 3 event types via client (fetch_issue_state_events etc.)\n b. Store via upsert functions (upsert_state_events etc.)\n c. complete_job(conn, job.id) on success\n d. fail_job(conn, job.id, error_msg) on failure\n4. Report progress: \"Fetching resource events... [N/M]\"\n5. Repeat until no more claimable jobs\n\n**Progress reporting:**\nAdd new ProgressEvent variants:\n```rust\nResourceEventsFetchStart { total: usize },\nResourceEventsFetchProgress { completed: usize, total: usize },\nResourceEventsFetchComplete { fetched: usize, failed: usize },\n```\n\n## Acceptance Criteria\n- [ ] Full sync enqueues resource_events jobs for all issues and MRs\n- [ ] Incremental sync only enqueues for entities changed since last sync\n- [ ] --no-events prevents enqueueing resource_events jobs\n- [ ] Drain step fetches all 3 event types per entity\n- [ ] Successful fetches stored and job completed\n- [ ] Failed fetches recorded with error, job retried on next sync\n- [ ] Stale locks reclaimed at drain start\n- [ ] Progress displayed: \"Fetching resource events... [N/M]\"\n- [ ] Robot mode progress suppressed (quiet mode)\n\n## Files\n- src/ingestion/orchestrator.rs (add enqueue calls during upsert)\n- src/cli/commands/sync.rs (add drain step after discussions)\n- src/core/drain.rs (new, optional — or inline in sync.rs)\n\n## TDD Loop\nRED: tests/sync_pipeline_tests.rs (or extend existing):\n- `test_sync_enqueues_resource_events_for_changed_entities` - mock sync, verify jobs enqueued\n- `test_sync_no_events_flag_skips_enqueue` - verify no jobs when flag false\n- `test_drain_completes_jobs_on_success` - mock API responses, verify jobs deleted\n- `test_drain_fails_jobs_on_error` - mock API failure, verify job attempts incremented\n\nNote: Full pipeline integration tests may need mock HTTP server. Start with unit tests on enqueue/drain logic using the real DB with mock API responses.\n\nGREEN: Implement enqueue hooks + drain step\n\nVERIFY: `cargo test sync -- --nocapture && cargo build`\n\n## Edge Cases\n- Entity deleted between enqueue and drain: API returns 404, fail_job with \"entity not found\" (retry won't help but backoff caps it)\n- Rate limiting during drain: GitLabRateLimited error should fail_job with retry (transient)\n- Network error during drain: GitLabNetworkError should fail_job with retry\n- Multiple sync runs competing: locked_at prevents double-processing; stale lock reclaim handles crashes\n- Drain should have a max iterations guard to prevent infinite loop if jobs keep failing and being retried within the same run","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-02T21:31:57.334527Z","created_by":"tayloreernisse","updated_at":"2026-02-03T17:46:51.336138Z","closed_at":"2026-02-03T17:46:51.336077Z","close_reason":"Implemented: enqueue + drain resource events in orchestrator, wired counts through ingest→sync pipeline, added progress events, 4 new tests, all 209 tests pass","compaction_level":0,"original_size":0,"labels":["gate-1","phase-b","pipeline"],"dependencies":[{"issue_id":"bd-1ep","depends_on_id":"bd-1uc","type":"blocks","created_at":"2026-02-02T21:32:06.225837Z","created_by":"tayloreernisse"},{"issue_id":"bd-1ep","depends_on_id":"bd-2e8","type":"blocks","created_at":"2026-02-02T21:32:06.142442Z","created_by":"tayloreernisse"},{"issue_id":"bd-1ep","depends_on_id":"bd-2zl","type":"parent-child","created_at":"2026-02-02T21:31:57.335847Z","created_by":"tayloreernisse"},{"issue_id":"bd-1ep","depends_on_id":"bd-sqw","type":"blocks","created_at":"2026-02-02T21:32:06.183287Z","created_by":"tayloreernisse"},{"issue_id":"bd-1ep","depends_on_id":"bd-tir","type":"blocks","created_at":"2026-02-02T21:32:06.267800Z","created_by":"tayloreernisse"}]}
{"id":"bd-1fn","title":"[CP1] Integration tests for discussion watermark","description":"Integration tests verifying discussion sync watermark behavior.\n\n## Tests (tests/discussion_watermark_tests.rs)\n\n- skips_discussion_fetch_when_updated_at_unchanged\n- fetches_discussions_when_updated_at_advanced\n- updates_watermark_after_successful_discussion_sync\n- does_not_update_watermark_on_discussion_sync_failure\n\n## Test Scenario\n1. Ingest issue with updated_at = T1\n2. Verify discussions_synced_for_updated_at = T1\n3. Re-run ingest with same issue (updated_at = T1)\n4. Verify NO discussion API calls made (watermark prevents)\n5. Simulate issue update (updated_at = T2)\n6. Re-run ingest\n7. Verify discussion API calls made for T2\n8. Verify watermark updated to T2\n\n## Why This Matters\nDiscussion API is expensive (1 call per issue). Watermark ensures\nwe only refetch when issue actually changed, even with cursor rewind.\n\nFiles: tests/discussion_watermark_tests.rs\nDone when: Watermark correctly prevents redundant discussion refetch","status":"tombstone","priority":3,"issue_type":"task","created_at":"2026-01-25T16:59:11.362495Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:02.086158Z","deleted_at":"2026-01-25T17:02:02.086154Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-1gu","title":"[CP0] gi auth-test command","description":"## Background\n\nauth-test is a quick diagnostic command to verify GitLab connectivity. Used for troubleshooting and CI pipelines. Simpler than doctor because it only checks auth, not full system health.\n\nReference: docs/prd/checkpoint-0.md section \"gi auth-test\"\n\n## Approach\n\n**src/cli/commands/auth-test.ts:**\n```typescript\nimport { Command } from 'commander';\nimport { loadConfig } from '../../core/config';\nimport { GitLabClient } from '../../gitlab/client';\nimport { TokenNotSetError } from '../../core/errors';\n\nexport const authTestCommand = new Command('auth-test')\n .description('Verify GitLab authentication')\n .action(async (options, command) => {\n const globalOpts = command.optsWithGlobals();\n \n // 1. Load config\n const config = loadConfig(globalOpts.config);\n \n // 2. Get token from environment\n const token = process.env[config.gitlab.tokenEnvVar];\n if (!token) {\n throw new TokenNotSetError(config.gitlab.tokenEnvVar);\n }\n \n // 3. Create client and test auth\n const client = new GitLabClient({\n baseUrl: config.gitlab.baseUrl,\n token,\n });\n \n // 4. Get current user\n const user = await client.getCurrentUser();\n \n // 5. Output success\n console.log(`Authenticated as @${user.username} (${user.name})`);\n console.log(`GitLab: ${config.gitlab.baseUrl}`);\n });\n```\n\n**Output format:**\n```\nAuthenticated as @johndoe (John Doe)\nGitLab: https://gitlab.example.com\n```\n\n## Acceptance Criteria\n\n- [ ] Loads config from default or --config path\n- [ ] Gets token from configured env var (default GITLAB_TOKEN)\n- [ ] Throws TokenNotSetError if env var not set\n- [ ] Calls GET /api/v4/user to verify auth\n- [ ] Prints username and display name on success\n- [ ] Exit 0 on success\n- [ ] Exit 1 on auth failure (GitLabAuthError)\n- [ ] Exit 1 if config not found (ConfigNotFoundError)\n\n## Files\n\nCREATE:\n- src/cli/commands/auth-test.ts\n\n## TDD Loop\n\nN/A - simple command, verify manually and with integration test in init.test.ts\n\n```bash\n# Manual verification\nexport GITLAB_TOKEN=\"valid-token\"\ngi auth-test\n\n# With invalid token\nexport GITLAB_TOKEN=\"invalid\"\ngi auth-test # should exit 1\n```\n\n## Edge Cases\n\n- Config exists but token env var not set - clear error message\n- Token exists but wrong scopes - GitLabAuthError (401)\n- Network unreachable - GitLabNetworkError\n- Token with extra whitespace - should trim","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-24T16:09:51.135580Z","created_by":"tayloreernisse","updated_at":"2026-01-25T03:28:16.369542Z","closed_at":"2026-01-25T03:28:16.369481Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1gu","depends_on_id":"bd-13b","type":"blocks","created_at":"2026-01-24T16:13:10.058655Z","created_by":"tayloreernisse"},{"issue_id":"bd-1gu","depends_on_id":"bd-1l1","type":"blocks","created_at":"2026-01-24T16:13:10.077581Z","created_by":"tayloreernisse"}]}
{"id":"bd-1hj","title":"[CP1] Ingestion orchestrator","description":"Coordinate issue + dependent discussion sync with bounded concurrency.\n\n## Module\nsrc/ingestion/orchestrator.rs\n\n## Canonical Pattern (CP1)\n\nWhen gi ingest --type=issues runs:\n\n1. **Ingest issues** - cursor-based with incremental cursor updates per page\n2. **Collect touched issues** - record IssueForDiscussionSync for each issue passing cursor filter\n3. **Filter for discussion sync** - enqueue issues where:\n issue.updated_at > issues.discussions_synced_for_updated_at\n4. **Execute discussion sync** - with bounded concurrency (dependent_concurrency from config)\n5. **Update watermark** - after each issue's discussions successfully ingested\n\n## Concurrency Notes\n\nRuntime decision: Use single-threaded Tokio runtime (flavor = \"current_thread\")\n- rusqlite::Connection is !Send, conflicts with multi-threaded runtimes\n- Single-threaded avoids Send bounds entirely\n- Use tokio::task::spawn_local + LocalSet for concurrent discussion fetches\n- Keeps code simple; can upgrade to channel-based DB writer in CP2 if needed\n\n## Configuration Used\n- config.sync.dependent_concurrency - limits parallel discussion requests\n- config.sync.cursor_rewind_seconds - safety margin for cursor\n\n## Progress Reporting\n- Show total issues fetched\n- Show issues needing discussion sync\n- Show discussion/note counts per project\n\nFiles: src/ingestion/orchestrator.rs\nTests: Integration tests with mocked GitLab\nDone when: Full issue + discussion ingestion orchestrated correctly","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T16:57:57.325679Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.851047Z","deleted_at":"2026-01-25T17:02:01.851043Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-1ht","title":"Epic: Gate 5 - Code Trace (lore trace)","description":"## Background\nGate 5 implements \"lore trace\" — the command that answers \"Why was this code introduced?\" by tracing from a file path through the MR that modified it, to the issue that motivated the MR, to the discussions that contain the decision rationale. This is the capstone of Phase B, combining data from all previous gates.\n\nGate 5 ships Tier 1 only (API-only, no local git). Tier 2 (git blame integration via git2-rs for line-level precision) is deferred to Phase C.\n\n## Architecture\n- **No new tables.** Trace queries combine mr_file_changes (Gate 4), entity_references (Gate 2), and discussions/notes (existing).\n- **Query flow:** file → mr_file_changes → MRs → entity_references (closes/related) → issues → discussions with DiffNote context\n- **Tier 1 limitation:** File-level granularity only. Cannot trace a specific line to its introducing commit.\n- **Path parsing:** Supports \"src/foo.rs:45\" syntax — line number parsed but deferred with warning about Tier 2.\n- **Rename aware:** Reuses file_history::resolve_rename_chain for multi-path matching.\n\n## Children (Execution Order)\n1. **bd-2n4** [OPEN] — Trace query logic: file → MR → issue → discussion chain (src/core/trace.rs)\n2. **bd-9dd** [OPEN] — CLI command with human + robot output (src/cli/commands/trace.rs)\n\n## Gate Completion Criteria\n- [ ] `lore trace <file>` shows MRs that touched the file with linked issues + discussion context\n- [ ] Output includes MR → issue → discussion chain\n- [ ] DiffNote snippets show content positioned on the traced file\n- [ ] Cross-references from entity_references used for MR→issue linking\n- [ ] Robot JSON output with trace_chains array and meta.tier = \"api_only\"\n- [ ] :line suffix parsed with Tier 2 warning message\n- [ ] -p flag for project scoping\n- [ ] --no-follow-renames disables rename chain\n- [ ] Graceful empty state: \"No MR data found. Run lore sync with fetchMrFileChanges: true\"\n\n## Dependencies\n- Depends on: Gate 2 (bd-1se) for entity_references, Gate 4 (bd-14q) for mr_file_changes + commit SHAs\n- Downstream: None — Gate 5 is the terminal gate of Phase B","status":"open","priority":1,"issue_type":"feature","created_at":"2026-02-02T21:31:01.141053Z","created_by":"tayloreernisse","updated_at":"2026-02-04T19:33:19.836653Z","compaction_level":0,"original_size":0,"labels":["epic","gate-5","phase-b"],"dependencies":[{"issue_id":"bd-1ht","depends_on_id":"bd-14q","type":"blocks","created_at":"2026-02-02T21:34:38.033428Z","created_by":"tayloreernisse"},{"issue_id":"bd-1ht","depends_on_id":"bd-1se","type":"blocks","created_at":"2026-02-02T21:34:37.987232Z","created_by":"tayloreernisse"}]}
{"id":"bd-1i2","title":"Integrate mark_dirty_tx into ingestion modules","description":"## Background\nThis bead integrates dirty source tracking into the existing ingestion pipelines. Every entity upserted during ingestion must be marked dirty so the document regenerator knows to update the corresponding search document. The critical constraint: mark_dirty_tx() must be called INSIDE the same transaction that upserts the entity — not after commit.\n\n**Key PRD clarification:** Mark ALL upserted entities dirty (not just changed ones). The regenerator's hash comparison handles \"unchanged\" detection cheaply — this avoids needing change detection in ingestion.\n\n## Approach\nModify 4 existing ingestion files to add mark_dirty_tx() calls inside existing transaction blocks per PRD Section 6.1.\n\n**1. src/ingestion/issues.rs:**\nInside the issue upsert loop, after each successful INSERT/UPDATE:\n```rust\ndirty_tracker::mark_dirty_tx(&tx, SourceType::Issue, issue_row.id)?;\n```\n\n**2. src/ingestion/merge_requests.rs:**\nInside the MR upsert loop:\n```rust\ndirty_tracker::mark_dirty_tx(&tx, SourceType::MergeRequest, mr_row.id)?;\n```\n\n**3. src/ingestion/discussions.rs:**\nInside discussion insert (issue discussions, full-refresh transaction):\n```rust\ndirty_tracker::mark_dirty_tx(&tx, SourceType::Discussion, discussion_row.id)?;\n```\n\n**4. src/ingestion/mr_discussions.rs:**\nInside discussion upsert (write phase):\n```rust\ndirty_tracker::mark_dirty_tx(&tx, SourceType::Discussion, discussion_row.id)?;\n```\n\n**Discussion Sweep Cleanup (PRD Section 6.1 — CRITICAL):**\nWhen the MR discussion sweep deletes stale discussions (`last_seen_at < run_start_time`), **delete the corresponding document rows directly** — do NOT use the dirty queue for cleanup. The `ON DELETE CASCADE` on `document_labels`/`document_paths` and the `documents_embeddings_ad` trigger handle all downstream cleanup.\n\n**PRD-exact CTE pattern:**\n```sql\n-- In src/ingestion/mr_discussions.rs, during sweep phase.\n-- Uses a CTE to capture stale IDs atomically before cascading deletes.\n-- This is more defensive than two separate statements because the CTE\n-- guarantees the ID set is captured before any row is deleted.\nWITH stale AS (\n SELECT id FROM discussions\n WHERE merge_request_id = ? AND last_seen_at < ?\n)\n-- Step 1: delete orphaned documents (must happen while source_id still resolves)\nDELETE FROM documents\n WHERE source_type = 'discussion' AND source_id IN (SELECT id FROM stale);\n-- Step 2: delete the stale discussions themselves\nDELETE FROM discussions\n WHERE id IN (SELECT id FROM stale);\n```\n\n**NOTE:** If SQLite version doesn't support CTE-based multi-statement, execute as two sequential statements capturing IDs in Rust first:\n```rust\nlet stale_ids: Vec<i64> = conn.prepare(\n \"SELECT id FROM discussions WHERE merge_request_id = ? AND last_seen_at < ?\"\n)?.query_map(params![mr_id, run_start], |r| r.get(0))?\n .collect::<Result<Vec<_>, _>>()?;\n\nif !stale_ids.is_empty() {\n // Delete documents FIRST (while source_id still resolves)\n conn.execute(\n \"DELETE FROM documents WHERE source_type = 'discussion' AND source_id IN (...)\",\n ...\n )?;\n // Then delete the discussions\n conn.execute(\n \"DELETE FROM discussions WHERE id IN (...)\",\n ...\n )?;\n}\n```\n\n**IMPORTANT difference from dirty queue pattern:** The sweep deletes documents DIRECTLY (not via dirty_sources queue). This is because the source entity is being deleted — there's nothing for the regenerator to regenerate from. The cascade handles FTS, labels, paths, and embeddings cleanup.\n\n## Acceptance Criteria\n- [ ] Every upserted issue is marked dirty inside the same transaction\n- [ ] Every upserted MR is marked dirty inside the same transaction\n- [ ] Every upserted discussion (issue + MR) is marked dirty inside the same transaction\n- [ ] ALL upserted entities marked dirty (not just changed ones) — regenerator handles skip\n- [ ] mark_dirty_tx called with &Transaction (not &Connection)\n- [ ] mark_dirty_tx uses upsert with ON CONFLICT to reset backoff state (not INSERT OR IGNORE)\n- [ ] Discussion sweep deletes documents DIRECTLY (not via dirty queue)\n- [ ] Discussion sweep uses CTE (or Rust-side ID capture) to capture stale IDs before cascading deletes\n- [ ] Documents deleted BEFORE discussions (while source_id still resolves)\n- [ ] ON DELETE CASCADE handles document_labels, document_paths cleanup\n- [ ] documents_embeddings_ad trigger handles embedding cleanup\n- [ ] `cargo build` succeeds\n- [ ] Existing ingestion tests still pass\n\n## Files\n- `src/ingestion/issues.rs` — add mark_dirty_tx calls in upsert loop\n- `src/ingestion/merge_requests.rs` — add mark_dirty_tx calls in upsert loop\n- `src/ingestion/discussions.rs` — add mark_dirty_tx calls in insert loop\n- `src/ingestion/mr_discussions.rs` — add mark_dirty_tx calls + direct document deletion in sweep\n\n## TDD Loop\nRED: Existing tests should still pass (regression); new tests:\n- `test_issue_upsert_marks_dirty` — after issue ingest, dirty_sources has entry\n- `test_mr_upsert_marks_dirty` — after MR ingest, dirty_sources has entry\n- `test_discussion_upsert_marks_dirty` — after discussion ingest, dirty_sources has entry\n- `test_discussion_sweep_deletes_documents` — stale discussion documents deleted directly\n- `test_sweep_cascade_cleans_labels_paths` — ON DELETE CASCADE works\nGREEN: Add mark_dirty_tx calls in all 4 files, implement sweep with CTE\nVERIFY: `cargo test ingestion && cargo build`\n\n## Edge Cases\n- Upsert that doesn't change data: still marks dirty (regenerator hash check handles skip)\n- Transaction rollback: dirty mark also rolled back (atomic, inside same txn)\n- Discussion sweep with zero stale IDs: CTE returns empty, no DELETE executed\n- Large batch of upserts: each mark_dirty_tx is O(1) INSERT with ON CONFLICT\n- Sweep deletes document before discussion: order matters for source_id resolution","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:27:09.540279Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:39:17.241433Z","closed_at":"2026-01-30T17:39:17.241390Z","close_reason":"Added mark_dirty_tx calls in issues.rs, merge_requests.rs, discussions.rs, mr_discussions.rs (2 paths)","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1i2","depends_on_id":"bd-38q","type":"blocks","created_at":"2026-01-30T15:29:35.105551Z","created_by":"tayloreernisse"}]}
{"id":"bd-1j1","title":"Integration test: full Phase B sync pipeline","description":"## Background\n\nThis integration test proves the full Phase B sync pipeline works end-to-end: sync issues/MRs from a mock GitLab, enqueue dependent fetches, drain the queue, extract cross-references from state events, and parse system notes for references. It validates that all the separate Gate 1-2 components work together.\n\n## Approach\n\nCreate `tests/phase_b_integration.rs` (or add to existing integration test file).\n\n### Test Setup\n\n1. Create in-memory SQLite DB with all migrations applied (001-015)\n2. Set up HTTP mock server (use `wiremock` crate) with:\n - `/api/v4/projects/:id/issues` returning 2 test issues\n - `/api/v4/projects/:id/merge_requests` returning 1 test MR\n - `/api/v4/projects/:id/issues/:iid/resource_state_events` returning state events\n - `/api/v4/projects/:id/issues/:iid/resource_label_events` returning label events\n - `/api/v4/projects/:id/merge_requests/:iid/resource_state_events` returning merge event with source_merge_request_iid\n - `/api/v4/projects/:id/issues/:iid/discussions` returning a discussion with a system note containing \"mentioned in merge request !1\"\n3. Build a Config pointing to the mock server\n\n### Test Flow\n\n```rust\n#[tokio::test]\nasync fn test_full_phase_b_pipeline() {\n // 1. Set up mock server + DB\n // 2. Run ingest issues\n // 3. Run ingest MRs\n // 4. Verify pending_dependent_fetches were enqueued\n // 5. Drain dependent fetch queue (resource_events)\n // 6. Verify resource_state_events populated\n // 7. Verify entity_references extracted from state events\n // 8. Verify entity_references parsed from system notes\n // 9. Query entity_references to confirm the MR->issue \"closes\" link exists\n}\n```\n\n### Assertions\n\n- `SELECT COUNT(*) FROM resource_state_events` > 0\n- `SELECT COUNT(*) FROM resource_label_events` > 0\n- `SELECT COUNT(*) FROM entity_references WHERE reference_type = 'closes'` >= 1\n- `SELECT COUNT(*) FROM entity_references WHERE source_method = 'note_parse'` >= 1\n- `SELECT COUNT(*) FROM pending_dependent_fetches WHERE locked_at IS NOT NULL` = 0 (all drained)\n\n### Mock Data\n\nUse realistic but minimal GitLab API responses. The key fixture is a state event with `source_merge_request` field that creates a closes reference, and a system note body like \"mentioned in merge request !1\" that the note parser should extract.\n\n## Acceptance Criteria\n\n- [ ] Test creates DB, mocks, and runs full pipeline without errors\n- [ ] resource_state_events table populated from mock API\n- [ ] resource_label_events table populated from mock API\n- [ ] entity_references table has at least one \"closes\" reference from API\n- [ ] entity_references table has at least one \"mentioned\" reference from note parsing\n- [ ] pending_dependent_fetches fully drained (no stuck jobs)\n- [ ] Test runs in < 10 seconds\n- [ ] `cargo test --test phase_b_integration` passes\n\n## Files\n\n- `tests/phase_b_integration.rs` (NEW)\n- Possibly `tests/fixtures/` for mock JSON responses\n\n## TDD Loop\n\nRED: Write the test with all assertions. It will fail because the pipeline steps may not be wired end-to-end yet.\n\nGREEN: Fix any pipeline wiring issues until the test passes.\n\nVERIFY: `cargo test --test phase_b_integration -- --nocapture`\n\n## Edge Cases\n\n- Mock server returning paginated responses: include Link header with rel=next\n- Mock server returning empty pages: verify pipeline handles gracefully\n- Concurrent queue draining: test with dependent_concurrency=1 to avoid timing issues\n- Rate limiting: don't test rate limiting in integration test (mock responds immediately)\n\n## Dependencies\n\nRequires `wiremock` crate in `[dev-dependencies]`:\n```toml\nwiremock = \"0.6\"\n```","status":"open","priority":3,"issue_type":"task","created_at":"2026-02-02T22:42:26.355071Z","created_by":"tayloreernisse","updated_at":"2026-02-05T18:53:22.367647Z","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1j1","depends_on_id":"bd-1ji","type":"blocks","created_at":"2026-02-02T22:43:27.941002Z","created_by":"tayloreernisse"},{"issue_id":"bd-1j1","depends_on_id":"bd-1se","type":"parent-child","created_at":"2026-02-02T22:43:40.577709Z","created_by":"tayloreernisse"},{"issue_id":"bd-1j1","depends_on_id":"bd-3ia","type":"blocks","created_at":"2026-02-02T22:43:28.048311Z","created_by":"tayloreernisse"},{"issue_id":"bd-1j1","depends_on_id":"bd-8t4","type":"blocks","created_at":"2026-02-02T22:43:27.996061Z","created_by":"tayloreernisse"}]}
{"id":"bd-1je","title":"Implement pending discussion queue","description":"## Background\nThe pending discussion queue tracks discussions that need to be fetched from GitLab. When an issue or MR is updated, its discussions may need re-fetching. This queue is separate from dirty_sources (which tracks entities needing document regeneration) — it tracks entities needing API calls to GitLab. The queue uses the same backoff pattern as dirty_sources for consistency.\n\n## Approach\nCreate `src/ingestion/discussion_queue.rs`:\n\n```rust\nuse crate::core::backoff::compute_next_attempt_at;\n\n/// Noteable type for discussion queue.\n#[derive(Debug, Clone, Copy)]\npub enum NoteableType {\n Issue,\n MergeRequest,\n}\n\nimpl NoteableType {\n pub fn as_str(&self) -> &'static str {\n match self {\n Self::Issue => \"Issue\",\n Self::MergeRequest => \"MergeRequest\",\n }\n }\n}\n\npub struct PendingFetch {\n pub project_id: i64,\n pub noteable_type: NoteableType,\n pub noteable_iid: i64,\n pub attempt_count: i32,\n}\n\n/// Queue a discussion fetch. ON CONFLICT DO UPDATE resets backoff (consistent with dirty_sources).\npub fn queue_discussion_fetch(\n conn: &Connection,\n project_id: i64,\n noteable_type: NoteableType,\n noteable_iid: i64,\n) -> Result<()>;\n\n/// Get next batch of pending fetches (WHERE next_attempt_at IS NULL OR <= now).\npub fn get_pending_fetches(conn: &Connection, limit: usize) -> Result<Vec<PendingFetch>>;\n\n/// Mark fetch complete (remove from queue).\npub fn complete_fetch(\n conn: &Connection,\n project_id: i64,\n noteable_type: NoteableType,\n noteable_iid: i64,\n) -> Result<()>;\n\n/// Record fetch error with backoff.\npub fn record_fetch_error(\n conn: &Connection,\n project_id: i64,\n noteable_type: NoteableType,\n noteable_iid: i64,\n error: &str,\n) -> Result<()>;\n```\n\n## Acceptance Criteria\n- [ ] queue_discussion_fetch uses ON CONFLICT DO UPDATE (consistent with dirty_sources pattern)\n- [ ] Re-queuing resets: attempt_count=0, next_attempt_at=NULL, last_error=NULL\n- [ ] get_pending_fetches respects next_attempt_at backoff\n- [ ] get_pending_fetches returns entries ordered by queued_at ASC\n- [ ] complete_fetch removes entry from queue\n- [ ] record_fetch_error increments attempt_count, computes next_attempt_at via shared backoff\n- [ ] NoteableType.as_str() returns \"Issue\" or \"MergeRequest\" (matches DB CHECK constraint)\n- [ ] `cargo test discussion_queue` passes\n\n## Files\n- `src/ingestion/discussion_queue.rs` — new file\n- `src/ingestion/mod.rs` — add `pub mod discussion_queue;`\n\n## TDD Loop\nRED: Tests in `#[cfg(test)] mod tests`:\n- `test_queue_and_get` — queue entry, get returns it\n- `test_requeue_resets_backoff` — queue, error, re-queue -> attempt_count=0\n- `test_backoff_respected` — entry with future next_attempt_at not returned\n- `test_complete_removes` — complete_fetch removes entry\n- `test_error_increments_attempts` — error -> attempt_count=1, next_attempt_at set\nGREEN: Implement all functions\nVERIFY: `cargo test discussion_queue`\n\n## Edge Cases\n- Queue same (project_id, noteable_type, noteable_iid) twice: ON CONFLICT resets state\n- NoteableType must match DB CHECK constraint exactly (\"Issue\", \"MergeRequest\" — capitalized)\n- Empty queue: get_pending_fetches returns empty Vec","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:27:09.505548Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:31:35.496454Z","closed_at":"2026-01-30T17:31:35.496405Z","close_reason":"Implemented discussion_queue with queue/get/complete/record_error + 6 tests","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1je","depends_on_id":"bd-hrs","type":"blocks","created_at":"2026-01-30T15:29:35.034753Z","created_by":"tayloreernisse"},{"issue_id":"bd-1je","depends_on_id":"bd-mem","type":"blocks","created_at":"2026-01-30T15:29:35.071573Z","created_by":"tayloreernisse"}]}
{"id":"bd-1ji","title":"Parse system notes for cross-reference patterns","description":"## Background\nSystem notes contain cross-reference patterns like 'mentioned in !{iid}', 'closed by !{iid}', etc. This is best-effort, English-only extraction that supplements the structured API data from bd-3ia and bd-8t4. Runs as a local post-processing step (no API calls).\n\n## Approach\nCreate src/core/note_parser.rs:\n\n```rust\nuse regex::Regex;\nuse lazy_static::lazy_static;\n\n/// A parsed cross-reference from a system note.\npub struct ParsedCrossRef {\n pub reference_type: String, // \"mentioned\" | \"closes\"\n pub target_entity_type: String, // \"issue\" | \"merge_request\" \n pub target_iid: i64,\n pub target_project_path: Option<String>, // None = same project\n}\n\nlazy_static! {\n static ref MENTIONED_RE: Regex = Regex::new(\n r\"mentioned in (?:(?P<project>[\\w\\-]+/[\\w\\-]+))?(?P<sigil>[#!])(?P<iid>\\d+)\"\n ).unwrap();\n static ref CLOSED_BY_RE: Regex = Regex::new(\n r\"closed by (?:(?P<project>[\\w\\-]+/[\\w\\-]+))?(?P<sigil>[#!])(?P<iid>\\d+)\"\n ).unwrap();\n}\n\n/// Parse a system note body for cross-references.\npub fn parse_cross_refs(body: &str) -> Vec<ParsedCrossRef>\n\n/// Extract cross-references from all system notes and insert into entity_references.\n/// Queries notes WHERE is_system = 1, parses body text, resolves to entity_references.\npub fn extract_refs_from_system_notes(\n conn: &Connection,\n project_id: i64,\n) -> Result<ExtractResult>\n\npub struct ExtractResult {\n pub inserted: usize,\n pub skipped_unresolvable: usize,\n pub parse_failures: usize, // logged at debug level\n}\n```\n\nSigil mapping: `#` = issue, `!` = merge_request\n\nResolution logic:\n1. If target_project_path is None (same project): look up entity by iid in local DB → set target_entity_id\n2. If target_project_path is Some: check if project is synced locally\n - If yes: resolve to local entity id\n - If no: store as unresolved (target_entity_id=NULL, target_project_path=path, target_entity_iid=iid)\n\nInsert with source_method='system_note_parse', INSERT OR IGNORE for dedup.\n\nCall after drain_dependent_queue and extract_refs_from_state_events in the sync pipeline.\n\n## Acceptance Criteria\n- [ ] 'mentioned in !123' → mentioned ref, target=MR iid 123\n- [ ] 'mentioned in #456' → mentioned ref, target=issue iid 456\n- [ ] 'mentioned in group/project!789' → cross-project mentioned ref\n- [ ] 'closed by !123' → closes ref\n- [ ] Cross-project refs stored as unresolved when target project not synced\n- [ ] source_method = 'system_note_parse'\n- [ ] Parse failures logged at debug level (not errors)\n- [ ] Idempotent (INSERT OR IGNORE)\n- [ ] Only processes is_system=1 notes\n\n## Files\n- src/core/note_parser.rs (new)\n- src/core/mod.rs (add `pub mod note_parser;`)\n- src/cli/commands/sync.rs (call after other ref extraction steps)\n\n## TDD Loop\nRED: tests/note_parser_tests.rs:\n- `test_parse_mentioned_in_mr` - \"mentioned in !567\" → ParsedCrossRef { mentioned, merge_request, 567 }\n- `test_parse_mentioned_in_issue` - \"mentioned in #234\" → ParsedCrossRef { mentioned, issue, 234 }\n- `test_parse_mentioned_cross_project` - \"mentioned in group/repo!789\" → with project path\n- `test_parse_closed_by_mr` - \"closed by !567\" → ParsedCrossRef { closes, merge_request, 567 }\n- `test_parse_multiple_refs` - note with two mentions → two refs\n- `test_parse_no_refs` - \"Updated the description\" → empty vec\n- `test_extract_refs_from_system_notes_integration` - seed DB with system notes, verify entity_references created\n\nGREEN: Implement regex patterns and extraction logic\n\nVERIFY: `cargo test note_parser -- --nocapture`\n\n## Edge Cases\n- Non-English GitLab instances: \"ajouté l'étiquette ~bug\" won't match — this is accepted limitation, logged at debug\n- Multi-level group paths: \"mentioned in top/sub/project#123\" — regex needs to handle arbitrary depth ([\\w\\-]+(?:/[\\w\\-]+)+)\n- Note body may contain markdown links that look like refs: \"[#123](url)\" — the regex should handle this correctly since the prefix \"mentioned in\" is required\n- Same ref mentioned multiple times in same note — dedup via INSERT OR IGNORE\n- Note may reference itself (e.g., system note on issue #123 says \"mentioned in #123\") — technically valid, store it","status":"closed","priority":3,"issue_type":"task","created_at":"2026-02-02T21:32:33.663304Z","created_by":"tayloreernisse","updated_at":"2026-02-04T20:13:33.398960Z","closed_at":"2026-02-04T20:13:33.398868Z","close_reason":"Completed: parse_cross_refs regex parser, extract_refs_from_system_notes DB function, wired into orchestrator. 17 tests passing.","compaction_level":0,"original_size":0,"labels":["gate-2","parsing","phase-b"],"dependencies":[{"issue_id":"bd-1ji","depends_on_id":"bd-1se","type":"parent-child","created_at":"2026-02-02T21:32:33.665218Z","created_by":"tayloreernisse"},{"issue_id":"bd-1ji","depends_on_id":"bd-hu3","type":"blocks","created_at":"2026-02-02T22:41:50.672947Z","created_by":"tayloreernisse"}]}
{"id":"bd-1k1","title":"Implement FTS5 search function and query sanitization","description":"## Background\nFTS5 search is the core lexical retrieval engine. It wraps SQLite's FTS5 with safe query parsing that prevents user input from causing SQL syntax errors, while preserving useful features like prefix search for type-ahead. The search function returns ranked results with BM25 scores and contextual snippets. This module is the Gate A search backbone and also provides fallback search when Ollama is unavailable in Gate B.\n\n## Approach\nCreate `src/search/` module with `mod.rs` and `fts.rs` per PRD Section 3.1-3.2.\n\n**src/search/mod.rs:**\n```rust\nmod fts;\nmod filters;\n// Later beads add: mod vector; mod hybrid; mod rrf;\npub use fts::{search_fts, to_fts_query, FtsResult, FtsQueryMode, generate_fallback_snippet, get_result_snippet};\n```\n\n**src/search/fts.rs — key functions:**\n\n1. `to_fts_query(raw: &str, mode: FtsQueryMode) -> String`\n - Safe mode: wrap each token in quotes, escape internal quotes, preserve trailing * on alphanumeric tokens\n - Raw mode: pass through unchanged\n\n2. `search_fts(conn: &Connection, query: &str, limit: usize, mode: FtsQueryMode) -> Result<Vec<FtsResult>>`\n - Uses `bm25(documents_fts)` for ranking\n - Uses `snippet(documents_fts, 1, '<mark>', '</mark>', '...', 64)` for context\n - Column index 1 = content_text (0=title)\n\n3. `generate_fallback_snippet(content_text: &str, max_chars: usize) -> String`\n - For semantic-only results without FTS snippets\n - Uses `truncate_utf8()` for safe byte boundaries\n\n4. `truncate_utf8(s: &str, max_bytes: usize) -> &str`\n - Walks backward from max_bytes to find nearest char boundary\n\n5. `get_result_snippet(fts_snippet: Option<&str>, content_text: &str) -> String`\n - Prefers FTS snippet, falls back to truncated content\n\nUpdate `src/lib.rs`: add `pub mod search;`\n\n## Acceptance Criteria\n- [ ] Porter stemming works: search \"searching\" matches document containing \"search\"\n- [ ] Prefix search works: `auth*` matches \"authentication\"\n- [ ] Empty query returns empty Vec (no error)\n- [ ] Special characters don't cause FTS5 errors: `-`, `\"`, `:`, `*`\n- [ ] Query `\"-DWITH_SSL\"` returns results (dash not treated as NOT operator)\n- [ ] Query `C++` returns results (special chars preserved in quotes)\n- [ ] Safe mode preserves trailing `*` on alphanumeric tokens: `auth*` -> `\"auth\"*`\n- [ ] Raw mode passes query unchanged\n- [ ] BM25 scores returned (lower = better match)\n- [ ] Snippets contain `<mark>` tags around matches\n- [ ] `generate_fallback_snippet` truncates at word boundary, appends \"...\"\n- [ ] `truncate_utf8` never panics on multi-byte codepoints\n- [ ] `cargo test fts` passes\n\n## Files\n- `src/search/mod.rs` — new file (module root)\n- `src/search/fts.rs` — new file (FTS5 search + query sanitization)\n- `src/lib.rs` — add `pub mod search;`\n\n## TDD Loop\nRED: Tests in `fts.rs` `#[cfg(test)] mod tests`:\n- `test_safe_query_basic` — \"auth error\" -> `\"auth\" \"error\"`\n- `test_safe_query_prefix` — \"auth*\" -> `\"auth\"*`\n- `test_safe_query_special_chars` — \"C++\" -> `\"C++\"`\n- `test_safe_query_dash` — \"-DWITH_SSL\" -> `\"-DWITH_SSL\"`\n- `test_safe_query_quotes` — `he said \"hello\"` -> escaped\n- `test_raw_mode_passthrough` — raw query unchanged\n- `test_empty_query` — returns empty vec\n- `test_truncate_utf8_emoji` — truncate mid-emoji walks back\n- `test_fallback_snippet_word_boundary` — truncates at space\nGREEN: Implement to_fts_query, search_fts, helpers\nVERIFY: `cargo test fts`\n\n## Edge Cases\n- Query with only whitespace: treated as empty, returns empty\n- Query with only special characters: quoted, may return no results (not an error)\n- Very long query (1000+ chars): works but may be slow (no explicit limit)\n- FTS5 snippet returns empty string: fallback to truncated content_text\n- Non-alphanumeric prefix: `C++*` — NOT treated as prefix (special chars present)","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:26:13.005179Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:23:35.204290Z","closed_at":"2026-01-30T17:23:35.204106Z","close_reason":"Completed: to_fts_query (safe/raw modes), search_fts with BM25+snippets, generate_fallback_snippet, get_result_snippet, truncate_utf8 reuse, 13 tests pass","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1k1","depends_on_id":"bd-221","type":"blocks","created_at":"2026-01-30T15:29:24.374108Z","created_by":"tayloreernisse"}]}
{"id":"bd-1k4","title":"OBSERV: Add get_log_dir() helper to paths module","description":"## Background\nA centralized helper for the log directory path ensures consistent XDG compliance and directory creation. The existing get_data_dir() (src/core/paths.rs:40-43) returns ~/.local/share/lore/. We add a sibling that appends /logs/.\n\n## Approach\nAdd to src/core/paths.rs, after get_db_path() (around line 53):\n\n```rust\n/// Get the log directory path. Creates the directory if it doesn't exist.\npub fn get_log_dir(config_override: Option<&str>) -> PathBuf {\n let dir = if let Some(path) = config_override {\n PathBuf::from(path)\n } else {\n get_data_dir().join(\"logs\")\n };\n std::fs::create_dir_all(&dir).ok();\n dir\n}\n```\n\nThe config_override comes from LoggingConfig.log_dir (bd-17n). When None, uses XDG default.\n\nExisting pattern to follow (src/core/paths.rs:40-53):\n- get_data_dir() -> PathBuf (returns ~/.local/share/lore/)\n- get_db_path(config_override: Option<&str>) -> PathBuf\n\n## Acceptance Criteria\n- [ ] get_log_dir(None) returns ~/.local/share/lore/logs/\n- [ ] get_log_dir(Some(\"/tmp/custom\")) returns /tmp/custom\n- [ ] Directory is created if it doesn't exist\n- [ ] Function is pub and accessible from other modules\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/core/paths.rs (add get_log_dir function after line ~53)\n\n## TDD Loop\nRED: test_get_log_dir_default, test_get_log_dir_override (use tempdir)\nGREEN: Add get_log_dir() function\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- create_dir_all failure (e.g., permissions): .ok() swallows error silently. This matches get_db_path() which also doesn't create dirs. Consider: should we propagate the error? The subscriber init will fail anyway if the dir doesn't exist, providing a clear error.\n- Trailing slash: PathBuf handles this correctly","status":"closed","priority":1,"issue_type":"task","created_at":"2026-02-04T15:53:55.525165Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:10:22.907812Z","closed_at":"2026-02-04T17:10:22.907763Z","close_reason":"Added get_log_dir() helper mirroring get_db_path/get_backup_dir pattern","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-1k4","depends_on_id":"bd-2nx","type":"parent-child","created_at":"2026-02-04T15:53:55.526345Z","created_by":"tayloreernisse"}]}
{"id":"bd-1kh","title":"[CP0] Raw payload handling - compression and deduplication","description":"## Background\n\nRaw payload storage allows replaying API responses for debugging and audit. Compression reduces storage for large payloads. SHA-256 deduplication prevents storing identical payloads multiple times (important for frequently polled resources that haven't changed).\n\nReference: docs/prd/checkpoint-0.md section \"Raw Payload Handling\"\n\n## Approach\n\n**src/core/payloads.ts:**\n```typescript\nimport { createHash } from 'node:crypto';\nimport { gzipSync, gunzipSync } from 'node:zlib';\nimport Database from 'better-sqlite3';\nimport { nowMs } from './time';\n\ninterface StorePayloadOptions {\n projectId: number | null;\n resourceType: string; // 'project' | 'issue' | 'mr' | 'note' | 'discussion'\n gitlabId: string; // TEXT because discussion IDs are strings\n payload: unknown; // JSON-serializable object\n compress: boolean; // from config.storage.compressRawPayloads\n}\n\nexport function storePayload(db: Database.Database, options: StorePayloadOptions): number | null {\n // 1. JSON.stringify the payload\n // 2. SHA-256 hash the JSON bytes\n // 3. Check for duplicate by (project_id, resource_type, gitlab_id, payload_hash)\n // 4. If duplicate, return existing ID\n // 5. If compress=true, gzip the JSON bytes\n // 6. INSERT with content_encoding='gzip' or 'identity'\n // 7. Return lastInsertRowid\n}\n\nexport function readPayload(db: Database.Database, id: number): unknown {\n // 1. SELECT content_encoding, payload FROM raw_payloads WHERE id = ?\n // 2. If gzip, decompress\n // 3. JSON.parse and return\n}\n```\n\n## Acceptance Criteria\n\n- [ ] storePayload() with compress=true stores gzip-encoded payload\n- [ ] storePayload() with compress=false stores identity-encoded payload\n- [ ] Duplicate payload (same hash) returns existing row ID, not new row\n- [ ] readPayload() correctly decompresses gzip payloads\n- [ ] readPayload() returns null for non-existent ID\n- [ ] SHA-256 hash computed from pre-compression JSON bytes\n- [ ] Large payloads (100KB+) compress to ~10-20% of original size\n\n## Files\n\nCREATE:\n- src/core/payloads.ts\n- tests/unit/payloads.test.ts\n\n## TDD Loop\n\nRED:\n```typescript\n// tests/unit/payloads.test.ts\ndescribe('Payload Storage', () => {\n describe('storePayload', () => {\n it('stores uncompressed payload with identity encoding')\n it('stores compressed payload with gzip encoding')\n it('deduplicates identical payloads by hash')\n it('stores different payloads for same gitlab_id')\n })\n\n describe('readPayload', () => {\n it('reads uncompressed payload')\n it('reads and decompresses gzip payload')\n it('returns null for non-existent id')\n })\n})\n```\n\nGREEN: Implement storePayload() and readPayload()\n\nVERIFY: `npm run test -- tests/unit/payloads.test.ts`\n\n## Edge Cases\n\n- gitlabId is TEXT not INTEGER - discussion IDs are UUIDs\n- Compression ratio varies - some JSON compresses better than others\n- null projectId valid for global resources (like user profile)\n- Hash collision extremely unlikely with SHA-256 but unique index enforces","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-24T16:09:50.189494Z","created_by":"tayloreernisse","updated_at":"2026-01-25T03:19:12.854771Z","closed_at":"2026-01-25T03:19:12.854372Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1kh","depends_on_id":"bd-3ng","type":"blocks","created_at":"2026-01-24T16:13:09.055338Z","created_by":"tayloreernisse"}]}
{"id":"bd-1l1","title":"[CP0] GitLab API client with rate limiting","description":"## Background\n\nThe GitLab client handles all API communication with rate limiting to avoid 429 errors. Uses native fetch (Node 18+). Rate limiter adds jitter to prevent thundering herd. All errors are typed for clean error handling in CLI commands.\n\nReference: docs/prd/checkpoint-0.md section \"GitLab Client\"\n\n## Approach\n\n**src/gitlab/client.ts:**\n```typescript\nexport class GitLabClient {\n private baseUrl: string;\n private token: string;\n private rateLimiter: RateLimiter;\n\n constructor(options: { baseUrl: string; token: string; requestsPerSecond?: number }) {\n this.baseUrl = options.baseUrl.replace(/\\/$/, '');\n this.token = options.token;\n this.rateLimiter = new RateLimiter(options.requestsPerSecond ?? 10);\n }\n\n async getCurrentUser(): Promise<GitLabUser>\n async getProject(pathWithNamespace: string): Promise<GitLabProject>\n private async request<T>(path: string, options?: RequestInit): Promise<T>\n}\n\nclass RateLimiter {\n private lastRequest = 0;\n private minInterval: number;\n\n constructor(requestsPerSecond: number) {\n this.minInterval = 1000 / requestsPerSecond;\n }\n\n async acquire(): Promise<void> {\n // Wait if too soon since last request\n // Add 0-50ms jitter\n }\n}\n```\n\n**src/gitlab/types.ts:**\n```typescript\nexport interface GitLabUser {\n id: number;\n username: string;\n name: string;\n}\n\nexport interface GitLabProject {\n id: number;\n path_with_namespace: string;\n default_branch: string;\n web_url: string;\n created_at: string;\n updated_at: string;\n}\n```\n\n**Integration tests with MSW (Mock Service Worker):**\nSet up MSW handlers that mock GitLab API responses for /api/v4/user and /api/v4/projects/:path\n\n## Acceptance Criteria\n\n- [ ] getCurrentUser() returns GitLabUser with id, username, name\n- [ ] getProject(\"group/project\") URL-encodes path correctly\n- [ ] 401 response throws GitLabAuthError\n- [ ] 404 response throws GitLabNotFoundError\n- [ ] 429 response throws GitLabRateLimitError with retryAfter from header\n- [ ] Network failure throws GitLabNetworkError\n- [ ] Rate limiter enforces minimum interval between requests\n- [ ] Rate limiter adds random jitter (0-50ms)\n- [ ] tests/integration/gitlab-client.test.ts passes (6 tests)\n\n## Files\n\nCREATE:\n- src/gitlab/client.ts\n- src/gitlab/types.ts\n- tests/integration/gitlab-client.test.ts\n- tests/fixtures/mock-responses/gitlab-user.json\n- tests/fixtures/mock-responses/gitlab-project.json\n\n## TDD Loop\n\nRED:\n```typescript\n// tests/integration/gitlab-client.test.ts\ndescribe('GitLab Client', () => {\n it('authenticates with valid PAT')\n it('returns 401 for invalid PAT')\n it('fetches project by path')\n it('handles rate limiting (429) with Retry-After')\n it('respects rate limit (requests per second)')\n it('adds jitter to rate limiting')\n})\n```\n\nGREEN: Implement client.ts and types.ts\n\nVERIFY: `npm run test -- tests/integration/gitlab-client.test.ts`\n\n## Edge Cases\n\n- Path with special characters (spaces, slashes) must be URL-encoded\n- Retry-After header may be missing - default to 60s\n- Network timeout should be handled (use AbortController)\n- Rate limiter jitter prevents multiple clients syncing in lockstep\n- baseUrl trailing slash should be stripped","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-24T16:09:49.842981Z","created_by":"tayloreernisse","updated_at":"2026-01-25T03:06:39.520300Z","closed_at":"2026-01-25T03:06:39.520131Z","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1l1","depends_on_id":"bd-gg1","type":"blocks","created_at":"2026-01-24T16:13:08.713272Z","created_by":"tayloreernisse"}]}
{"id":"bd-1m8","title":"Extend 'lore stats --check' for event table integrity and queue health","description":"## Background\nThe existing stats --check command validates data integrity. Need to extend it for event tables (referential integrity) and dependent job queue health (stuck locks, retryable jobs). This provides operators and agents a way to detect data quality issues after sync.\n\n## Approach\nExtend src/cli/commands/stats.rs check mode:\n\n**New checks:**\n\n1. Event FK integrity:\n```sql\n-- Orphaned state events (issue_id points to non-existent issue)\nSELECT COUNT(*) FROM resource_state_events rse\nWHERE rse.issue_id IS NOT NULL\n AND NOT EXISTS (SELECT 1 FROM issues i WHERE i.id = rse.issue_id);\n-- (repeat for merge_request_id, and for label + milestone event tables)\n```\n\n2. Queue health:\n```sql\n-- Pending jobs by type\nSELECT job_type, COUNT(*) FROM pending_dependent_fetches GROUP BY job_type;\n-- Stuck locks (locked_at older than 5 minutes)\nSELECT COUNT(*) FROM pending_dependent_fetches WHERE locked_at IS NOT NULL AND locked_at < ?;\n-- Retryable jobs (attempts > 0, not locked)\nSELECT COUNT(*) FROM pending_dependent_fetches WHERE attempts > 0 AND locked_at IS NULL;\n-- Max attempts (jobs that may be permanently failing)\nSELECT job_type, MAX(attempts) FROM pending_dependent_fetches GROUP BY job_type;\n```\n\n3. Human output per check: PASS / WARN / FAIL with counts\n```\nEvent FK integrity: PASS (0 orphaned events)\nQueue health: WARN (3 stuck locks, 12 retryable jobs)\n```\n\n4. Robot JSON: structured health report\n```json\n{\n \"event_integrity\": {\n \"status\": \"pass\",\n \"orphaned_state_events\": 0,\n \"orphaned_label_events\": 0,\n \"orphaned_milestone_events\": 0\n },\n \"queue_health\": {\n \"status\": \"warn\",\n \"pending_by_type\": {\"resource_events\": 5, \"mr_closes_issues\": 2},\n \"stuck_locks\": 3,\n \"retryable_jobs\": 12,\n \"max_attempts_by_type\": {\"resource_events\": 5}\n }\n}\n```\n\n## Acceptance Criteria\n- [ ] Detects orphaned events (FK target missing)\n- [ ] Detects stuck locks (locked_at older than threshold)\n- [ ] Reports retryable job count and max attempts\n- [ ] Human output shows PASS/WARN/FAIL per check\n- [ ] Robot JSON matches structured schema\n- [ ] Graceful when event/queue tables don't exist\n\n## Files\n- src/cli/commands/stats.rs (extend check mode)\n\n## TDD Loop\nRED: tests/stats_check_tests.rs:\n- `test_stats_check_events_pass` - clean data, verify PASS\n- `test_stats_check_events_orphaned` - delete an issue with events remaining, verify FAIL count\n- `test_stats_check_queue_stuck_locks` - set old locked_at, verify WARN\n- `test_stats_check_queue_retryable` - fail some jobs, verify retryable count\n\nGREEN: Add the check queries and formatting\n\nVERIFY: `cargo test stats_check -- --nocapture`\n\n## Edge Cases\n- FK with CASCADE should prevent orphaned events in normal operation — but manual DB edits or bugs could cause them\n- Tables may not exist if migration 011 not applied — check table existence before querying\n- Empty queue is PASS (not WARN for \"no jobs found\")\n- Distinguish between \"0 stuck locks\" (good) and \"queue table doesn't exist\" (skip check)","status":"closed","priority":3,"issue_type":"task","created_at":"2026-02-02T21:31:57.422916Z","created_by":"tayloreernisse","updated_at":"2026-02-03T16:23:13.409909Z","closed_at":"2026-02-03T16:23:13.409717Z","close_reason":"Extended IntegrityResult with orphan_state/label/milestone_events and queue_stuck_locks/queue_max_attempts. Added FK integrity queries for all 3 event tables and queue health checks. Updated human output with PASS/WARN/FAIL indicators and robot JSON.","compaction_level":0,"original_size":0,"labels":["cli","gate-1","phase-b"],"dependencies":[{"issue_id":"bd-1m8","depends_on_id":"bd-2zl","type":"parent-child","created_at":"2026-02-02T21:31:57.424103Z","created_by":"tayloreernisse"},{"issue_id":"bd-1m8","depends_on_id":"bd-hu3","type":"blocks","created_at":"2026-02-02T21:32:06.350605Z","created_by":"tayloreernisse"},{"issue_id":"bd-1m8","depends_on_id":"bd-tir","type":"blocks","created_at":"2026-02-02T21:32:06.391042Z","created_by":"tayloreernisse"}]}
{"id":"bd-1mf","title":"[CP1] gi sync-status enhancement","description":"Enhance sync-status from CP0 stub to show issue cursors.\n\nOutput:\n- Last run timestamp and duration\n- Cursor positions per project (issues resource_type)\n- Entity counts (issues, discussions, notes)\n\nFiles: src/cli/commands/sync-status.ts (update existing)\nDone when: Shows cursor positions and counts after ingestion","status":"tombstone","priority":3,"issue_type":"task","created_at":"2026-01-25T15:20:36.449088Z","created_by":"tayloreernisse","updated_at":"2026-01-25T15:21:35.157235Z","deleted_at":"2026-01-25T15:21:35.157232Z","deleted_by":"tayloreernisse","delete_reason":"delete","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-1n5","title":"[CP1] gi ingest --type=issues command","description":"CLI command to orchestrate issue ingestion.\n\nImplementation:\n1. Acquire app lock with heartbeat\n2. Create sync_run record (status='running')\n3. For each configured project:\n - Call ingestIssues()\n - For each ingested issue, call ingestIssueDiscussions()\n - Show progress (spinner or progress bar)\n4. Update sync_run (status='succeeded', metrics_json)\n5. Release lock\n\nFlags:\n- --type=issues (required)\n- --project=PATH (optional, filter to single project)\n- --force (override stale lock)\n\nOutput: Progress bar, then summary with counts\n\nFiles: src/cli/commands/ingest.ts\nTests: tests/integration/sync-runs.test.ts\nDone when: Full issue + discussion ingestion works end-to-end","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:20:05.114751Z","created_by":"tayloreernisse","updated_at":"2026-01-25T15:21:35.153598Z","deleted_at":"2026-01-25T15:21:35.153595Z","deleted_by":"tayloreernisse","delete_reason":"delete","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-1nf","title":"Register 'lore timeline' command with all flags","description":"## Background\n\nThis is the CLI wiring bead for the timeline command. It registers the command with clap, parses all flags, calls the pipeline functions, and delegates to human/robot output renderers. This is the integration point that ties all Gate 3 components together.\n\n**Spec reference:** `docs/phase-b-temporal-intelligence.md` Section 3.1 (Command Design).\n\n## Approach\n\n### 1. Add Timeline subcommand in `src/cli/mod.rs`\n\nAdd to the `Commands` enum:\n```rust\n/// Show a chronological timeline of events matching a keyword query\nTimeline(TimelineArgs),\n```\n\nAdd the args struct (flags from spec Section 3.1):\n```rust\n#[derive(Parser)]\npub struct TimelineArgs {\n /// Keyword search query\n pub query: String,\n\n /// Filter by project path\n #[arg(short = 'p', long, help_heading = \"Filters\")]\n pub project: Option<String>,\n\n /// Filter by time (7d, 2w, 1m, 6m, or YYYY-MM-DD) [spec: --since]\n #[arg(long, help_heading = \"Filters\")]\n pub since: Option<String>,\n\n /// Cross-reference expansion depth (0=seed only, 1=default, 2=deep) [spec: --depth]\n #[arg(long, default_value = \"1\", help_heading = \"Output\")]\n pub depth: u32,\n\n /// Include 'mentioned' edges in expansion (off by default, noisier) [spec: --expand-mentions]\n #[arg(long, help_heading = \"Output\")]\n pub expand_mentions: bool,\n\n /// Maximum events to return [spec: -n]\n #[arg(short = 'n', long = \"limit\", default_value = \"100\", help_heading = \"Output\")]\n pub limit: usize,\n}\n```\n\n### 2. Add handler in `src/main.rs`\n\n```rust\nSome(Commands::Timeline(args)) => handle_timeline(cli.config.as_deref(), args, robot_mode),\n```\n\nThe handler:\n1. Loads config, resolves project if -p given\n2. Parses --since using `core::time::parse_since()`\n3. Cap --depth at 5 silently (spec doesn't specify max, but prevent runaway)\n4. Calls pipeline: `seed_timeline()` -> `expand_timeline()` -> `collect_events()` -> sort + limit\n5. Delegates to `print_timeline()` or `print_timeline_json()`\n\n### 3. Add to VALID_COMMANDS in fuzzy matching\n\nIn `suggest_similar_command()`, add \"timeline\" to `VALID_COMMANDS`.\n\n### 4. Add to robot-docs manifest\n\nIn `handle_robot_docs()`, add the timeline command entry with all flags.\n\n## Acceptance Criteria\n\n- [ ] `lore timeline \"authentication\"` works with human output\n- [ ] `lore --robot timeline \"authentication\"` works with JSON output\n- [ ] `-p group/repo` filters to single project (spec Section 3.1)\n- [ ] `--since 6m` filters to recent events (spec Section 3.1: supports 6m, YYYY-MM-DD)\n- [ ] `--depth 0` disables expansion (spec Section 3.1)\n- [ ] `--depth 1` is the default (spec Section 3.1)\n- [ ] `--depth 2` expands two hops (spec Section 3.1)\n- [ ] `--expand-mentions` includes mentioned edges (spec Section 3.1)\n- [ ] `-n 50` limits output to 50 events (spec Section 3.1)\n- [ ] `lore --robot timeline` with missing query prints MISSING_REQUIRED error\n- [ ] Timeline command appears in `lore robot-docs` output\n- [ ] Timeline command in VALID_COMMANDS for fuzzy matching\n- [ ] `cargo check --all-targets` passes\n- [ ] `cargo clippy --all-targets -- -D warnings` passes\n\n## Files\n\n- `src/cli/mod.rs` (add TimelineArgs struct + Commands::Timeline variant)\n- `src/main.rs` (add handle_timeline function + match arm + VALID_COMMANDS + robot-docs entry)\n\n## TDD Loop\n\nRED: No unit tests for CLI wiring -- this is integration-level. Verify with:\n- `cargo check --all-targets` (compiles)\n- Manual: `lore timeline \"test\"` with a synced database\n\nGREEN: Wire up the clap struct, add the match arm, call through to pipeline.\n\nVERIFY: `cargo check --all-targets && cargo clippy --all-targets -- -D warnings`\n\n## Edge Cases\n\n- Missing query argument: clap handles this with MISSING_REQUIRED error (exit 2)\n- Ambiguous project: resolve_project returns exit 18 (Ambiguous match)\n- Invalid --since format: `parse_since()` returns descriptive LoreError\n- --depth > 5: cap silently at 5 to prevent explosion\n- Empty results: print \"No events found matching 'query'\" (human) or empty events array (robot)","status":"open","priority":2,"issue_type":"task","created_at":"2026-02-02T21:33:28.422082Z","created_by":"tayloreernisse","updated_at":"2026-02-05T19:12:17.937610Z","compaction_level":0,"original_size":0,"labels":["cli","gate-3","phase-b"],"dependencies":[{"issue_id":"bd-1nf","depends_on_id":"bd-2f2","type":"blocks","created_at":"2026-02-02T21:33:37.746192Z","created_by":"tayloreernisse"},{"issue_id":"bd-1nf","depends_on_id":"bd-dty","type":"blocks","created_at":"2026-02-02T21:33:37.788079Z","created_by":"tayloreernisse"},{"issue_id":"bd-1nf","depends_on_id":"bd-ike","type":"parent-child","created_at":"2026-02-02T21:33:28.423399Z","created_by":"tayloreernisse"}]}
{"id":"bd-1np","title":"[CP1] GitLab types for issues, discussions, notes","description":"## Background\n\nGitLab types define the Rust structs for deserializing GitLab API responses. These types are the foundation for all ingestion work - issues, discussions, and notes must be correctly typed for serde to parse them.\n\n## Approach\n\nAdd types to `src/gitlab/types.rs` with serde derives:\n\n### GitLabIssue\n\n```rust\n#[derive(Debug, Clone, Deserialize)]\npub struct GitLabIssue {\n pub id: i64, // GitLab global ID\n pub iid: i64, // Project-scoped issue number\n pub project_id: i64,\n pub title: String,\n pub description: Option<String>,\n pub state: String, // \"opened\" | \"closed\"\n pub created_at: String, // ISO 8601\n pub updated_at: String, // ISO 8601\n pub closed_at: Option<String>,\n pub author: GitLabAuthor,\n pub labels: Vec<String>, // Array of label names (CP1 canonical)\n pub web_url: String,\n}\n```\n\nNOTE: `labels_details` intentionally NOT modeled - varies across GitLab versions.\n\n### GitLabAuthor\n\n```rust\n#[derive(Debug, Clone, Deserialize)]\npub struct GitLabAuthor {\n pub id: i64,\n pub username: String,\n pub name: String,\n}\n```\n\n### GitLabDiscussion\n\n```rust\n#[derive(Debug, Clone, Deserialize)]\npub struct GitLabDiscussion {\n pub id: String, // String ID like \"6a9c1750b37d...\"\n pub individual_note: bool, // true = standalone comment\n pub notes: Vec<GitLabNote>,\n}\n```\n\n### GitLabNote\n\n```rust\n#[derive(Debug, Clone, Deserialize)]\npub struct GitLabNote {\n pub id: i64,\n #[serde(rename = \"type\")]\n pub note_type: Option<String>, // \"DiscussionNote\" | \"DiffNote\" | null\n pub body: String,\n pub author: GitLabAuthor,\n pub created_at: String, // ISO 8601\n pub updated_at: String, // ISO 8601\n pub system: bool, // true for system-generated notes\n #[serde(default)]\n pub resolvable: bool,\n #[serde(default)]\n pub resolved: bool,\n pub resolved_by: Option<GitLabAuthor>,\n pub resolved_at: Option<String>,\n pub position: Option<GitLabNotePosition>,\n}\n```\n\n### GitLabNotePosition\n\n```rust\n#[derive(Debug, Clone, Deserialize)]\npub struct GitLabNotePosition {\n pub old_path: Option<String>,\n pub new_path: Option<String>,\n pub old_line: Option<i32>,\n pub new_line: Option<i32>,\n}\n```\n\n## Acceptance Criteria\n\n- [ ] GitLabIssue deserializes from API response JSON\n- [ ] GitLabAuthor embedded correctly in issue and note\n- [ ] GitLabDiscussion with notes array deserializes\n- [ ] GitLabNote handles null note_type (use Option<String>)\n- [ ] GitLabNote uses #[serde(rename = \"type\")] for reserved keyword\n- [ ] resolvable/resolved default to false via #[serde(default)]\n- [ ] All timestamp fields are String (ISO 8601 parsed elsewhere)\n\n## Files\n\n- src/gitlab/types.rs (edit - add types)\n\n## TDD Loop\n\nRED:\n```rust\n// tests/gitlab_types_tests.rs\n#[test] fn deserializes_gitlab_issue_from_json()\n#[test] fn deserializes_gitlab_discussion_from_json()\n#[test] fn handles_null_note_type()\n#[test] fn handles_missing_resolvable_field()\n#[test] fn deserializes_labels_as_string_array()\n```\n\nGREEN: Add type definitions with serde attributes\n\nVERIFY: `cargo test gitlab_types`\n\n## Edge Cases\n\n- note_type can be null, \"DiscussionNote\", or \"DiffNote\"\n- labels array can be empty\n- description can be null\n- resolved_by/resolved_at can be null\n- position is only present for DiffNotes","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-25T17:02:38.150472Z","created_by":"tayloreernisse","updated_at":"2026-01-25T22:17:08.842965Z","closed_at":"2026-01-25T22:17:08.842895Z","close_reason":"Implemented GitLabAuthor, GitLabIssue, GitLabDiscussion, GitLabNote, GitLabNotePosition types with 10 passing tests","compaction_level":0,"original_size":0}
{"id":"bd-1o1","title":"OBSERV: Add -v/--verbose and --log-format CLI flags","description":"## Background\nUsers and agents need CLI-controlled verbosity without knowing RUST_LOG syntax. The -v flag convention (cargo, curl, ssh) is universally understood. --log-format json enables lore sync 2>&1 | jq workflows without reading log files.\n\n## Approach\nAdd two new global flags to the Cli struct in src/cli/mod.rs (insert after the quiet field at line ~37):\n\n```rust\n/// Increase log verbosity (-v, -vv, -vvv)\n#[arg(short = 'v', long = \"verbose\", action = clap::ArgAction::Count, global = true)]\npub verbose: u8,\n\n/// Log format for stderr output: text (default) or json\n#[arg(long = \"log-format\", global = true, value_parser = [\"text\", \"json\"], default_value = \"text\")]\npub log_format: String,\n```\n\nThe existing Cli struct (src/cli/mod.rs:13-42) has these global flags: config, robot, json, color, quiet. The new flags follow the same pattern.\n\nNote: clap::ArgAction::Count allows -v, -vv, -vvv as a single flag with increasing count (0, 1, 2, 3).\n\n## Acceptance Criteria\n- [ ] lore -v sync parses without error (verbose=1)\n- [ ] lore -vv sync parses (verbose=2)\n- [ ] lore -vvv sync parses (verbose=3)\n- [ ] lore --log-format json sync parses (log_format=\"json\")\n- [ ] lore --log-format text sync parses (default)\n- [ ] lore --log-format xml sync errors (invalid value)\n- [ ] Existing commands unaffected (verbose defaults to 0, log_format to \"text\")\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/cli/mod.rs (modify Cli struct, lines 13-42)\n\n## TDD Loop\nRED: Write test that parses Cli with -v flag and asserts verbose=1\nGREEN: Add the two fields to Cli struct\nVERIFY: cargo test -p lore && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- -v and -q together: both parse fine; conflict resolution happens in subscriber setup (bd-2rr), not here\n- -v flag must be global=true so it works before and after subcommands: lore -v sync AND lore sync -v\n- --log-format is a string, not enum, to keep Cli struct simple","status":"closed","priority":1,"issue_type":"task","created_at":"2026-02-04T15:53:55.421339Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:10:22.585947Z","closed_at":"2026-02-04T17:10:22.585905Z","close_reason":"Added -v/--verbose (count) and --log-format (text|json) global CLI flags","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-1o1","depends_on_id":"bd-2nx","type":"parent-child","created_at":"2026-02-04T15:53:55.422103Z","created_by":"tayloreernisse"}]}
{"id":"bd-1o4h","title":"OBSERV: Define StageTiming struct in src/core/metrics.rs","description":"## Background\nStageTiming is the materialized view of span timing data. It's the data structure that flows through robot JSON output, sync_runs.metrics_json, and the human-readable timing summary. Defined in a new file because it's genuinely new functionality that doesn't fit existing modules.\n\n## Approach\nCreate src/core/metrics.rs:\n\n```rust\nuse serde::Serialize;\n\nfn is_zero(v: &usize) -> bool { *v == 0 }\n\n#[derive(Debug, Clone, Serialize)]\npub struct StageTiming {\n pub name: String,\n #[serde(skip_serializing_if = \"Option::is_none\")]\n pub project: Option<String>,\n pub elapsed_ms: u64,\n pub items_processed: usize,\n #[serde(skip_serializing_if = \"is_zero\")]\n pub items_skipped: usize,\n #[serde(skip_serializing_if = \"is_zero\")]\n pub errors: usize,\n #[serde(skip_serializing_if = \"Vec::is_empty\")]\n pub sub_stages: Vec<StageTiming>,\n}\n```\n\nRegister module in src/core/mod.rs (line ~11, add):\n```rust\npub mod metrics;\n```\n\nThe is_zero helper is a private function used by serde's skip_serializing_if. It must take &usize (reference) and return bool.\n\n## Acceptance Criteria\n- [ ] StageTiming serializes to JSON matching PRD Section 4.6.2 example\n- [ ] items_skipped omitted when 0\n- [ ] errors omitted when 0\n- [ ] sub_stages omitted when empty vec\n- [ ] project omitted when None\n- [ ] name, elapsed_ms, items_processed always present\n- [ ] Struct is Debug + Clone + Serialize\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/core/metrics.rs (new file)\n- src/core/mod.rs (register module, add line after existing pub mod declarations)\n\n## TDD Loop\nRED:\n - test_stage_timing_serialization: create StageTiming with sub_stages, serialize, assert JSON structure\n - test_stage_timing_zero_fields_omitted: errors=0, items_skipped=0, assert no \"errors\" or \"items_skipped\" keys\n - test_stage_timing_empty_sub_stages: sub_stages=vec![], assert no \"sub_stages\" key\nGREEN: Create metrics.rs with StageTiming struct and is_zero helper\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- is_zero must be a function, not a closure (serde skip_serializing_if requires a function path)\n- Vec::is_empty is a method on Vec, and serde accepts \"Vec::is_empty\" as a path for skip_serializing_if\n- Recursive StageTiming (sub_stages contains StageTiming): serde handles this naturally, no special handling needed","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-04T15:54:31.907234Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:21:40.915842Z","closed_at":"2026-02-04T17:21:40.915794Z","close_reason":"Created src/core/metrics.rs with StageTiming struct, serde skip_serializing_if for zero/empty fields, 5 tests","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-1o4h","depends_on_id":"bd-3er","type":"parent-child","created_at":"2026-02-04T15:54:31.910015Z","created_by":"tayloreernisse"}]}
{"id":"bd-1oo","title":"Write migration 015: mr_file_changes + commit SHA columns","description":"## Background\n\nMigration 015 creates the mr_file_changes table that stores which files each MR touched, enabling Gate 4 (file-history) and Gate 5 (trace). It also adds merge_commit_sha and squash_commit_sha columns to merge_requests for future Tier 2 (git blame integration).\n\nNote: Despite the bead title saying \"015\", the actual migration number depends on what's been merged. Check LATEST_SCHEMA_VERSION in src/core/db.rs and use the next sequential number.\n\n## Approach\n\nCreate `migrations/015_mr_file_changes.sql`:\n\n```sql\n-- Migration 015: MR file changes and commit SHA columns\n-- Powers file-history and trace commands (Gates 4-5)\n\nCREATE TABLE mr_file_changes (\n id INTEGER PRIMARY KEY,\n merge_request_id INTEGER NOT NULL REFERENCES merge_requests(id) ON DELETE CASCADE,\n project_id INTEGER NOT NULL REFERENCES projects(id) ON DELETE CASCADE,\n old_path TEXT, -- NULL for added files\n new_path TEXT NOT NULL, -- current/final path\n change_type TEXT NOT NULL CHECK (change_type IN ('added', 'modified', 'renamed', 'deleted')),\n lines_added INTEGER,\n lines_removed INTEGER,\n UNIQUE(merge_request_id, new_path)\n);\n\nCREATE INDEX idx_mfc_project_path ON mr_file_changes(project_id, new_path);\nCREATE INDEX idx_mfc_project_old_path ON mr_file_changes(project_id, old_path) WHERE old_path IS NOT NULL;\nCREATE INDEX idx_mfc_mr ON mr_file_changes(merge_request_id);\nCREATE INDEX idx_mfc_renamed ON mr_file_changes(project_id, change_type) WHERE change_type = 'renamed';\n\n-- Add commit SHA columns to merge_requests for Tier 2 git blame\nALTER TABLE merge_requests ADD COLUMN merge_commit_sha TEXT;\nALTER TABLE merge_requests ADD COLUMN squash_commit_sha TEXT;\n\n-- Update schema version\nINSERT INTO schema_version (version, applied_at, description)\nVALUES (15, strftime('%s', 'now') * 1000, 'MR file changes and commit SHA columns');\n```\n\nThen update `src/core/db.rs`:\n1. Add migration to `MIGRATIONS` array\n2. Bump `LATEST_SCHEMA_VERSION` to 15\n\n## Acceptance Criteria\n\n- [ ] Migration file exists at `migrations/015_mr_file_changes.sql`\n- [ ] `mr_file_changes` table has all 7 columns per schema above\n- [ ] UNIQUE constraint on (merge_request_id, new_path)\n- [ ] CHECK constraint on change_type allows only 4 values\n- [ ] Indexes created: project+new_path, project+old_path (partial), mr_id, project+renamed (partial)\n- [ ] `merge_requests` table gains merge_commit_sha and squash_commit_sha columns\n- [ ] `LATEST_SCHEMA_VERSION` bumped to 15 in src/core/db.rs\n- [ ] Migration added to MIGRATIONS array in src/core/db.rs\n- [ ] `lore migrate` applies the migration successfully\n- [ ] `cargo check --all-targets` passes\n- [ ] `cargo clippy --all-targets -- -D warnings` passes\n\n## Files\n\n- `migrations/015_mr_file_changes.sql` (NEW)\n- `src/core/db.rs` (update MIGRATIONS array + LATEST_SCHEMA_VERSION)\n\n## TDD Loop\n\nRED: Run `lore migrate` on a v14 database -- should say \"already up to date\".\n\nGREEN: Add migration file and update db.rs. Run `lore migrate` again -- should apply v15.\n\nVERIFY:\n```bash\nlore --robot migrate\n# Verify: {\"ok\":true,\"data\":{\"before_version\":14,\"after_version\":15,\"migrated\":true}}\nsqlite3 ~/.local/share/lore/lore.db \".schema mr_file_changes\"\nsqlite3 ~/.local/share/lore/lore.db \"PRAGMA table_info(merge_requests)\" | grep sha\n```\n\n## Edge Cases\n\n- Migration must be idempotent-safe: schema_version check prevents double-apply\n- ALTER TABLE on merge_requests: SQLite allows adding columns but not removing them. These columns are nullable (no DEFAULT needed).\n- old_path is NULL for added files, NOT NULL for renamed/deleted files. The CHECK constraint doesn't enforce this -- document it as a convention.\n- Partial indexes (WHERE change_type = 'renamed') only index relevant rows for rename chain BFS.","status":"open","priority":2,"issue_type":"task","created_at":"2026-02-02T21:34:08.837816Z","created_by":"tayloreernisse","updated_at":"2026-02-05T18:54:13.710189Z","compaction_level":0,"original_size":0,"labels":["gate-4","phase-b","schema"],"dependencies":[{"issue_id":"bd-1oo","depends_on_id":"bd-14q","type":"parent-child","created_at":"2026-02-02T21:34:08.843541Z","created_by":"tayloreernisse"},{"issue_id":"bd-1oo","depends_on_id":"bd-hu3","type":"blocks","created_at":"2026-02-02T21:34:16.505965Z","created_by":"tayloreernisse"}]}
{"id":"bd-1qf","title":"[CP1] Discussion and note transformers","description":"## Background\n\nDiscussion and note transformers convert GitLab API discussion responses into our normalized schema. They compute derived fields like `first_note_at`, `last_note_at`, resolvable/resolved status, and note positions. These are pure functions with no I/O.\n\n## Approach\n\nCreate transformer module with:\n\n### Structs\n\n```rust\n// src/gitlab/transformers/discussion.rs\n\npub struct NormalizedDiscussion {\n pub gitlab_discussion_id: String,\n pub project_id: i64,\n pub issue_id: i64,\n pub noteable_type: String, // \"Issue\"\n pub individual_note: bool,\n pub first_note_at: Option<i64>, // min(note.created_at) in ms epoch\n pub last_note_at: Option<i64>, // max(note.created_at) in ms epoch\n pub last_seen_at: i64,\n pub resolvable: bool, // any note is resolvable\n pub resolved: bool, // all resolvable notes are resolved\n}\n\npub struct NormalizedNote {\n pub gitlab_id: i64,\n pub project_id: i64,\n pub note_type: Option<String>, // \"DiscussionNote\" | \"DiffNote\" | null\n pub is_system: bool, // from note.system\n pub author_username: String,\n pub body: String,\n pub created_at: i64, // ms epoch\n pub updated_at: i64, // ms epoch\n pub last_seen_at: i64,\n pub position: i32, // 0-indexed array position\n pub resolvable: bool,\n pub resolved: bool,\n pub resolved_by: Option<String>,\n pub resolved_at: Option<i64>,\n}\n```\n\n### Functions\n\n```rust\npub fn transform_discussion(\n gitlab_discussion: &GitLabDiscussion,\n local_project_id: i64,\n local_issue_id: i64,\n) -> NormalizedDiscussion\n\npub fn transform_notes(\n gitlab_discussion: &GitLabDiscussion,\n local_project_id: i64,\n) -> Vec<NormalizedNote>\n```\n\n## Acceptance Criteria\n\n- [ ] `NormalizedDiscussion` struct with all fields\n- [ ] `NormalizedNote` struct with all fields\n- [ ] `transform_discussion` computes first_note_at/last_note_at from notes array\n- [ ] `transform_discussion` computes resolvable (any note is resolvable)\n- [ ] `transform_discussion` computes resolved (all resolvable notes resolved)\n- [ ] `transform_notes` preserves array order via position field (0-indexed)\n- [ ] `transform_notes` maps system flag to is_system\n- [ ] Unit tests cover all computed fields\n\n## Files\n\n- src/gitlab/transformers/mod.rs (add `pub mod discussion;`)\n- src/gitlab/transformers/discussion.rs (create)\n\n## TDD Loop\n\nRED:\n```rust\n// tests/discussion_transformer_tests.rs\n#[test] fn transforms_discussion_payload_to_normalized_schema()\n#[test] fn extracts_notes_array_from_discussion()\n#[test] fn sets_individual_note_flag_correctly()\n#[test] fn flags_system_notes_with_is_system_true()\n#[test] fn preserves_note_order_via_position_field()\n#[test] fn computes_first_note_at_and_last_note_at_correctly()\n#[test] fn computes_resolvable_and_resolved_status()\n```\n\nGREEN: Implement transform_discussion and transform_notes\n\nVERIFY: `cargo test discussion_transformer`\n\n## Edge Cases\n\n- Discussion with single note - first_note_at == last_note_at\n- All notes are system notes - still compute timestamps\n- No notes resolvable - resolvable=false, resolved=false\n- Mix of resolved/unresolved notes - resolved=false until all done","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-25T17:02:38.196079Z","created_by":"tayloreernisse","updated_at":"2026-01-25T22:27:11.485112Z","closed_at":"2026-01-25T22:27:11.485058Z","close_reason":"Implemented NormalizedDiscussion, NormalizedNote, transform_discussion, transform_notes with 9 passing unit tests","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1qf","depends_on_id":"bd-1np","type":"blocks","created_at":"2026-01-25T17:04:05.347218Z","created_by":"tayloreernisse"}]}
{"id":"bd-1qz","title":"[CP1] Database migration 002_issues.sql","description":"Create migration file with tables for issues, labels, issue_labels, discussions, and notes.\n\n## Tables\n\n### issues\n- id INTEGER PRIMARY KEY\n- gitlab_id INTEGER UNIQUE NOT NULL\n- project_id INTEGER NOT NULL REFERENCES projects(id)\n- iid INTEGER NOT NULL\n- title TEXT, description TEXT, state TEXT\n- author_username TEXT\n- created_at, updated_at, last_seen_at INTEGER (ms epoch UTC)\n- discussions_synced_for_updated_at INTEGER (watermark for dependent sync)\n- web_url TEXT\n- raw_payload_id INTEGER REFERENCES raw_payloads(id)\n\n### labels (name-only for CP1)\n- id INTEGER PRIMARY KEY\n- gitlab_id INTEGER (optional, for future Labels API)\n- project_id INTEGER NOT NULL REFERENCES projects(id)\n- name TEXT NOT NULL\n- color TEXT, description TEXT (nullable, deferred)\n- UNIQUE(project_id, name)\n\n### issue_labels (junction)\n- issue_id, label_id with CASCADE DELETE\n- Clear existing links before INSERT to handle removed labels\n\n### discussions\n- gitlab_discussion_id TEXT (string ID from API)\n- project_id, issue_id/merge_request_id FKs\n- noteable_type TEXT ('Issue' | 'MergeRequest')\n- individual_note INTEGER, first_note_at, last_note_at, last_seen_at\n- resolvable, resolved flags\n- CHECK constraint for Issue vs MR exclusivity\n\n### notes\n- gitlab_id INTEGER UNIQUE NOT NULL\n- discussion_id, project_id FKs\n- note_type, is_system, author_username, body\n- timestamps, position (array order)\n- resolution fields, DiffNote position fields\n\n## Indexes\n- idx_issues_project_updated, idx_issues_author, idx_issues_discussions_sync\n- uq_issues_project_iid, uq_labels_project_name\n- idx_issue_labels_label\n- uq_discussions_project_discussion_id, idx_discussions_issue/mr/last_note\n- idx_notes_discussion/author/system\n\nFiles: migrations/002_issues.sql\nDone when: Migration applies cleanly on top of 001_initial.sql, schema_version = 2","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:42:31.464544Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.685262Z","deleted_at":"2026-01-25T17:02:01.685258Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-1re","title":"[CP1] gi show issue command","description":"Show issue details with discussions.\n\nFlags:\n- --project=PATH (required if iid is ambiguous across projects)\n\nOutput:\n- Title, project, state, author, dates, labels, URL\n- Description text\n- All discussions with notes (formatted thread view)\n\nHandle ambiguity: If multiple projects have same iid, prompt for --project or show error.\n\nFiles: src/cli/commands/show.ts\nDone when: Issue detail view displays all fields including threaded discussions","status":"tombstone","priority":3,"issue_type":"task","created_at":"2026-01-25T15:20:29.826786Z","created_by":"tayloreernisse","updated_at":"2026-01-25T15:21:35.153211Z","deleted_at":"2026-01-25T15:21:35.153208Z","deleted_by":"tayloreernisse","delete_reason":"delete","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-1s1","title":"[CP1] Integration tests for issue ingestion","description":"Full integration tests for issue ingestion module.\n\n## Tests (tests/issue_ingestion_tests.rs)\n\n- inserts_issues_into_database\n- creates_labels_from_issue_payloads\n- links_issues_to_labels_via_junction_table\n- removes_stale_label_links_on_resync\n- stores_raw_payload_for_each_issue\n- stores_raw_payload_for_each_discussion\n- updates_cursor_incrementally_per_page\n- resumes_from_cursor_on_subsequent_runs\n- handles_issues_with_no_labels\n- upserts_existing_issues_on_refetch\n- skips_discussion_refetch_for_unchanged_issues\n\n## Test Setup\n- tempfile::TempDir for isolated database\n- wiremock::MockServer for GitLab API\n- Mock handlers returning fixture data\n\nFiles: tests/issue_ingestion_tests.rs\nDone when: All integration tests pass with mocked GitLab","status":"tombstone","priority":3,"issue_type":"task","created_at":"2026-01-25T16:59:12.158586Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:02.109109Z","deleted_at":"2026-01-25T17:02:02.109105Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-1se","title":"Epic: Gate 2 - Cross-Reference Extraction","description":"## Background\nGate 2 builds the entity relationship graph that connects issues, MRs, and discussions. Without cross-references, temporal queries can only show events for individually-matched entities. With them, \"lore timeline auth migration\" can discover that MR !567 closed issue #234, which spawned follow-up issue #299 — even if #299 does not contain the words \"auth migration.\"\n\nThree data sources feed entity_references:\n1. **Structured API (reliable):** GET /projects/:id/merge_requests/:iid/closes_issues\n2. **State events (reliable):** resource_state_events.source_merge_request_id\n3. **System note parsing (best-effort):** \"mentioned in !456\", \"closed by !789\" patterns\n\n## Architecture\n- **entity_references table:** Already created in migration 011 (bd-hu3/bd-czk). Stores source→target relationships with reference_type (closes/mentioned/related) and source_method provenance.\n- **Directionality convention:** source = entity where reference was observed, target = entity being referenced. Consistent across all source_methods.\n- **Unresolved references:** Cross-project refs stored with target_entity_id=NULL, target_project_path populated. Still valuable for timeline narratives.\n- **closes_issues fetch:** Uses generic dependent fetch queue (job_type = mr_closes_issues). One API call per MR.\n- **System note parsing:** Local post-processing after all dependent fetches complete. No API calls. English-only, best-effort.\n\n## Children (Execution Order)\n1. **bd-czk** [CLOSED] — entity_references schema (folded into migration 011)\n2. **bd-8t4** [OPEN] — Extract cross-references from resource_state_events (source_merge_request_id)\n3. **bd-3ia** [OPEN] — Fetch closes_issues API and populate entity_references\n4. **bd-1ji** [OPEN] — Parse system notes for cross-reference patterns\n\n## Gate Completion Criteria\n- [ ] entity_references populated from closes_issues API for all synced MRs\n- [ ] entity_references populated from state events where source_merge_request_id present\n- [ ] System notes parsed for cross-reference patterns (English instances)\n- [ ] Cross-project references stored as unresolved (target_entity_id=NULL)\n- [ ] source_method tracks provenance of each reference\n- [ ] Deduplication: same relationship from multiple sources stored once (UNIQUE constraint)\n- [ ] Timeline JSON includes expansion provenance (via) for expanded entities\n- [ ] Integration test: sync with all three extraction methods, verify entity_references populated\n\n## Dependencies\n- Depends on: Gate 1 (bd-2zl) — event tables and dependent fetch queue\n- Downstream: Gate 3 (bd-ike) depends on entity_references for BFS expansion","status":"closed","priority":1,"issue_type":"feature","created_at":"2026-02-02T21:31:00.981132Z","created_by":"tayloreernisse","updated_at":"2026-02-05T16:08:26.965177Z","closed_at":"2026-02-05T16:08:26.964997Z","close_reason":"All child beads completed: bd-8t4 (state event extraction), bd-3ia (closes_issues API), bd-1ji (system note parsing)","compaction_level":0,"original_size":0,"labels":["epic","gate-2","phase-b"],"dependencies":[{"issue_id":"bd-1se","depends_on_id":"bd-2zl","type":"blocks","created_at":"2026-02-02T21:32:43.028033Z","created_by":"tayloreernisse"}]}
{"id":"bd-1t4","title":"Epic: CP2 Gate C - Dependent Discussion Sync","description":"## Background\nGate C validates the dependent discussion sync with DiffNote position capture. This is critical for code review context preservation - without DiffNote positions, we lose the file/line context for review comments.\n\n## Acceptance Criteria (Pass/Fail)\n- [ ] Discussions fetched for MRs with updated_at > discussions_synced_for_updated_at\n- [ ] `SELECT COUNT(*) FROM discussions WHERE merge_request_id IS NOT NULL` > 0\n- [ ] DiffNotes have `position_new_path` populated (file path)\n- [ ] DiffNotes have `position_new_line` populated (line number)\n- [ ] DiffNotes have `position_type` populated (text/image/file)\n- [ ] DiffNotes have SHA triplet: `position_base_sha`, `position_start_sha`, `position_head_sha`\n- [ ] Multi-line DiffNotes have `position_line_range_start` and `position_line_range_end`\n- [ ] Unchanged MRs skip discussion refetch (watermark comparison works)\n- [ ] Watermark NOT advanced on HTTP error mid-pagination\n- [ ] Watermark NOT advanced on note timestamp parse failure\n- [ ] `gi show mr <iid>` displays DiffNote with file context `[path:line]`\n\n## Validation Script\n```bash\n#!/bin/bash\nset -e\n\nDB_PATH=\"${XDG_DATA_HOME:-$HOME/.local/share}/gitlab-inbox/db.sqlite3\"\n\necho \"=== Gate C: Dependent Discussion Sync ===\"\n\n# 1. Check discussion count for MRs\necho \"Step 1: Check MR discussion count...\"\nMR_DISC_COUNT=$(sqlite3 \"$DB_PATH\" \"SELECT COUNT(*) FROM discussions WHERE merge_request_id IS NOT NULL;\")\necho \" MR discussions: $MR_DISC_COUNT\"\n[ \"$MR_DISC_COUNT\" -gt 0 ] || { echo \"FAIL: No MR discussions found\"; exit 1; }\n\n# 2. Check note count\necho \"Step 2: Check note count...\"\nNOTE_COUNT=$(sqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM notes n\n JOIN discussions d ON d.id = n.discussion_id\n WHERE d.merge_request_id IS NOT NULL;\n\")\necho \" MR notes: $NOTE_COUNT\"\n\n# 3. Check DiffNote position data\necho \"Step 3: Check DiffNote positions...\"\nDIFFNOTE_COUNT=$(sqlite3 \"$DB_PATH\" \"SELECT COUNT(*) FROM notes WHERE position_new_path IS NOT NULL;\")\necho \" DiffNotes with position: $DIFFNOTE_COUNT\"\n\n# 4. Sample DiffNote data\necho \"Step 4: Sample DiffNote data...\"\nsqlite3 \"$DB_PATH\" \"\n SELECT \n n.gitlab_id,\n n.position_new_path,\n n.position_new_line,\n n.position_type,\n SUBSTR(n.position_head_sha, 1, 7) as head_sha\n FROM notes n\n WHERE n.position_new_path IS NOT NULL\n LIMIT 5;\n\"\n\n# 5. Check multi-line DiffNotes\necho \"Step 5: Check multi-line DiffNotes...\"\nMULTILINE_COUNT=$(sqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM notes \n WHERE position_line_range_start IS NOT NULL \n AND position_line_range_end IS NOT NULL\n AND position_line_range_start != position_line_range_end;\n\")\necho \" Multi-line DiffNotes: $MULTILINE_COUNT\"\n\n# 6. Check watermarks set\necho \"Step 6: Check watermarks...\"\nWATERMARKED=$(sqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM merge_requests \n WHERE discussions_synced_for_updated_at IS NOT NULL;\n\")\necho \" MRs with watermark set: $WATERMARKED\"\n\n# 7. Check last_seen_at for sweep pattern\necho \"Step 7: Check last_seen_at (sweep pattern)...\"\nsqlite3 \"$DB_PATH\" \"\n SELECT \n MIN(last_seen_at) as oldest,\n MAX(last_seen_at) as newest\n FROM discussions \n WHERE merge_request_id IS NOT NULL;\n\"\n\n# 8. Test show command with DiffNote\necho \"Step 8: Find MR with DiffNotes for show test...\"\nMR_IID=$(sqlite3 \"$DB_PATH\" \"\n SELECT DISTINCT m.iid\n FROM merge_requests m\n JOIN discussions d ON d.merge_request_id = m.id\n JOIN notes n ON n.discussion_id = d.id\n WHERE n.position_new_path IS NOT NULL\n LIMIT 1;\n\")\nif [ -n \"$MR_IID\" ]; then\n echo \" Testing: gi show mr $MR_IID\"\n gi show mr \"$MR_IID\" | head -50\nfi\n\n# 9. Re-run and verify skip count\necho \"Step 9: Re-run ingest (should skip unchanged MRs)...\"\ngi ingest --type=merge_requests\n# Should report \"Skipped discussion sync for N unchanged MRs\"\n\necho \"\"\necho \"=== Gate C: PASSED ===\"\n```\n\n## Atomicity Test (Manual - Kill Test)\n```bash\n# This tests that partial failure preserves data\n\n# 1. Get an MR with discussions\nMR_ID=$(sqlite3 \"$DB_PATH\" \"\n SELECT m.id FROM merge_requests m\n JOIN discussions d ON d.merge_request_id = m.id\n LIMIT 1;\n\")\n\n# 2. Note current note count\nBEFORE=$(sqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM notes n\n JOIN discussions d ON d.id = n.discussion_id\n WHERE d.merge_request_id = $MR_ID;\n\")\necho \"Notes before: $BEFORE\"\n\n# 3. Note watermark\nWATERMARK_BEFORE=$(sqlite3 \"$DB_PATH\" \"\n SELECT discussions_synced_for_updated_at FROM merge_requests WHERE id = $MR_ID;\n\")\necho \"Watermark before: $WATERMARK_BEFORE\"\n\n# 4. Force full sync and kill mid-run\ngi ingest --type=merge_requests --full &\nPID=$!\nsleep 3 && kill -9 $PID 2>/dev/null || true\nwait $PID 2>/dev/null || true\n\n# 5. Verify notes preserved (should be same or more, never less)\nAFTER=$(sqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM notes n\n JOIN discussions d ON d.id = n.discussion_id\n WHERE d.merge_request_id = $MR_ID;\n\")\necho \"Notes after kill: $AFTER\"\n[ \"$AFTER\" -ge \"$BEFORE\" ] || echo \"WARNING: Notes decreased - atomicity may be broken\"\n\n# 6. Note watermark should NOT have advanced if killed mid-pagination\nWATERMARK_AFTER=$(sqlite3 \"$DB_PATH\" \"\n SELECT discussions_synced_for_updated_at FROM merge_requests WHERE id = $MR_ID;\n\")\necho \"Watermark after: $WATERMARK_AFTER\"\n```\n\n## Test Commands (Quick Verification)\n```bash\n# Check DiffNote data:\nsqlite3 ~/.local/share/gitlab-inbox/db.sqlite3 \"\n SELECT \n (SELECT COUNT(*) FROM discussions WHERE merge_request_id IS NOT NULL) as mr_discussions,\n (SELECT COUNT(*) FROM notes WHERE position_new_path IS NOT NULL) as diffnotes,\n (SELECT COUNT(*) FROM merge_requests WHERE discussions_synced_for_updated_at IS NOT NULL) as watermarked;\n\"\n\n# Find MR with DiffNotes and show it:\ngi show mr $(sqlite3 ~/.local/share/gitlab-inbox/db.sqlite3 \"\n SELECT DISTINCT m.iid FROM merge_requests m\n JOIN discussions d ON d.merge_request_id = m.id\n JOIN notes n ON n.discussion_id = d.id\n WHERE n.position_new_path IS NOT NULL LIMIT 1;\n\")\n```\n\n## Dependencies\nThis gate requires:\n- bd-3j6 (Discussion transformer with DiffNote position extraction)\n- bd-20h (MR discussion ingestion with atomicity guarantees)\n- bd-iba (Client pagination for MR discussions)\n- Gates A and B must pass first\n\n## Edge Cases\n- MRs without discussions: should sync successfully, just with 0 discussions\n- Discussions without DiffNotes: regular comments have NULL position fields\n- Deleted discussions in GitLab: sweep pattern should remove them locally\n- Invalid note timestamps: should NOT advance watermark, should log warning","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-26T22:06:01.769694Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:48:21.060017Z","closed_at":"2026-01-27T00:48:21.059974Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1t4","depends_on_id":"bd-20h","type":"blocks","created_at":"2026-01-26T22:08:55.778989Z","created_by":"tayloreernisse"}]}
{"id":"bd-1ta","title":"[CP1] Integration tests for pagination","description":"Integration tests for GitLab pagination with wiremock.\n\n## Tests (tests/pagination_tests.rs)\n\n### Page Navigation\n- fetches_all_pages_when_multiple_exist\n- respects_per_page_parameter\n- follows_x_next_page_header_until_empty\n- falls_back_to_empty_page_stop_if_headers_missing\n\n### Cursor Behavior\n- applies_cursor_rewind_for_tuple_semantics\n- clamps_negative_rewind_to_zero\n\n## Test Setup\n- Use wiremock::MockServer\n- Set up handlers for /api/v4/projects/:id/issues\n- Return x-next-page headers\n- Verify request params (updated_after, per_page)\n\nFiles: tests/pagination_tests.rs\nDone when: All pagination tests pass with mocked server","status":"tombstone","priority":3,"issue_type":"task","created_at":"2026-01-25T16:59:07.806593Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:02.038945Z","deleted_at":"2026-01-25T17:02:02.038939Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-1u1","title":"Implement document regenerator","description":"## Background\nThe document regenerator drains the dirty_sources queue, regenerating documents for each entry. It uses per-item transactions for crash safety, a triple-hash fast path to skip unchanged documents entirely (no writes at all), and a bounded batch loop that drains completely. Error recording includes backoff computation.\n\n## Approach\nCreate `src/documents/regenerator.rs` per PRD Section 6.3.\n\n**Core function:**\n```rust\npub fn regenerate_dirty_documents(conn: &Connection) -> Result<RegenerateResult>\n```\n\n**RegenerateResult:** { regenerated, unchanged, errored }\n\n**Algorithm (per PRD):**\n1. Loop: get_dirty_sources(conn) -> Vec<(SourceType, i64)>\n2. If empty, break (queue fully drained)\n3. For each (source_type, source_id):\n a. Begin transaction\n b. Call regenerate_one_tx(&tx, source_type, source_id) -> Result<bool>\n c. If Ok(changed): clear_dirty_tx, commit, count regenerated or unchanged\n d. If Err: record_dirty_error_tx (with backoff), commit, count errored\n\n**regenerate_one_tx (per PRD):**\n1. Extract document via extract_{type}_document(conn, source_id)\n2. If None (deleted): delete_document, return Ok(true)\n3. If Some(doc): call get_existing_hash() to check current state\n4. **If ALL THREE hashes match: return Ok(false) — skip ALL writes** (fast path)\n5. Otherwise: upsert_document with conditional label/path relinking\n6. Return Ok(content changed)\n\n**Helper functions (PRD-exact):**\n\n`get_existing_hash` — uses `optional()` to distinguish missing rows from DB errors:\n```rust\nfn get_existing_hash(\n conn: &Connection,\n source_type: SourceType,\n source_id: i64,\n) -> Result<Option<String>> {\n use rusqlite::OptionalExtension;\n let hash: Option<String> = stmt\n .query_row(params, |row| row.get(0))\n .optional()?; // IMPORTANT: Not .ok() — .ok() would hide real DB errors\n Ok(hash)\n}\n```\n\n`get_document_id` — resolve document ID after upsert:\n```rust\nfn get_document_id(conn: &Connection, source_type: SourceType, source_id: i64) -> Result<i64>\n```\n\n`upsert_document` — checks existing triple hash before writing:\n```rust\nfn upsert_document(conn: &Connection, doc: &DocumentData) -> Result<()> {\n // 1. Query existing (id, content_hash, labels_hash, paths_hash) via OptionalExtension\n // 2. Triple-hash fast path: all match -> return Ok(())\n // 3. Upsert document row (ON CONFLICT DO UPDATE)\n // 4. Get doc_id (from existing or query after insert)\n // 5. Only delete+reinsert labels if labels_hash changed\n // 6. Only delete+reinsert paths if paths_hash changed\n}\n```\n\n**Key PRD detail — triple-hash fast path:**\n```rust\nif old_content_hash == &doc.content_hash\n && old_labels_hash == &doc.labels_hash\n && old_paths_hash == &doc.paths_hash\n{ return Ok(()); } // Skip ALL writes — prevents WAL churn\n```\n\n**Error recording with backoff:**\nrecord_dirty_error_tx reads current attempt_count from DB, computes next_attempt_at via shared backoff utility:\n```rust\nlet next_attempt_at = crate::core::backoff::compute_next_attempt_at(now, attempt_count + 1);\n```\n\n**All internal functions use _tx suffix** (take &Transaction) for atomicity.\n\n## Acceptance Criteria\n- [ ] Queue fully drained (bounded batch loop until empty)\n- [ ] Per-item transactions (crash loses at most 1 doc)\n- [ ] Triple-hash fast path: ALL THREE hashes match -> skip ALL writes (return Ok(false))\n- [ ] Content change: upsert document, update labels/paths\n- [ ] Labels-only change: relabels but skips path writes (paths_hash unchanged)\n- [ ] Deleted entity: delete document (cascade handles FTS/labels/paths/embeddings)\n- [ ] get_existing_hash uses `.optional()` (not `.ok()`) to preserve DB errors\n- [ ] get_document_id resolves document ID after upsert\n- [ ] Error recording: increment attempt_count, compute next_attempt_at via backoff\n- [ ] FTS triggers fire on insert/update/delete (verified by trigger, not regenerator)\n- [ ] RegenerateResult counts accurate (regenerated, unchanged, errored)\n- [ ] Errors do not abort batch (log, increment, continue)\n- [ ] `cargo test regenerator` passes\n\n## Files\n- `src/documents/regenerator.rs` — new file\n- `src/documents/mod.rs` — add `pub use regenerator::regenerate_dirty_documents;`\n\n## TDD Loop\nRED: Tests requiring DB:\n- `test_creates_new_document` — dirty source -> document created\n- `test_skips_unchanged_triple_hash` — all 3 hashes match -> unchanged count incremented, no DB writes\n- `test_updates_changed_content` — content_hash mismatch -> updated\n- `test_updates_changed_labels_only` — content same but labels_hash different -> updated\n- `test_updates_changed_paths_only` — content same but paths_hash different -> updated\n- `test_deletes_missing_source` — source deleted -> document deleted\n- `test_drains_queue` — queue empty after regeneration\n- `test_error_records_backoff` — error -> attempt_count incremented, next_attempt_at set\n- `test_get_existing_hash_not_found` — returns Ok(None) for missing document\nGREEN: Implement regenerate_dirty_documents + all helpers\nVERIFY: `cargo test regenerator`\n\n## Edge Cases\n- Empty queue: return immediately with all-zero counts\n- Extractor error for one item: record_dirty_error_tx, commit, continue\n- Triple-hash prevents WAL churn on incremental syncs (most entities unchanged)\n- Labels change but content does not: labels_hash mismatch triggers upsert with label relinking\n- get_existing_hash on missing document: returns Ok(None) via .optional() (not DB error)\n- get_existing_hash on corrupt DB: propagates real DB error (not masked by .ok())","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:25:55.178825Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:41:29.942386Z","closed_at":"2026-01-30T17:41:29.942324Z","close_reason":"Implemented document regenerator with triple-hash fast path, queue draining, fail-soft error handling + 5 tests","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1u1","depends_on_id":"bd-1yz","type":"blocks","created_at":"2026-01-30T15:29:16.020686Z","created_by":"tayloreernisse"},{"issue_id":"bd-1u1","depends_on_id":"bd-247","type":"blocks","created_at":"2026-01-30T15:29:15.982772Z","created_by":"tayloreernisse"},{"issue_id":"bd-1u1","depends_on_id":"bd-2fp","type":"blocks","created_at":"2026-01-30T15:29:16.055043Z","created_by":"tayloreernisse"}]}
{"id":"bd-1uc","title":"Implement DB upsert functions for resource events","description":"## Background\nNeed to store fetched resource events into the three event tables created by migration 011. The existing DB pattern uses rusqlite prepared statements with named parameters. Timestamps from GitLab are ISO 8601 strings that need conversion to ms epoch UTC (matching the existing time.rs parse_datetime_to_ms function).\n\n## Approach\nCreate src/core/events_db.rs (new module) with three upsert functions:\n\n```rust\nuse rusqlite::Connection;\nuse super::error::Result;\n\n/// Upsert state events for an entity.\n/// Uses INSERT OR REPLACE keyed on UNIQUE(gitlab_id, project_id).\npub fn upsert_state_events(\n conn: &Connection,\n project_id: i64, // local DB project id\n entity_type: &str, // \"issue\" | \"merge_request\"\n entity_local_id: i64, // local DB id of the issue/MR\n events: &[GitLabStateEvent],\n) -> Result<usize>\n\n/// Upsert label events for an entity.\npub fn upsert_label_events(\n conn: &Connection,\n project_id: i64,\n entity_type: &str,\n entity_local_id: i64,\n events: &[GitLabLabelEvent],\n) -> Result<usize>\n\n/// Upsert milestone events for an entity.\npub fn upsert_milestone_events(\n conn: &Connection,\n project_id: i64,\n entity_type: &str,\n entity_local_id: i64,\n events: &[GitLabMilestoneEvent],\n) -> Result<usize>\n```\n\nEach function:\n1. Prepares INSERT OR REPLACE statement\n2. For each event, maps GitLab types to DB columns:\n - `actor_gitlab_id` = event.user.map(|u| u.id)\n - `actor_username` = event.user.map(|u| u.username.clone())\n - `created_at` = parse_datetime_to_ms(&event.created_at)?\n - Set issue_id or merge_request_id based on entity_type\n3. Returns count of upserted rows\n4. Wraps in a savepoint for atomicity per entity\n\nRegister module in src/core/mod.rs:\n```rust\npub mod events_db;\n```\n\n## Acceptance Criteria\n- [ ] All three upsert functions compile and handle all event fields\n- [ ] Upserts are idempotent (re-inserting same event doesn't duplicate)\n- [ ] Timestamps converted to ms epoch UTC via parse_datetime_to_ms\n- [ ] actor_gitlab_id and actor_username populated from event.user (handles None)\n- [ ] entity_type correctly maps to issue_id/merge_request_id (other is NULL)\n- [ ] source_merge_request_id populated for state events (iid from source_merge_request)\n- [ ] source_commit populated for state events\n- [ ] label_name populated for label events\n- [ ] milestone_title and milestone_id populated for milestone events\n- [ ] Returns upserted count\n\n## Files\n- src/core/events_db.rs (new)\n- src/core/mod.rs (add `pub mod events_db;`)\n\n## TDD Loop\nRED: tests/events_db_tests.rs (new):\n- `test_upsert_state_events_basic` - insert 3 events, verify count and data\n- `test_upsert_state_events_idempotent` - insert same events twice, verify no duplicates\n- `test_upsert_label_events_with_actor` - verify actor fields populated\n- `test_upsert_milestone_events_null_user` - verify user: null doesn't crash\n- `test_upsert_state_events_entity_exclusivity` - verify only one of issue_id/merge_request_id set\n\nSetup: create_test_db() helper that applies migrations 001-011, inserts a test project + issue + MR.\n\nGREEN: Implement the three functions\n\nVERIFY: `cargo test events_db -- --nocapture`\n\n## Edge Cases\n- parse_datetime_to_ms must handle GitLab's format: \"2024-03-15T10:30:00.000Z\" and \"2024-03-15T10:30:00.000+00:00\"\n- INSERT OR REPLACE will fire CASCADE deletes if there are FK references to these rows — currently no other table references event rows, so this is safe\n- entity_type must be validated (\"issue\" or \"merge_request\") — panic or error on invalid\n- source_merge_request field contains an MR ref object, not an ID — extract .iid for DB column","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-02T21:31:57.242549Z","created_by":"tayloreernisse","updated_at":"2026-02-03T16:19:14.169437Z","closed_at":"2026-02-03T16:19:14.169233Z","close_reason":"Implemented upsert_state_events, upsert_label_events, upsert_milestone_events, count_events in src/core/events_db.rs. Uses savepoints for atomicity, LoreError::Database via ? operator for clean error handling.","compaction_level":0,"original_size":0,"labels":["db","gate-1","phase-b"],"dependencies":[{"issue_id":"bd-1uc","depends_on_id":"bd-2zl","type":"parent-child","created_at":"2026-02-02T21:31:57.246078Z","created_by":"tayloreernisse"},{"issue_id":"bd-1uc","depends_on_id":"bd-hu3","type":"blocks","created_at":"2026-02-02T21:31:57.247258Z","created_by":"tayloreernisse"}]}
{"id":"bd-1ut","title":"[CP0] Final validation - tests, lint, typecheck","description":"## Background\n\nFinal validation ensures everything works together before marking CP0 complete. This is the integration gate - all unit tests, integration tests, lint, and type checking must pass. Manual smoke tests verify the full user experience.\n\nReference: docs/prd/checkpoint-0.md sections \"Definition of Done\", \"Manual Smoke Tests\"\n\n## Approach\n\n**Automated checks:**\n```bash\n# All tests pass\nnpm run test\n\n# TypeScript strict mode\nnpm run build # or: npx tsc --noEmit\n\n# ESLint with no errors\nnpm run lint\n```\n\n**Manual smoke tests (from PRD table):**\n\n| Command | Expected | Pass Criteria |\n|---------|----------|---------------|\n| `gi --help` | Command list | Shows all commands |\n| `gi version` | Version number | Shows installed version |\n| `gi init` | Interactive prompts | Creates valid config |\n| `gi init` (config exists) | Confirmation prompt | Warns before overwriting |\n| `gi init --force` | No prompt | Overwrites without asking |\n| `gi auth-test` | `Authenticated as @username` | Shows GitLab username |\n| `GITLAB_TOKEN=invalid gi auth-test` | Error message | Non-zero exit, clear error |\n| `gi doctor` | Status table | All required checks pass |\n| `gi doctor --json` | JSON object | Valid JSON, `success: true` |\n| `gi backup` | Backup path | Creates timestamped backup |\n| `gi sync-status` | No runs message | Stub output works |\n\n**Definition of Done gate items:**\n- [ ] `gi init` writes config to XDG path and validates projects against GitLab\n- [ ] `gi auth-test` succeeds with real PAT\n- [ ] `gi doctor` reports DB ok + GitLab ok\n- [ ] DB migrations apply; WAL + FK enabled; busy_timeout + synchronous set\n- [ ] App lock mechanism works (concurrent runs blocked)\n- [ ] All unit tests pass\n- [ ] All integration tests pass (mocked)\n- [ ] ESLint passes with no errors\n- [ ] TypeScript compiles with strict mode\n\n## Acceptance Criteria\n\n- [ ] `npm run test` exits 0 (all tests pass)\n- [ ] `npm run build` exits 0 (TypeScript compiles)\n- [ ] `npm run lint` exits 0 (no ESLint errors)\n- [ ] All 11 manual smoke tests pass\n- [ ] All 9 Definition of Done gate items verified\n\n## Files\n\nNo new files created. This bead verifies existing work.\n\n## TDD Loop\n\nThis IS the final verification step:\n\n```bash\n# Automated\nnpm run test\nnpm run build\nnpm run lint\n\n# Manual (requires GITLAB_TOKEN set with valid token)\ngi --help\ngi version\ngi init # go through setup\ngi auth-test\ngi doctor\ngi doctor --json | jq .success # should output true\ngi backup\ngi sync-status\ngi reset --confirm\ngi init # re-setup\n```\n\n## Edge Cases\n\n- Test coverage should be reasonable (aim for 80%+ on core modules)\n- Integration tests may flake on CI - check MSW setup\n- Manual tests require real GitLab token - document in README\n- ESLint may warn vs error - only errors block\n- TypeScript noImplicitAny catches missed types","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-24T16:09:52.078907Z","created_by":"tayloreernisse","updated_at":"2026-01-25T03:37:51.858558Z","closed_at":"2026-01-25T03:37:51.858474Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1ut","depends_on_id":"bd-1cb","type":"blocks","created_at":"2026-01-24T16:13:11.184261Z","created_by":"tayloreernisse"},{"issue_id":"bd-1ut","depends_on_id":"bd-1gu","type":"blocks","created_at":"2026-01-24T16:13:11.168637Z","created_by":"tayloreernisse"},{"issue_id":"bd-1ut","depends_on_id":"bd-1kh","type":"blocks","created_at":"2026-01-24T16:13:11.219042Z","created_by":"tayloreernisse"},{"issue_id":"bd-1ut","depends_on_id":"bd-38e","type":"blocks","created_at":"2026-01-24T16:13:11.150286Z","created_by":"tayloreernisse"},{"issue_id":"bd-1ut","depends_on_id":"bd-3kj","type":"blocks","created_at":"2026-01-24T16:13:11.200998Z","created_by":"tayloreernisse"}]}
{"id":"bd-1v8","title":"Update robot-docs manifest with Phase B commands","description":"## Background\n\nThe robot-docs manifest is the agent self-discovery mechanism. It must include all Phase B commands so AI agents can discover and use timeline, file-history, trace, and count references.\n\n## Approach\n\nIn `src/main.rs` `handle_robot_docs()`, add entries to the `commands` JSON object:\n\n```rust\n\"timeline\": {\n \"description\": \"Show chronological timeline of events matching a keyword query\",\n \"flags\": [\"<QUERY>\", \"-p/--project\", \"--since\", \"--depth\", \"--expand-mentions\", \"--no-expand-mentions\", \"-n/--limit\"],\n \"example\": \"lore --robot timeline 'authentication' --since 30d --depth 2\"\n},\n\"file-history\": {\n \"description\": \"Show which MRs touched a file path, with rename chain resolution\",\n \"flags\": [\"<PATH>\", \"-p/--project\", \"--discussions\", \"--no-discussions\", \"--no-follow-renames\", \"--merged\", \"-n/--limit\"],\n \"example\": \"lore --robot file-history src/auth/oauth.rs --discussions\"\n},\n\"trace\": {\n \"description\": \"Trace decision chain: file -> MR -> issue -> discussions\",\n \"flags\": [\"<PATH>\", \"-p/--project\", \"--discussions\", \"--no-discussions\", \"-n/--limit\"],\n \"example\": \"lore --robot trace src/auth/oauth.rs --discussions\"\n},\n\"count references\": {\n \"description\": \"Count entity cross-references (closes, mentioned, related)\",\n \"flags\": [\"references (as entity arg)\"],\n \"example\": \"lore --robot count references\"\n}\n```\n\nAlso add a \"temporal_intelligence\" workflow:\n```rust\n\"temporal_intelligence\": [\n \"lore --robot timeline 'keyword' --since 30d\",\n \"lore --robot file-history src/path.rs --discussions\",\n \"lore --robot trace src/path.rs --discussions\"\n]\n```\n\n## Acceptance Criteria\n\n- [ ] `lore robot-docs` output includes timeline, file-history, trace commands\n- [ ] `lore robot-docs` output includes count references variant\n- [ ] temporal_intelligence workflow is present in workflows section\n- [ ] All flag descriptions match actual CLI flag names\n- [ ] Examples are valid, runnable commands\n- [ ] `cargo check --all-targets` passes\n- [ ] `cargo clippy --all-targets -- -D warnings` passes\n\n## Files\n\n- `src/main.rs` (update handle_robot_docs commands JSON + workflows JSON)\n\n## TDD Loop\n\nRED: No unit test needed -- this is a JSON manifest.\n\nGREEN: Add the JSON entries.\n\nVERIFY: `cargo check --all-targets && lore robot-docs | jq '.data.commands.timeline'`\n\n## Edge Cases\n\n- Keep the JSON object alphabetically sorted for readability\n- Ensure the 'references' count command is documented as a value of the entity argument, not a separate command","status":"open","priority":3,"issue_type":"task","created_at":"2026-02-02T22:43:07.859092Z","created_by":"tayloreernisse","updated_at":"2026-02-05T18:52:59.297040Z","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1v8","depends_on_id":"bd-1ht","type":"parent-child","created_at":"2026-02-02T22:43:40.760196Z","created_by":"tayloreernisse"},{"issue_id":"bd-1v8","depends_on_id":"bd-2ez","type":"blocks","created_at":"2026-02-02T22:43:33.990140Z","created_by":"tayloreernisse"},{"issue_id":"bd-1v8","depends_on_id":"bd-2n4","type":"blocks","created_at":"2026-02-02T22:43:33.937157Z","created_by":"tayloreernisse"}]}
{"id":"bd-1x6","title":"Implement lore sync CLI command","description":"## Background\nThe sync command is the unified orchestrator for the full pipeline: ingest -> generate-docs -> embed. It replaces the need to run three separate commands. It acquires a lock, runs each stage sequentially, and reports combined results. Individual stages can be skipped via flags (--no-embed, --no-docs). The command is designed for cron/scheduled execution. Individual commands (`lore generate-docs`, `lore embed`) still exist for manual recovery and debugging.\n\n## Approach\nCreate `src/cli/commands/sync.rs` per PRD Section 6.4.\n\n**IMPORTANT: run_sync is async** (embed_documents and search_hybrid are async).\n\n**Key types (PRD-exact):**\n```rust\n#[derive(Debug, Serialize)]\npub struct SyncResult {\n pub issues_updated: usize,\n pub mrs_updated: usize,\n pub discussions_fetched: usize,\n pub documents_regenerated: usize,\n pub documents_embedded: usize,\n}\n\n#[derive(Debug, Default)]\npub struct SyncOptions {\n pub full: bool, // Reset cursors, fetch everything\n pub force: bool, // Override stale lock\n pub no_embed: bool, // Skip embedding step\n pub no_docs: bool, // Skip document regeneration\n}\n```\n\n**Core function (async, PRD-exact):**\n```rust\npub async fn run_sync(config: &Config, options: SyncOptions) -> Result<SyncResult>\n```\n\n**Pipeline (sequential steps per PRD):**\n1. Acquire app lock with heartbeat (via existing `src/core/lock.rs`)\n2. Ingest delta: fetch issues + MRs via cursor-based sync (calls existing ingestion orchestrator)\n - Each upserted entity marked dirty via `mark_dirty_tx(&tx)` inside ingestion transaction\n3. Process `pending_discussion_fetches` queue (bounded)\n - Discussion sweep uses CTE to capture stale IDs, then cascading deletes\n4. Regenerate documents from `dirty_sources` queue (unless --no-docs)\n5. Embed documents with changed content_hash (unless --no-embed; skipped gracefully if Ollama unavailable)\n6. Release lock, record sync_run\n\n**NOTE (PRD):** Rolling backfill window removed — the existing cursor + watermark design handles old issues with resumed activity. GitLab updates `updated_at` when new comments are added, so the cursor naturally picks up old issues that receive new activity.\n\n**CLI args (PRD-exact):**\n```rust\n#[derive(Args)]\npub struct SyncArgs {\n /// Reset cursors, fetch everything\n #[arg(long)]\n full: bool,\n /// Override stale lock\n #[arg(long)]\n force: bool,\n /// Skip embedding step\n #[arg(long)]\n no_embed: bool,\n /// Skip document regeneration\n #[arg(long)]\n no_docs: bool,\n}\n```\n\n**Human output:**\n```\nSync complete:\n Issues updated: 42\n MRs updated: 18\n Discussions fetched: 56\n Documents regenerated: 38\n Documents embedded: 38\n Elapsed: 2m 15s\n```\n\n**JSON output:**\n```json\n{\"ok\": true, \"data\": {...}, \"meta\": {\"elapsed_ms\": 135000}}\n```\n\n## Acceptance Criteria\n- [ ] Function is `async fn run_sync`\n- [ ] Takes `SyncOptions` struct (not separate params)\n- [ ] Returns `SyncResult` with flat fields (not nested sub-structs)\n- [ ] Full pipeline orchestrated: ingest -> discussion queue -> docs -> embed\n- [ ] --full resets cursors (passes through to ingest)\n- [ ] --force overrides stale sync lock\n- [ ] --no-embed skips embedding stage (Ollama not needed)\n- [ ] --no-docs skips document regeneration stage\n- [ ] Discussion queue processing bounded per run\n- [ ] Dirty sources marked inside ingestion transactions (via mark_dirty_tx)\n- [ ] Progress reporting: stage names + elapsed time\n- [ ] Lock acquired with heartbeat at start, released at end (even on error)\n- [ ] Embedding skipped gracefully if Ollama unavailable (warning, not error)\n- [ ] JSON summary in robot mode\n- [ ] Human-readable summary with elapsed time\n- [ ] `cargo build` succeeds\n\n## Files\n- `src/cli/commands/sync.rs` — new file\n- `src/cli/commands/mod.rs` — add `pub mod sync;`\n- `src/cli/mod.rs` — add SyncArgs, wire up sync subcommand\n- `src/main.rs` — add sync command handler (async dispatch)\n\n## TDD Loop\nRED: Integration test requiring full pipeline\nGREEN: Implement run_sync orchestration (async)\nVERIFY: `cargo build && cargo test sync`\n\n## Edge Cases\n- Ollama unavailable + --no-embed not set: sync should NOT fail — embed stage logs warning, returns 0 embedded\n- Lock already held: error unless --force (and lock is stale)\n- No dirty sources after ingest: regeneration stage returns 0 (not error)\n- --full with large dataset: keyset pagination prevents OFFSET degradation","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:27:09.577782Z","created_by":"tayloreernisse","updated_at":"2026-01-30T18:05:34.676100Z","closed_at":"2026-01-30T18:05:34.676035Z","close_reason":"Sync CLI: async run_sync orchestrator with 4-stage pipeline (ingest issues, ingest MRs, generate-docs, embed), SyncOptions/SyncResult, --full/--force/--no-embed/--no-docs flags, graceful Ollama degradation, human+JSON output, clean build, all tests pass","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1x6","depends_on_id":"bd-1i2","type":"blocks","created_at":"2026-01-30T15:29:35.287132Z","created_by":"tayloreernisse"},{"issue_id":"bd-1x6","depends_on_id":"bd-1je","type":"blocks","created_at":"2026-01-30T15:29:35.250622Z","created_by":"tayloreernisse"},{"issue_id":"bd-1x6","depends_on_id":"bd-2sx","type":"blocks","created_at":"2026-01-30T15:29:35.179059Z","created_by":"tayloreernisse"},{"issue_id":"bd-1x6","depends_on_id":"bd-38q","type":"blocks","created_at":"2026-01-30T15:29:35.213566Z","created_by":"tayloreernisse"},{"issue_id":"bd-1x6","depends_on_id":"bd-3qs","type":"blocks","created_at":"2026-01-30T15:29:35.144296Z","created_by":"tayloreernisse"}]}
{"id":"bd-1y8","title":"Implement chunk ID encoding module","description":"## Background\nsqlite-vec uses a single integer rowid for embeddings. To store multiple chunks per document, we encode (document_id, chunk_index) into a single rowid using a multiplier. This module is shared between the embedding pipeline (encode on write) and vector search (decode on read). The encoding scheme supports up to 1000 chunks per document.\n\n## Approach\nCreate `src/embedding/chunk_ids.rs`:\n\n```rust\n/// Multiplier for encoding (document_id, chunk_index) into a single rowid.\n/// Supports up to 1000 chunks per document (32M chars at 32k/chunk).\npub const CHUNK_ROWID_MULTIPLIER: i64 = 1000;\n\n/// Encode (document_id, chunk_index) into a sqlite-vec rowid.\n///\n/// rowid = document_id * CHUNK_ROWID_MULTIPLIER + chunk_index\npub fn encode_rowid(document_id: i64, chunk_index: i64) -> i64 {\n document_id * CHUNK_ROWID_MULTIPLIER + chunk_index\n}\n\n/// Decode a sqlite-vec rowid back into (document_id, chunk_index).\npub fn decode_rowid(rowid: i64) -> (i64, i64) {\n let document_id = rowid / CHUNK_ROWID_MULTIPLIER;\n let chunk_index = rowid % CHUNK_ROWID_MULTIPLIER;\n (document_id, chunk_index)\n}\n```\n\nAlso create the parent module `src/embedding/mod.rs`:\n```rust\npub mod chunk_ids;\n// Later beads add: pub mod ollama; pub mod pipeline;\n```\n\nUpdate `src/lib.rs`: add `pub mod embedding;`\n\n## Acceptance Criteria\n- [ ] `encode_rowid(42, 0)` == 42000\n- [ ] `encode_rowid(42, 5)` == 42005\n- [ ] `decode_rowid(42005)` == (42, 5)\n- [ ] Roundtrip: decode(encode(doc_id, chunk_idx)) == (doc_id, chunk_idx) for all valid inputs\n- [ ] CHUNK_ROWID_MULTIPLIER is 1000\n- [ ] `cargo test chunk_ids` passes\n\n## Files\n- `src/embedding/chunk_ids.rs` — new file\n- `src/embedding/mod.rs` — new file (module root)\n- `src/lib.rs` — add `pub mod embedding;`\n\n## TDD Loop\nRED: Tests in `#[cfg(test)] mod tests`:\n- `test_encode_single_chunk` — encode(1, 0) == 1000\n- `test_encode_multi_chunk` — encode(1, 5) == 1005\n- `test_decode_roundtrip` — property test over range of doc_ids and chunk_indices\n- `test_decode_zero_chunk` — decode(42000) == (42, 0)\n- `test_multiplier_value` — assert CHUNK_ROWID_MULTIPLIER == 1000\nGREEN: Implement encode_rowid, decode_rowid\nVERIFY: `cargo test chunk_ids`\n\n## Edge Cases\n- chunk_index >= 1000: not expected (documents that large would be pathological), but no runtime panic — just incorrect decode. The embedding pipeline caps chunks well below this.\n- document_id = 0: valid (encode returns chunk_index directly)","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-30T15:26:34.060769Z","created_by":"tayloreernisse","updated_at":"2026-01-30T16:51:59.048910Z","closed_at":"2026-01-30T16:51:59.048843Z","close_reason":"Completed: chunk_ids module with encode_rowid/decode_rowid, CHUNK_ROWID_MULTIPLIER=1000, 6 tests pass","compaction_level":0,"original_size":0}
{"id":"bd-1yu","title":"[CP1] GitLab types for issues, discussions, notes","description":"Add TypeScript interfaces for GitLab API responses.\n\nTypes to add to src/gitlab/types.ts:\n- GitLabIssue: id, iid, project_id, title, description, state, timestamps, author, labels[], labels_details?, web_url\n- GitLabDiscussion: id (string), individual_note, notes[]\n- GitLabNote: id, type, body, author, timestamps, system, resolvable, resolved, resolved_by, resolved_at, position?\n\nFiles: src/gitlab/types.ts\nDone when: Types compile and match GitLab API documentation","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:19:00.558718Z","created_by":"tayloreernisse","updated_at":"2026-01-25T15:21:35.153996Z","deleted_at":"2026-01-25T15:21:35.153993Z","deleted_by":"tayloreernisse","delete_reason":"delete","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-1yx","title":"Implement rename chain resolution for file-history","description":"## Background\n\nRename chain resolution is the core algorithm for Gate 4 (file-history). When a user queries \"show me the history of src/auth.rs\", they expect to see MRs that touched the file even before it was renamed -- e.g., when it was previously called src/authentication.rs. The rename chain finds all historical names of a file by following rename records in mr_file_changes.\n\n## Approach\n\nCreate `src/core/file_history.rs` with:\n\n```rust\nuse rusqlite::Connection;\nuse std::collections::HashSet;\n\n/// All file paths that are historically equivalent to the given path,\n/// including the path itself. Returned in chronological order (oldest first).\npub fn resolve_rename_chain(\n conn: &Connection,\n project_id: i64,\n path: &str,\n max_hops: usize, // default 10\n) -> Result<Vec<RenameStep>> {\n // BFS over mr_file_changes WHERE change_type = 'renamed'\n // Follow both directions:\n // - old_path -> new_path (file was renamed TO current name)\n // - new_path -> old_path (file was renamed FROM current name)\n // Bounded at max_hops with cycle detection via HashSet<String>\n}\n\n#[derive(Debug, Clone)]\npub struct RenameStep {\n pub path: String,\n pub renamed_from: Option<String>,\n pub merge_request_iid: i64,\n pub merge_date: i64, // ms epoch UTC\n}\n```\n\n### Algorithm\n\n```\nfn resolve_rename_chain(conn, project_id, path, max_hops):\n visited: HashSet<String> = {path}\n queue: VecDeque<String> = [path]\n steps: Vec<RenameStep> = []\n\n while let Some(current) = queue.pop_front() && steps.len() < max_hops:\n // Forward: find MRs where old_path = current (renamed FROM current)\n SELECT mfc.new_path, mfc.merge_request_id, mr.iid, mr.updated_at\n FROM mr_file_changes mfc\n JOIN merge_requests mr ON mr.id = mfc.merge_request_id\n WHERE mfc.project_id = ?1\n AND mfc.old_path = ?2\n AND mfc.change_type = 'renamed'\n\n for each row: if new_path not in visited, add to queue + visited + steps\n\n // Backward: find MRs where new_path = current (renamed TO current)\n SELECT mfc.old_path, mfc.merge_request_id, mr.iid, mr.updated_at\n FROM mr_file_changes mfc\n JOIN merge_requests mr ON mr.id = mfc.merge_request_id\n WHERE mfc.project_id = ?1\n AND mfc.new_path = ?2\n AND mfc.change_type = 'renamed'\n\n for each row: if old_path not in visited, add to queue + visited + steps\n\n // Sort steps chronologically\n steps.sort_by_key(|s| s.merge_date)\n return steps\n```\n\n### Usage\n\nThe rename chain produces a set of all paths. File-history then queries MRs touching ANY of those paths:\n\n```sql\nSELECT DISTINCT mr.* FROM mr_file_changes mfc\nJOIN merge_requests mr ON mr.id = mfc.merge_request_id\nWHERE mfc.project_id = ?1\n AND mfc.new_path IN (?2, ?3, ?4, ...)\nORDER BY mr.updated_at DESC\n```\n\nRegister in `src/core/mod.rs`: `pub mod file_history;`\n\n## Acceptance Criteria\n\n- [ ] `resolve_rename_chain()` follows renames in both directions\n- [ ] Cycles detected and avoided (same path never visited twice)\n- [ ] Bounded at max_hops (default 10) to prevent runaway\n- [ ] Returns paths in chronological order (oldest rename first)\n- [ ] Empty result (no renames found) returns vec with just the original path\n- [ ] Module registered in `src/core/mod.rs`\n- [ ] `cargo check --all-targets` passes\n- [ ] `cargo clippy --all-targets -- -D warnings` passes\n\n## Files\n\n- `src/core/file_history.rs` (NEW)\n- `src/core/mod.rs` (add `pub mod file_history;`)\n\n## TDD Loop\n\nRED: Create `src/core/file_history.rs` with `#[cfg(test)] mod tests`:\n- `test_rename_chain_no_renames` - file with no renames returns just itself\n- `test_rename_chain_forward` - a.rs -> b.rs -> c.rs finds all three\n- `test_rename_chain_backward` - starting from c.rs also finds a.rs and b.rs\n- `test_rename_chain_cycle_detection` - a.rs -> b.rs -> a.rs terminates without infinite loop\n- `test_rename_chain_max_hops` - chain longer than max_hops is bounded\n\nThese tests need an in-memory SQLite DB with mr_file_changes table. Create a test helper that runs migration 015 (which creates mr_file_changes).\n\nGREEN: Implement BFS with visited set.\n\nVERIFY: `cargo test --lib -- file_history`\n\n## Edge Cases\n\n- File never renamed: return single-element vec containing original path\n- Circular rename (a->b->a): visited set prevents infinite loop\n- Cross-project renames: not supported (project_id scoped); document this limitation\n- Case sensitivity: file paths are case-sensitive (Linux default)\n- max_hops=0: return just the original path with no queries","status":"open","priority":2,"issue_type":"task","created_at":"2026-02-02T21:34:08.985345Z","created_by":"tayloreernisse","updated_at":"2026-02-05T18:51:40.984081Z","compaction_level":0,"original_size":0,"labels":["gate-4","phase-b","query"],"dependencies":[{"issue_id":"bd-1yx","depends_on_id":"bd-14q","type":"parent-child","created_at":"2026-02-02T21:34:08.986730Z","created_by":"tayloreernisse"},{"issue_id":"bd-1yx","depends_on_id":"bd-1oo","type":"blocks","created_at":"2026-02-02T21:34:16.698782Z","created_by":"tayloreernisse"}]}
{"id":"bd-1yz","title":"Implement MR document extraction","description":"## Background\nMR documents are similar to issue documents but include source/target branch information in the header. The extractor queries merge_requests and mr_labels tables. Like issue extraction, it produces a DocumentData struct for the regeneration pipeline.\n\n## Approach\nImplement `extract_mr_document()` in `src/documents/extractor.rs`:\n\n```rust\n/// Extract a searchable document from a merge request.\n/// Returns None if the MR has been deleted from the DB.\npub fn extract_mr_document(conn: &Connection, mr_id: i64) -> Result<Option<DocumentData>>\n```\n\n**SQL queries (from PRD Section 2.2):**\n```sql\n-- Main entity\nSELECT m.id, m.iid, m.title, m.description, m.state, m.author_username,\n m.source_branch, m.target_branch,\n m.created_at, m.updated_at, m.web_url,\n p.path_with_namespace, p.id AS project_id\nFROM merge_requests m\nJOIN projects p ON p.id = m.project_id\nWHERE m.id = ?\n\n-- Labels\nSELECT l.name FROM mr_labels ml\nJOIN labels l ON l.id = ml.label_id\nWHERE ml.merge_request_id = ?\nORDER BY l.name\n```\n\n**Document format:**\n```\n[[MergeRequest]] !456: Implement JWT authentication\nProject: group/project-one\nURL: https://gitlab.example.com/group/project-one/-/merge_requests/456\nLabels: [\"feature\", \"auth\"]\nState: opened\nAuthor: @johndoe\nSource: feature/jwt-auth -> main\n\n--- Description ---\n\nThis MR implements JWT-based authentication...\n```\n\n**Key difference from issues:** The `Source:` line with `source_branch -> target_branch`.\n\n## Acceptance Criteria\n- [ ] Deleted MR returns Ok(None)\n- [ ] MR document has `[[MergeRequest]]` prefix with `!` before iid (not `#`)\n- [ ] Source line shows `source_branch -> target_branch`\n- [ ] Labels sorted alphabetically in JSON array\n- [ ] content_hash computed from full content_text\n- [ ] labels_hash computed from sorted labels\n- [ ] paths is empty (MR-level docs don't have DiffNote paths; those are on discussion docs)\n- [ ] `cargo test extract_mr` passes\n\n## Files\n- `src/documents/extractor.rs` — implement `extract_mr_document()`\n\n## TDD Loop\nRED: Tests in `#[cfg(test)] mod tests`:\n- `test_mr_document_format` — verify header matches PRD template with Source line\n- `test_mr_not_found` — returns Ok(None)\n- `test_mr_no_description` — header only\n- `test_mr_branch_info` — Source line correct\nGREEN: Implement extract_mr_document with SQL queries\nVERIFY: `cargo test extract_mr`\n\n## Edge Cases\n- MR with NULL description: skip \"--- Description ---\" section\n- MR with NULL source_branch or target_branch: omit Source line (shouldn't happen in practice)\n- Draft MRs: state field captures this, no special handling needed","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-30T15:25:45.521703Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:30:04.308781Z","closed_at":"2026-01-30T17:30:04.308598Z","close_reason":"Implemented extract_mr_document() with Source line, PRD format, and 5 tests","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1yz","depends_on_id":"bd-36p","type":"blocks","created_at":"2026-01-30T15:29:15.749264Z","created_by":"tayloreernisse"},{"issue_id":"bd-1yz","depends_on_id":"bd-hrs","type":"blocks","created_at":"2026-01-30T15:29:15.814729Z","created_by":"tayloreernisse"}]}
{"id":"bd-1zj6","title":"OBSERV: Enrich robot JSON meta with run_id and stages","description":"## Background\nRobot JSON currently has a flat meta.elapsed_ms. This enriches it with run_id and a stages array, making every lore --robot sync output a complete performance profile.\n\n## Approach\nThe robot JSON output is built in src/cli/commands/sync.rs. The current SyncResult (line 15-25) is serialized into the data field. The meta field is built alongside it.\n\n1. Find or create the SyncMeta struct (likely near SyncResult). Add fields:\n```rust\n#[derive(Debug, Serialize)]\nstruct SyncMeta {\n run_id: String,\n elapsed_ms: u64,\n stages: Vec<StageTiming>,\n}\n```\n\n2. After run_sync() completes, extract timings from MetricsLayer:\n```rust\nlet stages = metrics_handle.extract_timings();\nlet meta = SyncMeta {\n run_id: run_id.to_string(),\n elapsed_ms: start.elapsed().as_millis() as u64,\n stages,\n};\n```\n\n3. Build the JSON envelope:\n```rust\nlet output = serde_json::json!({\n \"ok\": true,\n \"data\": result,\n \"meta\": meta,\n});\n```\n\nThe metrics_handle (Arc<MetricsLayer>) must be passed from main.rs to the command handler. This requires adding a parameter to handle_sync_cmd() and run_sync(), or using a global. Prefer parameter passing.\n\nSame pattern for standalone ingest: add stages to IngestMeta.\n\n## Acceptance Criteria\n- [ ] lore --robot sync output includes meta.run_id (string, 8 hex chars)\n- [ ] lore --robot sync output includes meta.stages (array of StageTiming)\n- [ ] meta.elapsed_ms still present (total wall clock time)\n- [ ] Each stage has name, elapsed_ms, items_processed at minimum\n- [ ] Top-level stages have sub_stages when applicable\n- [ ] lore --robot ingest also includes run_id and stages\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/cli/commands/sync.rs (add SyncMeta struct, wire extract_timings)\n- src/cli/commands/ingest.rs (same for standalone ingest)\n- src/main.rs (pass metrics_handle to command handlers)\n\n## TDD Loop\nRED: test_sync_meta_includes_stages (run robot-mode sync, parse JSON, assert meta.stages is array)\nGREEN: Add SyncMeta, extract timings, include in JSON output\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- Empty stages: if sync runs with --no-docs --no-embed, some stages won't exist. stages array is shorter, not padded.\n- extract_timings() called before root span closes: returns incomplete tree. Must call AFTER run_sync returns (span is dropped on function exit).\n- metrics_handle clone: MetricsLayer uses Arc internally, clone is cheap (reference count increment).","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-04T15:54:32.062410Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:31:11.073580Z","closed_at":"2026-02-04T17:31:11.073534Z","close_reason":"Wired MetricsLayer into subscriber stack (all 4 branches), added run_id to SyncResult, enriched SyncMeta with run_id + stages Vec<StageTiming>, updated print_sync_json to accept MetricsLayer and extract timings","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-1zj6","depends_on_id":"bd-34ek","type":"blocks","created_at":"2026-02-04T15:55:20.085372Z","created_by":"tayloreernisse"},{"issue_id":"bd-1zj6","depends_on_id":"bd-3er","type":"parent-child","created_at":"2026-02-04T15:54:32.063354Z","created_by":"tayloreernisse"}]}
{"id":"bd-1zwv","title":"Display assignees, due_date, and milestone in lore issues <id> output","description":"## Background\nThe `lore issues <id>` command displays issue details but omits key metadata that exists in the database: assignees, due dates, and milestones. Users need this information to understand issue context without opening GitLab.\n\n**System fit**: This data is already ingested during issue sync (migration 005) but the show command never queries it.\n\n## Approach\n\nAll changes in `src/cli/commands/show.rs`:\n\n### 1. Update IssueRow struct (line ~119)\nAdd fields to internal row struct:\n```rust\nstruct IssueRow {\n // ... existing 10 fields ...\n due_date: Option<String>, // NEW\n milestone_title: Option<String>, // NEW\n}\n```\n\n### 2. Update find_issue() SQL (line ~137)\nExtend SELECT:\n```sql\nSELECT i.id, i.iid, i.title, i.description, i.state, i.author_username,\n i.created_at, i.updated_at, i.web_url, p.path_with_namespace,\n i.due_date, i.milestone_title -- ADD THESE\nFROM issues i ...\n```\n\nUpdate row mapping to extract columns 10 and 11.\n\n### 3. Add get_issue_assignees() (after get_issue_labels ~line 189)\n```rust\nfn get_issue_assignees(conn: &Connection, issue_id: i64) -> Result<Vec<String>> {\n let mut stmt = conn.prepare(\n \"SELECT username FROM issue_assignees WHERE issue_id = ? ORDER BY username\"\n )?;\n let assignees = stmt\n .query_map([issue_id], |row| row.get(0))?\n .collect::<std::result::Result<Vec<_>, _>>()?;\n Ok(assignees)\n}\n```\n\n### 4. Update IssueDetail struct (line ~59)\n```rust\npub struct IssueDetail {\n // ... existing 12 fields ...\n pub assignees: Vec<String>, // NEW\n pub due_date: Option<String>, // NEW\n pub milestone: Option<String>, // NEW\n}\n```\n\n### 5. Update IssueDetailJson struct (line ~770)\nAdd same 3 fields with identical types.\n\n### 6. Update run_show_issue() (line ~89)\n```rust\nlet assignees = get_issue_assignees(&conn, issue.id)?;\n// In return struct:\nassignees,\ndue_date: issue.due_date,\nmilestone: issue.milestone_title,\n```\n\n### 7. Update print_show_issue() (line ~533, after Author line ~548)\n```rust\nif !issue.assignees.is_empty() {\n println!(\"Assignee{}: {}\",\n if issue.assignees.len() > 1 { \"s\" } else { \"\" },\n issue.assignees.iter().map(|a| format!(\"@{}\", a)).collect::<Vec<_>>().join(\", \"));\n}\nif let Some(due) = &issue.due_date {\n println!(\"Due: {}\", due);\n}\nif let Some(ms) = &issue.milestone {\n println!(\"Milestone: {}\", ms);\n}\n```\n\n### 8. Update From<&IssueDetail> for IssueDetailJson (line ~799)\n```rust\nassignees: issue.assignees.clone(),\ndue_date: issue.due_date.clone(),\nmilestone: issue.milestone.clone(),\n```\n\n## Acceptance Criteria\n- [ ] `cargo test test_get_issue_assignees` passes (3 tests)\n- [ ] `lore issues <id>` shows Assignees line when assignees exist\n- [ ] `lore issues <id>` shows Due line when due_date set\n- [ ] `lore issues <id>` shows Milestone line when milestone set\n- [ ] `lore -J issues <id>` includes assignees/due_date/milestone in JSON\n- [ ] `cargo clippy --all-targets -- -D warnings` passes\n\n## Files\n- `src/cli/commands/show.rs` - ALL changes\n\n## TDD Loop\n\n**RED** - Add tests to `src/cli/commands/show.rs` `#[cfg(test)] mod tests`:\n\n```rust\nuse crate::core::db::{create_connection, run_migrations};\nuse std::path::Path;\n\nfn setup_test_db() -> Connection {\n let conn = create_connection(Path::new(\":memory:\")).unwrap();\n run_migrations(&conn).unwrap();\n conn\n}\n\n#[test]\nfn test_get_issue_assignees_empty() {\n let conn = setup_test_db();\n // seed project + issue with no assignees\n let result = get_issue_assignees(&conn, 1).unwrap();\n assert!(result.is_empty());\n}\n\n#[test]\nfn test_get_issue_assignees_multiple_sorted() {\n let conn = setup_test_db();\n // seed with alice, bob\n let result = get_issue_assignees(&conn, 1).unwrap();\n assert_eq!(result, vec![\"alice\", \"bob\"]); // alphabetical\n}\n\n#[test]\nfn test_get_issue_assignees_single() {\n let conn = setup_test_db();\n // seed with charlie only\n let result = get_issue_assignees(&conn, 1).unwrap();\n assert_eq!(result, vec![\"charlie\"]);\n}\n```\n\n**GREEN** - Implement get_issue_assignees() and struct updates\n\n**VERIFY**: `cargo test test_get_issue_assignees && cargo clippy --all-targets -- -D warnings`\n\n## Edge Cases\n- Empty assignees list -> don't print Assignees line\n- NULL due_date -> don't print Due line \n- NULL milestone_title -> don't print Milestone line\n- Single vs multiple assignees -> \"Assignee\" vs \"Assignees\" grammar","status":"closed","priority":1,"issue_type":"feature","created_at":"2026-02-05T15:16:00.105830Z","created_by":"tayloreernisse","updated_at":"2026-02-05T15:26:08.147202Z","closed_at":"2026-02-05T15:26:08.147154Z","close_reason":"Implemented: assignees, due_date, milestone now display in lore issues <id>. All 7 new tests pass.","compaction_level":0,"original_size":0,"labels":["ISSUE"]}
{"id":"bd-208","title":"[CP1] Issue ingestion module","description":"## Background\n\nThe issue ingestion module fetches and stores issues with cursor-based incremental sync. It is the primary data ingestion component, establishing the pattern reused for MR ingestion in CP2. The module handles tuple-cursor semantics, raw payload storage, label extraction, and tracking which issues need discussion sync.\n\n## Approach\n\n### Module: src/ingestion/issues.rs\n\n### Key Structs\n\n```rust\n#[derive(Debug, Default)]\npub struct IngestIssuesResult {\n pub fetched: usize,\n pub upserted: usize,\n pub labels_created: usize,\n pub issues_needing_discussion_sync: Vec<IssueForDiscussionSync>,\n}\n\n#[derive(Debug, Clone)]\npub struct IssueForDiscussionSync {\n pub local_issue_id: i64,\n pub iid: i64,\n pub updated_at: i64, // ms epoch\n}\n```\n\n### Main Function\n\n```rust\npub async fn ingest_issues(\n conn: &Connection,\n client: &GitLabClient,\n config: &Config,\n project_id: i64, // Local DB project ID\n gitlab_project_id: i64,\n) -> Result<IngestIssuesResult>\n```\n\n### Logic (Step by Step)\n\n1. **Get current cursor** from sync_cursors table:\n```sql\nSELECT updated_at_cursor, tie_breaker_id\nFROM sync_cursors\nWHERE project_id = ? AND resource_type = 'issues'\n```\n\n2. **Call pagination method** with cursor rewind:\n```rust\nlet issues_stream = client.paginate_issues(\n gitlab_project_id,\n cursor.updated_at_cursor,\n config.sync.cursor_rewind_seconds,\n);\n```\n\n3. **Apply local filtering** for tuple cursor semantics:\n```rust\n// Skip if issue.updated_at < cursor_updated_at\n// Skip if issue.updated_at == cursor_updated_at AND issue.gitlab_id <= cursor_gitlab_id\nfn passes_cursor_filter(issue: &GitLabIssue, cursor: &SyncCursor) -> bool {\n if issue.updated_at < cursor.updated_at_cursor {\n return false;\n }\n if issue.updated_at == cursor.updated_at_cursor \n && issue.gitlab_id <= cursor.tie_breaker_id {\n return false;\n }\n true\n}\n```\n\n4. **For each issue passing filter**:\n```rust\n// Begin transaction (unchecked_transaction for rusqlite)\nlet tx = conn.unchecked_transaction()?;\n\n// Store raw payload (compressed based on config)\nlet payload_id = store_raw_payload(&tx, &issue_json, config.storage.compress_raw_payloads)?;\n\n// Transform and upsert issue\nlet issue_row = transform_issue(&issue)?;\nupsert_issue(&tx, &issue_row, project_id, payload_id)?;\nlet local_issue_id = get_local_issue_id(&tx, project_id, issue.iid)?;\n\n// Clear existing label links (stale removal!)\ntx.execute(\"DELETE FROM issue_labels WHERE issue_id = ?\", [local_issue_id])?;\n\n// Extract and upsert labels\nfor label_name in &issue_row.label_names {\n let label_id = upsert_label(&tx, project_id, label_name)?;\n link_issue_label(&tx, local_issue_id, label_id)?;\n}\n\ntx.commit()?;\n```\n\n5. **Incremental cursor update** every 100 issues:\n```rust\nif batch_count % 100 == 0 {\n update_sync_cursor(conn, project_id, \"issues\", last_updated_at, last_gitlab_id)?;\n}\n```\n\n6. **Final cursor update** after all issues processed\n\n7. **Determine issues needing discussion sync**:\n```sql\nSELECT id, iid, updated_at\nFROM issues\nWHERE project_id = ?\n AND updated_at > COALESCE(discussions_synced_for_updated_at, 0)\n```\n\n### Helper Functions\n\n```rust\nfn store_raw_payload(conn, json: &Value, compress: bool) -> Result<i64>\nfn upsert_issue(conn, issue: &IssueRow, project_id: i64, payload_id: i64) -> Result<()>\nfn get_local_issue_id(conn, project_id: i64, iid: i64) -> Result<i64>\nfn upsert_label(conn, project_id: i64, name: &str) -> Result<i64>\nfn link_issue_label(conn, issue_id: i64, label_id: i64) -> Result<()>\nfn update_sync_cursor(conn, project_id: i64, resource: &str, updated_at: i64, gitlab_id: i64) -> Result<()>\n```\n\n### Critical Invariant\n\nStale label links MUST be removed on resync. The \"DELETE then INSERT\" pattern ensures GitLab reality is reflected locally. If an issue had labels [A, B] and now has [A, C], the B link must be removed.\n\n## Acceptance Criteria\n\n- [ ] `ingest_issues` returns IngestIssuesResult with all counts\n- [ ] Cursor fetched from sync_cursors at start\n- [ ] Cursor rewind applied before API call\n- [ ] Local filtering skips already-processed issues\n- [ ] Each issue wrapped in transaction for atomicity\n- [ ] Raw payload stored with correct compression\n- [ ] Issue upserted (INSERT OR REPLACE pattern)\n- [ ] Existing label links deleted before new links inserted\n- [ ] Labels upserted (INSERT OR IGNORE by project+name)\n- [ ] Cursor updated every 100 issues (crash recovery)\n- [ ] Final cursor update after all issues\n- [ ] issues_needing_discussion_sync populated correctly\n\n## Files\n\n- src/ingestion/mod.rs (add `pub mod issues;`)\n- src/ingestion/issues.rs (create)\n\n## TDD Loop\n\nRED:\n```rust\n// tests/issue_ingestion_tests.rs\n#[tokio::test] async fn ingests_issues_from_stream()\n#[tokio::test] async fn applies_cursor_filter_correctly()\n#[tokio::test] async fn updates_cursor_every_100_issues()\n#[tokio::test] async fn stores_raw_payload_for_each_issue()\n#[tokio::test] async fn upserts_issues_correctly()\n\n// tests/label_linkage_tests.rs\n#[tokio::test] async fn extracts_and_stores_labels()\n#[tokio::test] async fn removes_stale_label_links_on_resync()\n#[tokio::test] async fn handles_empty_labels_array()\n\n// tests/discussion_eligibility_tests.rs\n#[tokio::test] async fn identifies_issues_needing_discussion_sync()\n#[tokio::test] async fn skips_issues_with_current_watermark()\n```\n\nGREEN: Implement ingest_issues with all helper functions\n\nVERIFY: `cargo test issue_ingestion && cargo test label_linkage && cargo test discussion_eligibility`\n\n## Edge Cases\n\n- Empty issues stream - return result with all zeros\n- Cursor at epoch 0 - fetch all issues (no filtering)\n- Issue with no labels - empty Vec, no label links created\n- Issue with 50+ labels - all should be linked\n- Crash mid-batch - cursor at last 100-boundary, some issues re-fetched\n- Label already exists - upsert via INSERT OR IGNORE\n- Same issue fetched twice (due to rewind) - upsert handles it","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-25T17:02:38.245404Z","created_by":"tayloreernisse","updated_at":"2026-01-25T22:52:38.003964Z","closed_at":"2026-01-25T22:52:38.003868Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-208","depends_on_id":"bd-2iq","type":"blocks","created_at":"2026-01-25T17:04:05.425224Z","created_by":"tayloreernisse"},{"issue_id":"bd-208","depends_on_id":"bd-3nd","type":"blocks","created_at":"2026-01-25T17:04:05.450341Z","created_by":"tayloreernisse"},{"issue_id":"bd-208","depends_on_id":"bd-xhz","type":"blocks","created_at":"2026-01-25T17:04:05.473203Z","created_by":"tayloreernisse"}]}
{"id":"bd-20e","title":"Define TimelineEvent model and TimelineEventType enum","description":"## Background\n\nThe TimelineEvent model is the foundational data type for Gate 3's timeline feature. All pipeline stages (seed, expand, collect, interleave) produce or consume TimelineEvents. This must be defined first because every downstream bead (bd-32q, bd-ypa, bd-3as, bd-dty, bd-2f2) depends on these types.\n\n**Spec reference:** `docs/phase-b-temporal-intelligence.md` Section 3.3 (Event Model).\n\n## Approach\n\nCreate `src/core/timeline.rs` with the following types:\n\n```rust\n/// The core timeline event. All pipeline stages produce or consume these.\n/// Spec ref: Section 3.3 \"Event Model\"\n#[derive(Debug, Clone, Serialize)]\npub struct TimelineEvent {\n pub timestamp: i64, // ms epoch UTC\n pub entity_type: &'static str, // \"issue\" | \"merge_request\"\n pub entity_id: i64, // local DB id (internal, not in JSON output)\n pub entity_iid: i64,\n pub project_path: String,\n pub event_type: TimelineEventType,\n pub summary: String, // human-readable one-liner\n pub actor: Option<String>, // username or None for system\n pub url: Option<String>, // web URL for the event source (spec requirement)\n pub is_seed: bool, // true if from seed phase, false if expanded\n}\n\n/// Per spec Section 3.3. Serde tagged enum for JSON output.\n/// Note: internal struct uses &'static str for entity_type (perf optimization),\n/// but JSON output serializes as String. entity_id is internal-only (not in JSON).\n#[derive(Debug, Clone, Serialize)]\n#[serde(tag = \"kind\", rename_all = \"snake_case\")]\npub enum TimelineEventType {\n Created,\n StateChanged { state: String }, // spec: just the target state\n LabelAdded { label: String },\n LabelRemoved { label: String },\n MilestoneSet { milestone: String },\n MilestoneRemoved { milestone: String },\n Merged, // spec: unit variant\n NoteEvidence {\n note_id: i64, // spec: required\n snippet: String, // first ~200 chars of matching note body\n discussion_id: Option<i64>, // spec: optional (some notes lack discussions)\n },\n CrossReferenced { target: String }, // spec: compact target ref like \"\\!567\" or \"#234\"\n}\n\n/// Internal entity reference used across pipeline stages.\n#[derive(Debug, Clone)]\npub struct EntityRef {\n pub entity_type: &'static str,\n pub entity_id: i64,\n pub entity_iid: i64,\n pub project_path: String,\n}\n\n/// An entity discovered via BFS expansion.\n/// Spec ref: Section 3.5 \"expanded_entities\" JSON structure.\n#[derive(Debug, Clone, Serialize)]\npub struct ExpandedEntityRef {\n pub entity_ref: EntityRef,\n pub depth: u32,\n /// Provenance: which seed entity + edge type led to discovery.\n /// JSON output uses nested \"via\" object per spec Section 3.5:\n /// { \"from\": { \"type\", \"iid\", \"project\" }, \"reference_type\", \"source_method\" }\n pub via_from: EntityRef, // the entity that referenced this one\n pub via_reference_type: String, // \"closes\", \"mentioned\", \"related\"\n pub via_source_method: String, // \"api\", \"note_parse\", etc.\n}\n\n/// Reference that points to an unsynced external entity.\n/// Spec ref: Section 3.5 \"unresolved_references\" JSON structure.\n#[derive(Debug, Clone, Serialize)]\npub struct UnresolvedRef {\n pub source: EntityRef, // entity containing the reference\n pub target_project: Option<String>, // e.g. \"group/other-repo\"\n pub target_type: String, // \"issue\" | \"merge_request\"\n pub target_iid: i64,\n pub reference_type: String, // \"closes\", \"mentioned\", \"related\"\n}\n\n/// Complete result from the timeline pipeline.\n#[derive(Debug, Clone, Serialize)]\npub struct TimelineResult {\n pub query: String,\n pub events: Vec<TimelineEvent>,\n pub seed_entities: Vec<EntityRef>,\n pub expanded_entities: Vec<ExpandedEntityRef>,\n pub unresolved_references: Vec<UnresolvedRef>,\n}\n```\n\nImplement `Ord` on `TimelineEvent` for chronological sort: primary key `timestamp`, tiebreak by `entity_id` then `event_type` discriminant (use a `sort_key()` method returning `(i64, i64, u8)`).\n\nRegister in `src/core/mod.rs`: `pub mod timeline;`\n\n## Acceptance Criteria\n\n- [ ] `src/core/timeline.rs` compiles with no warnings\n- [ ] `TimelineEvent` has `url: Option<String>` field (spec Section 3.3)\n- [ ] `TimelineEventType` has exactly 9 variants matching spec Section 3.3\n- [ ] `StateChanged` uses `{ state: String }` (not from/to -- spec Section 3.3)\n- [ ] `Merged` is a unit variant (no inner fields -- spec Section 3.3)\n- [ ] `NoteEvidence` has `note_id: i64`, `snippet: String`, `discussion_id: Option<i64>` (spec Section 3.3)\n- [ ] `CrossReferenced` has `{ target: String }` (compact ref like \"\\!567\" -- spec Section 3.3)\n- [ ] `ExpandedEntityRef` has `via_from`, `via_reference_type`, `via_source_method` fields (spec Section 3.5)\n- [ ] `UnresolvedRef` has `source` entity, `target_project`, `target_type`, `target_iid`, `reference_type` (spec Section 3.5)\n- [ ] `TimelineEvent` derives `Serialize` for downstream JSON output\n- [ ] `EntityRef` is `Clone + Debug` (needed in BFS expand phase)\n- [ ] `TimelineResult` contains all 5 fields (query, events, seed_entities, expanded_entities, unresolved_references)\n- [ ] `Ord` impl on `TimelineEvent` sorts by (timestamp, entity_id, event_type discriminant)\n- [ ] Module registered in `src/core/mod.rs`\n- [ ] `cargo check --all-targets` passes\n- [ ] `cargo clippy --all-targets -- -D warnings` passes\n\n## Files\n\n- `src/core/timeline.rs` (NEW)\n- `src/core/mod.rs` (add `pub mod timeline;`)\n\n## TDD Loop\n\nRED: Create `src/core/timeline.rs` with `#[cfg(test)] mod tests` containing:\n- `test_timeline_event_sort_by_timestamp` - events sort chronologically\n- `test_timeline_event_sort_tiebreak` - same-timestamp events sort stably by entity_id then event_type\n- `test_timeline_event_type_serializes_tagged` - serde JSON output uses `kind` tag\n- `test_state_changed_serializes_state_only` - `{\"kind\":\"state_changed\",\"state\":\"closed\"}`\n- `test_note_evidence_has_note_id` - note_id present in serialized output\n\nGREEN: Implement the types and Ord trait.\n\nVERIFY: `cargo test --lib -- timeline`\n\n## Edge Cases\n\n- Ensure Ord is consistent: a.cmp(b) must never panic for any valid TimelineEvent\n- NoteEvidence snippet should be truncated to 200 chars at construction, not in the type itself\n- EntityRef's entity_type uses &'static str not String to avoid allocations in hot BFS loop\n- url field: constructed from project_path + entity_type + iid pattern; None for entities without web_url","status":"open","priority":2,"issue_type":"task","created_at":"2026-02-02T21:33:08.569126Z","created_by":"tayloreernisse","updated_at":"2026-02-05T19:09:03.146239Z","compaction_level":0,"original_size":0,"labels":["gate-3","phase-b","types"],"dependencies":[{"issue_id":"bd-20e","depends_on_id":"bd-ike","type":"parent-child","created_at":"2026-02-02T21:33:08.573079Z","created_by":"tayloreernisse"}]}
{"id":"bd-20h","title":"Implement MR discussion ingestion module","description":"## Background\nMR discussion ingestion with critical atomicity guarantees. Parse notes BEFORE destructive DB operations to prevent data loss. Watermark ONLY advanced on full success.\n\n## Approach\nCreate `src/ingestion/mr_discussions.rs` with:\n1. `IngestMrDiscussionsResult` - Per-MR stats\n2. `ingest_mr_discussions()` - Main function with atomicity guarantees\n3. Upsert + sweep pattern for notes (not delete-all-then-insert)\n4. Sync health telemetry for debugging failures\n\n## Files\n- `src/ingestion/mr_discussions.rs` - New module\n- `tests/mr_discussion_ingestion_tests.rs` - Integration tests\n\n## Acceptance Criteria\n- [ ] `IngestMrDiscussionsResult` has: discussions_fetched, discussions_upserted, notes_upserted, notes_skipped_bad_timestamp, diffnotes_count, pagination_succeeded\n- [ ] `ingest_mr_discussions()` returns `Result<IngestMrDiscussionsResult>`\n- [ ] CRITICAL: Notes parsed BEFORE any DELETE operations\n- [ ] CRITICAL: Watermark NOT advanced if `pagination_succeeded == false`\n- [ ] CRITICAL: Watermark NOT advanced if any note parse fails\n- [ ] Upsert + sweep pattern using `last_seen_at`\n- [ ] Stale discussions/notes removed only on full success\n- [ ] Selective raw payload storage (skip system notes without position)\n- [ ] Sync health telemetry recorded on failure\n- [ ] `does_not_advance_discussion_watermark_on_partial_failure` test passes\n- [ ] `atomic_note_replacement_preserves_data_on_parse_failure` test passes\n\n## TDD Loop\nRED: `cargo test does_not_advance_watermark` -> test fails\nGREEN: Add ingestion with atomicity guarantees\nVERIFY: `cargo test mr_discussion_ingestion`\n\n## Main Function\n```rust\npub async fn ingest_mr_discussions(\n conn: &Connection,\n client: &GitLabClient,\n config: &Config,\n project_id: i64,\n gitlab_project_id: i64,\n mr_iid: i64,\n local_mr_id: i64,\n mr_updated_at: i64,\n) -> Result<IngestMrDiscussionsResult>\n```\n\n## CRITICAL: Atomic Note Replacement\n```rust\n// Record sync start time for sweep\nlet run_seen_at = now_ms();\n\nwhile let Some(discussion_result) = stream.next().await {\n let discussion = match discussion_result {\n Ok(d) => d,\n Err(e) => {\n result.pagination_succeeded = false;\n break; // Stop but don't advance watermark\n }\n };\n \n // CRITICAL: Parse BEFORE destructive operations\n let notes = match transform_notes_with_diff_position(&discussion, project_id) {\n Ok(notes) => notes,\n Err(e) => {\n warn!(\"Note transform failed; preserving existing notes\");\n result.notes_skipped_bad_timestamp += discussion.notes.len();\n result.pagination_succeeded = false;\n continue; // Skip this discussion, don't delete existing\n }\n };\n \n // Only NOW start transaction (after parse succeeded)\n let tx = conn.unchecked_transaction()?;\n \n // Upsert discussion with run_seen_at\n // Upsert notes with run_seen_at (not delete-all)\n \n tx.commit()?;\n}\n```\n\n## Stale Data Sweep (only on success)\n```rust\nif result.pagination_succeeded {\n // Sweep stale discussions\n conn.execute(\n \"DELETE FROM discussions\n WHERE project_id = ? AND merge_request_id = ?\n AND last_seen_at < ?\",\n params![project_id, local_mr_id, run_seen_at],\n )?;\n \n // Sweep stale notes\n conn.execute(\n \"DELETE FROM notes\n WHERE discussion_id IN (\n SELECT id FROM discussions\n WHERE project_id = ? AND merge_request_id = ?\n )\n AND last_seen_at < ?\",\n params![project_id, local_mr_id, run_seen_at],\n )?;\n}\n```\n\n## Watermark Update (ONLY on success)\n```rust\nif result.pagination_succeeded {\n mark_discussions_synced(conn, local_mr_id, mr_updated_at)?;\n clear_sync_health_error(conn, local_mr_id)?;\n} else {\n record_sync_health_error(conn, local_mr_id, \"Pagination incomplete or parse failure\")?;\n warn!(\"Watermark NOT advanced; will retry on next sync\");\n}\n```\n\n## Selective Payload Storage\n```rust\n// Only store payload for DiffNotes and non-system notes\nlet should_store_note_payload =\n !note.is_system() ||\n note.position_new_path().is_some() ||\n note.position_old_path().is_some();\n```\n\n## Integration Tests (CRITICAL)\n```rust\n#[tokio::test]\nasync fn does_not_advance_discussion_watermark_on_partial_failure() {\n // Setup: MR with updated_at > discussions_synced_for_updated_at\n // Mock: Page 1 returns OK, Page 2 returns 500\n // Assert: discussions_synced_for_updated_at unchanged\n}\n\n#[tokio::test]\nasync fn does_not_advance_discussion_watermark_on_note_parse_failure() {\n // Setup: Existing notes in DB\n // Mock: Discussion with note having invalid created_at\n // Assert: Original notes preserved, watermark unchanged\n}\n\n#[tokio::test]\nasync fn atomic_note_replacement_preserves_data_on_parse_failure() {\n // Setup: Discussion with 3 valid notes\n // Mock: Updated discussion where note 2 has bad timestamp\n // Assert: All 3 original notes still in DB\n}\n```\n\n## Edge Cases\n- HTTP error mid-pagination: preserve existing data, log error, no watermark advance\n- Invalid note timestamp: skip discussion, preserve existing notes\n- System notes without position: don't store raw payload (saves space)\n- Empty discussion: still upsert discussion record, no notes","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-26T22:06:42.335714Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:22:43.207057Z","closed_at":"2026-01-27T00:22:43.206996Z","close_reason":"Implemented MR discussion ingestion module with full atomicity guarantees:\n- IngestMrDiscussionsResult with all required fields\n- parse-before-destructive pattern (transform notes before DB ops)\n- Upsert + sweep pattern with last_seen_at timestamps\n- Watermark advanced ONLY on full pagination success\n- Selective payload storage (skip system notes without position)\n- Sync health telemetry for failure debugging\n- All 163 tests passing","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-20h","depends_on_id":"bd-3ir","type":"blocks","created_at":"2026-01-26T22:08:54.649094Z","created_by":"tayloreernisse"},{"issue_id":"bd-20h","depends_on_id":"bd-3j6","type":"blocks","created_at":"2026-01-26T22:08:54.686066Z","created_by":"tayloreernisse"},{"issue_id":"bd-20h","depends_on_id":"bd-iba","type":"blocks","created_at":"2026-01-26T22:08:54.722746Z","created_by":"tayloreernisse"}]}
{"id":"bd-221","title":"Create migration 008_fts5.sql","description":"## Background\nFTS5 (Full-Text Search 5) provides the lexical search backbone for Gate A. The virtual table + triggers keep the FTS index in sync with the documents table automatically. This migration must be applied AFTER migration 007 (documents table exists). The trigger design handles NULL titles via COALESCE and only rebuilds the FTS entry when searchable text actually changes (not metadata-only updates).\n\n## Approach\nCreate `migrations/008_fts5.sql` with the exact SQL from PRD Section 1.2:\n\n1. **Virtual table:** `documents_fts` using FTS5 with porter stemmer, prefix indexes (2,3,4), external content backed by `documents` table\n2. **Insert trigger:** `documents_ai` — inserts into FTS on document insert, uses COALESCE(title, '') for NULL safety\n3. **Delete trigger:** `documents_ad` — removes from FTS on document delete using the FTS5 delete command syntax\n4. **Update trigger:** `documents_au` — only fires when `title` or `content_text` changes (WHEN clause), performs delete-then-insert to update FTS\n\nRegister migration 8 in `src/core/db.rs` MIGRATIONS array.\n\n**Critical detail:** The COALESCE is required because FTS5 external-content tables require exact value matching for delete operations. If NULL was inserted, the delete trigger couldn't match it (NULL != NULL in SQL).\n\n## Acceptance Criteria\n- [ ] `migrations/008_fts5.sql` file exists\n- [ ] `documents_fts` virtual table created with `tokenize='porter unicode61'` and `prefix='2 3 4'`\n- [ ] `content='documents'` and `content_rowid='id'` set (external content mode)\n- [ ] Insert trigger `documents_ai` fires on document insert with COALESCE(title, '')\n- [ ] Delete trigger `documents_ad` fires on document delete using FTS5 delete command\n- [ ] Update trigger `documents_au` only fires when `old.title IS NOT new.title OR old.content_text != new.content_text`\n- [ ] Prefix search works: query `auth*` matches \"authentication\"\n- [ ] After bulk insert of N documents, `SELECT count(*) FROM documents_fts` returns N\n- [ ] Schema version 8 recorded in schema_version table\n- [ ] `cargo test migration_tests` passes\n\n## Files\n- `migrations/008_fts5.sql` — new file (copy exact SQL from PRD Section 1.2)\n- `src/core/db.rs` — add migration 8 to MIGRATIONS array\n\n## TDD Loop\nRED: Register migration in db.rs, `cargo test migration_tests` fails (SQL file missing)\nGREEN: Create `008_fts5.sql` with all triggers\nVERIFY: `cargo test migration_tests && cargo build`\n\n## Edge Cases\n- Metadata-only updates (e.g., changing `updated_at` or `labels_hash`) must NOT trigger FTS rebuild — the WHEN clause prevents this\n- NULL titles must use COALESCE to empty string in both insert and delete triggers\n- The update trigger does delete+insert (not FTS5 'delete' + regular insert atomically) — this is the correct FTS5 pattern for content changes","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:25:25.763146Z","created_by":"tayloreernisse","updated_at":"2026-01-30T16:56:13.131830Z","closed_at":"2026-01-30T16:56:13.131771Z","close_reason":"Completed: migration 008_fts5.sql with FTS5 virtual table, 3 sync triggers (insert/delete/update with COALESCE NULL safety), prefix search, registered in db.rs, cargo build + tests pass","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-221","depends_on_id":"bd-hrs","type":"blocks","created_at":"2026-01-30T15:29:15.574576Z","created_by":"tayloreernisse"}]}
{"id":"bd-227","title":"[CP1] gi count issues/discussions/notes commands","description":"Count entities in the database.\n\n## Module\nsrc/cli/commands/count.rs\n\n## Clap Definition\nCount {\n #[arg(value_parser = [\"issues\", \"mrs\", \"discussions\", \"notes\"])]\n entity: String,\n \n #[arg(long, value_parser = [\"issue\", \"mr\"])]\n r#type: Option<String>,\n}\n\n## Commands\n- gi count issues → 'Issues: N'\n- gi count discussions → 'Discussions: N'\n- gi count discussions --type=issue → 'Issue Discussions: N'\n- gi count notes → 'Notes: N (excluding M system)'\n- gi count notes --type=issue → 'Issue Notes: N (excluding M system)'\n\n## Implementation\n- Simple COUNT(*) queries\n- For notes, also count WHERE is_system = 1 for system note count\n- Filter by noteable_type when --type specified\n\nFiles: src/cli/commands/count.rs\nDone when: Counts match expected values from GitLab","status":"tombstone","priority":3,"issue_type":"task","created_at":"2026-01-25T16:58:25.648805Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.920135Z","deleted_at":"2026-01-25T17:02:01.920129Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-22li","title":"OBSERV: Implement SyncRunRecorder lifecycle helper","description":"## Background\nThe sync_runs table exists (migration 001) but NOTHING writes to it. SyncRunRecorder encapsulates the INSERT-on-start, UPDATE-on-finish lifecycle, fixing this bug and enabling sync history tracking.\n\n## Approach\nCreate src/core/sync_run.rs:\n\n```rust\nuse crate::core::metrics::StageTiming;\nuse crate::core::error::Result;\nuse rusqlite::Connection;\n\npub struct SyncRunRecorder {\n row_id: i64,\n}\n\nimpl SyncRunRecorder {\n /// Insert a new sync_runs row with status='running'.\n pub fn start(conn: &Connection, command: &str, run_id: &str) -> Result<Self> {\n let now_ms = crate::core::time::now_ms();\n conn.execute(\n \"INSERT INTO sync_runs (started_at, heartbeat_at, status, command, run_id)\n VALUES (?1, ?2, 'running', ?3, ?4)\",\n rusqlite::params![now_ms, now_ms, command, run_id],\n )?;\n let row_id = conn.last_insert_rowid();\n Ok(Self { row_id })\n }\n\n /// Mark run as succeeded with full metrics.\n pub fn succeed(\n self,\n conn: &Connection,\n metrics: &[StageTiming],\n total_items: usize,\n total_errors: usize,\n ) -> Result<()> {\n let now_ms = crate::core::time::now_ms();\n let metrics_json = serde_json::to_string(metrics)\n .unwrap_or_else(|_| \"[]\".to_string());\n conn.execute(\n \"UPDATE sync_runs\n SET finished_at = ?1, status = 'succeeded',\n metrics_json = ?2, total_items_processed = ?3, total_errors = ?4\n WHERE id = ?5\",\n rusqlite::params![now_ms, metrics_json, total_items, total_errors, self.row_id],\n )?;\n Ok(())\n }\n\n /// Mark run as failed with error message and optional partial metrics.\n pub fn fail(\n self,\n conn: &Connection,\n error: &str,\n metrics: Option<&[StageTiming]>,\n ) -> Result<()> {\n let now_ms = crate::core::time::now_ms();\n let metrics_json = metrics\n .map(|m| serde_json::to_string(m).unwrap_or_else(|_| \"[]\".to_string()));\n conn.execute(\n \"UPDATE sync_runs\n SET finished_at = ?1, status = 'failed', error = ?2,\n metrics_json = ?3\n WHERE id = ?4\",\n rusqlite::params![now_ms, error, metrics_json, self.row_id],\n )?;\n Ok(())\n }\n}\n```\n\nRegister in src/core/mod.rs:\n```rust\npub mod sync_run;\n```\n\nNote: SyncRunRecorder takes self (not &self) in succeed/fail to enforce single-use lifecycle. You start a run, then either succeed or fail it -- never both.\n\nThe existing time::now_ms() helper (src/core/time.rs) returns milliseconds since epoch as i64. Used by the existing sync_runs schema (started_at, finished_at are INTEGER ms).\n\n## Acceptance Criteria\n- [ ] SyncRunRecorder::start() inserts row with status='running', started_at set\n- [ ] SyncRunRecorder::succeed() updates status='succeeded', finished_at set, metrics_json populated\n- [ ] SyncRunRecorder::fail() updates status='failed', error set, finished_at set\n- [ ] fail() with Some(metrics) stores partial metrics in metrics_json\n- [ ] fail() with None leaves metrics_json as NULL\n- [ ] succeed/fail consume self (single-use enforcement)\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/core/sync_run.rs (new file)\n- src/core/mod.rs (register module)\n\n## TDD Loop\nRED:\n - test_sync_run_recorder_start: in-memory DB, start(), query sync_runs, assert status='running'\n - test_sync_run_recorder_succeed: start() then succeed(), assert status='succeeded', metrics_json parseable\n - test_sync_run_recorder_fail: start() then fail(), assert status='failed', error set\n - test_sync_run_recorder_fail_with_partial_metrics: fail with Some(metrics), assert metrics_json has data\nGREEN: Implement SyncRunRecorder\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- Connection lifetime: SyncRunRecorder stores row_id, not Connection. The caller must ensure the same Connection is used for start/succeed/fail.\n- Panic during sync: if the program panics between start() and succeed()/fail(), the row stays as 'running'. The existing stale lock detection (stale_lock_minutes) handles this.\n- metrics_json encoding: serde_json::to_string on Vec<StageTiming> produces a JSON array string.","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-04T15:54:51.364617Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:38:04.903657Z","closed_at":"2026-02-04T17:38:04.903610Z","close_reason":"Implemented SyncRunRecorder with start/succeed/fail lifecycle, 4 passing tests","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-22li","depends_on_id":"bd-1o4h","type":"blocks","created_at":"2026-02-04T15:55:20.287655Z","created_by":"tayloreernisse"},{"issue_id":"bd-22li","depends_on_id":"bd-3pz","type":"parent-child","created_at":"2026-02-04T15:54:51.365721Z","created_by":"tayloreernisse"},{"issue_id":"bd-22li","depends_on_id":"bd-apmo","type":"blocks","created_at":"2026-02-04T15:55:20.236008Z","created_by":"tayloreernisse"}]}
{"id":"bd-23a4","title":"OBSERV: Wire SyncRunRecorder into sync and ingest commands","description":"## Background\nWith SyncRunRecorder implemented and MetricsLayer available, we wire them into the actual sync and ingest command handlers. This makes every sync/ingest invocation create a database record with full metrics.\n\n## Approach\n### src/cli/commands/sync.rs - run_sync() (line ~54)\n\nBefore the pipeline:\n```rust\nlet recorder = SyncRunRecorder::start(&conn, \"sync\", &run_id)?;\n```\n\nAfter pipeline succeeds:\n```rust\nlet stages = metrics_handle.extract_timings();\nlet total_items = stages.iter().map(|s| s.items_processed).sum::<usize>();\nlet total_errors = stages.iter().map(|s| s.errors).sum::<usize>();\nrecorder.succeed(&conn, &stages, total_items, total_errors)?;\n```\n\nOn pipeline failure (wrap pipeline in match or use a helper):\n```rust\nmatch pipeline_result {\n Ok(result) => {\n let stages = metrics_handle.extract_timings();\n recorder.succeed(&conn, &stages, total_items, total_errors)?;\n Ok(result)\n }\n Err(e) => {\n let stages = metrics_handle.extract_timings();\n recorder.fail(&conn, &e.to_string(), Some(&stages))?;\n Err(e)\n }\n}\n```\n\n### src/cli/commands/ingest.rs - run_ingest() (line ~107)\n\nSame pattern: start before pipeline, succeed/fail after.\n\nNote: run_sync() calls run_ingest() internally. Both will create sync_runs records. This is intentional -- standalone ingest should also be tracked. But when run_sync calls run_ingest, the ingest record is a child operation. Consider: should we skip the ingest recorder when called from sync? Decision: keep both records. The run_id differs, and sync-status can distinguish by the \"command\" column.\n\nActually, re-reading the code: run_sync() (line 54-178) calls run_ingest() for issues and MRs. If both create sync_runs rows, we get 3 rows per sync (1 sync + 2 ingest). This is fine -- command='sync' vs command='ingest:issues' distinguishes them.\n\n### Connection sharing\nrun_sync and run_ingest already have access to a Connection. SyncRunRecorder::start takes &Connection.\n\n### MetricsLayer handle\nmetrics_handle must be passed from main.rs through handle_sync_cmd/handle_ingest to run_sync/run_ingest. This requires adding a parameter. Alternative: use a thread-local or global. Prefer parameter passing for testability.\n\n## Acceptance Criteria\n- [ ] Every lore sync creates a sync_runs row with status transitioning running -> succeeded/failed\n- [ ] Every lore ingest creates a sync_runs row\n- [ ] metrics_json contains serialized Vec<StageTiming> on success\n- [ ] Failed syncs record error message and partial metrics\n- [ ] sync_runs.run_id matches run_id in log files and robot JSON\n- [ ] total_items_processed and total_errors are populated\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/cli/commands/sync.rs (wire SyncRunRecorder + extract_timings in run_sync)\n- src/cli/commands/ingest.rs (wire SyncRunRecorder in run_ingest)\n- src/main.rs (pass metrics_handle to command handlers)\n\n## TDD Loop\nRED: test_sync_creates_run_record (integration: run sync, query sync_runs, assert row exists with metrics)\nGREEN: Wire SyncRunRecorder into run_sync and run_ingest\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- Database locked: SyncRunRecorder operations happen on the main connection. If a concurrent process holds the lock, the INSERT/UPDATE will wait (WAL mode) or error. Use existing lock handling.\n- Partial failure: if ingest issues succeeds but ingest MRs fails, the sync recorder should fail() with partial metrics (stages from issues but not MRs).\n- metrics_handle lifetime: must outlive the root span. Since it's an Arc clone, this is guaranteed.","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-04T15:54:51.414504Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:41:04.963794Z","closed_at":"2026-02-04T17:41:04.963749Z","close_reason":"Wired SyncRunRecorder into handle_sync_cmd and handle_ingest in main.rs","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-23a4","depends_on_id":"bd-22li","type":"blocks","created_at":"2026-02-04T15:55:20.346104Z","created_by":"tayloreernisse"},{"issue_id":"bd-23a4","depends_on_id":"bd-34ek","type":"blocks","created_at":"2026-02-04T15:55:20.401842Z","created_by":"tayloreernisse"},{"issue_id":"bd-23a4","depends_on_id":"bd-3pz","type":"parent-child","created_at":"2026-02-04T15:54:51.415435Z","created_by":"tayloreernisse"}]}
{"id":"bd-247","title":"Implement issue document extraction","description":"## Background\nIssue documents are the simplest document type — a structured header + description text. The extractor queries the existing issues and issue_labels tables (populated by ingestion) and assembles a DocumentData struct. This is one of three entity-specific extractors (issue, MR, discussion) that feed the document regeneration pipeline.\n\n## Approach\nImplement `extract_issue_document()` in `src/documents/extractor.rs`:\n\n```rust\n/// Extract a searchable document from an issue.\n/// Returns None if the issue has been deleted from the DB.\npub fn extract_issue_document(conn: &Connection, issue_id: i64) -> Result<Option<DocumentData>>\n```\n\n**SQL queries (from PRD Section 2.2):**\n```sql\n-- Main entity\nSELECT i.id, i.iid, i.title, i.description, i.state, i.author_username,\n i.created_at, i.updated_at, i.web_url,\n p.path_with_namespace, p.id AS project_id\nFROM issues i\nJOIN projects p ON p.id = i.project_id\nWHERE i.id = ?\n\n-- Labels\nSELECT l.name FROM issue_labels il\nJOIN labels l ON l.id = il.label_id\nWHERE il.issue_id = ?\nORDER BY l.name\n```\n\n**Document format:**\n```\n[[Issue]] #234: Authentication redesign\nProject: group/project-one\nURL: https://gitlab.example.com/group/project-one/-/issues/234\nLabels: [\"bug\", \"auth\"]\nState: opened\nAuthor: @johndoe\n\n--- Description ---\n\nWe need to modernize our authentication system...\n```\n\n**Implementation steps:**\n1. Query issue row — if not found, return Ok(None)\n2. Query labels via junction table\n3. Format header with [[Issue]] prefix\n4. Compute content_hash via compute_content_hash()\n5. Compute labels_hash via compute_list_hash()\n6. paths is always empty for issues (paths are only for DiffNote discussions)\n7. Return DocumentData with all fields populated\n\n## Acceptance Criteria\n- [ ] Deleted issue (not in DB) returns Ok(None)\n- [ ] Issue with no description: content_text has header only (no \"--- Description ---\" section)\n- [ ] Issue with no labels: Labels line shows \"[]\"\n- [ ] Issue with labels: Labels line shows sorted JSON array\n- [ ] content_hash is SHA-256 of the full content_text\n- [ ] labels_hash is SHA-256 of sorted label names joined by newline\n- [ ] paths_hash is empty string hash (issues have no paths)\n- [ ] project_id comes from the JOIN with projects table\n- [ ] `cargo test extract_issue` passes\n\n## Files\n- `src/documents/extractor.rs` — implement `extract_issue_document()`\n\n## TDD Loop\nRED: Test in `#[cfg(test)] mod tests`:\n- `test_issue_document_format` — verify header format matches PRD template\n- `test_issue_not_found` — returns Ok(None) for nonexistent issue_id\n- `test_issue_no_description` — no description section when description is NULL\n- `test_issue_labels_sorted` — labels appear in alphabetical order\n- `test_issue_hash_deterministic` — same issue produces same content_hash\nGREEN: Implement extract_issue_document with SQL queries\nVERIFY: `cargo test extract_issue`\n\n## Edge Cases\n- Issue with NULL description: skip \"--- Description ---\" section entirely\n- Issue with empty string description: include section but with empty body\n- Issue with very long description: no truncation here (hard cap applied by caller)\n- Labels with special characters (quotes, commas): JSON array handles escaping","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-30T15:25:45.490145Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:28:13.974948Z","closed_at":"2026-01-30T17:28:13.974891Z","close_reason":"Implemented extract_issue_document() with SQL queries, PRD-compliant format, and 7 tests","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-247","depends_on_id":"bd-36p","type":"blocks","created_at":"2026-01-30T15:29:15.677223Z","created_by":"tayloreernisse"},{"issue_id":"bd-247","depends_on_id":"bd-hrs","type":"blocks","created_at":"2026-01-30T15:29:15.712739Z","created_by":"tayloreernisse"}]}
{"id":"bd-24j1","title":"OBSERV: Add #[instrument] spans to ingestion stages","description":"## Background\nTracing spans on each sync stage create the hierarchy that (1) makes log lines filterable by stage, (2) Phase 3's MetricsLayer reads to build StageTiming trees, and (3) gives meaningful context in -vv stderr output.\n\n## Approach\nAdd #[instrument] attributes or manual spans to these functions:\n\n### src/ingestion/orchestrator.rs\n1. ingest_project_issues_with_progress() (line ~110):\n```rust\n#[instrument(skip_all, fields(stage = \"ingest_issues\", project = %project_path))]\npub async fn ingest_project_issues_with_progress(...) -> Result<IngestProjectResult> {\n```\n\n2. The MR equivalent (ingest_project_mrs_with_progress or similar):\n```rust\n#[instrument(skip_all, fields(stage = \"ingest_mrs\", project = %project_path))]\n```\n\n3. Inside the issue ingest function, add child spans for sub-stages:\n```rust\nlet _fetch_span = tracing::info_span!(\"fetch_pages\", project = %project_path).entered();\n// ... fetch logic\ndrop(_fetch_span);\n\nlet _disc_span = tracing::info_span!(\"sync_discussions\", project = %project_path).entered();\n// ... discussion sync logic\ndrop(_disc_span);\n```\n\n4. drain_resource_events() (line ~566):\n```rust\nlet _span = tracing::info_span!(\"fetch_resource_events\", project = %project_path).entered();\n```\n\n### src/documents/regenerator.rs\n5. regenerate_dirty_documents() (line ~24):\n```rust\n#[instrument(skip_all, fields(stage = \"generate_docs\"))]\npub fn regenerate_dirty_documents(conn: &Connection) -> Result<RegenerateResult> {\n```\n\n### src/embedding/pipeline.rs\n6. embed_documents() (line ~36):\n```rust\n#[instrument(skip_all, fields(stage = \"embed\"))]\npub async fn embed_documents(...) -> Result<EmbedResult> {\n```\n\n### Important: field declarations for Phase 3\nThe #[instrument] fields should include empty recording fields that Phase 3 (bd-16m8) will populate:\n```rust\n#[instrument(skip_all, fields(\n stage = \"ingest_issues\",\n project = %project_path,\n items_processed = tracing::field::Empty,\n items_skipped = tracing::field::Empty,\n errors = tracing::field::Empty,\n))]\n```\n\nThis declares the fields on the span so MetricsLayer can capture them when span.record() is called later.\n\n## Acceptance Criteria\n- [ ] JSON log lines show nested span context: sync > ingest_issues > fetch_pages\n- [ ] Each stage span has a \"stage\" field with the stage name\n- [ ] Per-project spans include \"project\" field\n- [ ] Spans are visible in -vv stderr output as bracketed context\n- [ ] Empty recording fields declared for items_processed, items_skipped, errors\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/ingestion/orchestrator.rs (spans on ingest functions and sub-stages)\n- src/documents/regenerator.rs (span on regenerate_dirty_documents)\n- src/embedding/pipeline.rs (span on embed_documents)\n\n## TDD Loop\nRED:\n - test_span_context_in_json_logs: mock sync, capture JSON, verify span chain\n - test_nested_span_chain: verify parent-child: sync > ingest_issues > fetch_pages\n - test_span_elapsed_on_close: create span, sleep 10ms, verify elapsed >= 10\nGREEN: Add #[instrument] and manual spans to all stage functions\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- #[instrument] on async fn: uses tracing::Instrument trait automatically. Works with tokio.\n- skip_all is essential: without it, #[instrument] tries to Debug-format all parameters, which may not implement Debug or may be expensive.\n- Manual span drop: for sub-stages within a single function, use explicit drop(_span) to end the span before the next sub-stage starts. Otherwise spans overlap.\n- tracing::field::Empty: declares a field that can be recorded later. If never recorded, it appears as empty/missing in output (not zero).","status":"closed","priority":1,"issue_type":"task","created_at":"2026-02-04T15:54:07.821068Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:19:34.307672Z","closed_at":"2026-02-04T17:19:34.307624Z","close_reason":"Added #[instrument] spans to ingest_project_issues_with_progress, ingest_project_merge_requests_with_progress, drain_resource_events, regenerate_dirty_documents, embed_documents","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-24j1","depends_on_id":"bd-2ni","type":"parent-child","created_at":"2026-02-04T15:54:07.821916Z","created_by":"tayloreernisse"},{"issue_id":"bd-24j1","depends_on_id":"bd-2rr","type":"blocks","created_at":"2026-02-04T15:55:19.798133Z","created_by":"tayloreernisse"}]}
{"id":"bd-25s","title":"robot-docs: Add Ollama dependency discovery to manifest","description":"## Background\n\nThe robot-docs manifest currently lists all commands but doesn't mention that some features (embedding, semantic search) require Ollama to be installed. Agents waste time trying embed commands without Ollama. Adding a dependencies section to robot-docs makes this discoverable.\n\n## Approach\n\nIn `src/main.rs` `handle_robot_docs()`, add a `dependencies` field to `RobotDocsData`:\n\n```rust\n#[derive(Serialize)]\nstruct RobotDocsData {\n // ... existing fields ...\n dependencies: serde_json::Value,\n}\n```\n\nValue:\n```json\n{\n \"ollama\": {\n \"required_by\": [\"embed\", \"search --mode=semantic\", \"search --mode=hybrid\", \"sync (embed step)\"],\n \"not_required_by\": [\"issues\", \"mrs\", \"search --mode=lexical\", \"count\", \"ingest\", \"timeline\", \"file-history\", \"trace\", \"stats\", \"status\", \"health\", \"doctor\"],\n \"install\": {\n \"macos\": \"brew install ollama\",\n \"linux\": \"curl -fsSL https://ollama.ai/install.sh | sh\",\n \"check\": \"ollama --version\"\n },\n \"setup\": \"ollama pull nomic-embed-text\",\n \"note\": \"Lexical search, temporal queries (timeline/file-history/trace), and all non-embedding features work without Ollama.\"\n }\n}\n```\n\nAlso update `RobotDocsOutput` struct to include the field, and update serde derives.\n\n## Acceptance Criteria\n\n- [ ] `lore robot-docs | jq '.data.dependencies.ollama'` returns the dependency info\n- [ ] `required_by` lists all commands needing Ollama\n- [ ] `not_required_by` lists all commands that work without Ollama\n- [ ] Install instructions for macOS and Linux are present\n- [ ] Setup command (model pull) is present\n- [ ] Note clarifies that most features work without Ollama\n- [ ] `cargo check --all-targets` passes\n- [ ] `cargo clippy --all-targets -- -D warnings` passes\n\n## Files\n\n- `src/main.rs` (update RobotDocsData struct + handle_robot_docs JSON)\n\n## TDD Loop\n\nRED: No unit test -- verify with `lore robot-docs | jq '.data.dependencies'`\n\nGREEN: Add the struct field and JSON value.\n\nVERIFY: `cargo check --all-targets && lore robot-docs | jq '.data.dependencies'`\n\n## Edge Cases\n\n- Keep the dependencies object extensible for future dependencies (e.g., git2 for Tier 2)\n- Don't break existing robot-docs consumers: dependencies is a new field, not replacing anything","status":"open","priority":4,"issue_type":"feature","created_at":"2026-01-30T20:26:43.169688Z","created_by":"tayloreernisse","updated_at":"2026-02-05T18:53:45.110847Z","compaction_level":0,"original_size":0,"labels":["enhancement","robot-mode"]}
{"id":"bd-2ac","title":"Create migration 009_embeddings.sql","description":"## Background\nMigration 009 creates the embedding storage layer for Gate B. It introduces a sqlite-vec vec0 virtual table for vector search and an embedding_metadata table for tracking provenance per chunk. Unlike migrations 007-008, this migration REQUIRES sqlite-vec to be loaded before it can be applied. The migration runner in db.rs must load the sqlite-vec extension first.\n\n## Approach\nCreate `migrations/009_embeddings.sql` per PRD Section 1.3.\n\n**Tables:**\n1. `embeddings` — vec0 virtual table with `embedding float[768]`\n2. `embedding_metadata` — tracks per-chunk provenance with composite PK (document_id, chunk_index)\n3. Orphan cleanup trigger: `documents_embeddings_ad` — deletes ALL chunk embeddings when a document is deleted using range deletion `[doc_id * 1000, (doc_id + 1) * 1000)`\n\n**Critical: sqlite-vec loading:**\nThe migration runner in `src/core/db.rs` must load sqlite-vec BEFORE applying any migrations. This means adding extension loading to the `create_connection()` or `run_migrations()` function. sqlite-vec is loaded via:\n```rust\nconn.load_extension_enable()?;\nconn.load_extension(\"vec0\", None)?; // or platform-specific path\nconn.load_extension_disable()?;\n```\n\nRegister migration 9 in `src/core/db.rs` MIGRATIONS array.\n\n## Acceptance Criteria\n- [ ] `migrations/009_embeddings.sql` file exists\n- [ ] `embeddings` vec0 virtual table created with `embedding float[768]`\n- [ ] `embedding_metadata` table has composite PK (document_id, chunk_index)\n- [ ] `embedding_metadata.document_id` has FK to documents(id) ON DELETE CASCADE\n- [ ] Error tracking fields: last_error, attempt_count, last_attempt_at\n- [ ] Orphan cleanup trigger: deletes embeddings WHERE rowid in [doc_id*1000, (doc_id+1)*1000)\n- [ ] Index on embedding_metadata(last_error) WHERE last_error IS NOT NULL\n- [ ] Index on embedding_metadata(document_id)\n- [ ] Schema version 9 recorded\n- [ ] Migration runner loads sqlite-vec before applying migrations\n- [ ] `cargo build` succeeds\n\n## Files\n- `migrations/009_embeddings.sql` — new file (copy exact SQL from PRD Section 1.3)\n- `src/core/db.rs` — add migration 9 to MIGRATIONS array; add sqlite-vec extension loading\n\n## TDD Loop\nRED: Register migration in db.rs, `cargo test migration_tests` fails\nGREEN: Create SQL file + add extension loading\nVERIFY: `cargo test migration_tests && cargo build`\n\n## Edge Cases\n- sqlite-vec not installed: migration fails with clear error (not a silent skip)\n- Migration applied without sqlite-vec loaded: `CREATE VIRTUAL TABLE` fails with \"no such module: vec0\"\n- Documents deleted before embeddings: trigger fires but vec0 DELETE on empty range is safe\n- vec0 doesn't support FK cascades: that's why we need the explicit trigger","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:26:33.958178Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:22:26.478290Z","closed_at":"2026-01-30T17:22:26.478229Z","close_reason":"Completed: migration 009_embeddings.sql with vec0 table, embedding_metadata with composite PK, orphan cleanup trigger, registered in db.rs","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-2ac","depends_on_id":"bd-221","type":"blocks","created_at":"2026-01-30T15:29:24.594861Z","created_by":"tayloreernisse"}]}
{"id":"bd-2am8","title":"OBSERV: Enhance sync-status to show recent runs with metrics","description":"## Background\nsync_status currently queries sync_runs but always gets zero rows (nothing writes to the table). After bd-23a4 wires up SyncRunRecorder, rows will exist. This bead enhances the display to show recent runs with metrics.\n\n## Approach\n### src/cli/commands/sync_status.rs\n\n1. Change get_last_sync_run() (line ~66) to get_recent_sync_runs() returning last N:\n```rust\nfn get_recent_sync_runs(conn: &Connection, limit: usize) -> Result<Vec<SyncRunInfo>> {\n let mut stmt = conn.prepare(\n \"SELECT id, started_at, finished_at, status, command, error,\n run_id, total_items_processed, total_errors, metrics_json\n FROM sync_runs\n ORDER BY started_at DESC\n LIMIT ?1\",\n )?;\n // ... map rows to SyncRunInfo\n}\n```\n\n2. Extend SyncRunInfo to include new fields:\n```rust\npub struct SyncRunInfo {\n pub id: i64,\n pub started_at: i64,\n pub finished_at: Option<i64>,\n pub status: String,\n pub command: String,\n pub error: Option<String>,\n pub run_id: Option<String>, // NEW\n pub total_items_processed: i64, // NEW\n pub total_errors: i64, // NEW\n pub stages: Option<Vec<StageTiming>>, // NEW: parsed from metrics_json\n}\n```\n\n3. Parse metrics_json into Vec<StageTiming>:\n```rust\nlet stages: Option<Vec<StageTiming>> = row.get::<_, Option<String>>(9)?\n .and_then(|json| serde_json::from_str(&json).ok());\n```\n\n4. Interactive output (new format):\n```\nRecent sync runs:\n Run a1b2c3 | 2026-02-04 14:32 | 45.2s | 235 items | 1 error\n Run d4e5f6 | 2026-02-03 14:30 | 38.1s | 220 items | 0 errors\n Run g7h8i9 | 2026-02-02 14:29 | 42.7s | 228 items | 0 errors\n```\n\n5. Robot JSON output: runs array with stages parsed from metrics_json:\n```json\n{\n \"ok\": true,\n \"data\": {\n \"runs\": [{ \"run_id\": \"...\", \"stages\": [...] }],\n \"cursors\": [...],\n \"summary\": {...}\n }\n}\n```\n\n6. Add --run <run_id> flag to sync-status subcommand for single-run detail view (shows full stage breakdown).\n\n## Acceptance Criteria\n- [ ] lore sync-status shows last 10 runs (not just 1) with run_id, duration, items, errors\n- [ ] lore --robot sync-status JSON includes runs array with stages parsed from metrics_json\n- [ ] lore sync-status --run a1b2c3 shows single run detail with full stage breakdown\n- [ ] When no runs exist, shows appropriate \"No sync runs recorded\" message\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/cli/commands/sync_status.rs (rewrite query, extend structs, update display)\n\n## TDD Loop\nRED:\n - test_sync_status_shows_runs: insert 3 sync_runs rows, call print function, assert all 3 shown\n - test_sync_status_json_includes_stages: insert row with metrics_json, verify robot JSON has stages\n - test_sync_status_empty: no rows, verify graceful message\nGREEN: Rewrite get_last_sync_run -> get_recent_sync_runs, extend SyncRunInfo, update output\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- metrics_json is NULL (old rows or failed runs): stages field is null/empty in output\n- metrics_json is malformed: serde_json::from_str fails silently (.ok()), stages is None\n- Duration calculation: finished_at - started_at in ms. If finished_at is NULL (running), show \"in progress\"","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-04T15:54:51.467705Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:43:07.306504Z","closed_at":"2026-02-04T17:43:07.306425Z","close_reason":"Enhanced sync-status: shows last 10 runs with run_id, duration, items, errors, parsed stages; JSON includes full stages array","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-2am8","depends_on_id":"bd-23a4","type":"blocks","created_at":"2026-02-04T15:55:20.449881Z","created_by":"tayloreernisse"},{"issue_id":"bd-2am8","depends_on_id":"bd-3pz","type":"parent-child","created_at":"2026-02-04T15:54:51.468728Z","created_by":"tayloreernisse"}]}
{"id":"bd-2as","title":"[CP1] Epic: Issue Ingestion","description":"Ingest all issues, labels, and issue discussions from configured GitLab repositories with resumable cursor-based incremental sync. This establishes the core data ingestion pattern reused for MRs in CP2.\n\nSuccess Criteria:\n- gi ingest --type=issues fetches all issues (count matches GitLab UI)\n- Labels extracted from issue payloads\n- Issue discussions fetched per-issue\n- Cursor-based sync is resumable\n- Sync tracking records all runs\n- Single-flight lock prevents concurrent runs\n\nReference: docs/prd/checkpoint-1.md","status":"tombstone","priority":1,"issue_type":"task","created_at":"2026-01-25T15:18:44.062057Z","created_by":"tayloreernisse","updated_at":"2026-01-25T15:21:35.155746Z","deleted_at":"2026-01-25T15:21:35.155744Z","deleted_by":"tayloreernisse","delete_reason":"delete","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-2bu","title":"[CP1] GitLab types for issues, discussions, notes","description":"Add Rust types to src/gitlab/types.rs for GitLab API responses.\n\n## Types to Add\n\n### GitLabIssue\n- id: i64 (GitLab global ID)\n- iid: i64 (project-scoped issue number)\n- project_id: i64\n- title: String\n- description: Option<String>\n- state: String (\"opened\" | \"closed\")\n- created_at, updated_at: String (ISO 8601)\n- closed_at: Option<String>\n- author: GitLabAuthor\n- labels: Vec<String> (array of label names - CP1 canonical)\n- web_url: String\nNOTE: labels_details intentionally NOT modeled - varies across GitLab versions\n\n### GitLabAuthor\n- id: i64\n- username: String\n- name: String\n\n### GitLabDiscussion\n- id: String (like \"6a9c1750b37d...\")\n- individual_note: bool\n- notes: Vec<GitLabNote>\n\n### GitLabNote\n- id: i64\n- note_type: Option<String> (\"DiscussionNote\" | \"DiffNote\" | null)\n- body: String\n- author: GitLabAuthor\n- created_at, updated_at: String (ISO 8601)\n- system: bool\n- resolvable: bool (default false)\n- resolved: bool (default false)\n- resolved_by: Option<GitLabAuthor>\n- resolved_at: Option<String>\n- position: Option<GitLabNotePosition>\n\n### GitLabNotePosition\n- old_path, new_path: Option<String>\n- old_line, new_line: Option<i32>\n\nFiles: src/gitlab/types.rs\nTests: Test deserialization with fixtures\nDone when: Types compile and deserialize sample API responses correctly","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:42:46.922805Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.710057Z","deleted_at":"2026-01-25T17:02:01.710051Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-2cu","title":"[CP1] Discussion ingestion module","description":"Fetch and store discussions/notes for each issue.\n\n## Module\nsrc/ingestion/discussions.rs\n\n## Key Structs\n\n### IngestDiscussionsResult\n- discussions_fetched: usize\n- discussions_upserted: usize\n- notes_upserted: usize\n- system_notes_count: usize\n\n## Main Function\npub async fn ingest_issue_discussions(\n conn, client, config,\n project_id, gitlab_project_id,\n issue_iid, local_issue_id, issue_updated_at\n) -> Result<IngestDiscussionsResult>\n\n## Logic\n1. Paginate through all discussions for given issue\n2. For each discussion:\n - Begin transaction\n - Store raw payload (compressed)\n - Transform and upsert discussion record with correct issue FK\n - Get local discussion ID\n - Transform notes from discussion\n - For each note:\n - Store raw payload\n - Upsert note with discussion_id FK\n - Count system notes\n - Commit transaction\n3. After all discussions synced: mark_discussions_synced(conn, local_issue_id, issue_updated_at)\n - UPDATE issues SET discussions_synced_for_updated_at = ? WHERE id = ?\n\n## Invariant\nA rerun MUST NOT refetch discussions for issues whose updated_at has not advanced, even with cursor rewind. The discussions_synced_for_updated_at watermark ensures this.\n\n## Helper Functions\n- upsert_discussion(conn, discussion, payload_id)\n- get_local_discussion_id(conn, project_id, gitlab_id) -> i64\n- upsert_note(conn, discussion_id, note, payload_id)\n- mark_discussions_synced(conn, issue_id, issue_updated_at)\n\nFiles: src/ingestion/discussions.rs\nTests: tests/discussion_watermark_tests.rs\nDone when: Discussions and notes populated with correct FKs, watermark prevents redundant refetch","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T16:57:36.703237Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.827880Z","deleted_at":"2026-01-25T17:02:01.827876Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-2dk","title":"Implement project resolution for --project filter","description":"## Background\nThe --project filter on search (and other commands) accepts a string that must be resolved to a project_id. Users may type the full path, a partial path, or just the project name. The resolution logic provides cascading match with helpful error messages when ambiguous. This improves ergonomics for multi-project installations.\n\n## Approach\nImplement project resolution function (location TBD — likely `src/core/project.rs` or inline in search filters):\n\n```rust\n/// Resolve a project string to a project_id using cascading match:\n/// 1. Exact match on path_with_namespace\n/// 2. Case-insensitive exact match\n/// 3. Suffix match (only if unambiguous)\n/// 4. Error with available projects list\npub fn resolve_project(conn: &Connection, project_str: &str) -> Result<i64>\n```\n\n**SQL queries:**\n```sql\n-- Step 1: exact match\nSELECT id FROM projects WHERE path_with_namespace = ?\n\n-- Step 2: case-insensitive\nSELECT id FROM projects WHERE LOWER(path_with_namespace) = LOWER(?)\n\n-- Step 3: suffix match\nSELECT id, path_with_namespace FROM projects\nWHERE path_with_namespace LIKE '%/' || ?\n OR path_with_namespace = ?\n\n-- Step 4: list all for error message\nSELECT path_with_namespace FROM projects ORDER BY path_with_namespace\n```\n\n**Error format:**\n```\nError: Project 'auth-service' not found.\n\nAvailable projects:\n backend/auth-service\n frontend/auth-service-ui\n infra/auth-proxy\n\nHint: Use the full path, e.g., --project=backend/auth-service\n```\n\nUses `LoreError::Ambiguous` variant for multiple suffix matches.\n\n## Acceptance Criteria\n- [ ] Exact match: \"group/project\" resolves correctly\n- [ ] Case-insensitive: \"Group/Project\" resolves to \"group/project\"\n- [ ] Suffix match: \"project-name\" resolves when only one \"*/project-name\" exists\n- [ ] Ambiguous suffix: error lists matching projects with hint\n- [ ] No match: error lists all available projects with hint\n- [ ] Empty projects table: clear error message\n- [ ] `cargo test project_resolution` passes\n\n## Files\n- `src/core/project.rs` — new file (or add to existing module)\n- `src/core/mod.rs` — add `pub mod project;`\n\n## TDD Loop\nRED: Tests:\n- `test_exact_match` — full path resolves\n- `test_case_insensitive` — mixed case resolves\n- `test_suffix_unambiguous` — short name resolves when unique\n- `test_suffix_ambiguous` — error with list when multiple match\n- `test_no_match` — error with available projects\nGREEN: Implement resolve_project\nVERIFY: `cargo test project_resolution`\n\n## Edge Cases\n- Project path containing special LIKE characters (%, _): unlikely but escape for safety\n- Single project in DB: suffix always unambiguous\n- Project path with multiple slashes: \"org/group/project\" — suffix match on \"project\" works","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-30T15:26:13.076571Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:39:17.197735Z","closed_at":"2026-01-30T17:39:17.197552Z","close_reason":"Implemented resolve_project() with cascading match (exact, CI, suffix) + 6 tests","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-2dk","depends_on_id":"bd-3q2","type":"blocks","created_at":"2026-01-30T15:29:24.446650Z","created_by":"tayloreernisse"}]}
{"id":"bd-2e8","title":"Add fetchResourceEvents config flag to SyncConfig","description":"## Background\nEvent fetching should be opt-in (default true) so users who don't need temporal queries skip 3 extra API calls per entity. This follows the existing SyncConfig pattern with serde defaults and camelCase JSON aliases.\n\n## Approach\nAdd to SyncConfig in src/core/config.rs:\n```rust\n#[serde(rename = \"fetchResourceEvents\", default = \"default_true\")]\npub fetch_resource_events: bool,\n```\n\nAdd default function (if not already present):\n```rust\nfn default_true() -> bool { true }\n```\n\nUpdate Default impl for SyncConfig to include `fetch_resource_events: true`.\n\nAdd --no-events flag to sync command in src/cli/mod.rs (SyncArgs):\n```rust\n/// Skip resource event fetching (overrides config)\n#[arg(long = \"no-events\", help_heading = \"Sync Options\")]\npub no_events: bool,\n```\n\nIn the sync command handler (src/cli/commands/sync.rs), override config when flag is set:\n```rust\nif args.no_events {\n config.sync.fetch_resource_events = false;\n}\n```\n\n## Acceptance Criteria\n- [ ] SyncConfig deserializes `fetchResourceEvents: false` from JSON config\n- [ ] SyncConfig defaults to `fetch_resource_events: true` when field absent\n- [ ] `--no-events` flag parses correctly in CLI\n- [ ] `--no-events` overrides config to false\n- [ ] `cargo test` passes with no regressions\n\n## Files\n- src/core/config.rs (add field to SyncConfig + default fn + Default impl)\n- src/cli/mod.rs (add --no-events to SyncArgs)\n- src/cli/commands/sync.rs (override config when flag set)\n\n## TDD Loop\nRED: tests/config_tests.rs (or inline in config.rs):\n- `test_sync_config_fetch_resource_events_default_true` - omit field from JSON, verify default\n- `test_sync_config_fetch_resource_events_explicit_false` - set field false, verify parsed\n- `test_sync_config_no_events_flag` - verify CLI arg parsing\n\nGREEN: Add the field, default fn, Default impl update, CLI flag, and override logic\n\nVERIFY: `cargo test config -- --nocapture && cargo build`\n\n## Edge Cases\n- Ensure serde rename matches camelCase convention used by all other SyncConfig fields\n- The default_true fn may already exist for other fields — check before adding duplicate\n- The --no-events flag must NOT be confused with --no-X negation flags already in CLI (check mod.rs for conflicts)","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-02T21:31:24.006037Z","created_by":"tayloreernisse","updated_at":"2026-02-03T16:10:20.311986Z","closed_at":"2026-02-03T16:10:20.311939Z","close_reason":"Completed: Added fetch_resource_events bool to SyncConfig with serde rename, default_true, --no-events CLI flag, and config override in sync handler","compaction_level":0,"original_size":0,"labels":["config","gate-1","phase-b"],"dependencies":[{"issue_id":"bd-2e8","depends_on_id":"bd-2zl","type":"parent-child","created_at":"2026-02-02T21:31:24.010608Z","created_by":"tayloreernisse"}]}
{"id":"bd-2ez","title":"Add 'lore count references' command","description":"## Background\n\nThe count command currently supports issues, mrs, discussions, notes, and events. This bead adds 'references' as a new entity type, showing total cross-references and breakdowns by reference_type and source_method. This helps users understand the density and quality of their cross-reference data.\n\n**Spec reference:** `docs/phase-b-temporal-intelligence.md` Section 2.5 AC: \"source_method column tracks provenance of each reference.\"\n\n**IMPORTANT: source_method value discrepancy.** The spec (Section 2.2) defines source_method values as `'api_closes_issues' | 'api_state_event' | 'system_note_parse'`. However, the implemented migration 011 uses simpler values: `'api' | 'note_parse' | 'description_parse'` (see `migrations/011_resource_events.sql` line 86). The CHECK constraint is already applied. **Use the codebase values**, not the spec values.\n\n## Approach\n\n### 1. Add to CountArgs value_parser in `src/cli/mod.rs`\n\n```rust\n#[arg(value_parser = [\"issues\", \"mrs\", \"discussions\", \"notes\", \"events\", \"references\"])]\npub entity: String,\n```\n\n### 2. Add count_references function in `src/cli/commands/count.rs`\n\n```rust\npub struct ReferenceCountResult {\n pub total: i64,\n pub by_type: ReferenceTypeBreakdown,\n pub by_method: ReferenceMethodBreakdown,\n pub unresolved: i64,\n}\n\npub struct ReferenceTypeBreakdown {\n pub closes: i64,\n pub mentioned: i64,\n pub related: i64,\n}\n\npub struct ReferenceMethodBreakdown {\n pub api: i64,\n pub note_parse: i64,\n pub description_parse: i64,\n}\n```\n\n### 3. SQL Query\n\n```sql\nSELECT\n COUNT(*) as total,\n COALESCE(SUM(CASE WHEN reference_type = 'closes' THEN 1 ELSE 0 END), 0) as closes,\n COALESCE(SUM(CASE WHEN reference_type = 'mentioned' THEN 1 ELSE 0 END), 0) as mentioned,\n COALESCE(SUM(CASE WHEN reference_type = 'related' THEN 1 ELSE 0 END), 0) as related,\n COALESCE(SUM(CASE WHEN source_method = 'api' THEN 1 ELSE 0 END), 0) as api,\n COALESCE(SUM(CASE WHEN source_method = 'note_parse' THEN 1 ELSE 0 END), 0) as note_parse,\n COALESCE(SUM(CASE WHEN source_method = 'description_parse' THEN 1 ELSE 0 END), 0) as desc_parse,\n COALESCE(SUM(CASE WHEN target_entity_id IS NULL THEN 1 ELSE 0 END), 0) as unresolved\nFROM entity_references\n```\n\n### 4. Human Output\n\n```\nReferences: 1,234\n By type:\n closes: 456\n mentioned: 678\n related: 100\n By source:\n api: 234\n note_parse: 890\n description_parse: 110\n Unresolved: 45 (3.6%)\n```\n\n### 5. Robot JSON\n\n```json\n{\n \"ok\": true,\n \"data\": {\n \"entity\": \"references\",\n \"total\": 1234,\n \"by_type\": { \"closes\": 456, \"mentioned\": 678, \"related\": 100 },\n \"by_method\": { \"api\": 234, \"note_parse\": 890, \"description_parse\": 110 },\n \"unresolved\": 45\n }\n}\n```\n\n### 6. Wire in run_count and handle_count\n\nAdd a new branch in `run_count()` match:\n```rust\n\"references\" => count_references(&conn),\n```\n\nSince ReferenceCountResult has a different shape than CountResult, either:\n- Use an enum `CountOutput { Standard(CountResult), References(ReferenceCountResult) }`\n- Or make count_references return its own type and handle it separately in handle_count\n\n## Acceptance Criteria\n\n- [ ] `lore count references` works with human output\n- [ ] `lore --robot count references` works with JSON output\n- [ ] by_type breakdown matches reference_type column values: closes, mentioned, related\n- [ ] by_method breakdown matches source_method column values: api, note_parse, description_parse (codebase values, NOT spec values)\n- [ ] by_type breakdown sums to total\n- [ ] by_method breakdown sums to total\n- [ ] Unresolved count matches `WHERE target_entity_id IS NULL`\n- [ ] Zero references: all counts are 0 (not error)\n- [ ] `cargo check --all-targets` passes\n- [ ] `cargo clippy --all-targets -- -D warnings` passes\n\n## Files\n\n- `src/cli/mod.rs` (add \"references\" to CountArgs value_parser)\n- `src/cli/commands/count.rs` (add count_references + output functions + ReferenceCountResult)\n- `src/main.rs` (handle the references branch in handle_count)\n\n## TDD Loop\n\nRED: Add test in count.rs:\n- `test_count_references_query` - verify SQL returns correct structure (needs in-memory DB with migration 011)\n\nGREEN: Implement the query, result type, and output functions.\n\nVERIFY: `cargo test --lib -- count && cargo check --all-targets`\n\n## Edge Cases\n\n- entity_references table doesn't exist (old schema): handle with migration check or graceful SQL error\n- All references unresolved: unresolved count = total\n- New source_method values added in future: consider \"other\" bucket or explicit error","status":"open","priority":3,"issue_type":"task","created_at":"2026-02-02T22:42:43.780303Z","created_by":"tayloreernisse","updated_at":"2026-02-05T19:12:46.312791Z","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-2ez","depends_on_id":"bd-1se","type":"parent-child","created_at":"2026-02-02T22:43:40.652558Z","created_by":"tayloreernisse"},{"issue_id":"bd-2ez","depends_on_id":"bd-hu3","type":"blocks","created_at":"2026-02-02T22:43:33.877742Z","created_by":"tayloreernisse"}]}
{"id":"bd-2f0","title":"[CP1] gi count issues/discussions/notes commands","description":"## Background\n\nThe `gi count` command provides quick counts of entities in the local database. It supports counting issues, MRs, discussions, and notes, with optional filtering by noteable type. This enables quick validation that sync is working correctly.\n\n## Approach\n\n### Module: src/cli/commands/count.rs\n\n### Clap Definition\n\n```rust\n#[derive(Args)]\npub struct CountArgs {\n /// Entity type to count\n #[arg(value_parser = [\"issues\", \"mrs\", \"discussions\", \"notes\"])]\n pub entity: String,\n\n /// Filter by noteable type (for discussions/notes)\n #[arg(long, value_parser = [\"issue\", \"mr\"])]\n pub r#type: Option<String>,\n}\n```\n\n### Handler Function\n\n```rust\npub async fn handle_count(args: CountArgs, conn: &Connection) -> Result<()>\n```\n\n### Queries by Entity\n\n**issues:**\n```sql\nSELECT COUNT(*) FROM issues\n```\nOutput: `Issues: 3,801`\n\n**discussions:**\n```sql\n-- Without type filter\nSELECT COUNT(*) FROM discussions\n\n-- With --type=issue\nSELECT COUNT(*) FROM discussions WHERE noteable_type = 'Issue'\n```\nOutput: `Issue Discussions: 1,234`\n\n**notes:**\n```sql\n-- Total and system count\nSELECT COUNT(*), SUM(is_system) FROM notes\n\n-- With --type=issue (join through discussions)\nSELECT COUNT(*), SUM(n.is_system)\nFROM notes n\nJOIN discussions d ON n.discussion_id = d.id\nWHERE d.noteable_type = 'Issue'\n```\nOutput: `Issue Notes: 5,678 (excluding 1,234 system)`\n\n### Output Format\n\n```\nIssues: 3,801\n```\n\n```\nIssue Discussions: 1,234\n```\n\n```\nIssue Notes: 5,678 (excluding 1,234 system)\n```\n\n## Acceptance Criteria\n\n- [ ] `gi count issues` shows total issue count\n- [ ] `gi count discussions` shows total discussion count\n- [ ] `gi count discussions --type=issue` filters to issue discussions\n- [ ] `gi count notes` shows total note count with system note exclusion\n- [ ] `gi count notes --type=issue` filters to issue notes\n- [ ] Numbers formatted with thousands separators (1,234)\n\n## Files\n\n- src/cli/commands/mod.rs (add `pub mod count;`)\n- src/cli/commands/count.rs (create)\n- src/cli/mod.rs (add Count variant to Commands enum)\n\n## TDD Loop\n\nRED:\n```rust\n#[tokio::test] async fn count_issues_returns_total()\n#[tokio::test] async fn count_discussions_with_type_filter()\n#[tokio::test] async fn count_notes_excludes_system_notes()\n```\n\nGREEN: Implement handler with queries\n\nVERIFY: `cargo test count`\n\n## Edge Cases\n\n- Zero entities - show \"Issues: 0\"\n- --type flag invalid for issues/mrs - ignore or error\n- All notes are system notes - show \"Notes: 0 (excluding 1,234 system)\"","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-25T17:02:38.360495Z","created_by":"tayloreernisse","updated_at":"2026-01-25T23:01:37.084627Z","closed_at":"2026-01-25T23:01:37.084568Z","close_reason":"Implemented gi count command with issues/discussions/notes support, format_number helper, and system note exclusion","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-2f0","depends_on_id":"bd-208","type":"blocks","created_at":"2026-01-25T17:04:05.677181Z","created_by":"tayloreernisse"}]}
{"id":"bd-2f2","title":"Implement timeline human output renderer","description":"## Background\n\nThe human output renderer for timeline produces a vertically-oriented, colored timeline in the terminal. This follows the existing pattern from show.rs where human output uses console::style() for coloring.\n\n**Spec reference:** `docs/phase-b-temporal-intelligence.md` Section 3.4 (Human Output Format).\n\n## Approach\n\nCreate `print_timeline()` function in `src/cli/commands/timeline.rs`.\n\n### Output Format (from spec Section 3.4)\n\n```\nlore timeline \"auth migration\"\n\nTimeline: \"auth migration\" (12 events across 4 entities)\n───────────────────────────────────────────────────────\n\n2024-03-15 CREATED #234 Migrate to OAuth2 @alice\n Labels: ~auth, ~breaking-change\n2024-03-18 CREATED !567 feat: add OAuth2 provider @bob\n References: #234\n2024-03-20 NOTE #234 \"Should we support SAML too? I think @charlie\n we should stick with OAuth2 for now...\"\n2024-03-22 LABEL !567 added ~security-review @alice\n2024-03-24 NOTE !567 [src/auth/oauth.rs:45] @dave\n \"Consider refresh token rotation to\n prevent session fixation attacks\"\n2024-03-25 MERGED !567 feat: add OAuth2 provider @alice\n2024-03-26 CLOSED #234 closed by !567 @alice\n2024-03-28 CREATED #299 OAuth2 login fails for SSO users @dave [expanded]\n (via !567, closes)\n\n───────────────────────────────────────────────────────\nSeed entities: #234, !567 | Expanded: #299 (depth 1, via !567)\n```\n\nKey formatting rules from spec:\n- Entity refs use compact notation: `#234` for issues, `!567` for MRs (not `issue #234`)\n- Expanded entities show `[expanded]` inline with provenance `(via !567, closes)` below\n- Evidence notes show quoted text with author\n- Footer shows seed vs expanded entity summary\n\n### Color Scheme\n\n| Event Type | Color | Tag |\n|------------|-------|-----|\n| Created | green | CREATED |\n| StateChanged (close) | yellow | CLOSED |\n| StateChanged (reopen) | green | REOPENED |\n| StateChanged (lock) | dim | LOCKED |\n| LabelAdded | cyan | LABEL |\n| LabelRemoved | dim cyan | LABEL |\n| MilestoneSet | magenta | MILESTONE |\n| MilestoneRemoved | dim magenta | MILESTONE |\n| Merged | bright green bold | MERGED |\n| NoteEvidence | white | NOTE |\n| CrossReferenced | blue | XREF |\n\n### Evidence Notes\n\nEvidence notes (NoteEvidence) show the snippet as quoted text per spec Section 3.4:\n```\n2024-03-20 NOTE #234 \"Should we support SAML too?...\" @charlie\n```\n\n### Implementation Pattern\n\nFollow the existing `print_show_issue()` pattern from `src/cli/commands/show.rs`:\n- Use `console::style()` for colors\n- Use fixed-width columns for alignment\n- Truncate long titles to terminal width\n- Use `core::time::ms_to_iso()` for date formatting (date only, not time, for readability)\n\n## Acceptance Criteria\n\n- [ ] Events display in chronological order with date, tag, entity ref, summary, actor\n- [ ] Entity refs use compact notation: `#iid` for issues, `!iid` for MRs (spec Section 3.4)\n- [ ] Each event type has the correct color per the color scheme\n- [ ] Evidence note snippets are displayed as quoted text\n- [ ] Expanded entity events show `[expanded]` marker with `(via !iid, edge_type)` provenance (spec)\n- [ ] Header line shows event count and entity count\n- [ ] Footer shows seed vs expanded entity summary (spec Section 3.4)\n- [ ] Long titles truncate to fit terminal width (use `console::Term::size()`)\n- [ ] Empty results print \"No events found matching '<query>'\"\n- [ ] Function signature: `pub fn print_timeline(result: &TimelineResult)`\n- [ ] `cargo check --all-targets` passes\n- [ ] `cargo clippy --all-targets -- -D warnings` passes\n\n## Files\n\n- `src/cli/commands/timeline.rs` (NEW)\n- `src/cli/commands/mod.rs` (re-export print_timeline)\n\n## TDD Loop\n\nRED: No unit tests for display formatting -- this is visual output. Verify by:\n- Running `lore timeline \"test\"` against a synced database\n- Checking that colors render correctly in a TTY\n- Checking that output is readable in a non-TTY (no ANSI codes when piped)\n\nGREEN: Implement the column formatting and color logic.\n\nVERIFY: `cargo check --all-targets && cargo clippy --all-targets -- -D warnings`\n\n## Edge Cases\n\n- Terminal width < 80: truncate title column more aggressively\n- Events with no actor: show empty space (not \"None\" or \"unknown\")\n- Very long evidence snippets: already truncated to 200 chars at TimelineEvent level\n- Unicode in titles/actors: use `console::measure_text_width()` for correct column alignment\n- StateChanged event: use state value as tag (CLOSED, REOPENED, MERGED, LOCKED)","status":"open","priority":2,"issue_type":"task","created_at":"2026-02-02T21:33:28.326026Z","created_by":"tayloreernisse","updated_at":"2026-02-05T19:10:11.521068Z","compaction_level":0,"original_size":0,"labels":["cli","gate-3","phase-b"],"dependencies":[{"issue_id":"bd-2f2","depends_on_id":"bd-3as","type":"blocks","created_at":"2026-02-02T21:33:37.659719Z","created_by":"tayloreernisse"},{"issue_id":"bd-2f2","depends_on_id":"bd-ike","type":"parent-child","created_at":"2026-02-02T21:33:28.329132Z","created_by":"tayloreernisse"}]}
{"id":"bd-2fc","title":"Update AGENTS.md and CLAUDE.md with Phase B commands","description":"## Background\n\nAfter all Phase B commands are implemented, AGENTS.md and the global CLAUDE.md need to document the new temporal intelligence commands so agents can discover and use them without reading robot-docs.\n\n## Approach\n\n### 1. Update AGENTS.md\n\nAdd a new section after the existing Robot Mode Commands section:\n\n```markdown\n### Temporal Intelligence Commands\n\n\\`\\`\\`bash\n# Timeline: chronological event narrative for a keyword\nlore --robot timeline \"authentication\" --since 30d\nlore --robot timeline \"deployment\" --depth 2 --expand-mentions\n\n# File History: which MRs touched a file, with rename tracking\nlore --robot file-history src/auth/oauth.rs --discussions\nlore --robot file-history src/old_name.rs # follows renames automatically\n\n# Trace: decision chain from file -> MR -> issue -> discussions\nlore --robot trace src/auth/oauth.rs --discussions\nlore --robot trace src/auth/oauth.rs:45 # line hint (Tier 2 warning)\n\n# Count cross-references\nlore --robot count references\n\\`\\`\\`\n```\n\n### 2. Update CLAUDE.md (global)\n\nAdd the same commands to the Gitlore section in ~/.claude/CLAUDE.md, under the existing Commands section.\n\n## Acceptance Criteria\n\n- [ ] AGENTS.md has \"Temporal Intelligence Commands\" section\n- [ ] CLAUDE.md has matching section in the Gitlore block\n- [ ] All command examples are valid and runnable\n- [ ] No stale/outdated references to old command names\n- [ ] Examples cover all flags: --since, --depth, --expand-mentions, --discussions, -n, -p\n\n## Files\n\n- `AGENTS.md` (add section)\n- `~/.claude/CLAUDE.md` (add section in Gitlore block)\n\n## TDD Loop\n\nN/A - documentation only. Verify by reading the files after update.\n\n## Edge Cases\n\n- Don't duplicate the full robot-docs manifest; keep it concise\n- Reference robot-docs for the authoritative flag list\n- Mention that timeline requires synced resource events (--no-events disables them)","status":"open","priority":4,"issue_type":"task","created_at":"2026-02-02T22:43:22.090741Z","created_by":"tayloreernisse","updated_at":"2026-02-05T18:53:32.480490Z","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-2fc","depends_on_id":"bd-1ht","type":"parent-child","created_at":"2026-02-02T22:43:40.829848Z","created_by":"tayloreernisse"},{"issue_id":"bd-2fc","depends_on_id":"bd-1v8","type":"blocks","created_at":"2026-02-02T22:43:34.047898Z","created_by":"tayloreernisse"}]}
{"id":"bd-2fm","title":"Add GitLab Resource Event serde types","description":"## Background\nNeed Rust types for deserializing GitLab Resource Events API responses. These map directly to the API JSON shape from three endpoints: resource_state_events, resource_label_events, resource_milestone_events.\n\nExisting pattern: types.rs uses #[derive(Debug, Clone, Deserialize)] with Option<T> for nullable fields. GitLabAuthor is already defined (id, username, name). Tests in tests/gitlab_types_tests.rs use serde_json::from_str with sample payloads.\n\n## Approach\nAdd to src/gitlab/types.rs (after existing types):\n\n```rust\n/// Reference to an MR in state event's source_merge_request field\n#[derive(Debug, Clone, Deserialize, Serialize)]\npub struct GitLabMergeRequestRef {\n pub iid: i64,\n pub title: Option<String>,\n pub web_url: Option<String>,\n}\n\n/// Reference to a label in label event's label field\n#[derive(Debug, Clone, Deserialize, Serialize)]\npub struct GitLabLabelRef {\n pub id: i64,\n pub name: String,\n pub color: Option<String>,\n pub description: Option<String>,\n}\n\n/// Reference to a milestone in milestone event's milestone field\n#[derive(Debug, Clone, Deserialize, Serialize)]\npub struct GitLabMilestoneRef {\n pub id: i64,\n pub iid: i64,\n pub title: String,\n}\n\n#[derive(Debug, Clone, Deserialize, Serialize)]\npub struct GitLabStateEvent {\n pub id: i64,\n pub user: Option<GitLabAuthor>,\n pub created_at: String,\n pub resource_type: String, // \"Issue\" | \"MergeRequest\"\n pub resource_id: i64,\n pub state: String, // \"opened\" | \"closed\" | \"reopened\" | \"merged\" | \"locked\"\n pub source_commit: Option<String>,\n pub source_merge_request: Option<GitLabMergeRequestRef>,\n}\n\n#[derive(Debug, Clone, Deserialize, Serialize)]\npub struct GitLabLabelEvent {\n pub id: i64,\n pub user: Option<GitLabAuthor>,\n pub created_at: String,\n pub resource_type: String,\n pub resource_id: i64,\n pub label: GitLabLabelRef,\n pub action: String, // \"add\" | \"remove\"\n}\n\n#[derive(Debug, Clone, Deserialize, Serialize)]\npub struct GitLabMilestoneEvent {\n pub id: i64,\n pub user: Option<GitLabAuthor>,\n pub created_at: String,\n pub resource_type: String,\n pub resource_id: i64,\n pub milestone: GitLabMilestoneRef,\n pub action: String, // \"add\" | \"remove\"\n}\n```\n\nAlso export from src/gitlab/mod.rs if needed.\n\n## Acceptance Criteria\n- [ ] All 6 types (3 events + 3 refs) compile\n- [ ] GitLabStateEvent deserializes from real GitLab API JSON (with and without source_merge_request)\n- [ ] GitLabLabelEvent deserializes with nested label object\n- [ ] GitLabMilestoneEvent deserializes with nested milestone object\n- [ ] All Optional fields handle null/missing correctly\n- [ ] Types exported from lore::gitlab::types\n\n## Files\n- src/gitlab/types.rs (add 6 new types)\n- tests/gitlab_types_tests.rs (add deserialization tests)\n\n## TDD Loop\nRED: Add to tests/gitlab_types_tests.rs:\n- `test_deserialize_state_event_closed_by_mr` - JSON with source_merge_request present\n- `test_deserialize_state_event_simple` - JSON with source_merge_request null, user null\n- `test_deserialize_label_event_add` - label add with full label object\n- `test_deserialize_label_event_remove` - label remove\n- `test_deserialize_milestone_event` - milestone add with nested milestone\nImport new types: `use lore::gitlab::types::{GitLabStateEvent, GitLabLabelEvent, GitLabMilestoneEvent, GitLabMergeRequestRef, GitLabLabelRef, GitLabMilestoneRef};`\n\nGREEN: Add the type definitions to types.rs\n\nVERIFY: `cargo test gitlab_types_tests -- --nocapture`\n\n## Edge Cases\n- GitLab sometimes returns user: null for system-generated events (e.g., auto-close on merge) — user must be Option\n- source_merge_request can be null even when state is \"closed\" (manually closed, not by MR)\n- label.color may be null for labels created via API without color\n- The resource_type field uses PascalCase (\"MergeRequest\" not \"merge_request\") — don't confuse with DB entity_type","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-02T21:31:24.081234Z","created_by":"tayloreernisse","updated_at":"2026-02-03T16:10:20.253407Z","closed_at":"2026-02-03T16:10:20.253344Z","close_reason":"Completed: Added 6 new types (GitLabMergeRequestRef, GitLabLabelRef, GitLabMilestoneRef, GitLabStateEvent, GitLabLabelEvent, GitLabMilestoneEvent) to types.rs with exports and 8 passing tests","compaction_level":0,"original_size":0,"labels":["gate-1","phase-b","types"],"dependencies":[{"issue_id":"bd-2fm","depends_on_id":"bd-2zl","type":"parent-child","created_at":"2026-02-02T21:31:24.085809Z","created_by":"tayloreernisse"}]}
{"id":"bd-2fp","title":"Implement discussion document extraction","description":"## Background\nDiscussion documents are the most complex extraction — they involve querying discussions + notes + parent entity (issue or MR) + parent labels + DiffNote file paths. The output includes a threaded conversation format with author/date prefixes per note. System notes (bot-generated) are excluded. DiffNote paths are extracted for the --path search filter.\n\n## Approach\nImplement `extract_discussion_document()` in `src/documents/extractor.rs`:\n\n```rust\n/// Extract a searchable document from a discussion thread.\n/// Returns None if the discussion or its parent has been deleted.\npub fn extract_discussion_document(conn: &Connection, discussion_id: i64) -> Result<Option<DocumentData>>\n```\n\n**SQL queries (from PRD Section 2.2):**\n```sql\n-- Discussion metadata\nSELECT d.id, d.noteable_type, d.issue_id, d.merge_request_id,\n p.path_with_namespace, p.id AS project_id\nFROM discussions d\nJOIN projects p ON p.id = d.project_id\nWHERE d.id = ?\n\n-- Parent entity (conditional on noteable_type)\n-- If Issue: SELECT i.iid, i.title, i.web_url FROM issues i WHERE i.id = ?\n-- If MR: SELECT m.iid, m.title, m.web_url FROM merge_requests m WHERE m.id = ?\n\n-- Parent labels (via issue_labels or mr_labels junction)\n\n-- Non-system notes in thread order\nSELECT n.author_username, n.body, n.created_at, n.gitlab_id,\n n.note_type, n.position_old_path, n.position_new_path\nFROM notes n\nWHERE n.discussion_id = ? AND n.is_system = 0\nORDER BY n.created_at ASC, n.id ASC\n```\n\n**Document format:**\n```\n[[Discussion]] Issue #234: Authentication redesign\nProject: group/project-one\nURL: https://gitlab.example.com/group/project-one/-/issues/234#note_12345\nLabels: [\"bug\", \"auth\"]\nFiles: [\"src/auth/login.ts\"]\n\n--- Thread ---\n\n@johndoe (2024-03-15):\nI think we should move to JWT-based auth...\n\n@janedoe (2024-03-15):\nAgreed. What about refresh token strategy?\n```\n\n**Implementation steps:**\n1. Query discussion row — if not found, return Ok(None)\n2. Determine parent type (Issue or MR) from noteable_type\n3. Query parent entity for iid, title, web_url — if not found, return Ok(None)\n4. Query parent labels via appropriate junction table\n5. Query non-system notes ordered by created_at ASC, id ASC\n6. Extract DiffNote paths: collect position_old_path and position_new_path, dedup\n7. Construct URL: `{parent_web_url}#note_{first_note_gitlab_id}`\n8. Format header with [[Discussion]] prefix\n9. Format thread body: `@author (YYYY-MM-DD):\\nbody\\n\\n` per note\n10. Apply discussion truncation via `truncate_discussion()` if needed\n11. Author = first non-system note's author_username\n12. Compute hashes, return DocumentData\n\n## Acceptance Criteria\n- [ ] System notes (is_system=1) excluded from content\n- [ ] DiffNote paths extracted from position_old_path and position_new_path\n- [ ] Paths deduplicated and sorted\n- [ ] URL constructed as `parent_web_url#note_GITLAB_ID`\n- [ ] Header uses parent entity type: \"Issue #N\" or \"MR !N\"\n- [ ] Parent title included in header\n- [ ] Labels come from PARENT entity (not the discussion itself)\n- [ ] First non-system note author used as document author\n- [ ] Thread formatted with `@author (date):` per note\n- [ ] Truncation applied for long threads via truncate_discussion()\n- [ ] `cargo test extract_discussion` passes\n\n## Files\n- `src/documents/extractor.rs` — implement `extract_discussion_document()`\n\n## TDD Loop\nRED: Tests in `#[cfg(test)] mod tests`:\n- `test_discussion_document_format` — verify header + thread format\n- `test_discussion_not_found` — returns Ok(None)\n- `test_discussion_parent_deleted` — returns Ok(None) when parent issue/MR missing\n- `test_discussion_system_notes_excluded` — system notes not in content\n- `test_discussion_diffnote_paths` — old_path + new_path extracted and deduped\n- `test_discussion_url_construction` — URL has #note_GITLAB_ID anchor\n- `test_discussion_uses_parent_labels` — labels from parent entity, not discussion\nGREEN: Implement extract_discussion_document\nVERIFY: `cargo test extract_discussion`\n\n## Edge Cases\n- Discussion with all system notes: no non-system notes -> return empty thread (or skip document entirely?)\n- Discussion with NULL parent (orphaned): return Ok(None)\n- DiffNote with same old_path and new_path: dedup produces single entry\n- Notes with NULL body: skip or use empty string\n- Discussion on MR: header shows \"MR !N\" (not \"MergeRequest !N\")","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-30T15:25:45.549099Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:34:43.597398Z","closed_at":"2026-01-30T17:34:43.597339Z","close_reason":"Implemented extract_discussion_document() with parent entity lookup, DiffNote paths, system note exclusion, URL construction + 9 tests","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-2fp","depends_on_id":"bd-18t","type":"blocks","created_at":"2026-01-30T15:29:15.914098Z","created_by":"tayloreernisse"},{"issue_id":"bd-2fp","depends_on_id":"bd-36p","type":"blocks","created_at":"2026-01-30T15:29:15.847680Z","created_by":"tayloreernisse"},{"issue_id":"bd-2fp","depends_on_id":"bd-hrs","type":"blocks","created_at":"2026-01-30T15:29:15.880008Z","created_by":"tayloreernisse"}]}
{"id":"bd-2h0","title":"[CP1] gi list issues command","description":"List issues from the database.\n\n## Module\nsrc/cli/commands/list.rs\n\n## Clap Definition\nList {\n #[arg(value_parser = [\"issues\", \"mrs\"])]\n entity: String,\n \n #[arg(long, default_value = \"20\")]\n limit: usize,\n \n #[arg(long)]\n project: Option<String>,\n \n #[arg(long, value_parser = [\"opened\", \"closed\", \"all\"])]\n state: Option<String>,\n}\n\n## Output Format\nIssues (showing 20 of 3,801)\n\n #1234 Authentication redesign opened @johndoe 3 days ago\n #1233 Fix memory leak in cache closed @janedoe 5 days ago\n #1232 Add dark mode support opened @bobsmith 1 week ago\n ...\n\n## Implementation\n- Query issues table with filters\n- Join with projects table for display\n- Format updated_at as relative time (\"3 days ago\")\n- Truncate title if too long\n\nFiles: src/cli/commands/list.rs\nDone when: List displays issues with proper filtering and formatting","status":"tombstone","priority":3,"issue_type":"task","created_at":"2026-01-25T16:58:23.809829Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.898106Z","deleted_at":"2026-01-25T17:02:01.898102Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-2i10","title":"OBSERV: Add log file diagnostics to lore doctor","description":"## Background\nlore doctor is the diagnostic entry point. Adding log file info lets users verify logging is working and check disk usage. The existing DoctorChecks struct (src/cli/commands/doctor.rs:43-51) has checks for config, database, gitlab, projects, ollama.\n\n## Approach\nAdd a new LoggingCheck struct and field to DoctorChecks:\n\n```rust\n#[derive(Debug, Serialize)]\npub struct LoggingCheck {\n pub result: CheckResult,\n pub log_dir: String,\n pub file_count: usize,\n pub total_bytes: u64,\n #[serde(skip_serializing_if = \"Option::is_none\")]\n pub oldest_file: Option<String>,\n}\n```\n\nAdd to DoctorChecks (src/cli/commands/doctor.rs:43-51):\n```rust\npub logging: LoggingCheck,\n```\n\nImplement check_logging() function:\n```rust\nfn check_logging() -> LoggingCheck {\n let log_dir = get_log_dir(None); // TODO: accept config override\n let mut file_count = 0;\n let mut total_bytes = 0u64;\n let mut oldest: Option<String> = None;\n\n if let Ok(entries) = std::fs::read_dir(&log_dir) {\n for entry in entries.flatten() {\n let name = entry.file_name().to_string_lossy().to_string();\n if name.starts_with(\"lore.\") && name.ends_with(\".log\") {\n file_count += 1;\n if let Ok(meta) = entry.metadata() {\n total_bytes += meta.len();\n }\n if oldest.as_ref().map_or(true, |o| name < *o) {\n oldest = Some(name);\n }\n }\n }\n }\n\n LoggingCheck {\n result: CheckResult { status: CheckStatus::Ok, message: None },\n log_dir: log_dir.display().to_string(),\n file_count,\n total_bytes,\n oldest_file: oldest,\n }\n}\n```\n\nCall from run_doctor() (src/cli/commands/doctor.rs:91-126) and add to DoctorChecks construction.\n\nFor interactive output in print_doctor_results(), add a section:\n```\nLogging\n Log directory: ~/.local/share/lore/logs/\n Log files: 7 (2.3 MB)\n Oldest: lore.2026-01-28.log\n```\n\n## Acceptance Criteria\n- [ ] lore doctor shows log directory path, file count, total size\n- [ ] lore --robot doctor JSON includes logging field with log_dir, file_count, total_bytes, oldest_file\n- [ ] When no log files exist: file_count=0, total_bytes=0, oldest_file=null\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/cli/commands/doctor.rs (add LoggingCheck struct, check_logging fn, wire into DoctorChecks)\n\n## TDD Loop\nRED: test_check_logging_with_files, test_check_logging_empty_dir\nGREEN: Implement LoggingCheck struct and check_logging function\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- Log directory doesn't exist yet (first run before any sync): report file_count=0, status Ok\n- Permission errors on read_dir: report status Warning with message","status":"closed","priority":1,"issue_type":"task","created_at":"2026-02-04T15:53:55.682986Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:15:04.520915Z","closed_at":"2026-02-04T17:15:04.520868Z","close_reason":"Added LoggingCheck to DoctorChecks with log_dir, file_count, total_bytes; shows in both interactive and robot output","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-2i10","depends_on_id":"bd-1k4","type":"blocks","created_at":"2026-02-04T15:55:19.686771Z","created_by":"tayloreernisse"},{"issue_id":"bd-2i10","depends_on_id":"bd-2nx","type":"parent-child","created_at":"2026-02-04T15:53:55.683866Z","created_by":"tayloreernisse"}]}
{"id":"bd-2iq","title":"[CP1] Database migration 002_issues.sql","description":"## Background\n\nThe 002_issues.sql migration creates tables for issues, labels, issue_labels, discussions, and notes. This is the data foundation for Checkpoint 1, enabling issue ingestion with cursor-based sync, label tracking, and discussion storage.\n\n## Approach\n\nCreate `migrations/002_issues.sql` with complete SQL statements.\n\n### Full Migration SQL\n\n```sql\n-- Migration 002: Issue Ingestion Tables\n-- Applies on top of 001_initial.sql\n\n-- Issues table\nCREATE TABLE issues (\n id INTEGER PRIMARY KEY,\n gitlab_id INTEGER UNIQUE NOT NULL,\n project_id INTEGER NOT NULL REFERENCES projects(id) ON DELETE CASCADE,\n iid INTEGER NOT NULL,\n title TEXT,\n description TEXT,\n state TEXT NOT NULL CHECK (state IN ('opened', 'closed')),\n author_username TEXT,\n created_at INTEGER NOT NULL, -- ms epoch UTC\n updated_at INTEGER NOT NULL, -- ms epoch UTC\n last_seen_at INTEGER NOT NULL, -- updated on every upsert\n discussions_synced_for_updated_at INTEGER, -- watermark for dependent sync\n web_url TEXT,\n raw_payload_id INTEGER REFERENCES raw_payloads(id)\n);\n\nCREATE INDEX idx_issues_project_updated ON issues(project_id, updated_at);\nCREATE INDEX idx_issues_author ON issues(author_username);\nCREATE UNIQUE INDEX uq_issues_project_iid ON issues(project_id, iid);\n\n-- Labels table (name-only for CP1)\nCREATE TABLE labels (\n id INTEGER PRIMARY KEY,\n gitlab_id INTEGER, -- optional, for future Labels API\n project_id INTEGER NOT NULL REFERENCES projects(id) ON DELETE CASCADE,\n name TEXT NOT NULL,\n color TEXT,\n description TEXT\n);\n\nCREATE UNIQUE INDEX uq_labels_project_name ON labels(project_id, name);\nCREATE INDEX idx_labels_name ON labels(name);\n\n-- Issue-label junction (DELETE before INSERT for stale removal)\nCREATE TABLE issue_labels (\n issue_id INTEGER NOT NULL REFERENCES issues(id) ON DELETE CASCADE,\n label_id INTEGER NOT NULL REFERENCES labels(id) ON DELETE CASCADE,\n PRIMARY KEY(issue_id, label_id)\n);\n\nCREATE INDEX idx_issue_labels_label ON issue_labels(label_id);\n\n-- Discussion threads for issues (MR discussions added in CP2)\nCREATE TABLE discussions (\n id INTEGER PRIMARY KEY,\n gitlab_discussion_id TEXT NOT NULL, -- GitLab string ID (e.g., \"6a9c1750b37d...\")\n project_id INTEGER NOT NULL REFERENCES projects(id) ON DELETE CASCADE,\n issue_id INTEGER REFERENCES issues(id) ON DELETE CASCADE,\n merge_request_id INTEGER, -- FK added in CP2 via ALTER TABLE\n noteable_type TEXT NOT NULL CHECK (noteable_type IN ('Issue', 'MergeRequest')),\n individual_note INTEGER NOT NULL DEFAULT 0, -- 0=threaded, 1=standalone\n first_note_at INTEGER, -- min(note.created_at) for ordering\n last_note_at INTEGER, -- max(note.created_at) for \"recently active\"\n last_seen_at INTEGER NOT NULL, -- updated on every upsert\n resolvable INTEGER NOT NULL DEFAULT 0, -- MR discussions can be resolved\n resolved INTEGER NOT NULL DEFAULT 0,\n CHECK (\n (noteable_type = 'Issue' AND issue_id IS NOT NULL AND merge_request_id IS NULL) OR\n (noteable_type = 'MergeRequest' AND merge_request_id IS NOT NULL AND issue_id IS NULL)\n )\n);\n\nCREATE UNIQUE INDEX uq_discussions_project_discussion_id ON discussions(project_id, gitlab_discussion_id);\nCREATE INDEX idx_discussions_issue ON discussions(issue_id);\nCREATE INDEX idx_discussions_mr ON discussions(merge_request_id);\nCREATE INDEX idx_discussions_last_note ON discussions(last_note_at);\n\n-- Notes belong to discussions\nCREATE TABLE notes (\n id INTEGER PRIMARY KEY,\n gitlab_id INTEGER UNIQUE NOT NULL,\n discussion_id INTEGER NOT NULL REFERENCES discussions(id) ON DELETE CASCADE,\n project_id INTEGER NOT NULL REFERENCES projects(id) ON DELETE CASCADE,\n note_type TEXT, -- 'DiscussionNote' | 'DiffNote' | null\n is_system INTEGER NOT NULL DEFAULT 0, -- 1 for system-generated notes\n author_username TEXT,\n body TEXT,\n created_at INTEGER NOT NULL, -- ms epoch\n updated_at INTEGER NOT NULL, -- ms epoch\n last_seen_at INTEGER NOT NULL, -- updated on every upsert\n position INTEGER, -- 0-indexed array order from API\n resolvable INTEGER NOT NULL DEFAULT 0,\n resolved INTEGER NOT NULL DEFAULT 0,\n resolved_by TEXT,\n resolved_at INTEGER,\n -- DiffNote position metadata (populated for MR DiffNotes in CP2)\n position_old_path TEXT,\n position_new_path TEXT,\n position_old_line INTEGER,\n position_new_line INTEGER,\n raw_payload_id INTEGER REFERENCES raw_payloads(id)\n);\n\nCREATE INDEX idx_notes_discussion ON notes(discussion_id);\nCREATE INDEX idx_notes_author ON notes(author_username);\nCREATE INDEX idx_notes_system ON notes(is_system);\n\n-- Update schema version\nINSERT INTO schema_version (version, applied_at, description)\nVALUES (2, strftime('%s', 'now') * 1000, 'Issue ingestion tables');\n```\n\n## Acceptance Criteria\n\n- [ ] Migration file exists at `migrations/002_issues.sql`\n- [ ] All tables created: issues, labels, issue_labels, discussions, notes\n- [ ] All indexes created as specified\n- [ ] CHECK constraints on state and noteable_type work correctly\n- [ ] CASCADE deletes work (project deletion cascades)\n- [ ] Migration applies cleanly on fresh DB after 001_initial.sql\n- [ ] schema_version updated to 2 after migration\n- [ ] `gi doctor` shows schema_version = 2\n\n## Files\n\n- migrations/002_issues.sql (create)\n\n## TDD Loop\n\nRED:\n```rust\n// tests/migration_tests.rs\n#[test] fn migration_002_creates_issues_table()\n#[test] fn migration_002_creates_labels_table()\n#[test] fn migration_002_creates_discussions_table()\n#[test] fn migration_002_creates_notes_table()\n#[test] fn migration_002_enforces_state_check()\n#[test] fn migration_002_enforces_noteable_type_check()\n#[test] fn migration_002_cascades_on_project_delete()\n```\n\nGREEN: Create migration file with all SQL\n\nVERIFY:\n```bash\n# Apply migration to test DB\nsqlite3 :memory: < migrations/001_initial.sql\nsqlite3 :memory: < migrations/002_issues.sql\n\n# Verify schema_version\nsqlite3 test.db \"SELECT version FROM schema_version ORDER BY version DESC LIMIT 1\"\n# Expected: 2\n\ncargo test migration_002\n```\n\n## Edge Cases\n\n- Applying twice - should fail on UNIQUE constraint (idempotency via version check)\n- Missing 001 - foreign key to projects fails\n- Long label names - TEXT handles any length\n- NULL description - allowed by schema\n- Empty discussions_synced_for_updated_at - NULL means never synced","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-25T17:02:38.128594Z","created_by":"tayloreernisse","updated_at":"2026-01-25T22:25:10.309900Z","closed_at":"2026-01-25T22:25:10.309852Z","close_reason":"Created 002_issues.sql with issues/labels/issue_labels/discussions/notes tables, 8 passing tests verify schema, constraints, and cascades","compaction_level":0,"original_size":0}
{"id":"bd-2ms","title":"[CP1] Unit tests for transformers","description":"Comprehensive unit tests for issue and discussion transformers.\n\n## Issue Transformer Tests (tests/issue_transformer_tests.rs)\n\n- transforms_gitlab_issue_to_normalized_schema\n- extracts_labels_from_issue_payload\n- handles_missing_optional_fields_gracefully\n- converts_iso_timestamps_to_ms_epoch\n- sets_last_seen_at_to_current_time\n\n## Discussion Transformer Tests (tests/discussion_transformer_tests.rs)\n\n- transforms_discussion_payload_to_normalized_schema\n- extracts_notes_array_from_discussion\n- sets_individual_note_flag_correctly\n- flags_system_notes_with_is_system_true\n- preserves_note_order_via_position_field\n- computes_first_note_at_and_last_note_at_correctly\n- computes_resolvable_and_resolved_status\n\n## Test Setup\n- Load from test fixtures\n- Use serde_json for deserialization\n- Compare against expected NormalizedX structs\n\nFiles: tests/issue_transformer_tests.rs, tests/discussion_transformer_tests.rs\nDone when: All transformer unit tests pass","status":"tombstone","priority":3,"issue_type":"task","created_at":"2026-01-25T16:59:04.165187Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:02.015847Z","deleted_at":"2026-01-25T17:02:02.015841Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-2mz","title":"Epic: Gate A - Lexical MVP","description":"## Background\nGate A delivers the lexical search MVP — the foundation that works without sqlite-vec or Ollama. It introduces the document layer (documents, document_labels, document_paths), FTS5 indexing, search filters, and the search + stats + generate-docs CLI commands. Gate A is independently shippable — users get working search with FTS5 only.\n\n## Gate A Deliverables\n1. Document generation from issues/MRs/discussions with FTS5 indexing\n2. Lexical search + filters + snippets + lore stats\n\n## Bead Dependencies (execution order)\n1. **bd-3lc** — Rename GiError to LoreError (no deps, enables all subsequent work)\n2. **bd-hrs** — Migration 007 (blocked by bd-3lc)\n3. **bd-221** — Migration 008 FTS5 (blocked by bd-hrs)\n4. **bd-36p** — Document types + extractor module (blocked by bd-3lc)\n5. **bd-18t** — Truncation logic (blocked by bd-36p)\n6. **bd-247** — Issue extraction (blocked by bd-36p, bd-hrs)\n7. **bd-1yz** — MR extraction (blocked by bd-36p, bd-hrs)\n8. **bd-2fp** — Discussion extraction (blocked by bd-36p, bd-hrs, bd-18t)\n9. **bd-1u1** — Document regenerator (blocked by bd-36p, bd-38q, bd-hrs)\n10. **bd-1k1** — FTS5 search (blocked by bd-221)\n11. **bd-3q2** — Search filters (blocked by bd-36p)\n12. **bd-3lu** — Search CLI (blocked by bd-1k1, bd-3q2, bd-36p)\n13. **bd-3qs** — Generate-docs CLI (blocked by bd-1u1, bd-3lu)\n14. **bd-pr1** — Stats CLI (blocked by bd-hrs)\n15. **bd-2dk** — Project resolution (blocked by bd-3lc)\n\n## Acceptance Criteria\n- [ ] `lore search \"query\"` returns FTS5 results with snippets\n- [ ] `lore search --type issue --label bug \"query\"` filters correctly\n- [ ] `lore generate-docs` creates documents from all entities\n- [ ] `lore generate-docs --full` regenerates everything\n- [ ] `lore stats` shows document/FTS/queue counts\n- [ ] `lore stats --check` verifies FTS consistency\n- [ ] No sqlite-vec dependency in Gate A","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-30T15:25:09.721108Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:54:44.243610Z","closed_at":"2026-01-30T17:54:44.243562Z","close_reason":"All Gate A sub-beads complete. Lexical MVP delivered: document extraction (issue/MR/discussion), FTS5 indexing, search with filters/snippets/RRF, generate-docs CLI, stats CLI with integrity check/repair.","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-2mz","depends_on_id":"bd-3lu","type":"blocks","created_at":"2026-01-30T15:29:35.679499Z","created_by":"tayloreernisse"},{"issue_id":"bd-2mz","depends_on_id":"bd-3qs","type":"blocks","created_at":"2026-01-30T15:29:35.713718Z","created_by":"tayloreernisse"},{"issue_id":"bd-2mz","depends_on_id":"bd-pr1","type":"blocks","created_at":"2026-01-30T15:29:35.747904Z","created_by":"tayloreernisse"}]}
{"id":"bd-2n4","title":"Implement trace query: file -> MR -> issue -> discussion chain","description":"## Background\n\nThe trace query is the core logic for Gate 5's trace command. It builds a chain from file path -> MRs -> issues -> discussions, combining data from mr_file_changes (Gate 4), entity_references (Gate 2), and the existing discussions/notes tables. This is the backend that the trace CLI command calls.\n\n## Approach\n\nCreate `src/core/trace.rs` with:\n\n```rust\nuse rusqlite::Connection;\nuse crate::core::file_history::resolve_rename_chain;\n\n#[derive(Debug, Clone, Serialize)]\npub struct TraceChain {\n pub merge_request: TraceMr,\n pub issues: Vec<TraceIssue>,\n pub discussions: Vec<TraceDiscussion>,\n}\n\n#[derive(Debug, Clone, Serialize)]\npub struct TraceMr {\n pub iid: i64,\n pub title: String,\n pub state: String,\n pub author_username: String,\n pub web_url: Option<String>,\n pub merged_at: Option<i64>, // ms epoch UTC\n pub file_change_type: String, // \"added\", \"modified\", \"renamed\", \"deleted\"\n}\n\n#[derive(Debug, Clone, Serialize)]\npub struct TraceIssue {\n pub iid: i64,\n pub title: String,\n pub state: String,\n pub web_url: Option<String>,\n pub reference_type: String, // \"closes\", \"mentioned\", \"related\"\n}\n\n#[derive(Debug, Clone, Serialize)]\npub struct TraceDiscussion {\n pub author_username: String,\n pub body_snippet: String, // truncated to 500 chars\n pub created_at: i64,\n pub is_diff_note: bool, // true if DiffNote on the traced file\n}\n\n#[derive(Debug, Clone, Serialize)]\npub struct TraceResult {\n pub path: String,\n pub resolved_paths: Vec<String>, // all rename chain paths\n pub chains: Vec<TraceChain>,\n}\n\npub fn run_trace(\n conn: &Connection,\n project_id: i64,\n path: &str,\n include_discussions: bool,\n limit: usize,\n) -> Result<TraceResult> {\n // 1. Resolve rename chain via file_history::resolve_rename_chain()\n // 2. Find MRs that touched any of the resolved paths\n // 3. For each MR, find linked issues via entity_references\n // 4. If include_discussions, fetch DiffNote discussions on the traced file\n // 5. Order chains by MR merge date (newest first)\n // 6. Apply limit\n}\n```\n\n### SQL for Step 2 (find MRs):\n\n```sql\nSELECT DISTINCT mr.id, mr.iid, mr.title, mr.state, mr.author_username,\n mr.web_url, mr.updated_at, mfc.change_type\nFROM mr_file_changes mfc\nJOIN merge_requests mr ON mr.id = mfc.merge_request_id\nWHERE mfc.project_id = ?1\n AND mfc.new_path IN (rarray(?2))\nORDER BY mr.updated_at DESC\nLIMIT ?3\n```\n\nNote: Use rusqlite's `vtab::array` for IN-clause with dynamic list, or generate placeholders.\n\n### SQL for Step 3 (find linked issues):\n\n```sql\nSELECT i.iid, i.title, i.state, i.web_url, er.reference_type\nFROM entity_references er\nJOIN issues i ON i.id = er.target_entity_id\nWHERE er.source_entity_type = 'merge_request'\n AND er.source_entity_id = ?1\n AND er.target_entity_type = 'issue'\n```\n\n### SQL for Step 4 (DiffNote discussions):\n\n```sql\nSELECT n.author_username, n.body, n.created_at, n.is_system,\n n.position_new_path\nFROM notes n\nJOIN discussions d ON d.id = n.discussion_id\nWHERE d.merge_request_id = ?1\n AND n.position_new_path IN (rarray(?2))\n AND n.is_system = 0\nORDER BY n.created_at ASC\n```\n\nRegister in `src/core/mod.rs`: `pub mod trace;`\n\n## Acceptance Criteria\n\n- [ ] `run_trace()` returns chains ordered by MR merge date (newest first)\n- [ ] Rename-aware: uses all paths from resolve_rename_chain\n- [ ] Finds issues linked via entity_references (closes, mentioned, related)\n- [ ] DiffNote discussions correctly filtered to the traced file's paths\n- [ ] Discussion body snippets truncated to 500 chars\n- [ ] Empty result (file not found in any MR) returns TraceResult with empty chains\n- [ ] Limit applies after full chain construction\n- [ ] Module registered in `src/core/mod.rs`\n- [ ] `cargo check --all-targets` passes\n- [ ] `cargo clippy --all-targets -- -D warnings` passes\n\n## Files\n\n- `src/core/trace.rs` (NEW)\n- `src/core/mod.rs` (add `pub mod trace;`)\n\n## TDD Loop\n\nRED: Create `src/core/trace.rs` with `#[cfg(test)] mod tests`:\n- `test_trace_empty_file` - unknown file returns empty chains\n- `test_trace_finds_mr` - file in mr_file_changes returns chain with MR\n- `test_trace_follows_renames` - renamed file still finds historical MRs\n- `test_trace_links_issues` - MR with entity_references shows linked issues\n- `test_trace_limits_chains` - limit=1 returns at most 1 chain\n\nTests need in-memory DB with migrations 001-015 applied, plus test fixtures.\n\nGREEN: Implement the SQL queries and chain assembly.\n\nVERIFY: `cargo test --lib -- trace`\n\n## Edge Cases\n\n- MR with no linked issues: chain has empty issues vec (not an error)\n- MR with multiple issues: all issues included in the chain\n- Same issue linked from multiple MRs: appears in each MR's chain independently\n- DiffNote on old_path (before rename): captured because we query all resolved paths\n- include_discussions=false: skip the DiffNote query entirely for performance","status":"open","priority":2,"issue_type":"task","created_at":"2026-02-02T21:34:32.738743Z","created_by":"tayloreernisse","updated_at":"2026-02-05T18:52:34.118325Z","compaction_level":0,"original_size":0,"labels":["gate-5","phase-b","query"],"dependencies":[{"issue_id":"bd-2n4","depends_on_id":"bd-1ht","type":"parent-child","created_at":"2026-02-02T21:34:32.743943Z","created_by":"tayloreernisse"},{"issue_id":"bd-2n4","depends_on_id":"bd-3ia","type":"blocks","created_at":"2026-02-02T21:34:37.899870Z","created_by":"tayloreernisse"},{"issue_id":"bd-2n4","depends_on_id":"bd-z94","type":"blocks","created_at":"2026-02-02T21:34:37.854791Z","created_by":"tayloreernisse"}]}
{"id":"bd-2nb","title":"[CP1] Issue ingestion module","description":"Fetch and store issues with cursor-based incremental sync.\n\nImplement ingestIssues(options) → { fetched, upserted, labelsCreated }\n\nLogic:\n1. Get current cursor from sync_cursors\n2. Paginate through issues updated after cursor\n3. Apply local filtering for tuple cursor semantics\n4. For each issue:\n - Store raw payload (compressed)\n - Upsert issue record\n - Extract and upsert labels\n - Link issue to labels via junction\n5. Update cursor after each page commit\n\nFiles: src/ingestion/issues.ts\nTests: tests/integration/issue-ingestion.test.ts\nDone when: Issues, labels, issue_labels populated correctly with resumable cursor","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:19:50.701180Z","created_by":"tayloreernisse","updated_at":"2026-01-25T15:21:35.154318Z","deleted_at":"2026-01-25T15:21:35.154316Z","deleted_by":"tayloreernisse","delete_reason":"delete","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-2ni","title":"OBSERV Epic: Phase 2 - Spans + Correlation IDs","description":"Add tracing spans to all sync stages and generate UUID-based run_id for correlation. Every log line within a sync run includes run_id in JSON span context. Nested spans produce correct parent-child chains.\n\nDepends on: Phase 1 (subscriber must support span recording)\nUnblocks: Phase 3 (metrics), Phase 5 (rate limit logging)\n\nFiles: src/cli/commands/sync.rs, src/cli/commands/ingest.rs, src/ingestion/orchestrator.rs, src/documents/regenerator.rs, src/embedding/pipeline.rs, src/main.rs\n\nAcceptance criteria (PRD Section 6.2):\n- Every log line includes run_id in JSON span context\n- Nested spans produce chain: fetch_pages includes parent ingest_issues span\n- run_id is 8-char hex (truncated UUIDv4)\n- Spans visible in -vv stderr output","status":"closed","priority":1,"issue_type":"epic","created_at":"2026-02-04T15:53:08.935218Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:19:38.721297Z","closed_at":"2026-02-04T17:19:38.721241Z","close_reason":"Phase 2 complete: run_id correlation IDs generated at sync/ingest entry, root spans with .instrument() for async, #[instrument] on 5 key pipeline functions","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-2ni","depends_on_id":"bd-2nx","type":"blocks","created_at":"2026-02-04T15:55:19.044453Z","created_by":"tayloreernisse"}]}
{"id":"bd-2no","title":"Write integration tests","description":"## Background\nIntegration tests verify that modules work together with a real SQLite database. They test FTS search (stemming, empty results), embedding storage (sqlite-vec ops), hybrid search (combined retrieval), and sync orchestration (full pipeline). Each test creates a fresh in-memory DB with migrations applied.\n\n## Approach\nCreate integration test files in `tests/`:\n\n**1. tests/fts_search.rs:**\n- Create DB, apply migrations 001-008\n- Insert test documents via SQL\n- Verify FTS5 triggers fired (documents_fts has matching count)\n- Search with various queries: stemming, prefix, empty, special chars\n- Verify result ranking (BM25 ordering)\n- Verify snippet generation\n\n**2. tests/embedding.rs:**\n- Create DB, apply migrations 001-009 (requires sqlite-vec)\n- Insert test documents + embeddings with known vectors\n- Verify KNN search returns nearest neighbors\n- Verify chunk deduplication\n- Verify orphan cleanup trigger (delete document -> embeddings gone)\n\n**3. tests/hybrid_search.rs:**\n- Create DB, apply all migrations\n- Insert documents + embeddings\n- Test all three modes: lexical, semantic, hybrid\n- Verify RRF ranking produces expected order\n- Test graceful degradation (no embeddings -> FTS fallback)\n- Test adaptive recall with filters\n\n**4. tests/sync.rs:**\n- Test sync orchestration with mock/stub GitLab responses\n- Verify pipeline stages execute in order\n- Verify lock acquisition/release\n- Verify --no-embed and --no-docs flags\n\n**Test fixtures:**\n- Deterministic embedding vectors (no Ollama required): e.g., [1.0, 0.0, 0.0, ...] for doc1, [0.0, 1.0, 0.0, ...] for doc2\n- Known documents with predictable search results\n- Fixed timestamps for reproducibility\n\n## Acceptance Criteria\n- [ ] FTS search tests pass (stemming, prefix, empty, special chars)\n- [ ] Embedding tests pass (KNN, dedup, orphan cleanup)\n- [ ] Hybrid search tests pass (all 3 modes, graceful degradation)\n- [ ] Sync tests pass (pipeline orchestration)\n- [ ] All tests use in-memory DB (no file I/O)\n- [ ] No external dependencies (no Ollama, no GitLab) — use fixtures/stubs\n- [ ] `cargo test --test fts_search --test embedding --test hybrid_search --test sync` passes\n\n## Files\n- `tests/fts_search.rs` — new file\n- `tests/embedding.rs` — new file\n- `tests/hybrid_search.rs` — new file\n- `tests/sync.rs` — new file\n- `tests/fixtures/` — optional: test helper functions (shared DB setup)\n\n## TDD Loop\nThese ARE integration tests — they verify the combined behavior of multiple beads.\nVERIFY: `cargo test --test fts_search && cargo test --test embedding && cargo test --test hybrid_search && cargo test --test sync`\n\n## Edge Cases\n- sqlite-vec not available: embedding tests should skip gracefully (or require feature flag)\n- In-memory DB with WAL mode: may behave differently than file DB — test both if critical\n- Concurrent test execution: each test creates its own DB (no shared state)","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-30T15:27:21.751019Z","created_by":"tayloreernisse","updated_at":"2026-01-30T18:11:12.432092Z","closed_at":"2026-01-30T18:11:12.432036Z","close_reason":"Integration tests: 10 FTS search tests (stemming, empty, special chars, ordering, triggers, null title), 5 embedding tests (KNN, limit, dedup, orphan trigger, empty DB), 6 hybrid search tests (lexical mode, FTS-only, graceful degradation, RRF ranking, filters, mode variants). 310 total tests pass.","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-2no","depends_on_id":"bd-1x6","type":"blocks","created_at":"2026-01-30T15:29:35.607603Z","created_by":"tayloreernisse"},{"issue_id":"bd-2no","depends_on_id":"bd-3eu","type":"blocks","created_at":"2026-01-30T15:29:35.572825Z","created_by":"tayloreernisse"},{"issue_id":"bd-2no","depends_on_id":"bd-3lu","type":"blocks","created_at":"2026-01-30T15:29:35.499831Z","created_by":"tayloreernisse"},{"issue_id":"bd-2no","depends_on_id":"bd-am7","type":"blocks","created_at":"2026-01-30T15:29:35.535320Z","created_by":"tayloreernisse"}]}
{"id":"bd-2nx","title":"OBSERV Epic: Phase 1 - Verbosity Flags + Structured File Logging","description":"Foundation layer for observability. Add -v/-vv/-vvv CLI flags, dual-layer tracing subscriber (stderr + file), daily log rotation via tracing-appender, log retention cleanup, --log-format json flag, and LoggingConfig.\n\nDepends on: nothing (first phase)\nUnblocks: Phase 2, and transitively all other phases\n\nFiles: Cargo.toml, src/cli/mod.rs, src/main.rs, src/core/config.rs, src/core/paths.rs, src/cli/commands/doctor.rs\n\nAcceptance criteria (PRD Section 6.1):\n- JSON log files written to ~/.local/share/lore/logs/ with zero config\n- -v/-vv/-vvv control stderr verbosity per table in PRD 4.3\n- RUST_LOG overrides -v for both layers\n- --log-format json emits JSON on stderr\n- Daily rotation, retention cleanup on startup\n- --quiet suppresses stderr, does NOT affect file layer\n- lore doctor reports log directory info","status":"closed","priority":1,"issue_type":"epic","created_at":"2026-02-04T15:53:00.987774Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:15:09.465732Z","closed_at":"2026-02-04T17:15:09.465684Z","close_reason":"Phase 1 complete: dual-layer subscriber, -v/--verbose flags, --log-format json, LoggingConfig, get_log_dir(), log retention, doctor diagnostics","compaction_level":0,"original_size":0,"labels":["observability"]}
{"id":"bd-2px","title":"[CP1] Epic: Issue Ingestion","description":"Ingest all issues, labels, and issue discussions from configured GitLab repositories with resumable cursor-based incremental sync. This establishes the core data ingestion pattern reused for MRs in CP2.\n\n## Success Criteria\n- gi ingest --type=issues fetches all issues (count matches GitLab UI)\n- Labels extracted from issue payloads (name-only)\n- Label linkage reflects current GitLab state (removed labels unlinked on re-sync)\n- Issue discussions fetched per-issue (dependent sync)\n- Cursor-based sync is resumable (re-running fetches 0 new items)\n- Discussion sync skips unchanged issues (per-issue watermark)\n- Sync tracking records all runs\n- Single-flight lock prevents concurrent runs\n\n## Internal Gates\n- Gate A: Issues only (cursor + upsert + raw payloads + list/count/show)\n- Gate B: Labels correct (stale-link removal verified)\n- Gate C: Dependent discussion sync (watermark prevents redundant refetch)\n- Gate D: Resumability proof (kill mid-run, rerun; bounded redo)\n\nReference: docs/prd/checkpoint-1.md","status":"tombstone","priority":1,"issue_type":"epic","created_at":"2026-01-25T15:42:13.167698Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.638609Z","deleted_at":"2026-01-25T17:02:01.638606Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"epic","compaction_level":0,"original_size":0}
{"id":"bd-2rr","title":"OBSERV: Replace subscriber init with dual-layer setup","description":"## Background\nThis is the core infrastructure bead for Phase 1. It replaces the single-layer subscriber (src/main.rs:44-58) with a dual-layer registry that separates stderr and file concerns. The file layer provides always-on post-mortem data; the stderr layer respects -v flags.\n\n## Approach\nReplace src/main.rs lines 44-58 with a function (e.g., init_tracing()) that:\n\n1. Build stderr filter from -v count (or RUST_LOG override):\n```rust\nfn build_stderr_filter(verbose: u8, quiet: bool) -> EnvFilter {\n if let Ok(rust_log) = std::env::var(\"RUST_LOG\") {\n return EnvFilter::new(rust_log);\n }\n if quiet {\n return EnvFilter::new(\"lore=warn,error\");\n }\n match verbose {\n 0 => EnvFilter::new(\"lore=info,warn\"),\n 1 => EnvFilter::new(\"lore=debug,warn\"),\n 2 => EnvFilter::new(\"lore=debug,info\"),\n _ => EnvFilter::new(\"trace,debug\"),\n }\n}\n```\n\n2. Build file filter (always lore=debug,warn unless RUST_LOG set):\n```rust\nfn build_file_filter() -> EnvFilter {\n if let Ok(rust_log) = std::env::var(\"RUST_LOG\") {\n return EnvFilter::new(rust_log);\n }\n EnvFilter::new(\"lore=debug,warn\")\n}\n```\n\n3. Assemble the registry:\n```rust\nlet stderr_layer = fmt::layer()\n .with_target(false)\n .with_writer(SuspendingWriter);\n// Conditionally add .json() based on log_format\n\nlet file_appender = tracing_appender::rolling::daily(log_dir, \"lore\");\nlet (non_blocking, _guard) = tracing_appender::non_blocking(file_appender);\nlet file_layer = fmt::layer()\n .json()\n .with_writer(non_blocking);\n\ntracing_subscriber::registry()\n .with(stderr_layer.with_filter(build_stderr_filter(cli.verbose, cli.quiet)))\n .with(file_layer.with_filter(build_file_filter()))\n .init();\n```\n\nCRITICAL: The non_blocking _guard must be held for the program's lifetime. Store it in main() scope, NOT in the init function. If the guard drops, the file writer thread stops and buffered logs are lost.\n\nCRITICAL: Per-layer filtering requires each .with_filter() to produce a Filtered<L, F, S> type. The two layers will have different concrete types (one with json, one without). This is fine -- the registry accepts heterogeneous layers via .with().\n\nWhen --log-format json: wrap stderr_layer with .json() too. This requires conditional construction. Two approaches:\n A) Use Box<dyn Layer<S>> for dynamic dispatch (simpler, tiny perf hit)\n B) Use an enum wrapper (zero cost but more code)\nRecommend approach A for simplicity. The overhead is one vtable indirection per log event, dwarfed by I/O.\n\nWhen file_logging is false (LoggingConfig.file_logging == false): skip adding the file layer entirely.\n\n## Acceptance Criteria\n- [ ] lore sync writes JSON log lines to ~/.local/share/lore/logs/lore.YYYY-MM-DD.log\n- [ ] lore -v sync shows DEBUG lore::* on stderr, deps at WARN\n- [ ] lore -vv sync shows DEBUG lore::* + INFO deps on stderr\n- [ ] lore -vvv sync shows TRACE everything on stderr\n- [ ] RUST_LOG=lore::gitlab=trace overrides -v for both layers\n- [ ] lore --log-format json sync emits JSON on stderr\n- [ ] -q + -v: -q wins (stderr at WARN+)\n- [ ] -q does NOT affect file layer (still DEBUG+)\n- [ ] File layer does NOT use SuspendingWriter\n- [ ] Non-blocking guard kept alive for program duration\n- [ ] Existing behavior unchanged when no new flags passed\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/main.rs (replace lines 44-58, add init_tracing function or inline)\n\n## TDD Loop\nRED:\n - test_verbosity_filter_construction: assert filter directives for verbose=0,1,2,3\n - test_rust_log_overrides_verbose: set env, assert TRACE not DEBUG\n - test_quiet_overrides_verbose: -q + -v => WARN+\n - test_json_log_output_format: capture file output, parse as JSON\n - test_suspending_writer_dual_layer: no garbled stderr with progress bars\nGREEN: Implement build_stderr_filter, build_file_filter, assemble registry\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- _guard lifetime: if guard is dropped early, buffered log lines are lost. MUST hold in main() scope.\n- Type erasure: stderr layer with/without .json() produces different types. Use Box<dyn Layer<_>> or separate init paths.\n- Empty RUST_LOG string: env::var returns Ok(\"\"), which EnvFilter::new(\"\") defaults to TRACE. May want to check is_empty().\n- File I/O error on log dir: tracing-appender handles this gracefully (no panic), but logs will be silently lost. The doctor command (bd-2i10) can diagnose this.","status":"closed","priority":1,"issue_type":"task","created_at":"2026-02-04T15:53:55.577025Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:15:04.384114Z","closed_at":"2026-02-04T17:15:04.384062Z","close_reason":"Replaced single-layer subscriber with dual-layer setup: stderr (human/json, -v controlled) + file (always-on JSON, daily rotation via tracing-appender)","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-2rr","depends_on_id":"bd-17n","type":"blocks","created_at":"2026-02-04T15:55:19.397949Z","created_by":"tayloreernisse"},{"issue_id":"bd-2rr","depends_on_id":"bd-1k4","type":"blocks","created_at":"2026-02-04T15:55:19.461728Z","created_by":"tayloreernisse"},{"issue_id":"bd-2rr","depends_on_id":"bd-1o1","type":"blocks","created_at":"2026-02-04T15:55:19.327157Z","created_by":"tayloreernisse"},{"issue_id":"bd-2rr","depends_on_id":"bd-2nx","type":"parent-child","created_at":"2026-02-04T15:53:55.577882Z","created_by":"tayloreernisse"},{"issue_id":"bd-2rr","depends_on_id":"bd-gba","type":"blocks","created_at":"2026-02-04T15:55:19.262870Z","created_by":"tayloreernisse"}]}
{"id":"bd-2sx","title":"Implement lore embed CLI command","description":"## Background\nThe embed CLI command is the user-facing wrapper for the embedding pipeline. It runs Ollama health checks, selects documents to embed (pending or failed), shows progress, and reports results. This is the standalone command for building embeddings outside of the sync orchestrator.\n\n## Approach\nCreate `src/cli/commands/embed.rs` per PRD Section 4.4.\n\n**IMPORTANT: The embed command is async.** The underlying `embed_documents()` function is `async fn` (uses `FuturesUnordered` for concurrent HTTP to Ollama). The CLI runner must use tokio runtime.\n\n**Core function (async):**\n```rust\npub async fn run_embed(\n config: &Config,\n retry_failed: bool,\n) -> Result<EmbedResult>\n```\n\n**Pipeline:**\n1. Create OllamaClient from config.embedding (base_url, model, timeout_secs)\n2. Run `client.health_check().await` — fail early with clear error if Ollama unavailable or model missing\n3. Determine selection: `EmbedSelection::RetryFailed` if --retry-failed, else `EmbedSelection::Pending`\n4. Call `embed_documents(conn, &client, selection, concurrency, progress_callback).await`\n - `concurrency` param controls max in-flight HTTP requests to Ollama\n - `progress_callback` drives indicatif progress bar\n5. Show progress bar (indicatif) during embedding\n6. Return EmbedResult with counts\n\n**CLI args:**\n```rust\n#[derive(Args)]\npub struct EmbedArgs {\n #[arg(long)]\n retry_failed: bool,\n}\n```\n\n**Output:**\n- Human: \"Embedded 42 documents (15 chunks), 2 errors, 5 skipped (unchanged)\"\n- JSON: `{\"ok\": true, \"data\": {\"embedded\": 42, \"chunks\": 15, \"errors\": 2, \"skipped\": 5}}`\n\n**Tokio integration note:**\nThe embed command runs async code. Either:\n- Use `#[tokio::main]` on main and propagate async through CLI dispatch\n- Or use `tokio::runtime::Runtime::new()` in the embed command handler\n\n## Acceptance Criteria\n- [ ] Command is async (embed_documents is async, health_check is async)\n- [ ] OllamaClient created from config.embedding settings\n- [ ] Health check runs first — clear error if Ollama down (exit code 14)\n- [ ] Clear error if model not found: \"Pull the model: ollama pull nomic-embed-text\" (exit code 15)\n- [ ] Embeds pending documents (no existing embeddings or stale content_hash)\n- [ ] --retry-failed re-attempts documents with last_error\n- [ ] Progress bar shows during embedding (indicatif)\n- [ ] embed_documents called with concurrency parameter\n- [ ] embed_documents called with progress_callback for progress bar\n- [ ] Human + JSON output\n- [ ] `cargo build` succeeds\n\n## Files\n- `src/cli/commands/embed.rs` — new file\n- `src/cli/commands/mod.rs` — add `pub mod embed;`\n- `src/cli/mod.rs` — add EmbedArgs, wire up embed subcommand\n- `src/main.rs` — add embed command handler (async dispatch)\n\n## TDD Loop\nRED: Integration test needing Ollama\nGREEN: Implement run_embed (async)\nVERIFY: `cargo build && cargo test embed`\n\n## Edge Cases\n- No documents in DB: \"No documents to embed\" (not error)\n- All documents already embedded and unchanged: \"0 documents to embed (all up to date)\"\n- Ollama goes down mid-embedding: pipeline records errors for remaining docs, returns partial result\n- --retry-failed with no failed docs: \"No failed documents to retry\"","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:26:34.126482Z","created_by":"tayloreernisse","updated_at":"2026-01-30T18:02:38.633115Z","closed_at":"2026-01-30T18:02:38.633055Z","close_reason":"Embed CLI command fully wired: EmbedArgs, Commands::Embed variant, handle_embed handler, clean build, all tests pass","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-2sx","depends_on_id":"bd-am7","type":"blocks","created_at":"2026-01-30T15:29:24.766104Z","created_by":"tayloreernisse"}]}
{"id":"bd-2ug","title":"[CP1] gi ingest --type=issues command","description":"CLI command to orchestrate issue ingestion.\n\n## Module\nsrc/cli/commands/ingest.rs\n\n## Clap Definition\n#[derive(Subcommand)]\npub enum Commands {\n Ingest {\n #[arg(long, value_parser = [\"issues\", \"merge_requests\"])]\n r#type: String,\n \n #[arg(long)]\n project: Option<String>,\n \n #[arg(long)]\n force: bool,\n },\n}\n\n## Implementation\n1. Acquire app lock with heartbeat (respect --force for stale lock)\n2. Create sync_run record (status='running')\n3. For each configured project (or filtered --project):\n - Call orchestrator to ingest issues and discussions\n - Show progress (spinner or progress bar)\n4. Update sync_run (status='succeeded', metrics_json with counts)\n5. Release lock\n\n## Output Format\nIngesting issues...\n\n group/project-one: 1,234 issues fetched, 45 new labels\n\nFetching discussions (312 issues with updates)...\n\n group/project-one: 312 issues → 1,234 discussions, 5,678 notes\n\nTotal: 1,234 issues, 1,234 discussions, 5,678 notes (excluding 1,234 system notes)\nSkipped discussion sync for 922 unchanged issues.\n\n## Error Handling\n- Lock acquisition failure: exit with DatabaseLockError message\n- Network errors: show GitLabNetworkError, exit non-zero\n- Rate limiting: respect backoff, show progress\n\nFiles: src/cli/commands/ingest.rs, src/cli/commands/mod.rs\nTests: tests/integration/sync_runs_tests.rs\nDone when: Full issue + discussion ingestion works end-to-end","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T16:57:58.552504Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.875613Z","deleted_at":"2026-01-25T17:02:01.875607Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-2um","title":"[CP1] Epic: Issue Ingestion","description":"Ingest all issues, labels, and issue discussions from configured GitLab repositories with resumable cursor-based incremental sync. This checkpoint establishes the core data ingestion pattern that will be reused for MRs in Checkpoint 2.\n\n## Success Criteria\n- gi ingest --type=issues fetches all issues (count matches GitLab UI)\n- Labels extracted from issue payloads (name-only)\n- Label linkage reflects current GitLab state (removed labels unlinked on re-sync)\n- Issue discussions fetched per-issue (dependent sync)\n- Cursor-based sync is resumable (re-running fetches 0 new items)\n- Discussion sync skips unchanged issues (per-issue watermark)\n- Sync tracking records all runs (sync_runs table)\n- Single-flight lock prevents concurrent runs\n\n## Internal Gates\n- **Gate A**: Issues only - cursor + upsert + raw payloads + list/count/show working\n- **Gate B**: Labels correct - stale-link removal verified; label count matches GitLab\n- **Gate C**: Dependent discussion sync - watermark prevents redundant refetch; concurrency bounded\n- **Gate D**: Resumability proof - kill mid-run, rerun; bounded redo and no redundant discussion refetch\n\n## Reference\ndocs/prd/checkpoint-1.md","status":"closed","priority":1,"issue_type":"epic","created_at":"2026-01-25T17:02:38.075224Z","created_by":"tayloreernisse","updated_at":"2026-01-25T23:27:15.347364Z","closed_at":"2026-01-25T23:27:15.347317Z","close_reason":"CP1 Issue Ingestion complete: all sub-tasks done, 71 tests pass, CLI commands working","compaction_level":0,"original_size":0}
{"id":"bd-2y79","title":"Add work item status via GraphQL enrichment","description":"## Background\n\nGitLab 18.2+ added native work item status (To do, In progress, Done, Won't do, Duplicate) but it's only available via GraphQL, not the REST API. This enriches synced issues with status information by making a supplementary GraphQL call after the REST ingestion.\n\n## Approach\n\n### Phase 1: GraphQL Client (\\`src/gitlab/graphql.rs\\` NEW)\n\nMinimal GraphQL client -- single function, not a full framework:\n```rust\npub async fn graphql_query(\n base_url: &str,\n token: &str,\n query: &str,\n variables: serde_json::Value,\n) -> Result<serde_json::Value> {\n // POST to {base_url}/api/graphql\n // Content-Type: application/json\n // Headers: PRIVATE-TOKEN: {token}\n // Body: {\"query\": \"...\", \"variables\": {...}}\n // Parse response, check for errors array\n}\n```\n\n### Phase 2: Status Types\n\n```rust\n#[derive(Debug, Clone, Serialize, Deserialize)]\npub struct WorkItemStatus {\n pub name: String, // \"To do\", \"In progress\", \"Done\", etc.\n pub category: String, // \"todo\", \"in_progress\", \"done\"\n pub color: Option<String>, // hex color\n pub icon_name: Option<String>,\n}\n```\n\n### Phase 3: Batch Fetch Query\n\n```graphql\nquery IssueStatuses($projectPath: ID!, $iids: [String!]) {\n project(fullPath: $projectPath) {\n issues(iids: $iids) {\n nodes {\n iid\n state\n workItemType {\n name\n }\n widgets {\n ... on WorkItemWidgetStatus {\n status {\n name\n category\n color\n iconName\n }\n }\n }\n }\n }\n }\n}\n```\n\nBatch in groups of 50 IIDs to avoid query complexity limits.\n\n### Phase 4: Migration 016\n\n```sql\nALTER TABLE issues ADD COLUMN status_name TEXT;\nALTER TABLE issues ADD COLUMN status_category TEXT;\nALTER TABLE issues ADD COLUMN status_color TEXT;\nALTER TABLE issues ADD COLUMN status_icon_name TEXT;\n```\n\n### Phase 5: Enrichment Step\n\nAfter REST issue ingestion, call GraphQL to fetch statuses for all synced issues:\n```rust\npub async fn enrich_issue_statuses(\n config: &Config,\n conn: &Connection,\n project_id: i64,\n) -> Result<usize>\n```\n\n### Phase 6: Display\n\nIn \\`print_show_issue()\\`, add status line:\n```\nStatus: In progress (todo) [colored by category]\n```\n\n### Phase 7: Graceful Degradation\n\n- If GraphQL endpoint returns 404 or 403: skip silently (older GitLab)\n- If work item status widget not present: skip (not enabled)\n- Never fail the sync pipeline due to GraphQL errors\n\n## Acceptance Criteria\n\n- [ ] GraphQL client can POST queries and handle errors\n- [ ] Status fetched in batches of 50 IIDs\n- [ ] Migration adds 4 nullable columns to issues table\n- [ ] \\`lore issues 123\\` shows status in human output (when available)\n- [ ] \\`lore --robot issues 123\\` includes status in JSON\n- [ ] Graceful degradation: older GitLab versions don't cause errors\n- [ ] \\`cargo check --all-targets\\` passes\n- [ ] \\`cargo clippy --all-targets -- -D warnings\\` passes\n\n## Files\n\n- \\`src/gitlab/graphql.rs\\` (NEW -- minimal GraphQL client)\n- \\`src/gitlab/mod.rs\\` (add pub mod graphql)\n- \\`src/gitlab/types.rs\\` (add WorkItemStatus struct)\n- \\`migrations/016_issue_status.sql\\` (NEW)\n- \\`src/core/db.rs\\` (add migration, bump version)\n- \\`src/ingestion/orchestrator.rs\\` (call enrich_issue_statuses after issue sync)\n- \\`src/cli/commands/show.rs\\` (display status in issue output)\n- \\`src/cli/commands/list.rs\\` (optionally show status in list)\n\n## TDD Loop\n\nRED: Create tests:\n- \\`test_graphql_query_success\\` - mock server returns valid GraphQL response\n- \\`test_graphql_query_error\\` - mock server returns errors array -> Result::Err\n- \\`test_work_item_status_deserialize\\` - parse GraphQL response into WorkItemStatus\n- \\`test_enrichment_graceful_degradation\\` - 403 response -> Ok(0) not Err\n\nGREEN: Implement GraphQL client, enrichment step, migration.\n\nVERIFY: \\`cargo test --lib -- graphql\\`\n\n## Edge Cases\n\n- GitLab < 18.2: GraphQL endpoint exists but work item status widget missing -> skip\n- GraphQL rate limiting: respect Retry-After header\n- Issue with no status widget: status_name = NULL in DB\n- Multiple GraphQL pages: not needed (batch by IID list, not cursor pagination)\n- Token with only read_api scope (not api): GraphQL may require different scopes","status":"open","priority":2,"issue_type":"feature","created_at":"2026-02-05T18:32:39.287957Z","created_by":"tayloreernisse","updated_at":"2026-02-05T18:57:00.869297Z","compaction_level":0,"original_size":0,"labels":["api","phase-b"]}
{"id":"bd-2yo","title":"Fetch MR diffs API and populate mr_file_changes","description":"## Background\n\nThis bead fetches MR diff metadata from the GitLab API and populates the mr_file_changes table created by migration 015. It extracts only file-level metadata (paths, change type, line counts) and discards the actual diff content to keep storage minimal.\n\n## Approach\n\n### 1. API Client\n\nAdd to `src/gitlab/client.rs`:\n\n```rust\npub async fn fetch_mr_diffs(\n &self,\n project_id: i64,\n mr_iid: i64,\n) -> Result<Vec<GitLabMrDiff>> {\n // GET /projects/:id/merge_requests/:iid/diffs\n // Paginated (default 30 per page, max 100)\n // Returns array of diff objects\n}\n```\n\n### 2. Types\n\nAdd to `src/gitlab/types.rs`:\n\n```rust\n#[derive(Debug, Deserialize)]\npub struct GitLabMrDiff {\n pub old_path: String,\n pub new_path: String,\n pub new_file: bool,\n pub renamed_file: bool,\n pub deleted_file: bool,\n // Ignore: diff, a_mode, b_mode, generated_file\n}\n```\n\n### 3. Change Type Derivation\n\n```rust\nfn derive_change_type(diff: &GitLabMrDiff) -> &'static str {\n if diff.new_file { \"added\" }\n else if diff.renamed_file { \"renamed\" }\n else if diff.deleted_file { \"deleted\" }\n else { \"modified\" }\n}\n```\n\n### 4. DB Insert\n\nIn `src/ingestion/` (new file or extend orchestrator):\n\n```rust\npub fn upsert_mr_file_changes(\n conn: &Connection,\n mr_local_id: i64,\n project_id: i64,\n diffs: &[GitLabMrDiff],\n) -> Result<usize> {\n // DELETE FROM mr_file_changes WHERE merge_request_id = ?\n // Then INSERT each diff row\n // Re-sync approach: DELETE+INSERT is simpler than UPSERT for arrays\n}\n```\n\n### 5. Also capture merge_commit_sha and squash_commit_sha\n\nWhen fetching MR details (already in ingestion/merge_requests.rs), extract these fields:\n```rust\n// Already fetched in MR list/detail API\nUPDATE merge_requests SET\n merge_commit_sha = ?1,\n squash_commit_sha = ?2\nWHERE id = ?3\n```\n\n### 6. Queue Integration\n\nIn orchestrator, when MRs are ingested, enqueue mr_diffs jobs:\n```rust\n// In the MR ingestion loop, after upsert:\nenqueue_dependent_fetch(conn, project_id, \"merge_request\", mr_iid, mr_local_id, \"mr_diffs\")?;\n```\n\nCheck `config.sync.fetch_mr_file_changes` before enqueuing.\n\n## Acceptance Criteria\n\n- [ ] `fetch_mr_diffs()` calls correct GitLab API endpoint with pagination\n- [ ] Change type correctly derived: new_file->added, renamed_file->renamed, deleted_file->deleted, else->modified\n- [ ] mr_file_changes rows inserted with correct old_path, new_path, change_type\n- [ ] Old rows deleted before insert (clean re-sync per MR)\n- [ ] merge_commit_sha and squash_commit_sha captured from MR API response\n- [ ] Jobs only enqueued when config.sync.fetch_mr_file_changes is true\n- [ ] Handles MRs with no diffs (empty array) gracefully\n- [ ] `cargo check --all-targets` passes\n- [ ] `cargo clippy --all-targets -- -D warnings` passes\n\n## Files\n\n- `src/gitlab/client.rs` (add fetch_mr_diffs method)\n- `src/gitlab/types.rs` (add GitLabMrDiff struct)\n- `src/ingestion/mr_diffs.rs` (NEW -- upsert + queue drain logic)\n- `src/ingestion/mod.rs` (add pub mod mr_diffs)\n- `src/ingestion/orchestrator.rs` (enqueue mr_diffs jobs + guard with config flag)\n- `src/ingestion/merge_requests.rs` (capture commit SHAs)\n\n## TDD Loop\n\nRED: Create `src/ingestion/mr_diffs.rs` with tests:\n- `test_derive_change_type_added` - new_file=true -> \"added\"\n- `test_derive_change_type_renamed` - renamed_file=true -> \"renamed\"\n- `test_derive_change_type_deleted` - deleted_file=true -> \"deleted\"\n- `test_derive_change_type_modified` - all false -> \"modified\"\n- `test_upsert_replaces_existing` - second upsert with different diffs replaces first\n\nGREEN: Implement the API client, type derivation, and DB operations.\n\nVERIFY: `cargo test --lib -- mr_diffs`\n\n## Edge Cases\n\n- MR with 500+ changed files: paginate properly (GitLab returns max 100 per page)\n- Binary files: may have lines_added/removed = 0, handle as modified\n- File renamed AND modified in same MR: GitLab returns renamed_file=true, which takes precedence\n- MR in draft state: still fetch diffs (they exist in the API)\n- Deleted MR: API returns 404 -- handle with retry/skip logic in queue drainer","status":"open","priority":2,"issue_type":"task","created_at":"2026-02-02T21:34:08.939514Z","created_by":"tayloreernisse","updated_at":"2026-02-05T18:54:37.194715Z","compaction_level":0,"original_size":0,"labels":["api","gate-4","phase-b"],"dependencies":[{"issue_id":"bd-2yo","depends_on_id":"bd-14q","type":"parent-child","created_at":"2026-02-02T21:34:08.941359Z","created_by":"tayloreernisse"},{"issue_id":"bd-2yo","depends_on_id":"bd-1oo","type":"blocks","created_at":"2026-02-02T21:34:16.555239Z","created_by":"tayloreernisse"},{"issue_id":"bd-2yo","depends_on_id":"bd-jec","type":"blocks","created_at":"2026-02-02T21:34:16.656402Z","created_by":"tayloreernisse"},{"issue_id":"bd-2yo","depends_on_id":"bd-tir","type":"blocks","created_at":"2026-02-02T21:34:16.605198Z","created_by":"tayloreernisse"}]}
{"id":"bd-2yq","title":"[CP1] Issue transformer with label extraction","description":"Transform GitLab issue payloads to normalized database schema.\n\nFunctions to implement:\n- transformIssue(gitlabIssue, localProjectId) → NormalizedIssue\n- extractLabels(gitlabIssue, localProjectId) → Label[]\n\nTransformation rules:\n- Convert ISO timestamps to ms epoch using isoToMs()\n- Set last_seen_at to nowMs()\n- Handle labels vs labels_details (prefer details when available)\n- Handle missing optional fields gracefully\n\nFiles: src/gitlab/transformers/issue.ts\nTests: tests/unit/issue-transformer.test.ts\nDone when: Unit tests pass for payload transformation and label extraction","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:19:09.660448Z","created_by":"tayloreernisse","updated_at":"2026-01-25T15:21:35.152259Z","deleted_at":"2026-01-25T15:21:35.152254Z","deleted_by":"tayloreernisse","delete_reason":"delete","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-2ys","title":"[CP1] Cargo.toml updates - async-stream and futures","description":"## Background\n\nThe GitLab client pagination methods require async streaming capabilities. The `async-stream` crate provides the `stream!` macro for creating async iterators, and `futures` provides `StreamExt` for consuming them with `.next()` and other combinators.\n\n## Approach\n\nAdd these dependencies to Cargo.toml:\n\n```toml\n[dependencies]\nasync-stream = \"0.3\"\nfutures = { version = \"0.3\", default-features = false, features = [\"alloc\"] }\n```\n\nUse minimal features on `futures` to avoid pulling unnecessary code.\n\n## Acceptance Criteria\n\n- [ ] `async-stream = \"0.3\"` is in Cargo.toml [dependencies]\n- [ ] `futures` with `alloc` feature is in Cargo.toml [dependencies]\n- [ ] `cargo check` succeeds after adding dependencies\n\n## Files\n\n- Cargo.toml (edit)\n\n## TDD Loop\n\nRED: Not applicable (dependency addition)\nGREEN: Add lines to Cargo.toml\nVERIFY: `cargo check`\n\n## Edge Cases\n\n- If `futures` is already present, merge features rather than duplicate\n- Use exact version pins for reproducibility","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-25T17:02:38.104664Z","created_by":"tayloreernisse","updated_at":"2026-01-25T22:25:10.274787Z","closed_at":"2026-01-25T22:25:10.274727Z","close_reason":"Added async-stream 0.3 and futures 0.3 (alloc feature) to Cargo.toml, cargo check passes","compaction_level":0,"original_size":0}
{"id":"bd-2zl","title":"Epic: Gate 1 - Resource Events Ingestion","description":"## Background\nGate 1 transforms gitlore from a snapshot engine into a temporal data store by ingesting structured event data from GitLab Resource Events APIs (state, label, milestone changes). This is the foundation — Gates 2-5 all depend on the event tables and dependent fetch queue that Gate 1 establishes.\n\nCurrently, when an issue is closed or a label changes, gitlore overwrites the current state. The transition is lost. Gate 1 captures these transitions as discrete events with timestamps, actors, and provenance, enabling temporal queries like \"when did this issue become critical?\" and \"who closed this MR?\"\n\n## Architecture\n- **Three new tables:** resource_state_events, resource_label_events, resource_milestone_events (migration 011, already shipped as bd-hu3)\n- **Generic dependent fetch queue:** pending_dependent_fetches table replaces per-type queue tables. Supports job_types: resource_events, mr_closes_issues, mr_diffs. Used by Gates 1, 2, and 4.\n- **Opt-in via config:** sync.fetchResourceEvents (default true). --no-events CLI flag to skip.\n- **Incremental:** Only changed entities enqueued. --full re-enqueues all.\n- **Crash recovery:** locked_at column with 5-minute stale lock reclaim.\n\n## Children (Execution Order)\n1. **bd-hu3** [CLOSED] — Migration 011: event tables + entity_references + dependent fetch queue\n2. **bd-2e8** [CLOSED] — fetchResourceEvents config flag\n3. **bd-2fm** [CLOSED] — GitLab Resource Event serde types\n4. **bd-sqw** [CLOSED] — Resource Events API endpoints in GitLab client\n5. **bd-1uc** [CLOSED] — DB upsert functions for resource events\n6. **bd-tir** [CLOSED] — Generic dependent fetch queue (enqueue + drain)\n7. **bd-1ep** [CLOSED] — Wire resource event fetching into sync pipeline\n8. **bd-3sh** [CLOSED] — lore count events command\n9. **bd-1m8** [CLOSED] — lore stats --check for event integrity + queue health\n\n## Gate Completion Criteria\n- [ ] All 9 children closed\n- [ ] `lore sync` fetches resource events for changed entities\n- [ ] `lore sync --no-events` skips event fetching\n- [ ] Event fetch failures queued for retry with exponential backoff\n- [ ] Stale locks auto-reclaimed on next sync run\n- [ ] `lore count events` shows counts by type (state/label/milestone)\n- [ ] `lore stats --check` validates referential integrity + queue health\n- [ ] Robot mode JSON for all new commands\n- [ ] Integration test: full sync cycle with events enabled\n\n## Dependencies\n- None (Gate 1 is the foundation)\n- Downstream: Gate 2 (bd-1se) depends on event tables and dependent fetch queue","status":"closed","priority":1,"issue_type":"feature","created_at":"2026-02-02T21:30:49.136036Z","created_by":"tayloreernisse","updated_at":"2026-02-05T16:06:52.080788Z","closed_at":"2026-02-05T16:06:52.080725Z","close_reason":"Already implemented: migration 011 exists, events_db.rs has upsert functions, client.rs has fetch_*_state_events, orchestrator.rs has drain_resource_events. Full Gate 1 functionality is live.","compaction_level":0,"original_size":0,"labels":["epic","gate-1","phase-b"]}
{"id":"bd-2zr","title":"[CP1] GitLab client pagination methods","description":"Add async stream methods for paginated GitLab API calls.\n\n## Methods to Add to GitLabClient\n\n### paginate_issues(gitlab_project_id, updated_after, cursor_rewind_seconds) -> Stream<GitLabIssue>\n- Use async_stream::try_stream! macro\n- Query params: scope=all, state=all, order_by=updated_at, sort=asc, per_page=100\n- If updated_after provided, apply cursor_rewind_seconds (subtract from timestamp)\n- Clamp to 0 to avoid underflow: (ts - rewind_ms).max(0)\n- Follow x-next-page header until empty/absent\n- Fall back to empty-page detection if headers missing\n\n### paginate_issue_discussions(gitlab_project_id, issue_iid) -> Stream<GitLabDiscussion>\n- Paginate through discussions for single issue\n- per_page=100\n- Follow x-next-page header\n\n### request_with_headers<T>(path, params) -> Result<(T, HeaderMap)>\n- Acquire rate limiter\n- Make request with PRIVATE-TOKEN header\n- Return both deserialized data and response headers\n\n## Dependencies\n- async-stream = \"0.3\" (for try_stream! macro)\n- futures = \"0.3\" (for Stream trait and StreamExt)\n\nFiles: src/gitlab/client.rs\nTests: tests/pagination_tests.rs\nDone when: Pagination handles multiple pages and respects cursors, tests pass","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T16:57:13.045971Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.784887Z","deleted_at":"2026-01-25T17:02:01.784883Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-31b","title":"[CP1] Discussion ingestion module","description":"Fetch and store discussions/notes for each issue.\n\nImplement ingestIssueDiscussions(options) → { discussionsFetched, discussionsUpserted, notesUpserted, systemNotesCount }\n\nLogic:\n1. Paginate through all discussions for given issue\n2. For each discussion:\n - Store raw payload (compressed)\n - Upsert discussion record with correct issue FK\n - Transform and upsert all notes\n - Store raw payload per note\n - Track system notes count\n\nFiles: src/ingestion/discussions.ts\nTests: tests/integration/issue-discussion-ingestion.test.ts\nDone when: Discussions and notes populated with correct FKs and is_system flags","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:19:57.131442Z","created_by":"tayloreernisse","updated_at":"2026-01-25T15:21:35.156574Z","deleted_at":"2026-01-25T15:21:35.156571Z","deleted_by":"tayloreernisse","delete_reason":"delete","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-31i","title":"Epic: CP2 Gate B - Labels + Assignees + Reviewers","description":"## Background\nGate B validates junction tables for labels, assignees, and reviewers. Ensures relationships are tracked correctly and stale links are removed on resync. This is critical for filtering (`--reviewer=alice`) and display.\n\n## Acceptance Criteria (Pass/Fail)\n- [ ] `mr_labels` table has rows for MRs with labels\n- [ ] Label count per MR matches GitLab UI (spot check 3 MRs)\n- [ ] `mr_assignees` table has rows for MRs with assignees\n- [ ] Assignee usernames match GitLab UI (spot check 3 MRs)\n- [ ] `mr_reviewers` table has rows for MRs with reviewers\n- [ ] Reviewer usernames match GitLab UI (spot check 3 MRs)\n- [ ] Remove label in GitLab -> resync -> link removed from mr_labels\n- [ ] Add reviewer in GitLab -> resync -> link added to mr_reviewers\n- [ ] `gi list mrs --label=bugfix` filters correctly\n- [ ] `gi list mrs --reviewer=alice` filters correctly\n\n## Validation Script\n```bash\n#!/bin/bash\nset -e\n\nDB_PATH=\"${XDG_DATA_HOME:-$HOME/.local/share}/gitlab-inbox/db.sqlite3\"\n\necho \"=== Gate B: Labels + Assignees + Reviewers ===\"\n\n# 1. Check label linkage exists\necho \"Step 1: Check label linkage...\"\nLABEL_LINKS=$(sqlite3 \"$DB_PATH\" \"SELECT COUNT(*) FROM mr_labels;\")\necho \" Total label links: $LABEL_LINKS\"\n\n# 2. Show sample label linkage\necho \"Step 2: Sample label linkage...\"\nsqlite3 \"$DB_PATH\" \"\n SELECT m.iid, GROUP_CONCAT(l.name, ', ') as labels\n FROM merge_requests m\n JOIN mr_labels ml ON ml.merge_request_id = m.id\n JOIN labels l ON l.id = ml.label_id\n GROUP BY m.id\n LIMIT 5;\n\"\n\n# 3. Check assignee linkage\necho \"Step 3: Check assignee linkage...\"\nASSIGNEE_LINKS=$(sqlite3 \"$DB_PATH\" \"SELECT COUNT(*) FROM mr_assignees;\")\necho \" Total assignee links: $ASSIGNEE_LINKS\"\n\n# 4. Show sample assignee linkage\necho \"Step 4: Sample assignee linkage...\"\nsqlite3 \"$DB_PATH\" \"\n SELECT m.iid, GROUP_CONCAT(a.username, ', ') as assignees\n FROM merge_requests m\n JOIN mr_assignees a ON a.merge_request_id = m.id\n GROUP BY m.id\n LIMIT 5;\n\"\n\n# 5. Check reviewer linkage\necho \"Step 5: Check reviewer linkage...\"\nREVIEWER_LINKS=$(sqlite3 \"$DB_PATH\" \"SELECT COUNT(*) FROM mr_reviewers;\")\necho \" Total reviewer links: $REVIEWER_LINKS\"\n\n# 6. Show sample reviewer linkage\necho \"Step 6: Sample reviewer linkage...\"\nsqlite3 \"$DB_PATH\" \"\n SELECT m.iid, GROUP_CONCAT(r.username, ', ') as reviewers\n FROM merge_requests m\n JOIN mr_reviewers r ON r.merge_request_id = m.id\n GROUP BY m.id\n LIMIT 5;\n\"\n\n# 7. Test filter commands\necho \"Step 7: Test filter commands...\"\n# Get a label that exists\nLABEL=$(sqlite3 \"$DB_PATH\" \"SELECT name FROM labels LIMIT 1;\")\nif [ -n \"$LABEL\" ]; then\n echo \" Testing --label=$LABEL\"\n gi list mrs --label=\"$LABEL\" --limit=3\nfi\n\n# Get a reviewer that exists\nREVIEWER=$(sqlite3 \"$DB_PATH\" \"SELECT username FROM mr_reviewers LIMIT 1;\")\nif [ -n \"$REVIEWER\" ]; then\n echo \" Testing --reviewer=$REVIEWER\"\n gi list mrs --reviewer=\"$REVIEWER\" --limit=3\nfi\n\necho \"\"\necho \"=== Gate B: PASSED ===\"\n```\n\n## Stale Link Removal Test (Manual)\n```bash\n# 1. Pick an MR with labels in GitLab UI\nMR_IID=123\n\n# 2. Note current label count\nsqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM mr_labels ml\n JOIN merge_requests m ON m.id = ml.merge_request_id\n WHERE m.iid = $MR_IID;\n\"\n# Example: 3 labels\n\n# 3. Remove a label in GitLab UI (manually)\n\n# 4. Resync\ngi ingest --type=merge_requests\n\n# 5. Verify label removed\nsqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM mr_labels ml\n JOIN merge_requests m ON m.id = ml.merge_request_id\n WHERE m.iid = $MR_IID;\n\"\n# Should be: 2 labels (one less)\n```\n\n## Test Commands (Quick Verification)\n```bash\n# Check counts:\nsqlite3 ~/.local/share/gitlab-inbox/db.sqlite3 \"\n SELECT \n (SELECT COUNT(*) FROM mr_labels) as label_links,\n (SELECT COUNT(*) FROM mr_assignees) as assignee_links,\n (SELECT COUNT(*) FROM mr_reviewers) as reviewer_links;\n\"\n\n# Test filtering:\ngi list mrs --label=enhancement --limit=5\ngi list mrs --reviewer=alice --limit=5\ngi list mrs --assignee=bob --limit=5\n```\n\n## Dependencies\nThis gate requires:\n- bd-ser (MR ingestion with label/assignee/reviewer linking via clear-and-relink pattern)\n- Gate A must pass first\n\n## Edge Cases\n- MRs with no labels/assignees/reviewers: junction tables should have no rows for that MR\n- Labels shared across issues and MRs: labels table is shared, only junction differs\n- Usernames are case-sensitive: `Alice` != `alice`","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-26T22:06:01.292318Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:48:21.059422Z","closed_at":"2026-01-27T00:48:21.059378Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-31i","depends_on_id":"bd-ser","type":"blocks","created_at":"2026-01-26T22:08:55.684769Z","created_by":"tayloreernisse"}]}
{"id":"bd-31m","title":"[CP1] Test fixtures for mocked GitLab responses","description":"Create mock response files for integration tests.\n\nFixtures to create:\n- gitlab-issue.json (single issue with labels)\n- gitlab-issues-page.json (paginated list)\n- gitlab-discussion.json (single discussion with notes)\n- gitlab-discussions-page.json (paginated list)\n\nInclude edge cases:\n- Issue with labels_details\n- Issue with no labels\n- Discussion with individual_note=true\n- System notes with system=true\n\nFiles: tests/fixtures/mock-responses/gitlab-issue*.json, gitlab-discussion*.json\nDone when: MSW handlers can use fixtures for deterministic tests","status":"tombstone","priority":3,"issue_type":"task","created_at":"2026-01-25T15:20:43.781288Z","created_by":"tayloreernisse","updated_at":"2026-01-25T15:21:35.155480Z","deleted_at":"2026-01-25T15:21:35.155478Z","deleted_by":"tayloreernisse","delete_reason":"delete","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-327","title":"[CP0] Project scaffold","description":"## Background\n\nThis is the foundational scaffold for the GitLab Inbox CLI tool. Every subsequent bead depends on having the correct project structure, TypeScript configuration, and tooling in place. The configuration choices here (ESM modules, strict TypeScript, Vitest for testing) set constraints for all future code.\n\n## Approach\n\nCreate a Node.js 20+ ESM project with TypeScript strict mode. Use flat ESLint config (v9+) with TypeScript plugin. Configure Vitest with coverage. Create the directory structure matching the PRD exactly.\n\n**package.json essentials:**\n- `\"type\": \"module\"` for ESM\n- `\"bin\": { \"gi\": \"./dist/cli/index.js\" }` for CLI entry point\n- Runtime deps: better-sqlite3, sqlite-vec, commander, zod, pino, pino-pretty, ora, chalk, cli-table3, inquirer\n- Dev deps: typescript, @types/better-sqlite3, @types/node, vitest, msw, eslint, @typescript-eslint/*\n\n**tsconfig.json:**\n- `target: ES2022`, `module: Node16`, `moduleResolution: Node16`\n- `strict: true`, `noImplicitAny: true`, `strictNullChecks: true`\n- `outDir: ./dist`, `rootDir: ./src`\n\n**vitest.config.ts:**\n- Exclude `tests/live/**` unless `GITLAB_LIVE_TESTS=1`\n- Coverage with v8 provider\n\n## Acceptance Criteria\n\n- [ ] `npm install` completes without errors\n- [ ] `npm run build` compiles TypeScript to dist/\n- [ ] `npm run test` runs vitest (0 tests is fine at this stage)\n- [ ] `npm run lint` runs ESLint with no config errors\n- [ ] All directories exist: src/cli/commands/, src/core/, src/gitlab/, src/types/, tests/unit/, tests/integration/, tests/live/, tests/fixtures/mock-responses/, migrations/\n\n## Files\n\nCREATE:\n- package.json\n- tsconfig.json\n- vitest.config.ts\n- eslint.config.js\n- .gitignore\n- src/cli/index.ts (empty placeholder with shebang)\n- src/cli/commands/.gitkeep\n- src/core/.gitkeep\n- src/gitlab/.gitkeep\n- src/types/index.ts (empty)\n- tests/unit/.gitkeep\n- tests/integration/.gitkeep\n- tests/live/.gitkeep\n- tests/fixtures/mock-responses/.gitkeep\n- migrations/.gitkeep\n\n## TDD Loop\n\nN/A - scaffold only. Verify with:\n\n```bash\nnpm install\nnpm run build\nnpm run lint\nnpm run test\n```\n\n## Edge Cases\n\n- Node.js version < 20 will fail on ESM features - add `engines` field\n- better-sqlite3 requires native compilation - may need python/build-essential\n- sqlite-vec installation can fail on some platforms - document fallback","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-24T16:09:47.955044Z","created_by":"tayloreernisse","updated_at":"2026-01-25T02:51:25.347932Z","closed_at":"2026-01-25T02:51:25.347799Z","compaction_level":0,"original_size":0}
{"id":"bd-32mc","title":"OBSERV: Implement log retention cleanup at startup","description":"## Background\nLog files accumulate at ~1-10 MB/day. Without cleanup, they grow unbounded. Retention runs BEFORE subscriber init so deleted file handles aren't held open by the appender.\n\n## Approach\nAdd a cleanup function, called from main.rs before the subscriber is initialized (before current line 44):\n\n```rust\n/// Delete log files older than retention_days.\n/// Matches files named lore.YYYY-MM-DD.log in the log directory.\npub fn cleanup_old_logs(log_dir: &Path, retention_days: u32) -> std::io::Result<usize> {\n if retention_days == 0 {\n return Ok(0); // 0 means file logging disabled, don't delete\n }\n let cutoff = SystemTime::now() - Duration::from_secs(u64::from(retention_days) * 86400);\n let mut deleted = 0;\n\n for entry in std::fs::read_dir(log_dir)? {\n let entry = entry?;\n let name = entry.file_name();\n let name_str = name.to_string_lossy();\n\n // Only match lore.YYYY-MM-DD.log pattern\n if !name_str.starts_with(\"lore.\") || !name_str.ends_with(\".log\") {\n continue;\n }\n\n if let Ok(metadata) = entry.metadata() {\n if let Ok(modified) = metadata.modified() {\n if modified < cutoff {\n std::fs::remove_file(entry.path())?;\n deleted += 1;\n }\n }\n }\n }\n Ok(deleted)\n}\n```\n\nPlace this function in src/core/paths.rs (next to get_log_dir) or a new src/core/log_retention.rs. Prefer paths.rs since it's small and related.\n\nCall from main.rs:\n```rust\nlet log_dir = get_log_dir(config.logging.log_dir.as_deref());\nlet _ = cleanup_old_logs(&log_dir, config.logging.retention_days);\n// THEN init subscriber\n```\n\nNote: Config must be loaded before cleanup runs. Current main.rs parses Cli at line 60, but config loading happens inside command handlers. This means we need to either:\n A) Load config early in main() before subscriber init (preferred)\n B) Defer cleanup to after config load\n\nSince the subscriber must also know log_dir, approach A is natural: load config -> cleanup -> init subscriber -> dispatch command.\n\n## Acceptance Criteria\n- [ ] Files matching lore.*.log older than retention_days are deleted\n- [ ] Files matching lore.*.log within retention_days are preserved\n- [ ] Non-matching files (e.g., other.txt) are never deleted\n- [ ] retention_days=0 skips cleanup entirely (no files deleted)\n- [ ] Errors on individual files don't prevent cleanup of remaining files\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/core/paths.rs (add cleanup_old_logs function)\n- src/main.rs (call cleanup before subscriber init)\n\n## TDD Loop\nRED:\n - test_log_retention_cleanup: create tempdir with lore.2026-01-01.log through lore.2026-02-04.log, run with retention_days=7, assert old deleted, recent preserved\n - test_log_retention_ignores_non_log_files: create other.txt alongside old log files, assert other.txt untouched\n - test_log_retention_zero_days: retention_days=0, assert nothing deleted\nGREEN: Implement cleanup_old_logs\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- SystemTime::now() precision varies by OS; use file modified time, not name parsing (simpler and more reliable)\n- read_dir on non-existent directory: get_log_dir creates it first, so this shouldn't happen. But handle gracefully.\n- Permissions error on individual file: log a warning, continue with remaining files (don't propagate)\n- Race condition: another process creates a file during cleanup. Not a concern -- we only delete old files.","status":"closed","priority":1,"issue_type":"task","created_at":"2026-02-04T15:53:55.627901Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:15:04.452086Z","closed_at":"2026-02-04T17:15:04.452039Z","close_reason":"Implemented cleanup_old_logs() with date-pattern matching and retention_days config, runs at startup before subscriber init","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-32mc","depends_on_id":"bd-17n","type":"blocks","created_at":"2026-02-04T15:55:19.523048Z","created_by":"tayloreernisse"},{"issue_id":"bd-32mc","depends_on_id":"bd-1k4","type":"blocks","created_at":"2026-02-04T15:55:19.583155Z","created_by":"tayloreernisse"},{"issue_id":"bd-32mc","depends_on_id":"bd-2nx","type":"parent-child","created_at":"2026-02-04T15:53:55.628795Z","created_by":"tayloreernisse"}]}
{"id":"bd-32q","title":"Implement timeline seed phase: FTS5 keyword search to entity IDs","description":"## Background\n\nThe seed phase is steps 1-2 of the timeline pipeline (spec Section 3.2): SEED + HYDRATE. It converts a user's keyword query into a set of entity IDs (issues and MRs) via FTS5 search, and collects evidence note candidates. It reuses the existing FTS5 index (documents_fts table from migration 008) and the document-to-entity mapping in the documents table.\n\n**Spec reference:** `docs/phase-b-temporal-intelligence.md` Section 3.2 steps 1-2.\n\nNote: The spec describes SEED and HYDRATE as separate steps. This bead combines them into a single `seed_timeline()` function since they're tightly coupled (HYDRATE just maps document IDs to entities).\n\n## Approach\n\nCreate `src/core/timeline_seed.rs` with:\n\n```rust\nuse crate::core::timeline::{EntityRef, TimelineEvent, TimelineEventType};\nuse rusqlite::Connection;\n\npub struct SeedResult {\n pub seed_entities: Vec<EntityRef>,\n pub evidence_notes: Vec<TimelineEvent>, // NoteEvidence events (spec: \"evidence candidates\")\n}\n\npub fn seed_timeline(\n conn: &Connection,\n query: &str,\n project_id: Option<i64>,\n since_ms: Option<i64>,\n limit: usize, // max seed entities (default 50)\n) -> Result<SeedResult> {\n // Spec step 1 (SEED): FTS5 keyword search -> matched document IDs\n // Spec step 2 (HYDRATE):\n // - Map document IDs -> source entities (issues, MRs)\n // - Collect top matched notes as evidence candidates (bounded, default top 10)\n // \"These are the actual decision-bearing comments that answer why\"\n}\n```\n\n### SQL for SEED + HYDRATE (steps 1-2):\n```sql\nSELECT DISTINCT d.source_type, d.source_id, d.project_id,\n CASE d.source_type\n WHEN 'issue' THEN (SELECT iid FROM issues WHERE id = d.source_id)\n WHEN 'merge_request' THEN (SELECT iid FROM merge_requests WHERE id = d.source_id)\n END AS iid,\n CASE d.source_type\n WHEN 'issue' THEN (SELECT p.path_with_namespace FROM issues i JOIN projects p ON i.project_id = p.id WHERE i.id = d.source_id)\n WHEN 'merge_request' THEN (SELECT p.path_with_namespace FROM merge_requests m JOIN projects p ON m.project_id = p.id WHERE m.id = d.source_id)\n END AS project_path\nFROM documents_fts fts\nJOIN documents d ON d.id = fts.rowid\nWHERE documents_fts MATCH ?1\n AND (?2 IS NULL OR d.project_id = ?2)\n AND (?3 IS NULL OR d.created_at >= ?3)\nORDER BY rank\nLIMIT ?4\n```\n\n### SQL for evidence notes (spec: \"top matched notes as evidence candidates\"):\n```sql\nSELECT d.source_id, n.id as note_id, n.body, n.created_at, n.author_username,\n disc.id as discussion_id,\n CASE\n WHEN disc.issue_id IS NOT NULL THEN 'issue'\n WHEN disc.merge_request_id IS NOT NULL THEN 'merge_request'\n END AS parent_entity_type,\n COALESCE(disc.issue_id, disc.merge_request_id) AS parent_entity_id\nFROM documents_fts fts\nJOIN documents d ON d.id = fts.rowid\nJOIN discussions disc ON disc.id = d.source_id AND d.source_type = 'discussion'\nJOIN notes n ON n.discussion_id = disc.id AND n.is_system = 0\nWHERE documents_fts MATCH ?1\nORDER BY rank\nLIMIT 10\n```\n\nEvidence notes become `TimelineEvent` with:\n- `event_type: TimelineEventType::NoteEvidence { note_id, snippet, discussion_id }`\n- `snippet`: first 200 chars of note body (per spec Section 3.3)\n- `note_id`: from notes.id (per spec Section 3.3)\n- `discussion_id`: from discussions.id (per spec Section 3.3, Optional)\n\nRegister in `src/core/mod.rs`: `pub mod timeline_seed;`\n\n## Acceptance Criteria\n\n- [ ] `seed_timeline()` returns entities from FTS5 search\n- [ ] Entities are deduplicated (same entity from multiple document hits appears once)\n- [ ] Evidence notes capped at 10 (spec: \"bounded, default top 10\")\n- [ ] Evidence note body snippets truncated to 200 chars (spec Section 3.3)\n- [ ] Evidence notes include note_id field (spec Section 3.3)\n- [ ] `--since` filter works (only entities created after since_ms)\n- [ ] `-p` filter works (only entities from specified project)\n- [ ] Returns empty result (not error) for zero-match queries\n- [ ] Uses safe FTS query builder from `search::fts::build_safe_fts_query()`\n- [ ] `cargo check --all-targets` passes\n- [ ] `cargo clippy --all-targets -- -D warnings` passes\n\n## Files\n\n- `src/core/timeline_seed.rs` (NEW)\n- `src/core/mod.rs` (add `pub mod timeline_seed;`)\n\n## TDD Loop\n\nRED: Create `src/core/timeline_seed.rs` with `#[cfg(test)] mod tests`:\n- `test_seed_deduplicates_entities` - same issue from two docs -> one EntityRef\n- `test_seed_empty_query_returns_empty` - no panic on zero matches\n- `test_seed_evidence_capped_at_10` - never more than 10 evidence notes\n- `test_seed_evidence_has_note_id` - evidence NoteEvidence events include note_id\n- `test_seed_evidence_snippet_truncated` - snippets max 200 chars\n- `test_seed_respects_since_filter` - old docs excluded\n\nGREEN: Implement the FTS5 queries and deduplication logic.\n\nVERIFY: `cargo test --lib -- timeline_seed`\n\n## Edge Cases\n\n- FTS5 MATCH can fail with invalid syntax. Use the safe query builder from `search::fts::build_safe_fts_query()` to sanitize input.\n- Discussion documents may link to notes that have been deleted (orphan). Handle with LEFT JOIN and skip NULL note bodies.\n- Some documents have source_type='discussion' but the discussion may be for an issue or MR -- resolve the parent entity via disc.issue_id or disc.merge_request_id.\n- Evidence note with very long body: truncate to 200 chars, ensure no mid-codepoint truncation for UTF-8.","status":"open","priority":2,"issue_type":"task","created_at":"2026-02-02T21:33:08.615908Z","created_by":"tayloreernisse","updated_at":"2026-02-05T19:11:53.278838Z","compaction_level":0,"original_size":0,"labels":["gate-3","phase-b","query"],"dependencies":[{"issue_id":"bd-32q","depends_on_id":"bd-20e","type":"blocks","created_at":"2026-02-02T21:33:37.368005Z","created_by":"tayloreernisse"},{"issue_id":"bd-32q","depends_on_id":"bd-ike","type":"parent-child","created_at":"2026-02-02T21:33:08.617483Z","created_by":"tayloreernisse"}]}
{"id":"bd-335","title":"Implement Ollama API client","description":"## Background\nThe Ollama API client provides the HTTP interface to the local Ollama embedding server. It handles health checks (is Ollama running? does the model exist?), batch embedding requests (up to 32 texts per call), and error translation to LoreError variants. This is the lowest-level embedding component — the pipeline (bd-am7) builds on top of it.\n\n## Approach\nCreate \\`src/embedding/ollama.rs\\` per PRD Section 4.2. **Uses async reqwest (not blocking).**\n\n```rust\nuse reqwest::Client; // NOTE: async Client, not reqwest::blocking\nuse serde::{Deserialize, Serialize};\nuse crate::core::error::{LoreError, Result};\n\npub struct OllamaConfig {\n pub base_url: String, // default \\\"http://localhost:11434\\\"\n pub model: String, // default \\\"nomic-embed-text\\\"\n pub timeout_secs: u64, // default 60\n}\n\nimpl Default for OllamaConfig { /* PRD defaults */ }\n\npub struct OllamaClient {\n client: Client, // async reqwest::Client\n config: OllamaConfig,\n}\n\n#[derive(Serialize)]\nstruct EmbedRequest { model: String, input: Vec<String> }\n\n#[derive(Deserialize)]\nstruct EmbedResponse { model: String, embeddings: Vec<Vec<f32>> }\n\n#[derive(Deserialize)]\nstruct TagsResponse { models: Vec<ModelInfo> }\n\n#[derive(Deserialize)]\nstruct ModelInfo { name: String }\n\nimpl OllamaClient {\n pub fn new(config: OllamaConfig) -> Self;\n\n /// Async health check: GET /api/tags\n /// Model matched via starts_with (\\\"nomic-embed-text\\\" matches \\\"nomic-embed-text:latest\\\")\n pub async fn health_check(&self) -> Result<()>;\n\n /// Async batch embedding: POST /api/embed\n /// Input: Vec<String> of texts, Response: Vec<Vec<f32>> of 768-dim embeddings\n pub async fn embed_batch(&self, texts: Vec<String>) -> Result<Vec<Vec<f32>>>;\n}\n\n/// Quick health check without full client (async).\npub async fn check_ollama_health(base_url: &str) -> bool;\n```\n\n**Error mapping (per PRD):**\n- Connection refused/timeout -> LoreError::OllamaUnavailable { base_url, source: Some(e) }\n- Model not in /api/tags -> LoreError::OllamaModelNotFound { model }\n- Non-200 from /api/embed -> LoreError::EmbeddingFailed { document_id: 0, reason: format!(\\\"HTTP {}: {}\\\", status, body) }\n\n**Key PRD detail:** Model matching uses \\`starts_with\\` (not exact match) so \\\"nomic-embed-text\\\" matches \\\"nomic-embed-text:latest\\\".\n\n## Acceptance Criteria\n- [ ] Uses async reqwest::Client (not blocking)\n- [ ] health_check() is async, detects server availability and model presence\n- [ ] Model matched via starts_with (handles \\\":latest\\\" suffix)\n- [ ] embed_batch() is async, sends POST /api/embed\n- [ ] Batch size up to 32 texts\n- [ ] Returns Vec<Vec<f32>> with 768 dimensions each\n- [ ] OllamaUnavailable error includes base_url and source error\n- [ ] OllamaModelNotFound error includes model name\n- [ ] Non-200 response mapped to EmbeddingFailed with status + body\n- [ ] Timeout: 60 seconds default (configurable via OllamaConfig)\n- [ ] \\`cargo build\\` succeeds\n\n## Files\n- \\`src/embedding/ollama.rs\\` — new file\n- \\`src/embedding/mod.rs\\` — add \\`pub mod ollama;\\` and re-exports\n\n## TDD Loop\nRED: Tests (unit tests with mock, integration needs Ollama):\n- \\`test_config_defaults\\` — verify default base_url, model, timeout\n- \\`test_health_check_model_starts_with\\` — \\\"nomic-embed-text\\\" matches \\\"nomic-embed-text:latest\\\"\n- \\`test_embed_batch_parse\\` — mock response parsed correctly\n- \\`test_connection_error_maps_to_ollama_unavailable\\`\nGREEN: Implement OllamaClient\nVERIFY: \\`cargo test ollama\\`\n\n## Edge Cases\n- Ollama returns model name with version tag (\\\"nomic-embed-text:latest\\\"): starts_with handles this\n- Empty texts array: send empty batch, Ollama returns empty embeddings\n- Ollama returns wrong number of embeddings (2 texts, 1 embedding): caller (pipeline) validates\n- Non-JSON response: reqwest deserialization error -> wrap appropriately","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:26:34.025099Z","created_by":"tayloreernisse","updated_at":"2026-01-30T16:58:17.546852Z","closed_at":"2026-01-30T16:58:17.546794Z","close_reason":"Completed: OllamaClient with async health_check (starts_with model matching), embed_batch, error mapping to LoreError variants, check_ollama_health helper, 4 tests pass","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-335","depends_on_id":"bd-ljf","type":"blocks","created_at":"2026-01-30T15:29:24.627951Z","created_by":"tayloreernisse"}]}
{"id":"bd-343o","title":"Fetch and store GitLab linked issues (Related to)","description":"## Background\n\nGitLab's 'Linked items' (Related issues) section provides bidirectional issue linking that's distinct from 'closes' references and 'mentioned' references. This data is only available via the issue links API endpoint and must be fetched separately.\n\n**Spec context:** `docs/phase-b-temporal-intelligence.md` Section 2.3 mentions the `entity_references` table can store 'related' references. This bead adds a new ingestion source (issue links API) that wasn't explicitly in the spec but naturally extends Gate 2's cross-reference extraction.\n\n**source_method note:** The codebase uses `source_method = 'api'` (see migration 011, line 86 CHECK constraint). The spec defines more granular values (`'api_closes_issues'`), but the code has already shipped with the simpler taxonomy.\n\n## Approach\n\n### Phase 1: API Client\n\nAdd to `src/gitlab/client.rs`:\n```rust\npub async fn fetch_issue_links(\n &self,\n project_id: i64,\n issue_iid: i64,\n) -> Result<Vec<GitLabIssueLink>> {\n // GET /projects/:id/issues/:iid/links\n // Returns array of linked issues with link_type\n}\n```\n\n### Phase 2: Types\n\nAdd to `src/gitlab/types.rs`:\n```rust\n#[derive(Debug, Deserialize)]\npub struct GitLabIssueLink {\n pub id: i64, // GitLab issue ID (not IID)\n pub iid: i64,\n pub title: String,\n pub state: String,\n pub web_url: String,\n pub link_type: String, // \"relates_to\", \"blocks\", \"is_blocked_by\"\n pub link_created_at: Option<String>,\n pub link_updated_at: Option<String>,\n}\n```\n\n### Phase 3: Ingestion\n\nCreate `src/ingestion/issue_links.rs`:\n```rust\npub async fn fetch_and_store_issue_links(\n config: &Config,\n conn: &Connection,\n project_id: i64,\n issue_local_id: i64,\n issue_iid: i64,\n) -> Result<usize> {\n // 1. Fetch links from API\n // 2. For each link, resolve target issue to local DB id\n // 3. Insert into entity_references with:\n // reference_type = 'related'\n // source_method = 'api' (matches codebase CHECK constraint)\n // 4. link_type 'blocks'/'is_blocked_by' -> still reference_type = 'related'\n // (blocking semantics not modeled in entity_references yet)\n}\n```\n\n### Phase 4: Queue Integration\n\nAdd 'issue_links' job_type to pending_dependent_fetches:\n- Enqueue after issue ingestion\n- Drain during dependent fetch phase\n\nNote: This requires updating the CHECK constraint on pending_dependent_fetches.job_type to include 'issue_links'. SQLite can't ALTER CHECK constraints, so this needs a migration that recreates the table.\n\n### Phase 5: Display\n\nIn `lore show issue 123` output, add a \"Related Issues\" section:\n```\nRelated Issues:\n #456 \"Fix login timeout\" (opened) - relates_to\n #789 \"Auth redesign\" (closed) - blocks\n```\n\n## Acceptance Criteria\n\n- [ ] API client fetches issue links with pagination\n- [ ] Each link stored as entity_reference with reference_type='related', source_method='api'\n- [ ] Bidirectional: if issue A links to B, both A->B and B->A references created\n- [ ] Link type preserved (relates_to, blocks, is_blocked_by) in a metadata column or secondary table\n- [ ] `lore show issue 123` shows related issues section\n- [ ] `lore --robot show issue 123` includes related_issues in JSON\n- [ ] Cross-project links stored as unresolved (target_entity_id NULL, target_project_path populated)\n- [ ] Graceful handling: issues with no links -> empty section (not error)\n- [ ] `cargo check --all-targets` passes\n- [ ] `cargo clippy --all-targets -- -D warnings` passes\n\n## Files\n\n- `src/gitlab/client.rs` (add fetch_issue_links method)\n- `src/gitlab/types.rs` (add GitLabIssueLink struct)\n- `src/ingestion/issue_links.rs` (NEW)\n- `src/ingestion/mod.rs` (add pub mod issue_links)\n- `src/ingestion/orchestrator.rs` (enqueue issue_links jobs)\n- `migrations/???_issue_links_job_type.sql` (recreate pending_dependent_fetches with expanded CHECK)\n- `src/cli/commands/show.rs` (display related issues)\n\n## TDD Loop\n\nRED: Create tests:\n- `test_issue_link_deserialization` - parse GitLab API response\n- `test_store_issue_links_creates_references` - inserts into entity_references\n- `test_bidirectional_links` - A->B also creates B->A reference\n\nGREEN: Implement API client, ingestion, and display.\n\nVERIFY: `cargo test --lib -- issue_links`\n\n## Edge Cases\n\n- Cross-project links: target issue may not be in local DB. Store as unresolved reference (target_entity_id = NULL, target_project_path + target_entity_iid populated).\n- Self-links: issue linked to itself. Skip.\n- Duplicate links: UNIQUE constraint on entity_references prevents duplicates.\n- CHECK constraint on pending_dependent_fetches.job_type needs migration to add 'issue_links'. SQLite limitation: can't ALTER CHECK constraints. Create a new migration that recreates the table with the expanded constraint. This is safe because pending_dependent_fetches is a transient queue.","status":"open","priority":2,"issue_type":"feature","created_at":"2026-02-05T15:14:25.202900Z","created_by":"tayloreernisse","updated_at":"2026-02-05T19:13:13.910828Z","compaction_level":0,"original_size":0,"labels":["ISSUE"]}
{"id":"bd-34ek","title":"OBSERV: Implement MetricsLayer custom tracing subscriber layer","description":"## Background\nMetricsLayer is a custom tracing subscriber layer that records span timing and structured fields, then materializes them into Vec<StageTiming>. This avoids threading a mutable collector through every function signature -- spans are the single source of truth.\n\n## Approach\nAdd to src/core/metrics.rs (same file as StageTiming):\n\n```rust\nuse std::collections::HashMap;\nuse std::sync::{Arc, Mutex};\nuse std::time::Instant;\nuse tracing::span::{Attributes, Id, Record};\nuse tracing::Subscriber;\nuse tracing_subscriber::layer::{Context, Layer};\nuse tracing_subscriber::registry::LookupSpan;\n\n#[derive(Debug)]\nstruct SpanData {\n name: String,\n parent_id: Option<Id>,\n start: Instant,\n fields: HashMap<String, serde_json::Value>,\n}\n\n#[derive(Debug, Clone)]\npub struct MetricsLayer {\n spans: Arc<Mutex<HashMap<u64, SpanData>>>,\n completed: Arc<Mutex<Vec<(u64, StageTiming)>>>,\n}\n\nimpl MetricsLayer {\n pub fn new() -> Self {\n Self {\n spans: Arc::new(Mutex::new(HashMap::new())),\n completed: Arc::new(Mutex::new(Vec::new())),\n }\n }\n\n /// Extract timing tree for a completed run.\n /// Call this after the root span closes.\n pub fn extract_timings(&self) -> Vec<StageTiming> {\n let completed = self.completed.lock().unwrap();\n // Build tree: find root entries (no parent), attach children\n // ... tree construction logic\n }\n}\n\nimpl<S> Layer<S> for MetricsLayer\nwhere\n S: Subscriber + for<'a> LookupSpan<'a>,\n{\n fn on_new_span(&self, attrs: &Attributes<'_>, id: &Id, ctx: Context<'_, S>) {\n let parent_id = ctx.span(id).and_then(|s| s.parent().map(|p| p.id()));\n let mut fields = HashMap::new();\n // Visit attrs to capture initial field values\n let mut visitor = FieldVisitor(&mut fields);\n attrs.record(&mut visitor);\n\n self.spans.lock().unwrap().insert(id.into_u64(), SpanData {\n name: attrs.metadata().name().to_string(),\n parent_id,\n start: Instant::now(),\n fields,\n });\n }\n\n fn on_record(&self, id: &Id, values: &Record<'_>, _ctx: Context<'_, S>) {\n // Capture recorded fields (items_processed, items_skipped, errors)\n if let Some(data) = self.spans.lock().unwrap().get_mut(&id.into_u64()) {\n let mut visitor = FieldVisitor(&mut data.fields);\n values.record(&mut visitor);\n }\n }\n\n fn on_close(&self, id: Id, _ctx: Context<'_, S>) {\n if let Some(data) = self.spans.lock().unwrap().remove(&id.into_u64()) {\n let elapsed = data.start.elapsed();\n let timing = StageTiming {\n name: data.name,\n project: data.fields.get(\"project\").and_then(|v| v.as_str()).map(String::from),\n elapsed_ms: elapsed.as_millis() as u64,\n items_processed: data.fields.get(\"items_processed\").and_then(|v| v.as_u64()).unwrap_or(0) as usize,\n items_skipped: data.fields.get(\"items_skipped\").and_then(|v| v.as_u64()).unwrap_or(0) as usize,\n errors: data.fields.get(\"errors\").and_then(|v| v.as_u64()).unwrap_or(0) as usize,\n sub_stages: vec![], // Will be populated during extract_timings tree construction\n };\n self.completed.lock().unwrap().push((id.into_u64(), timing));\n }\n }\n}\n```\n\nNeed a FieldVisitor struct implementing tracing::field::Visit to capture field values.\n\nRegister in subscriber stack (src/main.rs), alongside stderr and file layers:\n```rust\nlet metrics_layer = MetricsLayer::new();\nlet metrics_handle = metrics_layer.clone(); // Clone Arc for later extraction\n\nregistry()\n .with(stderr_layer.with_filter(stderr_filter))\n .with(file_layer.with_filter(file_filter))\n .with(metrics_layer) // No filter -- captures all spans\n .init();\n```\n\nPass metrics_handle to command handlers so they can call extract_timings() after the pipeline completes.\n\n## Acceptance Criteria\n- [ ] MetricsLayer captures span enter/close timing\n- [ ] on_record captures items_processed, items_skipped, errors fields\n- [ ] extract_timings() returns correctly nested Vec<StageTiming> tree\n- [ ] Parallel spans (multiple projects) both appear as sub_stages of parent\n- [ ] Thread-safe: Arc<Mutex<>> allows concurrent span operations\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/core/metrics.rs (add MetricsLayer, FieldVisitor, tree construction)\n- src/main.rs (register MetricsLayer in subscriber stack)\n\n## TDD Loop\nRED:\n - test_metrics_layer_single_span: enter/exit one span, extract, assert one StageTiming\n - test_metrics_layer_nested_spans: parent + child, assert child in parent.sub_stages\n - test_metrics_layer_parallel_spans: two sibling spans, assert both in parent.sub_stages\n - test_metrics_layer_field_recording: record items_processed=42, assert captured\nGREEN: Implement MetricsLayer with on_new_span, on_record, on_close, extract_timings\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- Span ID reuse: tracing may reuse span IDs after close. Using remove on close prevents stale data.\n- Lock contention: Mutex per operation. For high-span-count scenarios, consider parking_lot::Mutex. But lore's span count is low (<100 per run), so std::sync::Mutex is fine.\n- extract_timings tree construction: iterate completed Vec, build parent->children map, then recursively construct StageTiming tree. Root entries have parent_id matching the root span or None.\n- MetricsLayer has no filter: it sees ALL spans. To avoid noise from dependency spans, check if span name starts with known stage names, or rely on the \"stage\" field being present.","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-04T15:54:31.960669Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:25:25.523811Z","closed_at":"2026-02-04T17:25:25.523730Z","close_reason":"Implemented MetricsLayer custom tracing subscriber layer with span timing capture, rate-limit/retry event detection, tree extraction, and 12 unit tests","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-34ek","depends_on_id":"bd-1o4h","type":"blocks","created_at":"2026-02-04T15:55:19.851554Z","created_by":"tayloreernisse"},{"issue_id":"bd-34ek","depends_on_id":"bd-24j1","type":"blocks","created_at":"2026-02-04T15:55:19.905554Z","created_by":"tayloreernisse"},{"issue_id":"bd-34ek","depends_on_id":"bd-3er","type":"parent-child","created_at":"2026-02-04T15:54:31.961646Z","created_by":"tayloreernisse"}]}
{"id":"bd-34o","title":"Implement MR transformer","description":"## Background\nTransforms GitLab MR API responses into normalized schema for database storage. Handles deprecated field fallbacks and extracts metadata (labels, assignees, reviewers).\n\n## Approach\nCreate new transformer module following existing issue transformer pattern:\n- `NormalizedMergeRequest` - Database-ready struct\n- `MergeRequestWithMetadata` - MR + extracted labels/assignees/reviewers\n- `transform_merge_request()` - Main transformation function\n- `extract_labels()` - Label extraction helper\n\n## Files\n- `src/gitlab/transformers/merge_request.rs` - New transformer module\n- `src/gitlab/transformers/mod.rs` - Export new module\n- `tests/mr_transformer_tests.rs` - Unit tests\n\n## Acceptance Criteria\n- [ ] `NormalizedMergeRequest` struct exists with all DB columns\n- [ ] `MergeRequestWithMetadata` contains MR + label_names + assignee_usernames + reviewer_usernames\n- [ ] `transform_merge_request()` returns `Result<MergeRequestWithMetadata, String>`\n- [ ] `draft` computed as `gitlab_mr.draft || gitlab_mr.work_in_progress`\n- [ ] `detailed_merge_status` prefers `detailed_merge_status` over `merge_status_legacy`\n- [ ] `merge_user_username` prefers `merge_user` over `merged_by`\n- [ ] `head_sha` extracted from `sha` field\n- [ ] `references_short` and `references_full` extracted from `references` Option\n- [ ] Timestamps parsed with `iso_to_ms()`, errors returned (not zeroed)\n- [ ] `last_seen_at` set to `now_ms()`\n- [ ] `cargo test mr_transformer` passes\n\n## TDD Loop\nRED: `cargo test mr_transformer` -> module not found\nGREEN: Add transformer with all fields\nVERIFY: `cargo test mr_transformer`\n\n## Struct Definitions\n```rust\n#[derive(Debug, Clone)]\npub struct NormalizedMergeRequest {\n pub gitlab_id: i64,\n pub project_id: i64,\n pub iid: i64,\n pub title: String,\n pub description: Option<String>,\n pub state: String,\n pub draft: bool,\n pub author_username: String,\n pub source_branch: String,\n pub target_branch: String,\n pub head_sha: Option<String>,\n pub references_short: Option<String>,\n pub references_full: Option<String>,\n pub detailed_merge_status: Option<String>,\n pub merge_user_username: Option<String>,\n pub created_at: i64,\n pub updated_at: i64,\n pub merged_at: Option<i64>,\n pub closed_at: Option<i64>,\n pub last_seen_at: i64,\n pub web_url: String,\n}\n\n#[derive(Debug, Clone)]\npub struct MergeRequestWithMetadata {\n pub merge_request: NormalizedMergeRequest,\n pub label_names: Vec<String>,\n pub assignee_usernames: Vec<String>,\n pub reviewer_usernames: Vec<String>,\n}\n```\n\n## Function Signature\n```rust\npub fn transform_merge_request(\n gitlab_mr: &GitLabMergeRequest,\n local_project_id: i64,\n) -> Result<MergeRequestWithMetadata, String>\n```\n\n## Key Logic\n```rust\n// Draft: prefer draft, fallback to work_in_progress\nlet is_draft = gitlab_mr.draft || gitlab_mr.work_in_progress;\n\n// Merge status: prefer detailed_merge_status\nlet detailed_merge_status = gitlab_mr.detailed_merge_status\n .clone()\n .or_else(|| gitlab_mr.merge_status_legacy.clone());\n\n// Merge user: prefer merge_user\nlet merge_user_username = gitlab_mr.merge_user\n .as_ref()\n .map(|u| u.username.clone())\n .or_else(|| gitlab_mr.merged_by.as_ref().map(|u| u.username.clone()));\n\n// References extraction\nlet (references_short, references_full) = gitlab_mr.references\n .as_ref()\n .map(|r| (Some(r.short.clone()), Some(r.full.clone())))\n .unwrap_or((None, None));\n\n// Head SHA\nlet head_sha = gitlab_mr.sha.clone();\n```\n\n## Edge Cases\n- Invalid timestamps should return `Err`, not zero values\n- Empty labels/assignees/reviewers should return empty Vecs, not None\n- `state` must pass through as-is (including \"locked\")","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-26T22:06:40.849049Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:11:48.501301Z","closed_at":"2026-01-27T00:11:48.501241Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-34o","depends_on_id":"bd-3ir","type":"blocks","created_at":"2026-01-26T22:08:54.023616Z","created_by":"tayloreernisse"},{"issue_id":"bd-34o","depends_on_id":"bd-5ta","type":"blocks","created_at":"2026-01-26T22:08:54.059646Z","created_by":"tayloreernisse"}]}
{"id":"bd-35o","title":"Create golden query test suite","description":"## Background\nGolden query tests verify end-to-end search quality with known-good expected results. They use a seeded SQLite DB with deterministic fixture data and fixed embedding vectors (no Ollama dependency). Each test query must return at least one expected URL in the top 10 results. These tests catch search regressions (ranking changes, filter bugs, missing results).\n\n## Approach\nCreate test infrastructure:\n\n**1. tests/fixtures/golden_queries.json:**\n```json\n[\n {\n \"query\": \"authentication login\",\n \"mode\": \"lexical\",\n \"filters\": {},\n \"expected_urls\": [\"https://gitlab.example.com/group/project/-/issues/234\"],\n \"min_results\": 1,\n \"max_rank\": 10\n },\n {\n \"query\": \"jwt token refresh\",\n \"mode\": \"hybrid\",\n \"filters\": {\"type\": \"merge_request\"},\n \"expected_urls\": [\"https://gitlab.example.com/group/project/-/merge_requests/456\"],\n \"min_results\": 1,\n \"max_rank\": 10\n }\n]\n```\n\n**2. Test harness (tests/golden_query_tests.rs):**\n- Load golden_queries.json\n- Create in-memory DB, apply all migrations\n- Seed with deterministic fixture documents (issues, MRs, discussions)\n- For hybrid/semantic queries: seed with fixed embedding vectors (768-dim, manually constructed for known similarity)\n- For each query: run search, verify expected URL in top N results\n\n**Fixture data design:**\n- 10-20 documents covering different source types\n- Known content that matches expected queries\n- Fixed embeddings: construct vectors where similar documents have small cosine distance\n- No randomness — fully deterministic\n\n## Acceptance Criteria\n- [ ] Golden queries file exists with at least 5 test queries\n- [ ] Test harness loads queries and validates each\n- [ ] All golden queries pass: expected URL in top 10\n- [ ] No external dependencies (no Ollama, no GitLab)\n- [ ] Deterministic fixture data (fixed embeddings, fixed content)\n- [ ] `cargo test --test golden_query_tests` passes in CI\n\n## Files\n- `tests/fixtures/golden_queries.json` — new file\n- `tests/golden_query_tests.rs` — new file (or tests/golden_queries.rs)\n\n## TDD Loop\nRED: Create golden_queries.json with expected results, harness fails (no fixture data)\nGREEN: Seed fixture data that satisfies expected results\nVERIFY: `cargo test --test golden_query_tests`\n\n## Edge Cases\n- Query matches multiple expected URLs: all must be present\n- Lexical queries: FTS ranking determines position, not vector\n- Hybrid queries: RRF combines both signals — fixed vectors must be designed to produce expected ranking\n- Empty result for a golden query: test failure with clear message showing actual results","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-30T15:27:21.788493Z","created_by":"tayloreernisse","updated_at":"2026-01-30T18:12:47.085563Z","closed_at":"2026-01-30T18:12:47.085363Z","close_reason":"Golden query test suite: 7 golden queries in fixture, 8 seeded documents, 2 test functions (all_pass + fixture_valid), deterministic in-memory DB, no external deps. 312 total tests pass.","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-35o","depends_on_id":"bd-2no","type":"blocks","created_at":"2026-01-30T15:29:35.641568Z","created_by":"tayloreernisse"}]}
{"id":"bd-35r","title":"[CP1] Discussion and note transformers","description":"Transform GitLab discussion/note payloads to normalized database schema.\n\nFunctions to implement:\n- transformDiscussion(gitlabDiscussion, localProjectId, localIssueId) → NormalizedDiscussion\n- transformNotes(gitlabDiscussion, localProjectId) → NormalizedNote[]\n\nTransformation rules:\n- Compute first_note_at/last_note_at from notes array\n- Compute resolvable/resolved status from notes\n- Set is_system from note.system\n- Preserve note order via position (array index)\n- Convert ISO timestamps to ms epoch\n\nFiles: src/gitlab/transformers/discussion.ts\nTests: tests/unit/discussion-transformer.test.ts\nDone when: Unit tests pass for discussion/note transformation with system note flagging","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:19:16.861421Z","created_by":"tayloreernisse","updated_at":"2026-01-25T15:21:35.154646Z","deleted_at":"2026-01-25T15:21:35.154643Z","deleted_by":"tayloreernisse","delete_reason":"delete","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-36m","title":"Final validation and test coverage","description":"## Background\nFinal validation gate ensuring all CP2 features work correctly. Verifies tests, lint, and manual smoke tests pass.\n\n## Approach\nRun comprehensive validation:\n1. Automated tests (unit + integration)\n2. Clippy and formatting\n3. Critical test case verification\n4. Gate A/B/C/D/E checklist\n5. Manual smoke tests\n\n## Files\nNone - validation only\n\n## Acceptance Criteria\n- [ ] `cargo test` passes (all tests green)\n- [ ] `cargo test --release` passes\n- [ ] `cargo clippy -- -D warnings` passes (zero warnings)\n- [ ] `cargo fmt --check` passes\n- [ ] Critical tests pass (see list below)\n- [ ] Gate A/B/C/D/E verification complete\n- [ ] Manual smoke tests pass\n\n## Validation Commands\n```bash\n# 1. Build and test\ncargo build --release\ncargo test --release\n\n# 2. Lint\ncargo clippy -- -D warnings\ncargo fmt --check\n\n# 3. Run specific critical tests\ncargo test does_not_advance_discussion_watermark_on_partial_failure\ncargo test prefers_detailed_merge_status_when_both_fields_present\ncargo test prefers_merge_user_when_both_fields_present\ncargo test prefers_draft_when_both_draft_and_work_in_progress_present\ncargo test atomic_note_replacement_preserves_data_on_parse_failure\ncargo test full_sync_resets_discussion_watermarks\n```\n\n## Critical Test Cases\n| Test | What It Verifies |\n|------|------------------|\n| `does_not_advance_discussion_watermark_on_partial_failure` | Pagination failure doesn't lose data |\n| `prefers_detailed_merge_status_when_both_fields_present` | Non-deprecated field wins |\n| `prefers_merge_user_when_both_fields_present` | Non-deprecated field wins |\n| `prefers_draft_when_both_draft_and_work_in_progress_present` | OR semantics for draft |\n| `atomic_note_replacement_preserves_data_on_parse_failure` | Parse before delete |\n| `full_sync_resets_discussion_watermarks` | --full truly refreshes |\n\n## Gate Checklist\n\n### Gate A: MRs Only\n- [ ] `gi ingest --type=merge_requests` fetches all MRs\n- [ ] MR state supports: opened, merged, closed, locked\n- [ ] draft field captured with work_in_progress fallback\n- [ ] detailed_merge_status used with merge_status fallback\n- [ ] head_sha and references captured\n- [ ] Cursor-based sync is resumable\n\n### Gate B: Labels + Assignees + Reviewers\n- [ ] Labels linked via mr_labels junction\n- [ ] Stale labels removed on resync\n- [ ] Assignees linked via mr_assignees\n- [ ] Reviewers linked via mr_reviewers\n\n### Gate C: Dependent Discussion Sync\n- [ ] Discussions fetched for MRs with updated_at advancement\n- [ ] DiffNote position metadata captured\n- [ ] DiffNote SHA triplet captured\n- [ ] Upsert + sweep pattern for notes\n- [ ] Watermark NOT advanced on partial failure\n- [ ] Unchanged MRs skip discussion refetch\n\n### Gate D: Resumability Proof\n- [ ] Kill mid-run, rerun -> bounded redo\n- [ ] `--full` resets cursor AND discussion watermarks\n- [ ] Single-flight lock prevents concurrent runs\n\n### Gate E: CLI Complete\n- [ ] `gi list mrs` with all filters including --draft/--no-draft\n- [ ] `gi show mr <iid>` with discussions and DiffNote context\n- [ ] `gi count mrs` with state breakdown\n- [ ] `gi sync-status` shows MR cursors\n\n## Manual Smoke Tests\n| Command | Expected |\n|---------|----------|\n| `gi ingest --type=merge_requests` | Completes, shows counts |\n| `gi list mrs --limit=10` | Shows 10 MRs with correct columns |\n| `gi list mrs --state=merged` | Only merged MRs |\n| `gi list mrs --draft` | Only draft MRs with [DRAFT] prefix |\n| `gi show mr <iid>` | Full detail with discussions |\n| `gi count mrs` | Count with state breakdown |\n| Re-run ingest | \"0 new MRs\", skipped discussion count |\n| `gi ingest --type=merge_requests --full` | Full resync |\n\n## Data Integrity Checks\n```sql\n-- MR count matches GitLab\nSELECT COUNT(*) FROM merge_requests;\n\n-- Every MR has raw payload\nSELECT COUNT(*) FROM merge_requests WHERE raw_payload_id IS NULL;\n-- Should be 0\n\n-- Labels linked correctly\nSELECT m.iid, COUNT(ml.label_id) \nFROM merge_requests m\nLEFT JOIN mr_labels ml ON ml.merge_request_id = m.id\nGROUP BY m.id;\n\n-- DiffNotes have position metadata\nSELECT COUNT(*) FROM notes WHERE position_new_path IS NOT NULL;\n```","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-26T22:06:43.697983Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:45:17.794393Z","closed_at":"2026-01-27T00:45:17.794325Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-36m","depends_on_id":"bd-3js","type":"blocks","created_at":"2026-01-26T22:08:55.409785Z","created_by":"tayloreernisse"},{"issue_id":"bd-36m","depends_on_id":"bd-mk3","type":"blocks","created_at":"2026-01-26T22:08:55.340118Z","created_by":"tayloreernisse"}]}
{"id":"bd-36p","title":"Implement document types and extractor module","description":"## Background\nThe document types module is the foundational data layer for all document operations. It defines SourceType (the enum used everywhere to identify entity types), DocumentData (the struct passed between extractors, regenerator, and storage), and hash functions (for change detection that skips unchanged documents). This bead has 7 downstream dependents — it's the most-depended-on implementation bead.\n\n## Approach\nCreate `src/documents/` module with `mod.rs` and `extractor.rs`.\n\n**src/documents/mod.rs:**\n```rust\nmod extractor;\nmod regenerator; // placeholder: pub mod added by later bead\nmod truncation; // placeholder: pub mod added by later bead\n\npub use extractor::{\n extract_discussion_document, extract_issue_document, extract_mr_document,\n DocumentData, SourceType, compute_content_hash, compute_list_hash,\n};\npub use regenerator::regenerate_dirty_documents;\npub use truncation::{truncate_content, TruncationResult};\n```\n\n**src/documents/extractor.rs — types only (this bead):**\n- `SourceType` enum with `Issue`, `MergeRequest`, `Discussion` variants\n- `SourceType::as_str()`, `SourceType::parse()` (accepts aliases), `Display` impl\n- `DocumentData` struct with all fields per PRD Section 2.2\n- `compute_content_hash(content: &str) -> String` using SHA-256\n- `compute_list_hash(items: &[String]) -> String` using sorted join + SHA-256\n\n**Dependencies:** Add `sha2 = \"0.10\"` to Cargo.toml if not already present.\n\nNote: The `extract_*_document()` functions are stub signatures in this bead. Their implementations are in separate beads (bd-247, bd-1yz, bd-2fp).\n\n**src/lib.rs:** Add `pub mod documents;`\n\n## Acceptance Criteria\n- [ ] `SourceType::parse(\"issue\")` returns `Some(Issue)`\n- [ ] `SourceType::parse(\"mr\")` returns `Some(MergeRequest)` (alias)\n- [ ] `SourceType::parse(\"mrs\")` returns `Some(MergeRequest)` (alias)\n- [ ] `SourceType::parse(\"merge_request\")` returns `Some(MergeRequest)`\n- [ ] `SourceType::parse(\"discussion\")` returns `Some(Discussion)`\n- [ ] `SourceType::parse(\"invalid\")` returns `None`\n- [ ] `SourceType::as_str()` returns \"issue\", \"merge_request\", \"discussion\"\n- [ ] `compute_content_hash(\"hello\")` returns deterministic SHA-256 hex string\n- [ ] `compute_list_hash([\"b\", \"a\"])` == `compute_list_hash([\"a\", \"b\"])` (sorted)\n- [ ] `DocumentData` struct compiles with all fields per PRD\n- [ ] `cargo build` succeeds\n- [ ] `cargo test documents` passes\n\n## Files\n- `src/documents/mod.rs` — new file (module root with pub use)\n- `src/documents/extractor.rs` — new file (types + hash functions)\n- `src/documents/regenerator.rs` — new file (placeholder, empty module or todo!())\n- `src/documents/truncation.rs` — new file (placeholder, empty module or todo!())\n- `src/lib.rs` — add `pub mod documents;`\n- `Cargo.toml` — add `sha2 = \"0.10\"` if not present\n\n## TDD Loop\nRED: Tests in `extractor.rs` `#[cfg(test)] mod tests`:\n- `test_source_type_parse_aliases` — all parse variants\n- `test_source_type_as_str` — roundtrip as_str\n- `test_content_hash_deterministic` — same input = same hash\n- `test_list_hash_order_independent` — sorted before hashing\n- `test_list_hash_empty` — empty vec produces consistent hash\nGREEN: Implement types and hash functions\nVERIFY: `cargo test documents`\n\n## Edge Cases\n- Empty string content hash: must produce valid SHA-256 (not panic)\n- Empty labels list: compute_list_hash returns hash of empty string\n- SourceType Display shows snake_case (same as as_str)","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:25:45.456566Z","created_by":"tayloreernisse","updated_at":"2026-01-30T16:55:18.636263Z","closed_at":"2026-01-30T16:55:18.636205Z","close_reason":"Completed: SourceType enum with parse/as_str/Display, DocumentData struct, compute_content_hash, compute_list_hash, 8 tests pass","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-36p","depends_on_id":"bd-3lc","type":"blocks","created_at":"2026-01-30T15:29:15.607888Z","created_by":"tayloreernisse"}]}
{"id":"bd-37qw","title":"OBSERV: Generate run_id at command entry in main.rs","description":"## Background\nEvery sync/ingest run needs a unique correlation ID so log lines, database records, and robot JSON can be linked. The uuid crate (v1, v4 feature) is already in Cargo.toml (line ~48). Using first 8 chars of UUIDv4 gives ~4 billion unique values.\n\n## Approach\nIn src/main.rs, after CLI parsing (line ~60) and before command dispatch (line ~73), generate run_id:\n\n```rust\nlet run_id = uuid::Uuid::new_v4().to_string();\nlet run_id = &run_id[..8]; // First 8 hex chars: e.g., \"a1b2c3d4\"\n```\n\nPass run_id to command handlers that need it (sync, ingest). This requires adding a run_id parameter to handle_sync_cmd() and handle_ingest(). Other commands (doctor, list, show, etc.) don't need correlation IDs.\n\nAlternative: generate run_id inside each command handler instead of main(). This avoids changing signatures of commands that don't need it. Prefer this approach -- generate inside run_sync() and run_ingest() directly.\n\nPreferred approach (generate in command handlers):\n```rust\n// In src/cli/commands/sync.rs run_sync():\nlet run_id = &uuid::Uuid::new_v4().to_string()[..8];\n\n// In src/cli/commands/ingest.rs run_ingest():\nlet run_id = &uuid::Uuid::new_v4().to_string()[..8];\n```\n\nThis keeps main.rs clean and only adds run_id where it's used.\n\n## Acceptance Criteria\n- [ ] run_id is generated for every sync and ingest invocation\n- [ ] run_id is exactly 8 characters, all hex (0-9, a-f)\n- [ ] run_id is unique across invocations (probabilistic, ~4 billion space)\n- [ ] No new dependencies needed (uuid already present)\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/cli/commands/sync.rs (generate run_id in run_sync)\n- src/cli/commands/ingest.rs (generate run_id in run_ingest)\n\n## TDD Loop\nRED:\n - test_run_id_format: generate 100 run_ids, assert each is 8 chars, all hex\n - test_run_id_uniqueness: generate 1000 run_ids, assert no duplicates\nGREEN: Add run_id generation to run_sync and run_ingest\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- UUID string format: Uuid::new_v4().to_string() produces \"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx\". First 8 chars are always hex (no hyphens). Safe to slice.\n- &run_id[..8] on a String: this is a byte slice on ASCII chars, always valid UTF-8. No panic risk.","status":"closed","priority":1,"issue_type":"task","created_at":"2026-02-04T15:54:07.673765Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:19:31.361619Z","closed_at":"2026-02-04T17:19:31.361575Z","close_reason":"Generated run_id via uuid::Uuid::new_v4().simple() at entry of run_sync and run_ingest, truncated to 8 hex chars","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-37qw","depends_on_id":"bd-2ni","type":"parent-child","created_at":"2026-02-04T15:54:07.677239Z","created_by":"tayloreernisse"}]}
{"id":"bd-38e","title":"[CP0] gi init command - interactive setup wizard","description":"## Background\n\nThe init command is the user's first interaction with gi. It must guide them through setup, validate everything works before writing config, and leave the system in a ready-to-use state. Poor UX here will frustrate new users.\n\nReference: docs/prd/checkpoint-0.md section \"gi init\"\n\n## Approach\n\n**src/cli/commands/init.ts:**\n\nInteractive flow (using inquirer):\n1. Check if config exists at target path\n - If exists and no --force: prompt \"Config exists. Overwrite? [y/N]\"\n - If --non-interactive and config exists: exit 2\n2. Prompt for GitLab base URL (validate URL format)\n3. Prompt for token env var name (default: GITLAB_TOKEN)\n4. Check token is set in environment\n - If not set: exit 1 with \"Export GITLAB_TOKEN first\"\n5. Test auth with GET /api/v4/user\n - If 401: exit 1 with \"Authentication failed\"\n - Show \"Authenticated as @username (Display Name)\"\n6. Prompt for project paths (comma-separated or add one at a time)\n7. Validate each project with GET /api/v4/projects/:encoded_path\n - If 404: exit 1 with \"Project not found: group/project\"\n - Show \"✓ group/project (Project Name)\"\n8. Write config.json to target path\n9. Initialize database with migrations\n10. Insert validated projects into projects table\n11. Show \"Setup complete! Run 'gi doctor' to verify.\"\n\n**Flags:**\n- `--config <path>`: Write config to specific path\n- `--force`: Skip overwrite confirmation\n- `--non-interactive`: Fail if prompts would be shown (for CI/scripting)\n\n## Acceptance Criteria\n\n- [ ] Creates config.json with valid structure\n- [ ] Validates GitLab URL is reachable before writing config\n- [ ] Validates token with GET /api/v4/user before writing config\n- [ ] Validates each project path exists in GitLab before writing config\n- [ ] Fails with exit 1 if token not set in environment\n- [ ] Fails with exit 1 if GitLab auth fails\n- [ ] Fails with exit 1 if any project not found\n- [ ] Prompts before overwriting existing config (unless --force)\n- [ ] --force skips overwrite confirmation\n- [ ] --non-interactive fails if prompts would be shown\n- [ ] Creates data directory and applies DB migrations\n- [ ] Inserts validated projects into projects table\n- [ ] tests/integration/init.test.ts passes (11 tests)\n\n## Files\n\nCREATE:\n- src/cli/commands/init.ts\n- tests/integration/init.test.ts\n\n## TDD Loop\n\nRED:\n```typescript\n// tests/integration/init.test.ts\ndescribe('gi init', () => {\n it('creates config file with valid structure')\n it('validates GitLab URL format')\n it('validates GitLab connection before writing config')\n it('validates each project path exists in GitLab')\n it('fails if token not set')\n it('fails if GitLab auth fails')\n it('fails if any project path not found')\n it('prompts before overwriting existing config')\n it('respects --force to skip confirmation')\n it('generates config with sensible defaults')\n it('creates data directory if missing')\n})\n```\n\nGREEN: Implement init.ts\n\nVERIFY: `npm run test -- tests/integration/init.test.ts`\n\n## Edge Cases\n\n- User cancels at any prompt: exit 2 (user cancelled)\n- Network error during validation: show specific error, exit 1\n- Token has wrong scopes (no read_api): auth succeeds but project fetch fails\n- Project path with special characters must be URL-encoded\n- Config directory might not exist - create with mkdirSync recursive\n- --non-interactive with missing env var should fail immediately","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-24T16:09:50.810720Z","created_by":"tayloreernisse","updated_at":"2026-01-25T03:27:07.775170Z","closed_at":"2026-01-25T03:27:07.774984Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-38e","depends_on_id":"bd-13b","type":"blocks","created_at":"2026-01-24T16:13:09.682253Z","created_by":"tayloreernisse"},{"issue_id":"bd-38e","depends_on_id":"bd-1l1","type":"blocks","created_at":"2026-01-24T16:13:09.733568Z","created_by":"tayloreernisse"},{"issue_id":"bd-38e","depends_on_id":"bd-3ng","type":"blocks","created_at":"2026-01-24T16:13:09.715644Z","created_by":"tayloreernisse"},{"issue_id":"bd-38e","depends_on_id":"bd-epj","type":"blocks","created_at":"2026-01-24T16:13:09.699092Z","created_by":"tayloreernisse"}]}
{"id":"bd-38q","title":"Implement dirty source tracking module","description":"## Background\nDirty source tracking drives incremental document regeneration. When entities are upserted during ingestion, they're marked dirty. The regenerator drains this queue. The key constraint: mark_dirty_tx() takes &Transaction to enforce atomic marking inside the entity upsert transaction. Uses ON CONFLICT DO UPDATE (not INSERT OR IGNORE) to reset backoff on re-queue.\n\n## Approach\nCreate \\`src/ingestion/dirty_tracker.rs\\` per PRD Section 6.1.\n\n```rust\nconst DIRTY_SOURCES_BATCH_SIZE: usize = 500;\n\n/// Mark dirty INSIDE existing transaction. Takes &Transaction, NOT &Connection.\n/// ON CONFLICT resets ALL backoff/error state (not INSERT OR IGNORE).\n/// This ensures fresh updates are immediately eligible, not stuck behind stale backoff.\npub fn mark_dirty_tx(\n tx: &rusqlite::Transaction<'_>,\n source_type: SourceType,\n source_id: i64,\n) -> Result<()>;\n\n/// Convenience wrapper for non-transactional contexts.\npub fn mark_dirty(conn: &Connection, source_type: SourceType, source_id: i64) -> Result<()>;\n\n/// Get dirty sources ready for processing.\n/// WHERE next_attempt_at IS NULL OR next_attempt_at <= now\n/// ORDER BY attempt_count ASC, queued_at ASC (failed items deprioritized)\n/// LIMIT 500\npub fn get_dirty_sources(conn: &Connection) -> Result<Vec<(SourceType, i64)>>;\n\n/// Clear dirty entry after successful processing.\npub fn clear_dirty(conn: &Connection, source_type: SourceType, source_id: i64) -> Result<()>;\n```\n\n**PRD-specific details:**\n- get_dirty_sources ORDER BY: \\`attempt_count ASC, queued_at ASC\\` (failed items processed AFTER fresh items)\n- mark_dirty_tx ON CONFLICT resets: queued_at, attempt_count=0, last_attempt_at=NULL, last_error=NULL, next_attempt_at=NULL\n- SourceType parsed from string in query results via match on \\\"issue\\\"/\\\"merge_request\\\"/\\\"discussion\\\"\n- Invalid source_type in DB -> rusqlite::Error::FromSqlConversionFailure\n\n**Error recording is in regenerator.rs (bd-1u1)**, not dirty_tracker. The dirty_tracker only marks, gets, and clears.\n\n## Acceptance Criteria\n- [ ] mark_dirty_tx takes &Transaction<'_>, NOT &Connection\n- [ ] ON CONFLICT DO UPDATE resets: attempt_count=0, next_attempt_at=NULL, last_error=NULL, last_attempt_at=NULL\n- [ ] Uses ON CONFLICT DO UPDATE, NOT INSERT OR IGNORE (PRD explains why)\n- [ ] get_dirty_sources WHERE next_attempt_at IS NULL OR <= now\n- [ ] get_dirty_sources ORDER BY attempt_count ASC, queued_at ASC\n- [ ] get_dirty_sources LIMIT 500\n- [ ] get_dirty_sources returns Vec<(SourceType, i64)>\n- [ ] clear_dirty DELETEs entry\n- [ ] Queue drains completely when called in loop\n- [ ] \\`cargo test dirty_tracker\\` passes\n\n## Files\n- \\`src/ingestion/dirty_tracker.rs\\` — new file\n- \\`src/ingestion/mod.rs\\` — add \\`pub mod dirty_tracker;\\`\n\n## TDD Loop\nRED: Tests:\n- \\`test_mark_dirty_tx_inserts\\` — entry appears in dirty_sources\n- \\`test_requeue_resets_backoff\\` — mark, simulate error state, re-mark -> attempt_count=0, next_attempt_at=NULL\n- \\`test_get_respects_backoff\\` — entry with future next_attempt_at not returned\n- \\`test_get_orders_by_attempt_count\\` — fresh items before failed items\n- \\`test_batch_size_500\\` — insert 600, get returns 500\n- \\`test_clear_removes\\` — entry gone after clear\n- \\`test_drain_loop\\` — insert 1200, loop 3 times = empty\nGREEN: Implement all functions\nVERIFY: \\`cargo test dirty_tracker\\`\n\n## Edge Cases\n- Empty queue: get returns empty Vec\n- Invalid source_type string in DB: FromSqlConversionFailure error\n- Concurrent mark + get: ON CONFLICT handles race condition","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:27:09.434845Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:31:35.455315Z","closed_at":"2026-01-30T17:31:35.455127Z","close_reason":"Implemented dirty_tracker with mark_dirty_tx, get_dirty_sources, clear_dirty, record_dirty_error + 8 tests","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-38q","depends_on_id":"bd-36p","type":"blocks","created_at":"2026-01-30T15:29:34.914038Z","created_by":"tayloreernisse"},{"issue_id":"bd-38q","depends_on_id":"bd-hrs","type":"blocks","created_at":"2026-01-30T15:29:34.961390Z","created_by":"tayloreernisse"},{"issue_id":"bd-38q","depends_on_id":"bd-mem","type":"blocks","created_at":"2026-01-30T15:29:34.995197Z","created_by":"tayloreernisse"}]}
{"id":"bd-39w","title":"[CP1] Test fixtures for mocked GitLab responses","description":"## Background\n\nTest fixtures provide mocked GitLab API responses for unit and integration tests. They enable testing without a live GitLab instance and ensure consistent test data across runs.\n\n## Approach\n\n### Fixture Files\n\nCreate JSON fixtures that match GitLab API response shapes:\n\n```\ntests/fixtures/\n├── gitlab_issue.json # Single issue\n├── gitlab_issues_page.json # Array of issues (pagination test)\n├── gitlab_discussion.json # Single discussion with notes\n└── gitlab_discussions_page.json # Array of discussions\n```\n\n### gitlab_issue.json\n\n```json\n{\n \"id\": 12345,\n \"iid\": 42,\n \"project_id\": 100,\n \"title\": \"Test issue title\",\n \"description\": \"Test issue description\",\n \"state\": \"opened\",\n \"created_at\": \"2024-01-15T10:00:00.000Z\",\n \"updated_at\": \"2024-01-20T15:30:00.000Z\",\n \"closed_at\": null,\n \"author\": {\n \"id\": 1,\n \"username\": \"testuser\",\n \"name\": \"Test User\"\n },\n \"labels\": [\"bug\", \"priority::high\"],\n \"web_url\": \"https://gitlab.example.com/group/project/-/issues/42\"\n}\n```\n\n### gitlab_discussion.json\n\n```json\n{\n \"id\": \"6a9c1750b37d513a43987b574953fceb50b03ce7\",\n \"individual_note\": false,\n \"notes\": [\n {\n \"id\": 1001,\n \"type\": \"DiscussionNote\",\n \"body\": \"First comment in thread\",\n \"author\": { \"id\": 1, \"username\": \"testuser\", \"name\": \"Test User\" },\n \"created_at\": \"2024-01-16T09:00:00.000Z\",\n \"updated_at\": \"2024-01-16T09:00:00.000Z\",\n \"system\": false,\n \"resolvable\": true,\n \"resolved\": false,\n \"resolved_by\": null,\n \"resolved_at\": null,\n \"position\": null\n },\n {\n \"id\": 1002,\n \"type\": \"DiscussionNote\",\n \"body\": \"Reply to first comment\",\n \"author\": { \"id\": 2, \"username\": \"reviewer\", \"name\": \"Reviewer\" },\n \"created_at\": \"2024-01-16T10:00:00.000Z\",\n \"updated_at\": \"2024-01-16T10:00:00.000Z\",\n \"system\": false,\n \"resolvable\": true,\n \"resolved\": false,\n \"resolved_by\": null,\n \"resolved_at\": null,\n \"position\": null\n }\n ]\n}\n```\n\n### Helper Module\n\n```rust\n// tests/fixtures/mod.rs\n\npub fn load_fixture<T: DeserializeOwned>(name: &str) -> T {\n let path = PathBuf::from(env!(\"CARGO_MANIFEST_DIR\"))\n .join(\"tests/fixtures\")\n .join(name);\n let content = std::fs::read_to_string(&path)\n .expect(&format!(\"Failed to read fixture: {}\", name));\n serde_json::from_str(&content)\n .expect(&format!(\"Failed to parse fixture: {}\", name))\n}\n\npub fn gitlab_issue() -> GitLabIssue {\n load_fixture(\"gitlab_issue.json\")\n}\n\npub fn gitlab_issues_page() -> Vec<GitLabIssue> {\n load_fixture(\"gitlab_issues_page.json\")\n}\n\npub fn gitlab_discussion() -> GitLabDiscussion {\n load_fixture(\"gitlab_discussion.json\")\n}\n```\n\n## Acceptance Criteria\n\n- [ ] gitlab_issue.json deserializes to GitLabIssue correctly\n- [ ] gitlab_issues_page.json contains 3+ issues for pagination tests\n- [ ] gitlab_discussion.json contains multi-note thread\n- [ ] gitlab_discussions_page.json contains mix of individual_note true/false\n- [ ] At least one fixture includes system: true note\n- [ ] Helper functions load fixtures without panic\n\n## Files\n\n- tests/fixtures/gitlab_issue.json (create)\n- tests/fixtures/gitlab_issues_page.json (create)\n- tests/fixtures/gitlab_discussion.json (create)\n- tests/fixtures/gitlab_discussions_page.json (create)\n- tests/fixtures/mod.rs (create)\n\n## TDD Loop\n\nRED:\n```rust\n#[test] fn fixture_gitlab_issue_deserializes()\n#[test] fn fixture_gitlab_discussion_deserializes()\n#[test] fn fixture_has_system_note()\n```\n\nGREEN: Create JSON fixtures and helper module\n\nVERIFY: `cargo test fixture`\n\n## Edge Cases\n\n- Include issue with empty labels array\n- Include issue with null description\n- Include system note (system: true)\n- Include individual_note: true discussion (standalone comment)\n- Timestamps must be valid ISO 8601","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-25T17:02:38.433752Z","created_by":"tayloreernisse","updated_at":"2026-01-25T22:48:08.415195Z","closed_at":"2026-01-25T22:48:08.415132Z","close_reason":"Created 4 JSON fixture files (issue, issues_page, discussion, discussions_page) with helper tests - 6 tests passing","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-39w","depends_on_id":"bd-1np","type":"blocks","created_at":"2026-01-25T17:04:05.770848Z","created_by":"tayloreernisse"}]}
{"id":"bd-3ae","title":"Epic: CP2 Gate A - MRs Only","description":"## Background\nGate A validates core MR ingestion works before adding complexity. Proves the cursor-based sync, pagination, and basic CLI work. This is the foundation - if Gate A fails, nothing else matters.\n\n## Acceptance Criteria (Pass/Fail)\n- [ ] `gi ingest --type=merge_requests` completes without error\n- [ ] `SELECT COUNT(*) FROM merge_requests` > 0\n- [ ] `gi list mrs --limit=5` shows 5 MRs with iid, title, state, author\n- [ ] `gi count mrs` shows total count matching DB query\n- [ ] MR with `state=locked` can be stored (if exists in test data)\n- [ ] Draft MR shows `draft=1` in DB and `[DRAFT]` in list output\n- [ ] `work_in_progress=true` MR shows `draft=1` (fallback works)\n- [ ] `head_sha` populated for MRs with commits\n- [ ] `references_short` and `references_full` populated\n- [ ] Re-run ingest shows \"0 new MRs\" or minimal refetch (cursor working)\n- [ ] Cursor saved at page boundary, not item boundary\n\n## Validation Script\n```bash\n#!/bin/bash\nset -e\n\nDB_PATH=\"${XDG_DATA_HOME:-$HOME/.local/share}/gitlab-inbox/db.sqlite3\"\n\necho \"=== Gate A: MRs Only ===\"\n\n# 1. Clear any existing MR data for clean test\necho \"Step 1: Reset MR cursor for clean test...\"\nsqlite3 \"$DB_PATH\" \"DELETE FROM sync_cursors WHERE resource_type = 'merge_requests';\"\n\n# 2. Run MR ingestion\necho \"Step 2: Ingest MRs...\"\ngi ingest --type=merge_requests\n\n# 3. Verify MRs exist\necho \"Step 3: Verify MR count...\"\nMR_COUNT=$(sqlite3 \"$DB_PATH\" \"SELECT COUNT(*) FROM merge_requests;\")\necho \" MR count: $MR_COUNT\"\n[ \"$MR_COUNT\" -gt 0 ] || { echo \"FAIL: No MRs ingested\"; exit 1; }\n\n# 4. Verify list command\necho \"Step 4: Test list command...\"\ngi list mrs --limit=5\n\n# 5. Verify count command\necho \"Step 5: Test count command...\"\ngi count mrs\n\n# 6. Verify draft handling\necho \"Step 6: Check draft MRs...\"\nDRAFT_COUNT=$(sqlite3 \"$DB_PATH\" \"SELECT COUNT(*) FROM merge_requests WHERE draft = 1;\")\necho \" Draft MR count: $DRAFT_COUNT\"\n\n# 7. Verify head_sha population\necho \"Step 7: Check head_sha...\"\nSHA_COUNT=$(sqlite3 \"$DB_PATH\" \"SELECT COUNT(*) FROM merge_requests WHERE head_sha IS NOT NULL;\")\necho \" MRs with head_sha: $SHA_COUNT\"\n\n# 8. Verify references\necho \"Step 8: Check references...\"\nREF_COUNT=$(sqlite3 \"$DB_PATH\" \"SELECT COUNT(*) FROM merge_requests WHERE references_short IS NOT NULL;\")\necho \" MRs with references: $REF_COUNT\"\n\n# 9. Verify cursor saved\necho \"Step 9: Check cursor...\"\nCURSOR=$(sqlite3 \"$DB_PATH\" \"SELECT updated_at, gitlab_id FROM sync_cursors WHERE resource_type = 'merge_requests';\")\necho \" Cursor: $CURSOR\"\n[ -n \"$CURSOR\" ] || { echo \"FAIL: Cursor not saved\"; exit 1; }\n\n# 10. Re-run and verify minimal refetch\necho \"Step 10: Re-run ingest (should be minimal)...\"\ngi ingest --type=merge_requests\n# Output should show minimal or zero new MRs\n\necho \"\"\necho \"=== Gate A: PASSED ===\"\n```\n\n## Test Commands (Quick Verification)\n```bash\n# Run these in order:\ngi ingest --type=merge_requests\ngi list mrs --limit=10\ngi count mrs\n\n# Verify in DB:\nsqlite3 ~/.local/share/gitlab-inbox/db.sqlite3 \"\n SELECT \n COUNT(*) as total,\n SUM(CASE WHEN draft = 1 THEN 1 ELSE 0 END) as drafts,\n SUM(CASE WHEN head_sha IS NOT NULL THEN 1 ELSE 0 END) as with_sha,\n SUM(CASE WHEN references_short IS NOT NULL THEN 1 ELSE 0 END) as with_refs\n FROM merge_requests;\n\"\n\n# Re-run (should be no-op):\ngi ingest --type=merge_requests\n```\n\n## Dependencies\nThis gate requires these beads to be complete:\n- bd-3ir (Database migration)\n- bd-5ta (GitLab MR types)\n- bd-34o (MR transformer)\n- bd-iba (GitLab client pagination)\n- bd-ser (MR ingestion module)\n\n## Edge Cases\n- `locked` state is transitional (merge in progress); may not exist in test data\n- Some older GitLab instances may not return `head_sha` for all MRs\n- `work_in_progress` is deprecated but should still work as fallback\n- Very large projects (10k+ MRs) may take significant time on first sync","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-26T22:06:00.966522Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:48:21.057298Z","closed_at":"2026-01-27T00:48:21.057225Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-3ae","depends_on_id":"bd-iba","type":"blocks","created_at":"2026-01-26T22:08:55.576626Z","created_by":"tayloreernisse"},{"issue_id":"bd-3ae","depends_on_id":"bd-ser","type":"blocks","created_at":"2026-01-26T22:08:55.446814Z","created_by":"tayloreernisse"}]}
{"id":"bd-3as","title":"Implement timeline event collection and chronological interleaving","description":"## Background\n\nThe event collection phase is steps 4-5 of the timeline pipeline (spec Section 3.2). It takes the seed + expanded entity sets and collects all their events from the database, then interleaves them chronologically. This produces the final ordered event list that the human/robot renderers consume.\n\n**Spec reference:** `docs/phase-b-temporal-intelligence.md` Section 3.2 steps 4-5, Section 3.3 (Event Model).\n\n## Approach\n\nCreate `src/core/timeline_collect.rs`:\n\n```rust\nuse rusqlite::Connection;\nuse crate::core::timeline::{TimelineEvent, TimelineEventType, EntityRef, ExpandedEntityRef};\n\npub fn collect_events(\n conn: &Connection,\n seed_entities: &[EntityRef],\n expanded_entities: &[ExpandedEntityRef],\n evidence_notes: &[TimelineEvent], // from seed phase\n limit: usize, // -n flag (default 100)\n) -> Result<Vec<TimelineEvent>> {\n let mut events: Vec<TimelineEvent> = Vec::new();\n\n let all_entities = /* combine seeds and expanded */;\n\n for entity in all_entities {\n let is_seed = /* check if entity is in seeds */;\n\n // Spec Section 3.2 step 4: COLLECT EVENTS\n // 1. Entity creation (created_at from issues/merge_requests)\n events.push(creation_event(conn, &entity, is_seed)?);\n\n // 2. State changes (resource_state_events)\n events.extend(state_events(conn, &entity, is_seed)?);\n\n // 3. Label changes (resource_label_events)\n events.extend(label_events(conn, &entity, is_seed)?);\n\n // 4. Milestone changes (resource_milestone_events)\n events.extend(milestone_events(conn, &entity, is_seed)?);\n\n // 5. Merge events (merged_at from merge_requests, for MRs only)\n if entity.entity_type == \"merge_request\" {\n events.extend(merge_event(conn, &entity, is_seed)?);\n }\n }\n\n // 6. Evidence notes from seed phase (spec: \"Evidence-bearing notes\")\n events.extend(evidence_notes.iter().cloned());\n\n // Spec Section 3.2 step 5: INTERLEAVE (chronological sort)\n events.sort();\n\n // Apply limit\n events.truncate(limit);\n\n Ok(events)\n}\n```\n\n### SQL Queries\n\n**Creation event** -> `TimelineEventType::Created`:\n```sql\n-- For issues:\nSELECT created_at, author_username, title, web_url FROM issues WHERE id = ?1\n-- For MRs:\nSELECT created_at, author_username, title, web_url FROM merge_requests WHERE id = ?1\n```\n\n**State events** -> `TimelineEventType::StateChanged { state }`:\nNote: spec Section 3.3 uses `StateChanged { state: String }` (just the target state).\n```sql\nSELECT state, actor_username, created_at\nFROM resource_state_events\nWHERE issue_id = ?1 OR merge_request_id = ?1\nORDER BY created_at ASC\n```\n\n**Label events** -> `TimelineEventType::LabelAdded { label }` / `LabelRemoved { label }`:\n```sql\nSELECT action, label_name, actor_username, created_at\nFROM resource_label_events\nWHERE issue_id = ?1 OR merge_request_id = ?1\nORDER BY created_at ASC\n```\n\n**Milestone events** -> `TimelineEventType::MilestoneSet { milestone }` / `MilestoneRemoved { milestone }`:\n```sql\nSELECT action, milestone_title, actor_username, created_at\nFROM resource_milestone_events\nWHERE issue_id = ?1 OR merge_request_id = ?1\nORDER BY created_at ASC\n```\n\n**Merge event** -> `TimelineEventType::Merged` (unit variant per spec):\n```sql\nSELECT rse.created_at, rse.actor_username\nFROM resource_state_events rse\nWHERE rse.merge_request_id = ?1 AND rse.state = 'merged'\n```\n\n### URL Construction\n\nEach event needs a `url` field (spec Section 3.3). Pattern:\n- Issues: `{gitlab_url}/{project_path}/-/issues/{iid}`\n- MRs: `{gitlab_url}/{project_path}/-/merge_requests/{iid}`\n- Evidence notes: `{issue_or_mr_url}#note_{note_id}` (when note_id is available)\n\nUse `web_url` from the entity's DB row when available; fall back to construction.\n\nRegister in `src/core/mod.rs`: `pub mod timeline_collect;`\n\n## Acceptance Criteria\n\n- [ ] Collects all 6 event types per spec Section 3.2 step 4: created, state_changed, label_added/removed, milestone_set/removed, merged, note_evidence\n- [ ] StateChanged uses `{ state: String }` per spec (not from/to)\n- [ ] Merged is a unit variant per spec (no inner fields)\n- [ ] Each event has `url: Option<String>` populated from entity web_url\n- [ ] Events marked with is_seed=true for seed entities, false for expanded\n- [ ] Chronological sort with stable tiebreak (timestamp, entity_id, event_type)\n- [ ] Limit applied AFTER sorting (last N events truncated, not random)\n- [ ] Evidence notes from seed phase included in final list\n- [ ] MR merge events only collected for merge_request entities\n- [ ] Empty entity set returns empty events list\n- [ ] Module registered in `src/core/mod.rs`\n- [ ] `cargo check --all-targets` passes\n- [ ] `cargo clippy --all-targets -- -D warnings` passes\n\n## Files\n\n- `src/core/timeline_collect.rs` (NEW)\n- `src/core/mod.rs` (add `pub mod timeline_collect;`)\n\n## TDD Loop\n\nRED: Create tests:\n- `test_collect_creation_event` - entity produces Created event at created_at timestamp\n- `test_collect_state_events` - state changes produce StateChanged { state } events\n- `test_collect_label_events` - label add/remove produce LabelAdded/LabelRemoved events\n- `test_collect_chronological_sort` - events from different entities interleave correctly\n- `test_collect_respects_limit` - limit=5 returns at most 5 events\n- `test_collect_marks_seed_flag` - seed entities have is_seed=true\n- `test_collect_includes_url` - events have url populated from entity web_url\n\nTests need in-memory DB with migrations 001-011 and test data in issues, merge_requests, resource_*_events tables.\n\nGREEN: Implement the SQL queries and event assembly.\n\nVERIFY: `cargo test --lib -- timeline_collect`\n\n## Edge Cases\n\n- Entity with no events (just created): returns only the Created event\n- Entity with 1000+ events: collect all, then apply limit at the end\n- State event with NULL actor: actor field is None in TimelineEvent\n- Label/milestone events may have same timestamp: tiebreak by event_type discriminant\n- MR with state='merged' but no merge_commit_sha (legacy data): Merged variant is unit, no issue\n- URL construction: handle entities without web_url (old data) by setting url to None","status":"open","priority":2,"issue_type":"task","created_at":"2026-02-02T21:33:08.703942Z","created_by":"tayloreernisse","updated_at":"2026-02-05T19:10:47.601301Z","compaction_level":0,"original_size":0,"labels":["gate-3","phase-b","query"],"dependencies":[{"issue_id":"bd-3as","depends_on_id":"bd-1ep","type":"blocks","created_at":"2026-02-02T21:33:37.618171Z","created_by":"tayloreernisse"},{"issue_id":"bd-3as","depends_on_id":"bd-ike","type":"parent-child","created_at":"2026-02-02T21:33:08.705605Z","created_by":"tayloreernisse"},{"issue_id":"bd-3as","depends_on_id":"bd-ypa","type":"blocks","created_at":"2026-02-02T21:33:37.575585Z","created_by":"tayloreernisse"}]}
{"id":"bd-3bo","title":"[CP1] gi count issues/discussions/notes commands","description":"Count entities in the database.\n\nCommands:\n- gi count issues → 'Issues: N'\n- gi count discussions --type=issue → 'Issue Discussions: N'\n- gi count notes --type=issue → 'Issue Notes: N (excluding M system)'\n\nFiles: src/cli/commands/count.ts\nDone when: Counts match expected values from GitLab","status":"tombstone","priority":3,"issue_type":"task","created_at":"2026-01-25T15:20:16.190875Z","created_by":"tayloreernisse","updated_at":"2026-01-25T15:21:35.156293Z","deleted_at":"2026-01-25T15:21:35.156290Z","deleted_by":"tayloreernisse","delete_reason":"delete","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-3er","title":"OBSERV Epic: Phase 3 - Performance Metrics Collection","description":"StageTiming struct, custom MetricsLayer tracing subscriber layer, span-to-metrics extraction, robot JSON enrichment with meta.stages, human-readable timing summary.\n\nDepends on: Phase 2 (spans must exist to extract timing from)\nUnblocks: Phase 4 (sync history needs Vec<StageTiming> to store)\n\nFiles: src/core/metrics.rs (new), src/cli/commands/sync.rs, src/cli/commands/ingest.rs, src/main.rs\n\nAcceptance criteria (PRD Section 6.3):\n- lore --robot sync includes meta.run_id and meta.stages array\n- Each stage has name, elapsed_ms, items_processed\n- Top-level stages have sub_stages arrays\n- Interactive sync prints timing summary table\n- Zero-value fields omitted from JSON","status":"closed","priority":2,"issue_type":"epic","created_at":"2026-02-04T15:53:27.415566Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:32:56.743477Z","closed_at":"2026-02-04T17:32:56.743430Z","close_reason":"All Phase 3 tasks complete: StageTiming struct, MetricsLayer, span field recording, robot JSON enrichment with stages, and human-readable timing summary","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-3er","depends_on_id":"bd-2ni","type":"blocks","created_at":"2026-02-04T15:55:19.101775Z","created_by":"tayloreernisse"}]}
{"id":"bd-3eu","title":"Implement hybrid search with adaptive recall","description":"## Background\nHybrid search is the top-level search orchestrator that combines FTS5 lexical results with sqlite-vec semantic results via RRF ranking. It supports three modes (Lexical, Semantic, Hybrid) and implements adaptive recall (wider initial fetch when filters are applied) and graceful degradation (falls back to FTS when Ollama is unavailable). All modes use RRF for consistent --explain output.\n\n## Approach\nCreate `src/search/hybrid.rs` per PRD Section 5.3.\n\n**Key types:**\n```rust\n#[derive(Debug, Clone, Copy, PartialEq, Eq)]\npub enum SearchMode {\n Hybrid, // Vector + FTS with RRF\n Lexical, // FTS only\n Semantic, // Vector only\n}\n\nimpl SearchMode {\n pub fn from_str(s: &str) -> Option<Self> {\n match s.to_lowercase().as_str() {\n \"hybrid\" => Some(Self::Hybrid),\n \"lexical\" | \"fts\" => Some(Self::Lexical),\n \"semantic\" | \"vector\" => Some(Self::Semantic),\n _ => None,\n }\n }\n\n pub fn as_str(&self) -> &'static str {\n match self {\n Self::Hybrid => \"hybrid\",\n Self::Lexical => \"lexical\",\n Self::Semantic => \"semantic\",\n }\n }\n}\n\npub struct HybridResult {\n pub document_id: i64,\n pub score: f64, // Normalized RRF score (0-1)\n pub vector_rank: Option<usize>,\n pub fts_rank: Option<usize>,\n pub rrf_score: f64, // Raw RRF score\n}\n```\n\n**Core function (ASYNC, PRD-exact signature):**\n```rust\npub async fn search_hybrid(\n conn: &Connection,\n client: Option<&OllamaClient>, // None if Ollama unavailable\n ollama_base_url: Option<&str>, // For actionable error messages\n query: &str,\n mode: SearchMode,\n filters: &SearchFilters,\n fts_mode: FtsQueryMode,\n) -> Result<(Vec<HybridResult>, Vec<String>)>\n```\n\n**IMPORTANT — client is `Option<&OllamaClient>`:** This enables graceful degradation. When Ollama is unavailable, the caller passes `None` and hybrid mode falls back to FTS-only with a warning. The `ollama_base_url` is separate so error messages can include it even when client is None.\n\n**Adaptive recall constants (PRD Section 5.3):**\n```rust\nconst BASE_RECALL_MIN: usize = 50;\nconst FILTERED_RECALL_MIN: usize = 200;\nconst RECALL_CAP: usize = 1500;\n```\n\n**Recall formula:**\n```rust\nlet requested = filters.clamp_limit();\nlet top_k = if filters.has_any_filter() {\n (requested * 50).max(FILTERED_RECALL_MIN).min(RECALL_CAP)\n} else {\n (requested * 10).max(BASE_RECALL_MIN).min(RECALL_CAP)\n};\n```\n\n**Mode behavior:**\n- **Lexical:** FTS only -> rank_rrf with empty vector list (single-list RRF)\n- **Semantic:** Vector only -> requires client (error if None) -> rank_rrf with empty FTS list\n- **Hybrid:** Both FTS + vector -> rank_rrf with both lists\n- **Hybrid with client=None:** Graceful degradation to Lexical with warning, NOT error\n\n**Graceful degradation logic:**\n```rust\nSearchMode::Hybrid => {\n let fts_results = search_fts(conn, query, top_k, fts_mode)?;\n let fts_tuples: Vec<_> = fts_results.iter().map(|r| (r.document_id, r.rank)).collect();\n\n match client {\n Some(client) => {\n let query_embedding = client.embed_batch(vec\\![query.to_string()]).await?;\n let embedding = query_embedding.into_iter().next().unwrap();\n let vec_results = search_vector(conn, &embedding, top_k)?;\n let vec_tuples: Vec<_> = vec_results.iter().map(|r| (r.document_id, r.distance)).collect();\n let ranked = rank_rrf(&vec_tuples, &fts_tuples);\n // ... map to HybridResult\n Ok((results, warnings))\n }\n None => {\n warnings.push(\"Ollama unavailable, falling back to lexical search\".into());\n let ranked = rank_rrf(&[], &fts_tuples);\n // ... map to HybridResult\n Ok((results, warnings))\n }\n }\n}\n```\n\n## Acceptance Criteria\n- [ ] Function is `async` (per PRD — Ollama client methods are async)\n- [ ] Signature takes `client: Option<&OllamaClient>` (not required)\n- [ ] Signature takes `ollama_base_url: Option<&str>` for actionable error messages\n- [ ] Returns `(Vec<HybridResult>, Vec<String>)` — results + warnings\n- [ ] Lexical mode: FTS-only results ranked via RRF (single list)\n- [ ] Semantic mode: vector-only results ranked via RRF; error if client is None\n- [ ] Hybrid mode: both FTS + vector results merged via RRF\n- [ ] Graceful degradation: client=None in Hybrid falls back to FTS with warning (not error)\n- [ ] Adaptive recall: unfiltered max(50, limit*10), filtered max(200, limit*50), capped 1500\n- [ ] All modes produce consistent --explain output (vector_rank, fts_rank, rrf_score)\n- [ ] SearchMode::from_str accepts aliases: \"fts\" for Lexical, \"vector\" for Semantic\n- [ ] `cargo build` succeeds\n\n## Files\n- `src/search/hybrid.rs` — new file\n- `src/search/mod.rs` — add `pub use hybrid::{search_hybrid, HybridResult, SearchMode};`\n\n## TDD Loop\nRED: Tests (some integration, some unit):\n- `test_lexical_mode` — FTS results only\n- `test_semantic_mode` — vector results only\n- `test_hybrid_mode` — both lists merged\n- `test_graceful_degradation` — None client falls back to FTS with warning in warnings vec\n- `test_adaptive_recall_unfiltered` — recall = max(50, limit*10)\n- `test_adaptive_recall_filtered` — recall = max(200, limit*50)\n- `test_recall_cap` — never exceeds 1500\n- `test_search_mode_from_str` — \"hybrid\", \"lexical\", \"fts\", \"semantic\", \"vector\", invalid\nGREEN: Implement search_hybrid\nVERIFY: `cargo test hybrid`\n\n## Edge Cases\n- Both FTS and vector return zero results: empty output (not error)\n- FTS returns results but vector returns empty: RRF still works (single-list)\n- Very high limit (100) with filters: recall = min(5000, 1500) = 1500\n- Semantic mode with client=None: error (OllamaUnavailable), not degradation\n- Semantic mode with 0% coverage: return LoreError::EmbeddingsNotBuilt","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:26:50.343002Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:56:16.631748Z","closed_at":"2026-01-30T17:56:16.631682Z","close_reason":"Implemented hybrid search with 3 modes (lexical/semantic/hybrid), graceful degradation when Ollama unavailable, adaptive recall (50-1500), RRF fusion. 6 tests pass.","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-3eu","depends_on_id":"bd-1k1","type":"blocks","created_at":"2026-01-30T15:29:24.913458Z","created_by":"tayloreernisse"},{"issue_id":"bd-3eu","depends_on_id":"bd-335","type":"blocks","created_at":"2026-01-30T15:29:25.025502Z","created_by":"tayloreernisse"},{"issue_id":"bd-3eu","depends_on_id":"bd-3ez","type":"blocks","created_at":"2026-01-30T15:29:24.987809Z","created_by":"tayloreernisse"},{"issue_id":"bd-3eu","depends_on_id":"bd-bjo","type":"blocks","created_at":"2026-01-30T15:29:24.950761Z","created_by":"tayloreernisse"}]}
{"id":"bd-3ez","title":"Implement RRF ranking","description":"## Background\nReciprocal Rank Fusion (RRF) combines results from multiple retrieval systems (FTS5 lexical + sqlite-vec semantic) into a single ranked list without requiring score normalization. Documents appearing in both lists rank higher than single-list documents. This is the core ranking algorithm for hybrid search in Gate B.\n\n## Approach\nCreate \\`src/search/rrf.rs\\` per PRD Section 5.2.\n\n```rust\nuse std::collections::HashMap;\n\nconst RRF_K: f64 = 60.0;\n\npub struct RrfResult {\n pub document_id: i64,\n pub rrf_score: f64, // Raw RRF score\n pub normalized_score: f64, // Normalized to 0-1 (rrf_score / max)\n pub vector_rank: Option<usize>, // 1-indexed rank in vector list\n pub fts_rank: Option<usize>, // 1-indexed rank in FTS list\n}\n\n/// Input: tuples of (document_id, score/distance) — already sorted by retriever.\n/// Ranks are 1-indexed (first result = rank 1).\n/// Score = sum of 1/(k + rank) for each list containing the document.\npub fn rank_rrf(\n vector_results: &[(i64, f64)], // (doc_id, distance)\n fts_results: &[(i64, f64)], // (doc_id, bm25_score)\n) -> Vec<RrfResult>\n```\n\n**Algorithm (per PRD):**\n1. Build HashMap<doc_id, (rrf_score, vector_rank, fts_rank)>\n2. For each vector result at position i: score += 1/(K + (i+1)), record vector_rank = i+1 (**1-indexed**)\n3. For each FTS result at position i: score += 1/(K + (i+1)), record fts_rank = i+1 (**1-indexed**)\n4. Sort descending by rrf_score\n5. Normalize: each result.normalized_score = result.rrf_score / max_score (best = 1.0)\n\n**Key PRD details:**\n- Ranks are **1-indexed** (rank 1 = best, not rank 0)\n- Input is \\`&[(i64, f64)]\\` tuples, NOT custom structs\n- Output has both \\`rrf_score\\` (raw) and \\`normalized_score\\` (0-1)\n\n## Acceptance Criteria\n- [ ] Documents in both lists score higher than single-list documents\n- [ ] Single-list documents are included (not dropped)\n- [ ] Ranks are 1-indexed (first element = rank 1)\n- [ ] Raw RRF score available in rrf_score field\n- [ ] Normalized score: best = 1.0, all in [0, 1]\n- [ ] Results sorted descending by rrf_score\n- [ ] vector_rank and fts_rank tracked per result for --explain\n- [ ] Empty input lists handled (return empty)\n- [ ] One empty list + one non-empty returns results from non-empty list\n\n## Files\n- \\`src/search/rrf.rs\\` — new file\n- \\`src/search/mod.rs\\` — add \\`mod rrf; pub use rrf::{rank_rrf, RrfResult};\\`\n\n## TDD Loop\nRED: Tests in \\`#[cfg(test)] mod tests\\`:\n- \\`test_dual_list_ranks_higher\\` — doc in both lists scores > doc in one list\n- \\`test_single_list_included\\` — FTS-only and vector-only docs appear\n- \\`test_normalization\\` — best score is 1.0, all in [0, 1]\n- \\`test_empty_inputs\\` — empty returns empty\n- \\`test_ranks_are_1_indexed\\` — verify vector_rank/fts_rank start at 1\n- \\`test_raw_and_normalized_scores\\` — both fields populated correctly\nGREEN: Implement rank_rrf()\nVERIFY: \\`cargo test rrf\\`\n\n## Edge Cases\n- Duplicate document_id within same list: shouldn't happen, use first occurrence\n- Single result in one list, zero in other: normalized_score = 1.0\n- Very large input lists: HashMap handles efficiently","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:26:50.309012Z","created_by":"tayloreernisse","updated_at":"2026-01-30T16:53:04.128560Z","closed_at":"2026-01-30T16:53:04.128498Z","close_reason":"Completed: RRF ranking with 1-indexed ranks, raw+normalized scores, vector_rank/fts_rank provenance, 7 tests pass","compaction_level":0,"original_size":0}
{"id":"bd-3hy","title":"[CP1] Test fixtures for mocked GitLab responses","description":"Create mock response files for integration tests using wiremock.\n\n## Fixtures to Create\n\n### tests/fixtures/gitlab_issue.json\nSingle issue with labels:\n- id, iid, project_id, title, description, state\n- author object\n- labels array (string names)\n- timestamps\n- web_url\n\n### tests/fixtures/gitlab_issues_page.json\nArray of issues simulating paginated response:\n- 3-5 issues with varying states\n- Mix of labels\n\n### tests/fixtures/gitlab_discussion.json\nSingle discussion:\n- id (string)\n- individual_note: false\n- notes array with 2+ notes\n- Include one system note\n\n### tests/fixtures/gitlab_discussions_page.json\nArray of discussions:\n- Mix of individual_note true/false\n- Include resolvable/resolved examples\n\n## Edge Cases to Cover\n- Issue with no labels (empty array)\n- Issue with labels_details (ignored in CP1)\n- Discussion with individual_note=true (single note)\n- System notes with system=true\n- Resolvable notes\n\nFiles: tests/fixtures/gitlab_issue.json, gitlab_issues_page.json, gitlab_discussion.json, gitlab_discussions_page.json\nDone when: wiremock handlers can use fixtures for deterministic tests","status":"tombstone","priority":3,"issue_type":"task","created_at":"2026-01-25T16:59:01.206436Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.991367Z","deleted_at":"2026-01-25T17:02:01.991362Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-3ia","title":"Fetch closes_issues API and populate entity_references","description":"## Background\nGET /projects/:id/merge_requests/:iid/closes_issues returns issues that will close when MR merges. This is the most reliable source for MR→issue relationships. Uses the generic dependent fetch queue (job_type = 'mr_closes_issues').\n\n## Approach\n\n**1. Add API endpoint to GitLab client (src/gitlab/client.rs):**\n```rust\n/// Fetch issues that will be closed when this MR merges.\npub async fn fetch_mr_closes_issues(\n &self, \n project_id: i64, \n iid: i64\n) -> Result<Vec<GitLabIssueRef>>\n```\n\nNew type in src/gitlab/types.rs:\n```rust\n#[derive(Debug, Clone, Deserialize)]\npub struct GitLabIssueRef {\n pub id: i64,\n pub iid: i64,\n pub project_id: i64,\n pub title: String,\n pub state: String,\n pub web_url: String,\n}\n```\n\nURL: `GET /api/v4/projects/{project_id}/merge_requests/{iid}/closes_issues?per_page=100`\n\n**2. Enqueue jobs during MR ingestion:**\nIn orchestrator.rs, after MR upsert:\n```rust\nenqueue_job(conn, project_id, \"merge_request\", iid, local_id, \"mr_closes_issues\", None)?;\n```\n\nThis is always enqueued (not gated by a config flag) because cross-reference data is fundamental to all temporal queries.\n\n**3. Process jobs in drain step:**\nIn the drain dispatcher (from bd-1ep), handle \"mr_closes_issues\" job_type:\n```rust\nlet closes_issues = client.fetch_mr_closes_issues(gitlab_project_id, job.entity_iid).await?;\nfor issue_ref in &closes_issues {\n let target_id = resolve_issue_local_id(conn, project_id, issue_ref.iid);\n insert_entity_reference(conn, EntityReference {\n source_entity_type: \"merge_request\",\n source_entity_id: job.entity_local_id,\n target_entity_type: \"issue\",\n target_entity_id: target_id, // Some(id) or None for cross-project\n target_project_path: if target_id.is_none() { Some(resolve_project_path(issue_ref.project_id)) } else { None },\n target_entity_iid: if target_id.is_none() { Some(issue_ref.iid) } else { None },\n reference_type: \"closes\",\n source_method: \"api_closes_issues\",\n created_at: None,\n })?;\n}\n```\n\n**4. Insert helper for entity_references:**\nAdd to src/core/references.rs:\n```rust\npub fn insert_entity_reference(conn: &Connection, ref_: &EntityReference) -> Result<bool>\n// INSERT OR IGNORE, returns true if inserted\n```\n\n## Acceptance Criteria\n- [ ] closes_issues API called for all MRs during sync\n- [ ] Entity references created with reference_type='closes', source_method='api_closes_issues'\n- [ ] Source = MR, target = issue (correct directionality)\n- [ ] Cross-project issues stored as unresolved (target_entity_id=NULL, target_project_path set)\n- [ ] Idempotent: re-sync doesn't create duplicate references\n- [ ] 404 on deleted MR handled gracefully (fail_job)\n\n## Files\n- src/gitlab/client.rs (add fetch_mr_closes_issues)\n- src/gitlab/types.rs (add GitLabIssueRef)\n- src/core/references.rs (add insert_entity_reference helper)\n- src/ingestion/orchestrator.rs (enqueue mr_closes_issues jobs)\n- src/core/drain.rs or sync.rs (handle mr_closes_issues in drain dispatcher)\n\n## TDD Loop\nRED: tests/references_tests.rs:\n- `test_closes_issues_creates_references` - mock closes_issues response, verify entity_references rows\n- `test_closes_issues_cross_project_unresolved` - issue from different project stored as unresolved\n- `test_closes_issues_idempotent` - process same job twice, verify no duplicates\n\ntests/gitlab_types_tests.rs:\n- `test_deserialize_issue_ref` - verify GitLabIssueRef deserialization\n\nGREEN: Implement API endpoint, enqueue hook, drain handler, insert helper\n\nVERIFY: `cargo test references -- --nocapture && cargo test gitlab_types -- --nocapture`\n\n## Edge Cases\n- closes_issues API returns issues from OTHER projects (cross-project closing) — must check if issue is in local DB\n- Empty response (MR doesn't close any issues) — no refs created, job still completed\n- MR may close the same issue via description (\"Closes #123\") and via commits — API deduplicates, but our INSERT OR IGNORE handles it too\n- The closes_issues API may return stale data for draft MRs (issues that *would* close but haven't yet)","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-02T21:32:33.561956Z","created_by":"tayloreernisse","updated_at":"2026-02-04T20:15:54.763773Z","closed_at":"2026-02-04T20:15:54.763643Z","compaction_level":0,"original_size":0,"labels":["api","gate-2","phase-b"],"dependencies":[{"issue_id":"bd-3ia","depends_on_id":"bd-1se","type":"parent-child","created_at":"2026-02-02T21:32:33.563366Z","created_by":"tayloreernisse"},{"issue_id":"bd-3ia","depends_on_id":"bd-hu3","type":"blocks","created_at":"2026-02-02T22:41:50.613776Z","created_by":"tayloreernisse"},{"issue_id":"bd-3ia","depends_on_id":"bd-tir","type":"blocks","created_at":"2026-02-02T21:32:42.860463Z","created_by":"tayloreernisse"}]}
{"id":"bd-3ir","title":"Add database migration 006_merge_requests.sql","description":"## Background\nFoundation for all CP2 MR features. This migration defines the schema that all other MR components depend on. Must complete BEFORE any other CP2 work can proceed.\n\n## Approach\nCreate migration file that adds:\n1. `merge_requests` table with all CP2 fields\n2. `mr_labels`, `mr_assignees`, `mr_reviewers` junction tables\n3. Indexes on discussions for MR queries\n4. DiffNote position columns on notes table\n\n## Files\n- `migrations/006_merge_requests.sql` - New migration file\n- `src/core/db.rs` - Update MIGRATIONS const to include version 6\n\n## Acceptance Criteria\n- [ ] Migration file exists at `migrations/006_merge_requests.sql`\n- [ ] `merge_requests` table has columns: id, gitlab_id, project_id, iid, title, description, state, draft, author_username, source_branch, target_branch, head_sha, references_short, references_full, detailed_merge_status, merge_user_username, created_at, updated_at, merged_at, closed_at, last_seen_at, discussions_synced_for_updated_at, discussions_sync_last_attempt_at, discussions_sync_attempts, discussions_sync_last_error, web_url, raw_payload_id\n- [ ] `mr_labels` junction table exists with (merge_request_id, label_id) PK\n- [ ] `mr_assignees` junction table exists with (merge_request_id, username) PK\n- [ ] `mr_reviewers` junction table exists with (merge_request_id, username) PK\n- [ ] `idx_discussions_mr_id` and `idx_discussions_mr_resolved` indexes exist\n- [ ] `notes` table has new columns: position_type, position_line_range_start, position_line_range_end, position_base_sha, position_start_sha, position_head_sha\n- [ ] `gi doctor` runs without migration errors\n- [ ] `cargo test` passes\n\n## TDD Loop\nRED: Cannot open DB with version 6 schema\nGREEN: Add migration file with full SQL\nVERIFY: `cargo run -- doctor` shows healthy DB\n\n## SQL Reference (from PRD)\n```sql\n-- Merge requests table\nCREATE TABLE merge_requests (\n id INTEGER PRIMARY KEY,\n gitlab_id INTEGER UNIQUE NOT NULL,\n project_id INTEGER NOT NULL REFERENCES projects(id),\n iid INTEGER NOT NULL,\n title TEXT,\n description TEXT,\n state TEXT, -- opened | merged | closed | locked\n draft INTEGER NOT NULL DEFAULT 0, -- SQLite boolean\n author_username TEXT,\n source_branch TEXT,\n target_branch TEXT,\n head_sha TEXT,\n references_short TEXT,\n references_full TEXT,\n detailed_merge_status TEXT,\n merge_user_username TEXT,\n created_at INTEGER, -- ms epoch UTC\n updated_at INTEGER,\n merged_at INTEGER,\n closed_at INTEGER,\n last_seen_at INTEGER NOT NULL,\n discussions_synced_for_updated_at INTEGER,\n discussions_sync_last_attempt_at INTEGER,\n discussions_sync_attempts INTEGER DEFAULT 0,\n discussions_sync_last_error TEXT,\n web_url TEXT,\n raw_payload_id INTEGER REFERENCES raw_payloads(id)\n);\nCREATE INDEX idx_mrs_project_updated ON merge_requests(project_id, updated_at);\nCREATE UNIQUE INDEX uq_mrs_project_iid ON merge_requests(project_id, iid);\n-- ... (see PRD for full index list)\n\n-- Junction tables\nCREATE TABLE mr_labels (\n merge_request_id INTEGER REFERENCES merge_requests(id) ON DELETE CASCADE,\n label_id INTEGER REFERENCES labels(id) ON DELETE CASCADE,\n PRIMARY KEY(merge_request_id, label_id)\n);\n\nCREATE TABLE mr_assignees (\n merge_request_id INTEGER REFERENCES merge_requests(id) ON DELETE CASCADE,\n username TEXT NOT NULL,\n PRIMARY KEY(merge_request_id, username)\n);\n\nCREATE TABLE mr_reviewers (\n merge_request_id INTEGER REFERENCES merge_requests(id) ON DELETE CASCADE,\n username TEXT NOT NULL,\n PRIMARY KEY(merge_request_id, username)\n);\n\n-- DiffNote position columns (ALTER TABLE)\nALTER TABLE notes ADD COLUMN position_type TEXT;\nALTER TABLE notes ADD COLUMN position_line_range_start INTEGER;\nALTER TABLE notes ADD COLUMN position_line_range_end INTEGER;\nALTER TABLE notes ADD COLUMN position_base_sha TEXT;\nALTER TABLE notes ADD COLUMN position_start_sha TEXT;\nALTER TABLE notes ADD COLUMN position_head_sha TEXT;\n\nINSERT INTO schema_version (version, applied_at, description)\nVALUES (6, strftime('%s', 'now') * 1000, 'Merge requests, MR labels, assignees, reviewers');\n```\n\n## Edge Cases\n- SQLite does not support ADD CONSTRAINT - FK defined as nullable in CP1\n- `locked` state is transitional (merge-in-progress) - store as first-class\n- discussions_synced_for_updated_at prevents redundant discussion refetch","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-26T22:06:40.101470Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:06:43.899079Z","closed_at":"2026-01-27T00:06:43.898875Z","close_reason":"Migration 006_merge_requests.sql created and verified. Schema v6 applied successfully with all tables, indexes, and position columns.","compaction_level":0,"original_size":0}
{"id":"bd-3j6","title":"Add transform_mr_discussion and transform_notes_with_diff_position","description":"## Background\nExtends discussion transformer for MR context. MR discussions can contain DiffNotes with file position metadata. This is critical for code review context in CP3 document generation.\n\n## Approach\nAdd two new functions to existing `src/gitlab/transformers/discussion.rs`:\n1. `transform_mr_discussion()` - Transform discussion with MR reference\n2. `transform_notes_with_diff_position()` - Extract DiffNote position metadata\n\nCP1 already has the polymorphic `NormalizedDiscussion` with `NoteableRef` enum - reuse that pattern.\n\n## Files\n- `src/gitlab/transformers/discussion.rs` - Add new functions\n- `tests/diffnote_tests.rs` - DiffNote position extraction tests\n- `tests/mr_discussion_tests.rs` - MR discussion transform tests\n\n## Acceptance Criteria\n- [ ] `transform_mr_discussion()` returns `NormalizedDiscussion` with `merge_request_id: Some(local_mr_id)`\n- [ ] `transform_notes_with_diff_position()` returns `Result<Vec<NormalizedNote>, String>`\n- [ ] DiffNote position fields extracted: `position_old_path`, `position_new_path`, `position_old_line`, `position_new_line`\n- [ ] Extended position fields extracted: `position_type`, `position_line_range_start`, `position_line_range_end`\n- [ ] SHA triplet extracted: `position_base_sha`, `position_start_sha`, `position_head_sha`\n- [ ] Strict timestamp parsing - returns `Err` on invalid timestamps (no `unwrap_or(0)`)\n- [ ] `cargo test diffnote` passes\n- [ ] `cargo test mr_discussion` passes\n\n## TDD Loop\nRED: `cargo test diffnote_position` -> test fails\nGREEN: Add position extraction logic\nVERIFY: `cargo test diffnote`\n\n## Function Signatures\n```rust\n/// Transform GitLab discussion for MR context.\n/// Reuses existing transform_discussion logic, just with MR reference.\npub fn transform_mr_discussion(\n gitlab_discussion: &GitLabDiscussion,\n local_project_id: i64,\n local_mr_id: i64,\n) -> NormalizedDiscussion {\n // Use existing transform_discussion with NoteableRef::MergeRequest(local_mr_id)\n transform_discussion(\n gitlab_discussion,\n local_project_id,\n NoteableRef::MergeRequest(local_mr_id),\n )\n}\n\n/// Transform notes with DiffNote position extraction.\n/// Returns Result to enforce strict timestamp parsing.\npub fn transform_notes_with_diff_position(\n gitlab_discussion: &GitLabDiscussion,\n local_project_id: i64,\n) -> Result<Vec<NormalizedNote>, String>\n```\n\n## DiffNote Position Extraction\n```rust\n// Extract position metadata if present\nlet (old_path, new_path, old_line, new_line, position_type, lr_start, lr_end, base_sha, start_sha, head_sha) = note\n .position\n .as_ref()\n .map(|pos| (\n pos.old_path.clone(),\n pos.new_path.clone(),\n pos.old_line,\n pos.new_line,\n pos.position_type.clone(), // \"text\" | \"image\" | \"file\"\n pos.line_range.as_ref().map(|r| r.start_line),\n pos.line_range.as_ref().map(|r| r.end_line),\n pos.base_sha.clone(),\n pos.start_sha.clone(),\n pos.head_sha.clone(),\n ))\n .unwrap_or((None, None, None, None, None, None, None, None, None, None));\n```\n\n## Strict Timestamp Parsing\n```rust\n// CRITICAL: Return error on invalid timestamps, never zero\nlet created_at = iso_to_ms(&note.created_at)\n .ok_or_else(|| format\\!(\n \"Invalid note.created_at for note {}: {}\",\n note.id, note.created_at\n ))?;\n```\n\n## NormalizedNote Fields for DiffNotes\n```rust\nNormalizedNote {\n // ... existing fields ...\n // DiffNote position metadata\n position_old_path: old_path,\n position_new_path: new_path,\n position_old_line: old_line,\n position_new_line: new_line,\n // Extended position\n position_type,\n position_line_range_start: lr_start,\n position_line_range_end: lr_end,\n // SHA triplet\n position_base_sha: base_sha,\n position_start_sha: start_sha,\n position_head_sha: head_sha,\n}\n```\n\n## Edge Cases\n- Notes without position should have all position fields as None\n- Invalid timestamp should fail the entire discussion (no partial results)\n- File renames: `old_path \\!= new_path` indicates a renamed file\n- Multi-line comments: `line_range` present means comment spans lines 45-48","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-26T22:06:41.208380Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:20:13.473091Z","closed_at":"2026-01-27T00:20:13.473031Z","close_reason":"Implemented transform_mr_discussion() and transform_notes_with_diff_position() with full DiffNote position extraction:\n- Extended NormalizedNote with 10 DiffNote position fields (path, line, type, line_range, SHA triplet)\n- Added strict timestamp parsing that returns Err on invalid timestamps\n- Created 13 diffnote_position_tests covering all extraction paths and error cases\n- Created 6 mr_discussion_tests verifying MR reference handling\n- All 161 tests passing","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-3j6","depends_on_id":"bd-3ir","type":"blocks","created_at":"2026-01-26T22:08:54.207801Z","created_by":"tayloreernisse"},{"issue_id":"bd-3j6","depends_on_id":"bd-5ta","type":"blocks","created_at":"2026-01-26T22:08:54.244201Z","created_by":"tayloreernisse"}]}
{"id":"bd-3js","title":"Implement MR CLI commands (list, show, count)","description":"## Background\nCLI commands for viewing and filtering merge requests. Includes list, show, and count commands with MR-specific filters.\n\n## Approach\nUpdate existing CLI command files:\n1. `list.rs` - Add MR listing with filters\n2. `show.rs` - Add MR detail view with discussions\n3. `count.rs` - Add MR counting with state breakdown\n\n## Files\n- `src/cli/commands/list.rs` - Add MR subcommand\n- `src/cli/commands/show.rs` - Add MR detail view\n- `src/cli/commands/count.rs` - Add MR counting\n\n## Acceptance Criteria\n- [ ] `gi list mrs` shows MR table with iid, title, state, author, branches\n- [ ] `gi list mrs --state=merged` filters by state\n- [ ] `gi list mrs --state=locked` filters locally (not server-side)\n- [ ] `gi list mrs --draft` shows only draft MRs\n- [ ] `gi list mrs --no-draft` excludes draft MRs\n- [ ] `gi list mrs --reviewer=username` filters by reviewer\n- [ ] `gi list mrs --target-branch=main` filters by target branch\n- [ ] `gi list mrs --source-branch=feature/x` filters by source branch\n- [ ] Draft MRs show `[DRAFT]` prefix in title\n- [ ] `gi show mr <iid>` displays full detail including discussions\n- [ ] DiffNote shows file context: `[src/file.ts:45]`\n- [ ] Multi-line DiffNote shows: `[src/file.ts:45-48]`\n- [ ] `gi show mr` shows `detailed_merge_status`\n- [ ] `gi count mrs` shows total with state breakdown\n- [ ] `gi sync-status` shows MR cursor positions\n- [ ] `cargo test cli_commands` passes\n\n## TDD Loop\nRED: `cargo test list_mrs` -> command not found\nGREEN: Add MR subcommand\nVERIFY: `gi list mrs --help`\n\n## gi list mrs Output\n```\nMerge Requests (showing 20 of 1,234)\n\n !847 Refactor auth to use JWT tokens merged @johndoe main <- feature/jwt 3 days ago\n !846 Fix memory leak in websocket handler opened @janedoe main <- fix/websocket 5 days ago\n !845 [DRAFT] Add dark mode CSS variables opened @bobsmith main <- ui/dark-mode 1 week ago\n```\n\n## SQL for MR Listing\n```sql\nSELECT \n m.iid, m.title, m.state, m.draft, m.author_username,\n m.target_branch, m.source_branch, m.updated_at\nFROM merge_requests m\nWHERE m.project_id = ?\n AND (? IS NULL OR m.state = ?) -- state filter\n AND (? IS NULL OR m.draft = ?) -- draft filter\n AND (? IS NULL OR m.author_username = ?) -- author filter\n AND (? IS NULL OR m.target_branch = ?) -- target-branch filter\n AND (? IS NULL OR m.source_branch = ?) -- source-branch filter\n AND (? IS NULL OR EXISTS ( -- reviewer filter\n SELECT 1 FROM mr_reviewers r \n WHERE r.merge_request_id = m.id AND r.username = ?\n ))\nORDER BY m.updated_at DESC\nLIMIT ?\n```\n\n## gi show mr Output\n```\nMerge Request !847: Refactor auth to use JWT tokens\n================================================================================\n\nProject: group/project-one\nState: merged\nDraft: No\nAuthor: @johndoe\nAssignees: @janedoe, @bobsmith\nReviewers: @alice, @charlie\nSource: feature/jwt\nTarget: main\nMerge Status: mergeable\nMerged By: @alice\nMerged At: 2024-03-20 14:30:00\nLabels: enhancement, auth, reviewed\n\nDescription:\n Moving away from session cookies to JWT-based authentication...\n\nDiscussions (8):\n\n @janedoe (2024-03-16) [src/auth/jwt.ts:45]:\n Should we use a separate signing key for refresh tokens?\n\n @johndoe (2024-03-16):\n Good point. I'll add a separate key with rotation support.\n\n @alice (2024-03-18) [RESOLVED]:\n Looks good! Just one nit about the token expiry constant.\n```\n\n## DiffNote File Context Display\n```rust\n// Build file context string\nlet file_context = match (note.position_new_path, note.position_new_line, note.position_line_range_end) {\n (Some(path), Some(line), Some(end_line)) if line != end_line => {\n format!(\"[{}:{}-{}]\", path, line, end_line)\n }\n (Some(path), Some(line), _) => {\n format!(\"[{}:{}]\", path, line)\n }\n _ => String::new(),\n};\n```\n\n## gi count mrs Output\n```\nMerge Requests: 1,234\n opened: 89\n merged: 1,045\n closed: 100\n```\n\n## Filter Arguments (clap)\n```rust\n#[derive(Parser)]\nstruct ListMrsArgs {\n #[arg(long)]\n state: Option<String>, // opened|merged|closed|locked|all\n #[arg(long)]\n draft: bool,\n #[arg(long)]\n no_draft: bool,\n #[arg(long)]\n author: Option<String>,\n #[arg(long)]\n assignee: Option<String>,\n #[arg(long)]\n reviewer: Option<String>,\n #[arg(long)]\n target_branch: Option<String>,\n #[arg(long)]\n source_branch: Option<String>,\n #[arg(long)]\n label: Vec<String>,\n #[arg(long)]\n project: Option<String>,\n #[arg(long, default_value = \"20\")]\n limit: u32,\n}\n```\n\n## Edge Cases\n- `--state=locked` must filter locally (GitLab API doesn't support it)\n- Ambiguous MR iid across projects: prompt for `--project`\n- Empty discussions: show \"No discussions\" message\n- Multi-line DiffNotes: show line range in context","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-26T22:06:43.354939Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:37:31.792569Z","closed_at":"2026-01-27T00:37:31.792504Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-3js","depends_on_id":"bd-20h","type":"blocks","created_at":"2026-01-26T22:08:55.209249Z","created_by":"tayloreernisse"},{"issue_id":"bd-3js","depends_on_id":"bd-ser","type":"blocks","created_at":"2026-01-26T22:08:55.117728Z","created_by":"tayloreernisse"}]}
{"id":"bd-3kj","title":"[CP0] gi version, backup, reset, sync-status commands","description":"## Background\n\nThese are the remaining utility commands for CP0. version is trivial. backup creates safety copies before destructive operations. reset provides clean-slate capability. sync-status is a stub for CP0 that will be implemented in CP1.\n\nReference: docs/prd/checkpoint-0.md sections \"gi version\", \"gi backup\", \"gi reset\", \"gi sync-status\"\n\n## Approach\n\n**src/cli/commands/version.ts:**\n```typescript\nimport { Command } from 'commander';\nimport { version } from '../../../package.json' with { type: 'json' };\n\nexport const versionCommand = new Command('version')\n .description('Show version information')\n .action(() => {\n console.log(\\`gi version \\${version}\\`);\n });\n```\n\n**src/cli/commands/backup.ts:**\n```typescript\nimport { Command } from 'commander';\nimport { copyFileSync, mkdirSync } from 'node:fs';\nimport { loadConfig } from '../../core/config';\nimport { getDbPath, getBackupDir } from '../../core/paths';\n\nexport const backupCommand = new Command('backup')\n .description('Create timestamped database backup')\n .action(async (options, command) => {\n const globalOpts = command.optsWithGlobals();\n const config = loadConfig(globalOpts.config);\n \n const dbPath = getDbPath(config.storage?.dbPath);\n const backupDir = getBackupDir(config.storage?.backupDir);\n \n mkdirSync(backupDir, { recursive: true });\n \n // Format: data-2026-01-24T10-30-00.db (colons replaced for Windows compat)\n const timestamp = new Date().toISOString().replace(/:/g, '-').replace(/\\\\..*/, '');\n const backupPath = \\`\\${backupDir}/data-\\${timestamp}.db\\`;\n \n copyFileSync(dbPath, backupPath);\n console.log(\\`Created backup: \\${backupPath}\\`);\n });\n```\n\n**src/cli/commands/reset.ts:**\n```typescript\nimport { Command } from 'commander';\nimport { unlinkSync, existsSync } from 'node:fs';\nimport { createInterface } from 'node:readline';\nimport { loadConfig } from '../../core/config';\nimport { getDbPath } from '../../core/paths';\n\nexport const resetCommand = new Command('reset')\n .description('Delete database and reset all state')\n .option('--confirm', 'Skip confirmation prompt')\n .action(async (options, command) => {\n const globalOpts = command.optsWithGlobals();\n const config = loadConfig(globalOpts.config);\n const dbPath = getDbPath(config.storage?.dbPath);\n \n if (!existsSync(dbPath)) {\n console.log('No database to reset.');\n return;\n }\n \n if (!options.confirm) {\n console.log(\\`This will delete:\\n - Database: \\${dbPath}\\n - All sync cursors\\n - All cached data\\n\\`);\n // Prompt for 'yes' confirmation\n // If not 'yes', exit 2\n }\n \n unlinkSync(dbPath);\n // Also delete WAL and SHM files if they exist\n if (existsSync(\\`\\${dbPath}-wal\\`)) unlinkSync(\\`\\${dbPath}-wal\\`);\n if (existsSync(\\`\\${dbPath}-shm\\`)) unlinkSync(\\`\\${dbPath}-shm\\`);\n \n console.log(\"Database reset. Run 'gi sync' to repopulate.\");\n });\n```\n\n**src/cli/commands/sync-status.ts:**\n```typescript\n// CP0 stub - full implementation in CP1\nexport const syncStatusCommand = new Command('sync-status')\n .description('Show sync state')\n .action(() => {\n console.log(\"No sync runs yet. Run 'gi sync' to start.\");\n });\n```\n\n## Acceptance Criteria\n\n- [ ] `gi version` outputs \"gi version X.Y.Z\"\n- [ ] `gi backup` creates timestamped copy of database\n- [ ] Backup filename is Windows-compatible (no colons)\n- [ ] Backup directory created if missing\n- [ ] `gi reset` prompts for 'yes' confirmation\n- [ ] `gi reset --confirm` skips prompt\n- [ ] Reset deletes .db, .db-wal, and .db-shm files\n- [ ] Reset exits 2 if user doesn't type 'yes'\n- [ ] `gi sync-status` outputs stub message\n\n## Files\n\nCREATE:\n- src/cli/commands/version.ts\n- src/cli/commands/backup.ts\n- src/cli/commands/reset.ts\n- src/cli/commands/sync-status.ts\n\n## TDD Loop\n\nN/A - simple commands, verify manually:\n\n```bash\ngi version\ngi backup\nls ~/.local/share/gi/backups/\ngi reset # type 'no'\ngi reset --confirm\nls ~/.local/share/gi/data.db # should not exist\ngi sync-status\n```\n\n## Edge Cases\n\n- Backup when database doesn't exist - show clear error\n- Reset when database doesn't exist - show \"No database to reset\"\n- WAL/SHM files may not exist - check before unlinking\n- Timestamp with milliseconds could cause very long filename\n- readline prompt in non-interactive terminal - handle SIGINT","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-24T16:09:51.774210Z","created_by":"tayloreernisse","updated_at":"2026-01-25T03:31:46.227285Z","closed_at":"2026-01-25T03:31:46.227220Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-3kj","depends_on_id":"bd-13b","type":"blocks","created_at":"2026-01-24T16:13:10.810953Z","created_by":"tayloreernisse"},{"issue_id":"bd-3kj","depends_on_id":"bd-3ng","type":"blocks","created_at":"2026-01-24T16:13:10.827689Z","created_by":"tayloreernisse"}]}
{"id":"bd-3lc","title":"Rename GiError to LoreError across codebase","description":"## Background\nThe codebase currently uses `GiError` as the primary error enum name (legacy from when the project was called \"gi\"). Checkpoint 3 introduces new modules (documents, search, embedding) that import error types. Renaming before Gate A work begins prevents every subsequent bead from needing to reference the old name and avoids merge conflicts across parallel work streams.\n\n## Approach\nMechanical find-and-replace using `ast-grep` or `sed`:\n1. Rename the enum declaration in `src/core/error.rs`: `pub enum GiError` -> `pub enum LoreError`\n2. Update the type alias: `pub type Result<T> = std::result::Result<T, LoreError>;`\n3. Update re-exports in `src/core/mod.rs` and `src/lib.rs`\n4. Update all `use` statements across ~16 files that import `GiError`\n5. Update any `GiError::` variant construction sites\n6. Run `cargo build` to verify no references remain\n\n**Do NOT change:**\n- Error variant names (ConfigNotFound, etc.) — only the enum name\n- ErrorCode enum — it's already named correctly\n- RobotError — already named correctly\n\n## Acceptance Criteria\n- [ ] `cargo build` succeeds with zero warnings about GiError\n- [ ] `rg GiError src/` returns zero results\n- [ ] `rg LoreError src/core/error.rs` shows the enum declaration\n- [ ] `src/core/mod.rs` re-exports `LoreError` (not `GiError`)\n- [ ] `src/lib.rs` re-exports `LoreError`\n- [ ] All `use crate::core::error::LoreError` imports compile\n\n## Files\n- `src/core/error.rs` — enum rename + type alias\n- `src/core/mod.rs` — re-export update\n- `src/lib.rs` — re-export update\n- All files matching `rg 'GiError' src/` (~16 files: ingestion/*.rs, cli/commands/*.rs, gitlab/*.rs, main.rs)\n\n## TDD Loop\nRED: `cargo build` fails after renaming enum but before fixing imports\nGREEN: Fix all imports; `cargo build` succeeds\nVERIFY: `cargo build && rg GiError src/ && echo \"FAIL: GiError references remain\" || echo \"PASS: clean\"`\n\n## Edge Cases\n- Some files may use `GiError` in string literals (error messages) — do NOT rename those, only type references\n- `impl From<X> for GiError` blocks must become `impl From<X> for LoreError`\n- The `thiserror` derive macro on the enum does not reference the name, so no macro changes needed","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:25:25.694773Z","created_by":"tayloreernisse","updated_at":"2026-01-30T16:50:10.612340Z","closed_at":"2026-01-30T16:50:10.612278Z","close_reason":"Completed: renamed GiError to LoreError across all 16 files, cargo build + 164 tests pass","compaction_level":0,"original_size":0}
{"id":"bd-3lu","title":"Implement lore search CLI command (lexical mode)","description":"## Background\nThe search CLI command is the user-facing entry point for Gate A lexical search. It orchestrates the search pipeline: query parsing -> FTS5 search -> filter application -> result hydration (single round-trip) -> display. Gate B extends this same command with --mode=hybrid and --mode=semantic. The hydration query is critical for performance — it fetches all display fields + labels + paths in one SQL query using json_each() + json_group_array().\n\n## Approach\nCreate `src/cli/commands/search.rs` per PRD Section 3.4.\n\n**Key types:**\n- `SearchResultDisplay` — display-ready result with all fields (dates as ISO via `ms_to_iso`)\n- `ExplainData` — ranking explanation for --explain flag (vector_rank, fts_rank, rrf_score)\n- `SearchResponse` — wrapper with query, mode, total_results, results, warnings\n\n**Core function:**\n```rust\npub fn run_search(\n config: &Config,\n query: &str,\n mode: SearchMode,\n filters: SearchFilters,\n explain: bool,\n) -> Result<SearchResponse>\n```\n\n**Pipeline:**\n1. Parse query + filters\n2. Execute search based on mode -> ranked doc_ids (+ explain ranks)\n3. Apply post-retrieval filters via apply_filters() preserving ranking order\n4. Hydrate results in single DB round-trip using json_each + json_group_array\n5. Attach snippets: prefer FTS snippet, fallback to `generate_fallback_snippet()` for semantic-only\n6. Convert timestamps via `ms_to_iso()` from `crate::core::time`\n7. Build SearchResponse\n\n**Hydration query (critical — single round-trip, replaces 60 queries with 1):**\n```sql\nSELECT d.id, d.source_type, d.title, d.url, d.author_username,\n d.created_at, d.updated_at, d.content_text,\n p.path_with_namespace AS project_path,\n (SELECT json_group_array(dl.label_name)\n FROM document_labels dl WHERE dl.document_id = d.id) AS labels,\n (SELECT json_group_array(dp.path)\n FROM document_paths dp WHERE dp.document_id = d.id) AS paths\nFROM json_each(?) AS j\nJOIN documents d ON d.id = j.value\nJOIN projects p ON p.id = d.project_id\nORDER BY j.key\n```\n\n**Human output uses `console::style` for terminal formatting:**\n```rust\nuse console::style;\n// Type prefix in cyan\nprintln!(\"[{}] {} - {} ({})\", i+1, style(type_prefix).cyan(), title, score);\n// URL in dim\nprintln!(\" {}\", style(url).dim());\n```\n\n**JSON robot mode includes elapsed_ms in meta (PRD Section 3.4):**\n```rust\npub fn print_search_results_json(response: &SearchResponse, elapsed_ms: u64) {\n let output = serde_json::json!({\n \"ok\": true,\n \"data\": response,\n \"meta\": { \"elapsed_ms\": elapsed_ms }\n });\n println!(\"{}\", serde_json::to_string_pretty(&output).unwrap());\n}\n```\n\n**CLI args in `src/cli/mod.rs` (PRD Section 3.4):**\n```rust\n#[derive(Args)]\npub struct SearchArgs {\n query: String,\n #[arg(long, default_value = \"hybrid\")]\n mode: String,\n #[arg(long, value_name = \"TYPE\")]\n r#type: Option<String>,\n #[arg(long)]\n author: Option<String>,\n #[arg(long)]\n project: Option<String>,\n #[arg(long, action = clap::ArgAction::Append)]\n label: Vec<String>,\n #[arg(long)]\n path: Option<String>,\n #[arg(long)]\n after: Option<String>,\n #[arg(long)]\n updated_after: Option<String>,\n #[arg(long, default_value = \"20\")]\n limit: usize,\n #[arg(long)]\n explain: bool,\n #[arg(long, default_value = \"safe\")]\n fts_mode: String,\n}\n```\n\n**IMPORTANT: default_value = \"hybrid\"** — When Ollama is unavailable, hybrid mode gracefully degrades to FTS-only with a warning (not an error). `lore search` works without Ollama.\n\n## Acceptance Criteria\n- [ ] Default mode is \"hybrid\" (not \"lexical\") per PRD\n- [ ] Hybrid mode degrades gracefully to FTS-only when Ollama unavailable (warning, not error)\n- [ ] All filters work (type, author, project, label, path, after, updated_after, limit)\n- [ ] Label filter uses `clap::ArgAction::Append` for repeatable --label flags\n- [ ] Hydration in single query (not N+1) — uses json_each + json_group_array\n- [ ] Timestamps converted via `ms_to_iso()` for display (ISO format)\n- [ ] Human output uses `console::style` for colored type prefix (cyan) and dim URLs\n- [ ] JSON robot mode includes `elapsed_ms` in `meta` field\n- [ ] Semantic-only results get fallback snippets via `generate_fallback_snippet()`\n- [ ] Empty results show friendly message: \"No results found for 'query'\"\n- [ ] \"No data indexed\" message if documents table empty\n- [ ] --explain shows vector_rank, fts_rank, rrf_score per result\n- [ ] --fts-mode=safe preserves prefix `*` while escaping special chars\n- [ ] --fts-mode=raw passes FTS5 MATCH syntax through unchanged\n- [ ] --mode=semantic with 0% embedding coverage returns LoreError::EmbeddingsNotBuilt (not OllamaUnavailable)\n- [ ] SearchArgs registered in cli/mod.rs with Clap derive\n- [ ] `cargo build` succeeds\n\n## Files\n- `src/cli/commands/search.rs` — new file\n- `src/cli/commands/mod.rs` — add `pub mod search;`\n- `src/cli/mod.rs` — add SearchArgs struct, wire up search subcommand\n- `src/main.rs` — add search command handler\n\n## TDD Loop\nRED: Integration test requiring DB with documents\n- `test_lexical_search_returns_results` — FTS search returns hits\n- `test_hydration_single_query` — verify no N+1 (mock/inspect query count)\n- `test_json_output_includes_elapsed` — robot mode JSON has meta.elapsed_ms\n- `test_empty_results_message` — zero results shows friendly message\n- `test_fallback_snippet` — semantic-only result uses truncated content\nGREEN: Implement run_search + hydrate_results + print functions\nVERIFY: `cargo build && cargo test search`\n\n## Edge Cases\n- Zero results: display friendly empty message, JSON returns empty array\n- --mode=semantic with 0% embedding coverage: return LoreError::EmbeddingsNotBuilt\n- json_group_array returns \"[]\" for documents with no labels — parse as empty array\n- Very long snippets: truncated at display time\n- Hybrid default works without Ollama: degrades to FTS-only with warning\n- ms_to_iso with epoch 0: return valid ISO string (not crash)","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:26:13.109876Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:52:24.320923Z","closed_at":"2026-01-30T17:52:24.320857Z","close_reason":"Implemented search CLI with FTS5 + RRF ranking, single-query hydration (json_each + json_group_array), adaptive recall, all filters, --explain, human + JSON output. Builds clean.","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-3lu","depends_on_id":"bd-1k1","type":"blocks","created_at":"2026-01-30T15:29:24.482877Z","created_by":"tayloreernisse"},{"issue_id":"bd-3lu","depends_on_id":"bd-3q2","type":"blocks","created_at":"2026-01-30T15:29:24.520379Z","created_by":"tayloreernisse"},{"issue_id":"bd-3lu","depends_on_id":"bd-3qs","type":"blocks","created_at":"2026-01-30T15:29:24.556323Z","created_by":"tayloreernisse"}]}
{"id":"bd-3mk","title":"[CP1] gi list issues command","description":"List issues from the database.\n\nFlags:\n- --limit=N (default: 20)\n- --project=PATH (filter by project)\n- --state=opened|closed|all (default: all)\n\nOutput: Table with iid, title, state, author, relative time\n\nFiles: src/cli/commands/list.ts\nDone when: List displays issues with proper filtering and formatting","status":"tombstone","priority":3,"issue_type":"task","created_at":"2026-01-25T15:20:10.400664Z","created_by":"tayloreernisse","updated_at":"2026-01-25T15:21:35.155211Z","deleted_at":"2026-01-25T15:21:35.155209Z","deleted_by":"tayloreernisse","delete_reason":"delete","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-3n1","title":"[CP1] gi list issues command","description":"## Background\n\nThe `gi list issues` command displays a paginated list of issues from the local database. It supports filtering by project and state, with configurable limit. This provides quick access to synced issues without opening GitLab.\n\n## Approach\n\n### Module: src/cli/commands/list.rs\n\n### Clap Definition\n\n```rust\n#[derive(Args)]\npub struct ListArgs {\n /// Entity type to list\n #[arg(value_parser = [\"issues\", \"mrs\"])]\n pub entity: String,\n\n /// Maximum results\n #[arg(long, default_value = \"20\")]\n pub limit: usize,\n\n /// Filter by project path\n #[arg(long)]\n pub project: Option<String>,\n\n /// Filter by state\n #[arg(long, value_parser = [\"opened\", \"closed\", \"all\"])]\n pub state: Option<String>,\n}\n```\n\n### Handler Function\n\n```rust\npub async fn handle_list(args: ListArgs, conn: &Connection) -> Result<()>\n```\n\n### Query (for issues)\n\n```sql\nSELECT i.iid, i.title, i.state, i.author_username, i.updated_at, p.path\nFROM issues i\nJOIN projects p ON i.project_id = p.id\nWHERE (p.path = ? OR ? IS NULL)\n AND (i.state = ? OR ? IS NULL OR ? = 'all')\nORDER BY i.updated_at DESC\nLIMIT ?\n```\n\n### Output Format (matches PRD)\n\n```\nIssues (showing 20 of 3,801)\n\n #1234 Authentication redesign opened @johndoe 3 days ago\n #1233 Fix memory leak in cache closed @janedoe 5 days ago\n #1232 Add dark mode support opened @bobsmith 1 week ago\n ...\n```\n\n### Column Layout\n\n| Column | Width | Alignment |\n|--------|-------|-----------|\n| IID | 6 | right |\n| Title | 45 | left (truncate) |\n| State | 8 | left |\n| Author | 12 | left |\n| Updated | 12 | right (relative) |\n\n### Relative Time Formatting\n\n```rust\nfn format_relative_time(ms_epoch: i64) -> String {\n let now = now_ms();\n let diff = now - ms_epoch;\n match diff {\n d if d < 60_000 => \"just now\".to_string(),\n d if d < 3_600_000 => format!(\"{} min ago\", d / 60_000),\n d if d < 86_400_000 => format!(\"{} hours ago\", d / 3_600_000),\n d if d < 604_800_000 => format!(\"{} days ago\", d / 86_400_000),\n d if d < 2_592_000_000 => format!(\"{} weeks ago\", d / 604_800_000),\n _ => format!(\"{} months ago\", diff / 2_592_000_000),\n }\n}\n```\n\n## Acceptance Criteria\n\n- [ ] Lists issues ordered by updated_at DESC\n- [ ] Shows \"showing X of Y\" with total count\n- [ ] Respects --limit parameter\n- [ ] --project filters to single project\n- [ ] --state filters to opened/closed/all\n- [ ] Title truncated if longer than column width\n- [ ] Updated time shown as relative (\"3 days ago\")\n\n## Files\n\n- src/cli/commands/mod.rs (add `pub mod list;`)\n- src/cli/commands/list.rs (create)\n- src/cli/mod.rs (add List variant to Commands enum)\n\n## TDD Loop\n\nRED:\n```rust\n#[tokio::test] async fn list_issues_shows_correct_columns()\n#[tokio::test] async fn list_issues_respects_limit()\n#[tokio::test] async fn list_issues_filters_by_project()\n#[tokio::test] async fn list_issues_filters_by_state()\n```\n\nGREEN: Implement handler with query and formatting\n\nVERIFY: `cargo test list_issues`\n\n## Edge Cases\n\n- No issues match filters - show \"No issues found\"\n- Title exactly 45 chars - no truncation\n- Title 46+ chars - truncate with \"...\"\n- --state=all shows both opened and closed\n- Default state filter is all (not just opened)","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-25T17:02:38.336352Z","created_by":"tayloreernisse","updated_at":"2026-01-25T22:58:56.619167Z","closed_at":"2026-01-25T22:58:56.619106Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-3n1","depends_on_id":"bd-208","type":"blocks","created_at":"2026-01-25T17:04:05.653278Z","created_by":"tayloreernisse"}]}
{"id":"bd-3nd","title":"[CP1] Issue transformer with label extraction","description":"## Background\n\nThe issue transformer converts GitLab API responses into our local schema format. It extracts core fields and, critically, the array of label names from each issue. This transformer is pure logic with no I/O, making it easy to test.\n\n## Approach\n\nCreate a transformer module with a function that:\n1. Takes a `GitLabIssue` and returns an `IssueRow` struct\n2. Extracts the `labels: Vec<String>` directly from the issue (GitLab returns label names as strings)\n\n### Structs\n\n```rust\n// src/gitlab/transformers/issue.rs\n\npub struct IssueRow {\n pub gitlab_id: i64,\n pub iid: i64,\n pub project_id: i64,\n pub title: String,\n pub description: Option<String>,\n pub state: String,\n pub author_username: String,\n pub created_at: i64, // ms epoch UTC\n pub updated_at: i64, // ms epoch UTC\n pub web_url: String,\n}\n\npub struct IssueWithLabels {\n pub issue: IssueRow,\n pub label_names: Vec<String>,\n}\n```\n\n### Function\n\n```rust\npub fn transform_issue(issue: GitLabIssue) -> Result<IssueWithLabels, TransformError> {\n // Parse ISO 8601 timestamps to ms epoch\n // Extract author.username\n // Return IssueWithLabels with label_names from issue.labels\n}\n```\n\n## Acceptance Criteria\n\n- [ ] `IssueRow` struct exists with all fields from schema\n- [ ] `IssueWithLabels` struct bundles issue + label names\n- [ ] `transform_issue` parses ISO 8601 to ms epoch correctly\n- [ ] `transform_issue` handles missing description (None)\n- [ ] Label names are preserved exactly as received from GitLab\n- [ ] Unit tests cover all edge cases\n\n## Files\n\n- src/gitlab/transformers/mod.rs (create, add `pub mod issue;`)\n- src/gitlab/transformers/issue.rs (create)\n\n## TDD Loop\n\nRED: \n```rust\n// tests/unit/issue_transformer_test.rs\n#[test] fn transforms_issue_with_all_fields()\n#[test] fn handles_missing_description()\n#[test] fn extracts_label_names()\n#[test] fn parses_timestamps_to_ms_epoch()\n```\n\nGREEN: Implement transform_issue function\n\nVERIFY: `cargo test issue_transformer`\n\n## Edge Cases\n\n- GitLab timestamps are ISO 8601 with timezone - use chrono::DateTime::parse_from_rfc3339\n- Description can be null in GitLab API - map to Option<String>\n- Empty labels array is valid - return empty Vec\n- Do NOT parse labels_details - it varies across GitLab versions","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-25T17:02:38.174071Z","created_by":"tayloreernisse","updated_at":"2026-01-25T22:27:11.430611Z","closed_at":"2026-01-25T22:27:11.430439Z","close_reason":"Implemented IssueRow, IssueWithLabels, transform_issue with 6 passing unit tests covering all edge cases","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-3nd","depends_on_id":"bd-1np","type":"blocks","created_at":"2026-01-25T17:04:05.314883Z","created_by":"tayloreernisse"}]}
{"id":"bd-3ng","title":"[CP0] Database setup with migrations and app lock","description":"## Background\n\nThe database is the backbone of gitlab-inbox. SQLite with WAL mode for performance, foreign keys for integrity, and proper pragmas for reliability. Migrations allow schema evolution. App lock prevents concurrent sync corruption.\n\nReference: docs/prd/checkpoint-0.md sections \"Database Schema\", \"SQLite Runtime Pragmas\", \"App Lock Mechanism\"\n\n## Approach\n\n**src/core/db.ts:**\n```typescript\nimport Database from 'better-sqlite3';\nimport { join, dirname } from 'node:path';\nimport { mkdirSync, readdirSync, readFileSync } from 'node:fs';\nimport { getDbPath } from './paths';\nimport { dbLogger } from './logger';\n\nexport function createConnection(dbPath: string): Database.Database {\n mkdirSync(dirname(dbPath), { recursive: true });\n const db = new Database(dbPath);\n \n // Production-grade pragmas\n db.pragma('journal_mode = WAL');\n db.pragma('synchronous = NORMAL');\n db.pragma('foreign_keys = ON');\n db.pragma('busy_timeout = 5000');\n db.pragma('temp_store = MEMORY');\n \n return db;\n}\n\nexport function runMigrations(db: Database.Database, migrationsDir: string): void {\n // Create schema_version table if not exists\n // Read migration files sorted by version\n // Apply migrations not yet applied\n // Track in schema_version table\n}\n```\n\n**migrations/001_initial.sql:**\nFull schema with tables: schema_version, projects, sync_runs, app_locks, sync_cursors, raw_payloads\n\n**src/core/lock.ts:**\nAppLock class with:\n- acquire(force?): acquires lock or throws DatabaseLockError\n- release(): releases lock and stops heartbeat\n- Heartbeat timer that updates heartbeat_at every N seconds\n- Stale lock detection (heartbeat_at > staleLockMinutes ago)\n\n## Acceptance Criteria\n\n- [ ] createConnection() creates parent directories if missing\n- [ ] WAL mode verified: `db.pragma('journal_mode')` returns 'wal'\n- [ ] Foreign keys verified: `db.pragma('foreign_keys')` returns 1\n- [ ] busy_timeout verified: `db.pragma('busy_timeout')` returns 5000\n- [ ] 001_initial.sql creates all 6 tables\n- [ ] schema_version shows version 1 after migration\n- [ ] AppLock.acquire() succeeds for first caller\n- [ ] AppLock.acquire() throws DatabaseLockError for second concurrent caller\n- [ ] Stale lock (heartbeat > 10 min old) can be taken over\n- [ ] tests/unit/db.test.ts passes (8 tests)\n- [ ] tests/integration/app-lock.test.ts passes (6 tests)\n\n## Files\n\nCREATE:\n- src/core/db.ts\n- src/core/lock.ts\n- migrations/001_initial.sql\n- tests/unit/db.test.ts\n- tests/integration/app-lock.test.ts\n\n## TDD Loop\n\nRED:\n```typescript\n// tests/unit/db.test.ts\ndescribe('Database', () => {\n it('creates database file if not exists')\n it('applies migrations in order')\n it('sets WAL journal mode')\n it('enables foreign keys')\n it('sets busy_timeout=5000')\n it('sets synchronous=NORMAL')\n it('sets temp_store=MEMORY')\n it('tracks schema version')\n})\n\n// tests/integration/app-lock.test.ts\ndescribe('App Lock', () => {\n it('acquires lock successfully')\n it('updates heartbeat during operation')\n it('detects stale lock and recovers')\n it('refuses concurrent acquisition')\n it('allows force override')\n it('releases lock on completion')\n})\n```\n\nGREEN: Implement db.ts, lock.ts, 001_initial.sql\n\nVERIFY: \n```bash\nnpm run test -- tests/unit/db.test.ts\nnpm run test -- tests/integration/app-lock.test.ts\n```\n\n## Edge Cases\n\n- Migration file with syntax error should rollback and throw MigrationError\n- Lock heartbeat timer must be unref()'d to not block process exit\n- Database file permissions - fail clearly if not writable\n- Concurrent lock tests need separate database files","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-24T16:09:49.481012Z","created_by":"tayloreernisse","updated_at":"2026-01-25T03:08:38.612669Z","closed_at":"2026-01-25T03:08:38.612543Z","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-3ng","depends_on_id":"bd-epj","type":"blocks","created_at":"2026-01-24T16:13:08.349356Z","created_by":"tayloreernisse"}]}
{"id":"bd-3pk","title":"OBSERV Epic: Phase 5 - Rate Limit + Retry Instrumentation","description":"Enhanced logging in GitLab HTTP client for rate limits and retries. Structured fields on retry/rate-limit events. StageTiming gets rate_limit_hits and retries counters.\n\nDepends on: Phase 2 (spans for context)\nParallel with: Phase 3\n\nFiles: src/gitlab/client.rs, src/core/metrics.rs\n\nAcceptance criteria (PRD Section 6.5):\n- 429 events log at INFO with path, attempt, retry_after_secs, status_code\n- Retry events log with path, attempt, error\n- StageTiming includes rate_limit_hits and retries (omitted when zero)\n- lore -v sync shows retry activity on stderr\n- Rate limit counts included in metrics_json","status":"closed","priority":2,"issue_type":"epic","created_at":"2026-02-04T15:53:27.517023Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:31:35.239820Z","closed_at":"2026-02-04T17:31:35.239776Z","close_reason":"All Phase 5 tasks complete: structured rate-limit logging (bd-12ae) and rate_limit_hits/retries counters in StageTiming (bd-3vqk)","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-3pk","depends_on_id":"bd-2ni","type":"blocks","created_at":"2026-02-04T15:55:19.208152Z","created_by":"tayloreernisse"}]}
{"id":"bd-3pz","title":"OBSERV Epic: Phase 4 - Sync History Enrichment","description":"Wire up sync_runs INSERT/UPDATE lifecycle (table exists but nothing writes to it), schema migration 014, enhanced sync-status with recent runs and metrics.\n\nDepends on: Phase 3 (needs Vec<StageTiming> to store in metrics_json)\nUnblocks: nothing (terminal phase)\n\nFiles: migrations/014_sync_runs_enrichment.sql (new), src/core/sync_run.rs (new), src/cli/commands/sync.rs, src/cli/commands/ingest.rs, src/cli/commands/sync_status.rs\n\nAcceptance criteria (PRD Section 6.4):\n- lore sync creates sync_runs row with status=running, updated to succeeded/failed\n- sync_runs.run_id matches log files and robot JSON\n- metrics_json contains serialized Vec<StageTiming>\n- lore sync-status shows last 10 runs with metrics\n- Failed syncs record error and partial metrics\n- Migration 014 applies cleanly","status":"closed","priority":2,"issue_type":"epic","created_at":"2026-02-04T15:53:27.469149Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:43:07.375047Z","closed_at":"2026-02-04T17:43:07.375Z","close_reason":"Phase 4 complete: migration 014, SyncRunRecorder, wiring, sync-status enhancement","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-3pz","depends_on_id":"bd-3er","type":"blocks","created_at":"2026-02-04T15:55:19.153053Z","created_by":"tayloreernisse"}]}
{"id":"bd-3q2","title":"Implement search filters module","description":"## Background\nSearch filters are applied post-retrieval to narrow results by source type, author, project, date, labels, and file paths. The filter module must preserve ranking order from the search pipeline (FTS/RRF scores). It uses SQLite's JSON1 extension (json_each) to pass ranked document IDs efficiently and maintain their original order.\n\n## Approach\nCreate `src/search/filters.rs` per PRD Section 3.3. The full implementation is specified in the PRD including the SQL query.\n\n**Key types:**\n- `SearchFilters` struct with all filter fields + `has_any_filter()` + `clamp_limit()`\n- `PathFilter` enum: `Prefix(String)` (trailing `/`) or `Exact(String)`\n\n**Core function:**\n```rust\npub fn apply_filters(\n conn: &Connection,\n document_ids: &[i64],\n filters: &SearchFilters,\n) -> Result<Vec<i64>>\n```\n\n**SQL pattern (JSON1 for ordered ID passing):**\n```sql\nSELECT d.id\nFROM json_each(?) AS j\nJOIN documents d ON d.id = j.value\nWHERE 1=1\n AND d.source_type = ? -- if source_type filter set\n AND d.author_username = ? -- if author filter set\n -- ... dynamic WHERE clauses\nORDER BY j.key -- preserves ranking order\nLIMIT ?\n```\n\n**Filter logic:**\n- Labels: AND logic via `EXISTS (SELECT 1 FROM document_labels dl WHERE dl.document_id = d.id AND dl.label_name = ?)`\n- Path prefix: `LIKE ? ESCAPE '\\\\'` with escaped wildcards\n- Path exact: `= ?`\n- Limit: clamped to [1, 100], default 20\n\n## Acceptance Criteria\n- [ ] source_type filter works (issue, merge_request, discussion)\n- [ ] author filter: exact username match\n- [ ] project_id filter: restricts to single project\n- [ ] after filter: created_at >= value\n- [ ] updated_after filter: updated_at >= value\n- [ ] labels filter: AND logic (all specified labels must be present)\n- [ ] path exact filter: matches exact path string\n- [ ] path prefix filter: trailing `/` triggers LIKE with escaped wildcards\n- [ ] Ranking order preserved (ORDER BY j.key from json_each)\n- [ ] Limit clamped: 0 -> 20 (default), 200 -> 100 (max)\n- [ ] Empty document_ids returns empty Vec\n- [ ] Multiple filters compose correctly (all applied via AND)\n- [ ] `cargo test filters` passes\n\n## Files\n- `src/search/filters.rs` — new file\n- `src/search/mod.rs` — add `pub use filters::{SearchFilters, PathFilter, apply_filters};`\n\n## TDD Loop\nRED: Tests in `filters.rs` `#[cfg(test)] mod tests`:\n- `test_no_filters` — all docs returned up to limit\n- `test_source_type_filter` — only issues returned\n- `test_author_filter` — exact match\n- `test_labels_and_logic` — must have ALL specified labels\n- `test_path_exact` — matches exact path\n- `test_path_prefix` — trailing slash matches prefix\n- `test_limit_clamping` — 0 -> 20, 200 -> 100\n- `test_ranking_preserved` — output order matches input order\n- `test_has_any_filter` — true when any filter set, false when default\nGREEN: Implement apply_filters with dynamic SQL\nVERIFY: `cargo test filters`\n\n## Edge Cases\n- Path containing SQL LIKE wildcards (`%`, `_`): must be escaped before LIKE\n- Empty labels list: no label filter applied (not \"must have zero labels\")\n- `has_any_filter()` returns false for default SearchFilters (no filters set)\n- Large document_ids array (1000+): JSON1 handles efficiently","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:26:13.042512Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:24:38.402483Z","closed_at":"2026-01-30T17:24:38.402302Z","close_reason":"Completed: SearchFilters with has_any_filter/clamp_limit, PathFilter enum, apply_filters with dynamic SQL + json_each ordering, escape_like, 8 tests pass","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-3q2","depends_on_id":"bd-36p","type":"blocks","created_at":"2026-01-30T15:29:24.412357Z","created_by":"tayloreernisse"}]}
{"id":"bd-3qm","title":"[CP1] Final validation - tests, smoke tests, integrity checks","description":"Run all tests and perform data integrity checks.\n\nValidation steps:\n1. Run all unit tests (vitest)\n2. Run all integration tests\n3. Run ESLint\n4. Run TypeScript strict check\n5. Manual smoke tests per PRD table\n6. Data integrity SQL checks:\n - Issue count matches GitLab\n - Every issue has raw_payload\n - Labels in junction exist in labels table\n - sync_cursors has entry per project\n - Re-run fetches 0 new items\n - Discussion count > 0\n - Every discussion has >= 1 note\n - individual_note=true has exactly 1 note\n\nFiles: All CP1 files\nDone when: All gate criteria from Definition of Done pass","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:20:51.994183Z","created_by":"tayloreernisse","updated_at":"2026-01-25T15:21:35.152852Z","deleted_at":"2026-01-25T15:21:35.152849Z","deleted_by":"tayloreernisse","delete_reason":"delete","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-3qs","title":"Implement lore generate-docs CLI command","description":"## Background\nThe generate-docs CLI command is the user-facing wrapper around the document regeneration pipeline. It has two modes: incremental (default, processes dirty_sources queue only) and full (seeds dirty_sources with ALL entities, then drains). Both modes use the same regenerator codepath to avoid logic divergence. Full mode uses keyset pagination (WHERE id > last_id) for seeding to avoid O(n^2) OFFSET degradation on large tables.\n\n## Approach\nCreate `src/cli/commands/generate_docs.rs` per PRD Section 2.4.\n\n**Core function:**\n```rust\npub fn run_generate_docs(\n config: &Config,\n full: bool,\n project_filter: Option<&str>,\n) -> Result<GenerateDocsResult>\n```\n\n**Full mode seeding (keyset pagination):**\n```rust\nconst FULL_MODE_CHUNK_SIZE: usize = 2000;\n\n// For each source type (issues, MRs, discussions):\nlet mut last_id: i64 = 0;\nloop {\n let tx = conn.transaction()?;\n let inserted = tx.execute(\n \"INSERT INTO dirty_sources (source_type, source_id, queued_at, ...)\n SELECT 'issue', id, ?, 0, NULL, NULL, NULL\n FROM issues WHERE id > ? ORDER BY id LIMIT ?\n ON CONFLICT(source_type, source_id) DO NOTHING\",\n params![now_ms(), last_id, FULL_MODE_CHUNK_SIZE],\n )?;\n if inserted == 0 { tx.commit()?; break; }\n // Advance keyset cursor...\n tx.commit()?;\n}\n```\n\n**After draining (full mode only):**\n```sql\nINSERT INTO documents_fts(documents_fts) VALUES('optimize')\n```\n\n**CLI args:**\n```rust\n#[derive(Args)]\npub struct GenerateDocsArgs {\n #[arg(long)]\n full: bool,\n #[arg(long)]\n project: Option<String>,\n}\n```\n\n**Output:** Human-readable table + JSON robot mode.\n\n## Acceptance Criteria\n- [ ] Default mode (no --full): processes only existing dirty_sources entries\n- [ ] --full mode: seeds dirty_sources with ALL issues, MRs, and discussions\n- [ ] Full mode uses keyset pagination (WHERE id > last_id, not OFFSET)\n- [ ] Full mode chunk size is 2000\n- [ ] Full mode does FTS optimize after completion\n- [ ] Both modes use regenerate_dirty_documents() (same codepath)\n- [ ] Progress bar shown in human mode (via indicatif)\n- [ ] JSON output in robot mode with GenerateDocsResult\n- [ ] GenerateDocsResult has issues/mrs/discussions/total/truncated/skipped counts\n- [ ] `cargo build` succeeds\n\n## Files\n- `src/cli/commands/generate_docs.rs` — new file\n- `src/cli/commands/mod.rs` — add `pub mod generate_docs;`\n- `src/cli/mod.rs` — add GenerateDocsArgs, wire up generate-docs subcommand\n- `src/main.rs` — add generate-docs command handler\n\n## TDD Loop\nRED: Integration test with seeded DB\nGREEN: Implement run_generate_docs with seeding + drain\nVERIFY: `cargo build && cargo test generate_docs`\n\n## Edge Cases\n- Empty database (no issues/MRs/discussions): full mode seeds nothing, returns all-zero counts\n- --project filter in full mode: only seed dirty_sources for entities in that project\n- Interrupted full mode: dirty_sources entries persist (ON CONFLICT DO NOTHING), resume by re-running\n- FTS optimize on empty FTS table: no-op (safe)","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:25:55.226666Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:49:23.397157Z","closed_at":"2026-01-30T17:49:23.397098Z","close_reason":"Implemented generate-docs command with incremental + full mode, keyset pagination seeding, FTS optimize, project filter, human + JSON output. Builds clean.","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-3qs","depends_on_id":"bd-1u1","type":"blocks","created_at":"2026-01-30T15:29:16.089769Z","created_by":"tayloreernisse"},{"issue_id":"bd-3qs","depends_on_id":"bd-221","type":"blocks","created_at":"2026-01-30T15:29:16.125158Z","created_by":"tayloreernisse"}]}
{"id":"bd-3rl","title":"Epic: Gate C - Sync MVP","description":"## Background\nGate C adds the sync orchestrator and queue infrastructure that makes the search pipeline incremental and self-maintaining. It introduces dirty source tracking (change detection during ingestion), the discussion fetch queue, and the unified lore sync command that orchestrates the full pipeline. Gate C also adds integrity checks and repair paths.\n\n## Gate C Deliverables\n1. Orchestrated lore sync command with incremental doc regen + re-embedding\n2. Integrity checks + repair paths for FTS/embeddings consistency\n\n## Bead Dependencies (execution order, after Gate A)\n1. **bd-mem** — Shared backoff utility (no deps, shared with Gate B)\n2. **bd-38q** — Dirty source tracking (blocked by bd-36p, bd-hrs, bd-mem)\n3. **bd-1je** — Discussion queue (blocked by bd-hrs, bd-mem)\n4. **bd-1i2** — Integrate dirty tracking into ingestion (blocked by bd-38q)\n5. **bd-1x6** — Sync CLI (blocked by bd-38q, bd-1je, bd-1i2, bd-3qs, bd-2sx)\n\n## Acceptance Criteria\n- [ ] `lore sync` runs full pipeline: ingest -> generate-docs -> embed\n- [ ] `lore sync --full` does full re-sync + regeneration\n- [ ] `lore sync --no-embed` skips embedding stage\n- [ ] Dirty tracking: upserted entities automatically marked for regeneration\n- [ ] Queue draining: dirty_sources fully drained in bounded batch loop\n- [ ] Backoff: failed items use exponential backoff with jitter\n- [ ] `lore stats --check` detects inconsistencies\n- [ ] `lore stats --repair` fixes FTS/embedding inconsistencies","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-30T15:25:13.494698Z","created_by":"tayloreernisse","updated_at":"2026-01-30T18:05:52.121666Z","closed_at":"2026-01-30T18:05:52.121619Z","close_reason":"All Gate C sub-beads complete: backoff utility, dirty tracking, discussion queue, ingestion integration, sync CLI, stats CLI","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-3rl","depends_on_id":"bd-1x6","type":"blocks","created_at":"2026-01-30T15:29:35.853817Z","created_by":"tayloreernisse"},{"issue_id":"bd-3rl","depends_on_id":"bd-pr1","type":"blocks","created_at":"2026-01-30T15:29:35.892441Z","created_by":"tayloreernisse"}]}
{"id":"bd-3sh","title":"Add 'lore count events' command with robot mode","description":"## Background\nNeed to verify event ingestion and report counts by type. The existing count command (src/cli/commands/count.rs) handles issues, mrs, discussions, notes with both human and robot output. This adds 'events' as a new count subcommand.\n\n## Approach\nExtend the existing count command in src/cli/commands/count.rs:\n\n1. Add CountTarget::Events variant (or string match) in the count dispatcher\n2. Query each event table with GROUP BY entity type:\n```sql\nSELECT \n CASE WHEN issue_id IS NOT NULL THEN 'issue' ELSE 'merge_request' END as entity_type,\n COUNT(*) as count\nFROM resource_state_events\nGROUP BY entity_type;\n-- (repeat for label and milestone events)\n```\n\n3. Human output: table format\n```\nEvent Type Issues MRs Total\nState events 1,234 567 1,801\nLabel events 2,345 890 3,235\nMilestone events 456 123 579\nTotal 4,035 1,580 5,615\n```\n\n4. Robot JSON:\n```json\n{\n \"ok\": true,\n \"data\": {\n \"state_events\": {\"issue\": 1234, \"merge_request\": 567, \"total\": 1801},\n \"label_events\": {\"issue\": 2345, \"merge_request\": 890, \"total\": 3235},\n \"milestone_events\": {\"issue\": 456, \"merge_request\": 123, \"total\": 579},\n \"total\": 5615\n }\n}\n```\n\n5. Register in CLI: add \"events\" to count's entity_type argument in src/cli/mod.rs\n\n## Acceptance Criteria\n- [ ] `lore count events` shows correct counts by event type and entity type\n- [ ] Robot JSON matches the schema above\n- [ ] Works with empty tables (all zeros)\n- [ ] Does not error if migration 011 hasn't been applied (graceful degradation or \"no event tables\" message)\n\n## Files\n- src/cli/commands/count.rs (add events counting logic)\n- src/cli/mod.rs (add \"events\" to count's accepted entity types)\n\n## TDD Loop\nRED: tests/count_tests.rs (or extend existing):\n- `test_count_events_empty_tables` - verify all zeros on fresh DB\n- `test_count_events_with_data` - seed state + label events, verify correct counts\n- `test_count_events_robot_json` - verify JSON structure\n\nGREEN: Add the events branch to count command\n\nVERIFY: `cargo test count -- --nocapture`\n\n## Edge Cases\n- Tables don't exist if user hasn't run migrate — check table existence first or catch the error\n- COUNT with GROUP BY returns no rows for empty tables — need to handle missing entity types as 0","status":"closed","priority":3,"issue_type":"task","created_at":"2026-02-02T21:31:57.379702Z","created_by":"tayloreernisse","updated_at":"2026-02-03T16:21:21.408874Z","closed_at":"2026-02-03T16:21:21.408806Z","close_reason":"Added 'events' to count CLI parser, run_count_events function, print_event_count (table format) and print_event_count_json (structured JSON). Wired into handle_count in main.rs.","compaction_level":0,"original_size":0,"labels":["cli","gate-1","phase-b"],"dependencies":[{"issue_id":"bd-3sh","depends_on_id":"bd-2zl","type":"parent-child","created_at":"2026-02-02T21:31:57.380927Z","created_by":"tayloreernisse"},{"issue_id":"bd-3sh","depends_on_id":"bd-hu3","type":"blocks","created_at":"2026-02-02T21:32:06.308285Z","created_by":"tayloreernisse"}]}
{"id":"bd-3vqk","title":"OBSERV: Add rate_limit_hits and retries counters to StageTiming","description":"## Background\nMetricsLayer counts span timing but doesn't yet count rate-limit hits and retries. These counters complete the observability picture, showing HOW MUCH time was spent waiting vs. working.\n\n## Approach\n### src/core/metrics.rs - StageTiming struct\n\nAdd two new fields:\n```rust\n#[derive(Debug, Clone, Serialize)]\npub struct StageTiming {\n // ... existing fields ...\n #[serde(skip_serializing_if = \"is_zero\")]\n pub rate_limit_hits: usize,\n #[serde(skip_serializing_if = \"is_zero\")]\n pub retries: usize,\n}\n```\n\n### src/core/metrics.rs - MetricsLayer\n\nThe structured log events from bd-12ae use info!() with specific fields (status_code=429, \"Rate limited, retrying\"). MetricsLayer needs to count these events within each span.\n\nAdd to SpanData:\n```rust\nstruct SpanData {\n // ... existing fields ...\n rate_limit_hits: usize,\n retries: usize,\n}\n```\n\nAdd on_event() to MetricsLayer:\n```rust\nfn on_event(&self, event: &tracing::Event<'_>, ctx: Context<'_, S>) {\n // Check if event message contains rate-limit or retry indicators\n // Increment counters on the current span\n if let Some(span_ref) = ctx.event_span(event) {\n let id = span_ref.id();\n if let Some(data) = self.spans.lock().unwrap().get_mut(&id.into_u64()) {\n let mut visitor = EventVisitor::default();\n event.record(&mut visitor);\n\n if visitor.status_code == Some(429) {\n data.rate_limit_hits += 1;\n }\n if visitor.is_retry {\n data.retries += 1;\n }\n }\n }\n}\n```\n\nThe EventVisitor checks for status_code=429 and message containing \"retrying\" to classify events.\n\nOn span close, propagate counts to parent (bubble up):\n```rust\nfn on_close(&self, id: Id, _ctx: Context<'_, S>) {\n if let Some(data) = self.spans.lock().unwrap().remove(&id.into_u64()) {\n let timing = StageTiming {\n // ... existing fields ...\n rate_limit_hits: data.rate_limit_hits,\n retries: data.retries,\n };\n // ... push to completed\n }\n}\n```\n\n## Acceptance Criteria\n- [ ] StageTiming has rate_limit_hits and retries fields\n- [ ] Fields omitted when zero in JSON serialization\n- [ ] MetricsLayer counts 429 events as rate_limit_hits\n- [ ] MetricsLayer counts retry events as retries\n- [ ] Counts bubble up to parent spans in extract_timings()\n- [ ] Rate limit counts appear in metrics_json stored in sync_runs\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/core/metrics.rs (add fields to StageTiming, add on_event to MetricsLayer, add EventVisitor)\n\n## TDD Loop\nRED:\n - test_stage_timing_rate_limit_counts: simulate 3 rate-limit events, extract, assert rate_limit_hits=3\n - test_stage_timing_retry_counts: simulate 2 retries, extract, assert retries=2\n - test_rate_limit_fields_omitted_when_zero: StageTiming with zero counts, serialize, assert no keys\nGREEN: Add fields to StageTiming, implement on_event in MetricsLayer\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- Events outside any span: ctx.event_span() returns None. Skip counting. This shouldn't happen in practice since all GitLab calls happen within stage spans.\n- Event classification: rely on structured fields (status_code=429) not message text. More reliable and less fragile.\n- Count bubbling: parent stage should aggregate child counts. In extract_timings(), sum children's counts into parent.","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-04T15:55:02.523778Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:25:25.456758Z","closed_at":"2026-02-04T17:25:25.456708Z","close_reason":"Implemented rate_limit_hits and retries counters in StageTiming with skip_serializing_if for zero values","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-3vqk","depends_on_id":"bd-12ae","type":"blocks","created_at":"2026-02-04T15:55:20.563473Z","created_by":"tayloreernisse"},{"issue_id":"bd-3vqk","depends_on_id":"bd-1o4h","type":"blocks","created_at":"2026-02-04T15:55:20.503024Z","created_by":"tayloreernisse"},{"issue_id":"bd-3vqk","depends_on_id":"bd-3pk","type":"parent-child","created_at":"2026-02-04T15:55:02.524557Z","created_by":"tayloreernisse"}]}
{"id":"bd-4qd","title":"Write unit tests for core algorithms","description":"## Background\nUnit tests verify the core algorithms in isolation: document extraction formatting, FTS query sanitization, RRF scoring, content hashing, backoff curves, and filter helpers. These tests don't require a database or external services — they test pure functions and logic.\n\n## Approach\nAdd #[cfg(test)] mod tests blocks to each module:\n\n**1. src/documents/extractor.rs:**\n- test_source_type_parse_all_aliases — every alias resolves correctly\n- test_source_type_parse_unknown — returns None\n- test_source_type_as_str_roundtrip — as_str matches parse input\n- test_content_hash_deterministic — same input = same hash\n- test_list_hash_order_independent — sorted before hashing\n- test_list_hash_empty — empty vec produces consistent hash\n\n**2. src/documents/truncation.rs:**\n- test_truncation_edge_cases (per bd-18t TDD Loop)\n\n**3. src/search/fts.rs:**\n- test_to_fts_query_basic — \"auth error\" -> quoted tokens\n- test_to_fts_query_prefix — \"auth*\" preserves prefix\n- test_to_fts_query_special_chars — \"C++\" quoted correctly\n- test_to_fts_query_dash — \"-DWITH_SSL\" quoted (not NOT operator)\n- test_to_fts_query_internal_quotes — escaped by doubling\n- test_to_fts_query_empty — empty string returns empty\n\n**4. src/search/rrf.rs:**\n- test_rrf_dual_list — docs in both lists score higher\n- test_rrf_normalization — best score = 1.0\n- test_rrf_empty — empty returns empty\n\n**5. src/core/backoff.rs:**\n- test_exponential_curve — delays double each attempt\n- test_cap_at_one_hour — high attempt_count capped\n- test_jitter_range — within [0.9, 1.1) factor\n\n**6. src/search/filters.rs:**\n- test_has_any_filter — true/false for various filter combos\n- test_clamp_limit — 0->20, 200->100, 50->50\n- test_path_filter_from_str — trailing slash = Prefix\n\n**7. src/search/hybrid.rs (hydration round-trip):**\n- test_single_round_trip_query — verify hydration SQL produces correct structure\n\n## Acceptance Criteria\n- [ ] All edge cases covered per PRD acceptance criteria\n- [ ] Tests are unit tests (no DB, no network, no Ollama)\n- [ ] `cargo test` passes with all new tests\n- [ ] No test depends on execution order\n- [ ] Tests cover: document extractor formats, truncation, RRF, hashing, FTS sanitization, backoff, filters\n\n## Files\n- In-module tests in: extractor.rs, truncation.rs, fts.rs, rrf.rs, backoff.rs, filters.rs, hybrid.rs\n\n## TDD Loop\nThese tests ARE the TDD loop for their respective beads. Each implementation bead should write its tests first (RED), then implement (GREEN).\nVERIFY: `cargo test`\n\n## Edge Cases\n- Tests with Unicode: include emoji, CJK characters in truncation tests\n- Tests with empty strings: empty queries, empty content, empty labels\n- Tests with boundary values: limit=0, limit=100, limit=101","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-30T15:27:21.712924Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:46:00.059346Z","closed_at":"2026-01-30T17:46:00.059292Z","close_reason":"All acceptance criteria tests already exist across modules. 276 tests passing (189 unit + 87 integration).","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-4qd","depends_on_id":"bd-18t","type":"blocks","created_at":"2026-01-30T15:29:35.356715Z","created_by":"tayloreernisse"},{"issue_id":"bd-4qd","depends_on_id":"bd-1k1","type":"blocks","created_at":"2026-01-30T15:29:35.320913Z","created_by":"tayloreernisse"},{"issue_id":"bd-4qd","depends_on_id":"bd-36p","type":"blocks","created_at":"2026-01-30T15:29:35.465589Z","created_by":"tayloreernisse"},{"issue_id":"bd-4qd","depends_on_id":"bd-3ez","type":"blocks","created_at":"2026-01-30T15:29:35.393455Z","created_by":"tayloreernisse"},{"issue_id":"bd-4qd","depends_on_id":"bd-mem","type":"blocks","created_at":"2026-01-30T15:29:35.427448Z","created_by":"tayloreernisse"}]}
{"id":"bd-5ta","title":"Add GitLab MR types to types.rs","description":"## Background\nGitLab API types for merge requests. These structs define how we deserialize GitLab API responses. Must handle deprecated field aliases for backward compatibility with older GitLab instances.\n\n## Approach\nAdd new structs to `src/gitlab/types.rs`:\n- `GitLabMergeRequest` - Main MR struct with all fields\n- `GitLabReviewer` - Reviewer with optional approval state\n- `GitLabReferences` - Short and full reference strings\n\nUse serde `#[serde(alias = \"...\")]` for deprecated field fallbacks.\n\n## Files\n- `src/gitlab/types.rs` - Add new structs after existing GitLabIssue\n- `tests/fixtures/gitlab_merge_request.json` - Test fixture\n\n## Acceptance Criteria\n- [ ] `GitLabMergeRequest` struct exists with all fields from PRD\n- [ ] `detailed_merge_status` field exists (non-deprecated)\n- [ ] `#[serde(alias = \"merge_status\")]` on `merge_status_legacy` for fallback\n- [ ] `merge_user` field exists (non-deprecated)\n- [ ] `merged_by` field exists for fallback\n- [ ] `draft` and `work_in_progress` both exist (draft preferred, WIP fallback)\n- [ ] `sha` field maps to `head_sha` in transformer\n- [ ] `references: Option<GitLabReferences>` for short/full refs\n- [ ] `state: String` supports \"opened\", \"merged\", \"closed\", \"locked\"\n- [ ] Fixture deserializes without error\n- [ ] `cargo test` passes\n\n## TDD Loop\nRED: Add test that deserializes fixture -> struct not found\nGREEN: Add GitLabMergeRequest, GitLabReviewer, GitLabReferences structs\nVERIFY: `cargo test gitlab_types`\n\n## Struct Definitions (from PRD)\n```rust\n#[derive(Debug, Clone, Deserialize)]\npub struct GitLabMergeRequest {\n pub id: i64,\n pub iid: i64,\n pub project_id: i64,\n pub title: String,\n pub description: Option<String>,\n pub state: String, // \"opened\" | \"merged\" | \"closed\" | \"locked\"\n #[serde(default)]\n pub draft: bool,\n #[serde(default)]\n pub work_in_progress: bool, // Deprecated fallback\n pub source_branch: String,\n pub target_branch: String,\n pub sha: Option<String>, // head_sha\n pub references: Option<GitLabReferences>,\n pub detailed_merge_status: Option<String>,\n #[serde(alias = \"merge_status\")]\n pub merge_status_legacy: Option<String>,\n pub created_at: String,\n pub updated_at: String,\n pub merged_at: Option<String>,\n pub closed_at: Option<String>,\n pub author: GitLabAuthor,\n pub merge_user: Option<GitLabAuthor>,\n pub merged_by: Option<GitLabAuthor>,\n #[serde(default)]\n pub labels: Vec<String>,\n #[serde(default)]\n pub assignees: Vec<GitLabAuthor>,\n #[serde(default)]\n pub reviewers: Vec<GitLabReviewer>,\n pub web_url: String,\n}\n\n#[derive(Debug, Clone, Deserialize)]\npub struct GitLabReferences {\n pub short: String, // e.g. \"\\!123\"\n pub full: String, // e.g. \"group/project\\!123\"\n}\n\n#[derive(Debug, Clone, Deserialize)]\npub struct GitLabReviewer {\n pub id: i64,\n pub username: String,\n pub name: String,\n}\n```\n\n## Test Fixture (create tests/fixtures/gitlab_merge_request.json)\n```json\n{\n \"id\": 12345,\n \"iid\": 42,\n \"project_id\": 100,\n \"title\": \"Add user authentication\",\n \"description\": \"Implements JWT auth flow\",\n \"state\": \"merged\",\n \"draft\": false,\n \"work_in_progress\": false,\n \"source_branch\": \"feature/auth\",\n \"target_branch\": \"main\",\n \"sha\": \"abc123def456\",\n \"references\": { \"short\": \"\\!42\", \"full\": \"group/project\\!42\" },\n \"detailed_merge_status\": \"mergeable\",\n \"merge_status\": \"can_be_merged\",\n \"created_at\": \"2024-01-15T10:00:00Z\",\n \"updated_at\": \"2024-01-20T14:30:00Z\",\n \"merged_at\": \"2024-01-20T14:30:00Z\",\n \"closed_at\": null,\n \"author\": { \"id\": 1, \"username\": \"johndoe\", \"name\": \"John Doe\" },\n \"merge_user\": { \"id\": 2, \"username\": \"janedoe\", \"name\": \"Jane Doe\" },\n \"merged_by\": { \"id\": 2, \"username\": \"janedoe\", \"name\": \"Jane Doe\" },\n \"labels\": [\"enhancement\", \"auth\"],\n \"assignees\": [{ \"id\": 3, \"username\": \"bob\", \"name\": \"Bob Smith\" }],\n \"reviewers\": [{ \"id\": 4, \"username\": \"alice\", \"name\": \"Alice Wong\" }],\n \"web_url\": \"https://gitlab.example.com/group/project/-/merge_requests/42\"\n}\n```\n\n## Edge Cases\n- `locked` state is transitional (merge in progress) - rare but valid\n- Some older instances may not return `detailed_merge_status`\n- Some older instances may not return `merge_user` (use `merged_by` fallback)\n- `work_in_progress` is deprecated but still returned by some instances","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-26T22:06:40.498088Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:08:35.520229Z","closed_at":"2026-01-27T00:08:35.520167Z","close_reason":"Added GitLabMergeRequest, GitLabReviewer, GitLabReferences structs. Updated GitLabNotePosition with position_type, line_range, and SHA triplet fields. All 23 type tests passing.","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-5ta","depends_on_id":"bd-3ir","type":"blocks","created_at":"2026-01-26T22:08:53.981911Z","created_by":"tayloreernisse"}]}
{"id":"bd-88m","title":"[CP1] Issue ingestion module","description":"Fetch and store issues with cursor-based incremental sync.\n\n## Module\nsrc/ingestion/issues.rs\n\n## Key Structs\n\n### IngestIssuesResult\n- fetched: usize\n- upserted: usize\n- labels_created: usize\n- issues_needing_discussion_sync: Vec<IssueForDiscussionSync>\n\n### IssueForDiscussionSync\n- local_issue_id: i64\n- iid: i64\n- updated_at: i64\n\n## Main Function\npub async fn ingest_issues(conn, client, config, project_id, gitlab_project_id) -> Result<IngestIssuesResult>\n\n## Logic\n1. Get current cursor from sync_cursors (updated_at_cursor, tie_breaker_id)\n2. Paginate through issues updated after cursor with cursor_rewind_seconds\n3. Apply local filtering for tuple cursor semantics:\n - Skip if issue.updated_at < cursor_updated_at\n - Skip if issue.updated_at == cursor_updated_at AND issue.id <= cursor_gitlab_id\n4. For each issue passing filter:\n - Begin transaction\n - Store raw payload (compressed)\n - Transform and upsert issue\n - Clear existing label links (DELETE FROM issue_labels)\n - Extract and upsert labels\n - Link issue to labels via junction\n - Commit transaction\n - Track for discussion sync eligibility\n5. Incremental cursor update every 100 issues\n6. Final cursor update\n7. Determine issues needing discussion sync: where updated_at > discussions_synced_for_updated_at\n\n## Helper Functions\n- get_cursor(conn, project_id) -> (Option<i64>, Option<i64>)\n- get_discussions_synced_at(conn, issue_id) -> Option<i64>\n- upsert_issue(conn, issue, payload_id) -> usize\n- get_local_issue_id(conn, gitlab_id) -> i64\n- clear_issue_labels(conn, issue_id)\n- upsert_label(conn, label) -> bool\n- get_label_id(conn, project_id, name) -> i64\n- link_issue_label(conn, issue_id, label_id)\n- update_cursor(conn, project_id, resource_type, updated_at, gitlab_id)\n\nFiles: src/ingestion/mod.rs, src/ingestion/issues.rs\nTests: tests/issue_ingestion_tests.rs\nDone when: Issues, labels, issue_labels populated correctly with resumable cursor","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T16:57:35.655708Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.806982Z","deleted_at":"2026-01-25T17:02:01.806977Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-8t4","title":"Extract cross-references from resource_state_events","description":"## Background\nresource_state_events includes source_merge_request (with iid) for 'closed by MR' events. After state events are stored (Gate 1), post-processing extracts these into entity_references for the cross-reference graph.\n\n## Approach\nCreate src/core/references.rs (new module) or add to events_db.rs:\n\n```rust\n/// Extract cross-references from stored state events and insert into entity_references.\n/// Looks for state events with source_merge_request_id IS NOT NULL (meaning \"closed by MR\").\n/// \n/// Directionality: source = MR (that caused the close), target = issue (that was closed)\npub fn extract_refs_from_state_events(\n conn: &Connection,\n project_id: i64,\n) -> Result<usize> // returns count of new references inserted\n```\n\nSQL logic:\n```sql\nINSERT OR IGNORE INTO entity_references (\n source_entity_type, source_entity_id,\n target_entity_type, target_entity_id,\n reference_type, source_method, created_at\n)\nSELECT\n 'merge_request',\n mr.id,\n 'issue',\n rse.issue_id,\n 'closes',\n 'api_state_event',\n rse.created_at\nFROM resource_state_events rse\nJOIN merge_requests mr ON mr.project_id = rse.project_id AND mr.iid = rse.source_merge_request_id\nWHERE rse.source_merge_request_id IS NOT NULL\n AND rse.issue_id IS NOT NULL\n AND rse.project_id = ?1;\n```\n\nKey: source_merge_request_id stores the MR iid, so we JOIN on merge_requests.iid to get the local DB id.\n\nRegister in src/core/mod.rs: `pub mod references;`\n\nCall this after drain_dependent_queue in the sync pipeline (after all state events are stored).\n\n## Acceptance Criteria\n- [ ] State events with source_merge_request_id produce 'closes' references\n- [ ] Source = MR (resolved by iid), target = issue\n- [ ] source_method = 'api_state_event'\n- [ ] INSERT OR IGNORE prevents duplicates with api_closes_issues data\n- [ ] Returns count of newly inserted references\n- [ ] No-op when no state events have source_merge_request_id\n\n## Files\n- src/core/references.rs (new)\n- src/core/mod.rs (add `pub mod references;`)\n- src/cli/commands/sync.rs (call after drain step)\n\n## TDD Loop\nRED: tests/references_tests.rs:\n- `test_extract_refs_from_state_events_basic` - seed a \"closed\" state event with source_merge_request_id, verify entity_reference created\n- `test_extract_refs_dedup_with_closes_issues` - insert ref from closes_issues API first, verify state event extraction doesn't duplicate\n- `test_extract_refs_no_source_mr` - state events without source_merge_request_id produce no refs\n\nSetup: create_test_db with migrations 001-011, seed project + issue + MR + state events.\n\nGREEN: Implement extract_refs_from_state_events\n\nVERIFY: `cargo test references -- --nocapture`\n\n## Edge Cases\n- source_merge_request_id may reference an MR not synced locally (cross-project close) — the JOIN will produce no match, which is correct behavior (ref simply not created)\n- Multiple state events can reference the same MR for the same issue (reopen + re-close) — INSERT OR IGNORE handles dedup\n- The merge_requests table might not have the MR yet if sync is still running — call this after all dependent fetches complete","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-02T21:32:33.619606Z","created_by":"tayloreernisse","updated_at":"2026-02-04T20:13:28.219791Z","closed_at":"2026-02-04T20:13:28.219633Z","compaction_level":0,"original_size":0,"labels":["extraction","gate-2","phase-b"],"dependencies":[{"issue_id":"bd-8t4","depends_on_id":"bd-1ep","type":"blocks","created_at":"2026-02-02T21:32:42.945176Z","created_by":"tayloreernisse"},{"issue_id":"bd-8t4","depends_on_id":"bd-1se","type":"parent-child","created_at":"2026-02-02T21:32:33.621025Z","created_by":"tayloreernisse"},{"issue_id":"bd-8t4","depends_on_id":"bd-hu3","type":"blocks","created_at":"2026-02-02T22:41:50.562935Z","created_by":"tayloreernisse"}]}
{"id":"bd-9av","title":"[CP1] gi sync-status enhancement","description":"Enhance sync-status from CP0 stub to show issue cursors.\n\n## Changes to src/cli/commands/sync_status.rs\n\nUpdate the existing stub to show:\n- Last run timestamp and duration\n- Cursor positions per project (issues resource_type)\n- Entity counts (issues, discussions, notes)\n\n## Output Format\nLast sync: 2026-01-25 10:30:00 (succeeded, 45s)\n\nCursors:\n group/project-one\n issues: 2026-01-25T10:25:00Z (gitlab_id: 12345678)\n\nCounts:\n Issues: 1,234\n Discussions: 5,678\n Notes: 23,456 (4,567 system)\n\nFiles: src/cli/commands/sync_status.rs\nDone when: Shows cursor positions and counts after ingestion","status":"tombstone","priority":3,"issue_type":"task","created_at":"2026-01-25T16:58:27.246825Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.968507Z","deleted_at":"2026-01-25T17:02:01.968503Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-9dd","title":"Implement 'lore trace' command with human and robot output","description":"## Background\n\nThe trace command is the capstone CLI of Gate 5. It answers \"Why was this code introduced?\" by tracing from a file path through the MR that modified it, to the issue that motivated the MR, to the discussions that contain the decision rationale. It combines data from mr_file_changes (Gate 4), entity_references (Gate 2), and discussions/notes (existing).\n\n## Approach\n\n### 1. Add Trace subcommand in `src/cli/mod.rs`\n\n```rust\n/// Trace the decision chain behind a file: file -> MR -> issue -> discussions\nTrace(TraceArgs),\n```\n\n```rust\n#[derive(Parser)]\npub struct TraceArgs {\n /// File path to trace (supports :line suffix for future Tier 2)\n pub path: String,\n\n /// Filter by project path\n #[arg(short = 'p', long, help_heading = \"Filters\")]\n pub project: Option<String>,\n\n /// Include discussion snippets in output\n #[arg(long, help_heading = \"Output\", overrides_with = \"no_discussions\")]\n pub discussions: bool,\n\n #[arg(long = \"no-discussions\", hide = true, overrides_with = \"discussions\")]\n pub no_discussions: bool,\n\n /// Maximum trace chains to return\n #[arg(short = 'n', long = \"limit\", default_value = \"20\", help_heading = \"Output\")]\n pub limit: usize,\n}\n```\n\n### 2. Path Parsing\n\nSupport `src/foo.rs:45` syntax:\n```rust\nfn parse_trace_path(input: &str) -> (String, Option<u32>) {\n if let Some((path, line)) = input.rsplit_once(':') {\n if let Ok(line_num) = line.parse::<u32>() {\n return (path.to_string(), Some(line_num));\n }\n }\n (input.to_string(), None)\n}\n```\n\nIf line number is present, emit warning: \"Line-level tracing requires Tier 2 (git2 integration). Showing file-level results.\"\n\n### 3. Handler in `src/main.rs`\n\n```rust\nSome(Commands::Trace(args)) => handle_trace(cli.config.as_deref(), args, robot_mode),\n```\n\n### 4. Human Output\n\n```\nTrace: src/auth/oauth.rs (3 chains)\n\nChain 1: MR !456 \"Implement OAuth2 flow\" (@alice, merged 2024-01-22)\n Issue: #123 \"Add OAuth2 support\" (closed)\n Discussions:\n @carol (2024-01-19): \"We should use PKCE here because it prevents...\"\n @dave (2024-01-20): \"Agreed. Also need to handle token refresh...\"\n\nChain 2: MR !489 \"Fix OAuth token expiry\" (@bob, merged 2024-02-15)\n Issue: #145 \"OAuth tokens expire silently\" (closed)\n (no discussions with DiffNote on this file)\n```\n\n### 5. Robot JSON\n\n```json\n{\n \"ok\": true,\n \"data\": {\n \"path\": \"src/auth/oauth.rs\",\n \"resolved_paths\": [\"src/auth/oauth.rs\", \"src/authentication/oauth.rs\"],\n \"trace_chains\": [\n {\n \"merge_request\": { \"iid\": 456, \"title\": \"...\", \"state\": \"merged\", ... },\n \"issues\": [\n { \"iid\": 123, \"title\": \"...\", \"state\": \"closed\", ... }\n ],\n \"discussions\": [\n { \"author\": \"carol\", \"body_snippet\": \"...\", \"created_at\": \"...\" }\n ]\n }\n ]\n },\n \"meta\": {\n \"tier\": \"api_only\",\n \"line_requested\": null,\n \"rename_chain_length\": 2,\n \"total_chains\": 3,\n \"showing\": 3\n }\n}\n```\n\n## Acceptance Criteria\n\n- [ ] `lore trace src/foo.rs` works with human output\n- [ ] `lore --robot trace src/foo.rs` works with JSON output\n- [ ] Path with `:line` suffix parses correctly and emits Tier 2 warning\n- [ ] `-p group/repo` filters to single project\n- [ ] `--discussions` includes discussion snippets\n- [ ] `-n 5` limits to 5 chains\n- [ ] Rename-aware: traces through renamed files\n- [ ] Robot JSON includes `meta.tier: \"api_only\"` and `meta.line_requested`\n- [ ] Trace command appears in `lore robot-docs` output\n- [ ] Added to VALID_COMMANDS for fuzzy matching\n- [ ] `cargo check --all-targets` passes\n- [ ] `cargo clippy --all-targets -- -D warnings` passes\n\n## Files\n\n- `src/cli/mod.rs` (add TraceArgs struct + Commands::Trace variant)\n- `src/cli/commands/trace.rs` (NEW - handler + output functions)\n- `src/cli/commands/mod.rs` (re-export trace functions)\n- `src/main.rs` (add handle_trace + match arm + VALID_COMMANDS + robot-docs)\n\n## TDD Loop\n\nRED: No unit tests for CLI wiring. Create one test for path parsing:\n- `test_parse_trace_path_simple` - \"src/foo.rs\" -> (\"src/foo.rs\", None)\n- `test_parse_trace_path_with_line` - \"src/foo.rs:42\" -> (\"src/foo.rs\", Some(42))\n- `test_parse_trace_path_colon_in_path` - \"C:/foo.rs\" -> (\"C:/foo.rs\", None)\n\nGREEN: Implement CLI wiring and handlers.\n\nVERIFY: `cargo check --all-targets && cargo clippy --all-targets -- -D warnings`\n\n## Edge Cases\n\n- File not found in mr_file_changes: \"No MR history found for 'src/foo.rs'\" (exit 0, not error)\n- Ambiguous project: exit 18 with suggestions\n- Line number in path with Windows-style path (C:\\foo:42): don't misparse the drive letter\n- Very deep rename chain: bounded by resolve_rename_chain's max_hops","status":"open","priority":2,"issue_type":"task","created_at":"2026-02-02T21:34:32.788530Z","created_by":"tayloreernisse","updated_at":"2026-02-05T18:52:06.287955Z","compaction_level":0,"original_size":0,"labels":["cli","gate-5","phase-b"],"dependencies":[{"issue_id":"bd-9dd","depends_on_id":"bd-1ht","type":"parent-child","created_at":"2026-02-02T21:34:32.789920Z","created_by":"tayloreernisse"},{"issue_id":"bd-9dd","depends_on_id":"bd-2n4","type":"blocks","created_at":"2026-02-02T21:34:37.941327Z","created_by":"tayloreernisse"}]}
{"id":"bd-am7","title":"Implement embedding pipeline with chunking","description":"## Background\nThe embedding pipeline takes documents, chunks them (paragraph-boundary splitting with overlap), sends chunks to Ollama for embedding via async HTTP, and stores vectors in sqlite-vec + metadata. It uses keyset pagination, concurrent HTTP requests via FuturesUnordered, per-batch transactions, and dimension validation.\n\n## Approach\nCreate \\`src/embedding/pipeline.rs\\` per PRD Section 4.4. **The pipeline is async.**\n\n**Constants (per PRD):**\n```rust\nconst BATCH_SIZE: usize = 32; // texts per Ollama API call\nconst DB_PAGE_SIZE: usize = 500; // keyset pagination page size\nconst EXPECTED_DIMS: usize = 768; // nomic-embed-text dimensions\nconst CHUNK_MAX_CHARS: usize = 32_000; // max chars per chunk\nconst CHUNK_OVERLAP_CHARS: usize = 500; // overlap between chunks\n```\n\n**Core async function:**\n```rust\npub async fn embed_documents(\n conn: &Connection,\n client: &OllamaClient,\n selection: EmbedSelection,\n concurrency: usize, // max in-flight HTTP requests\n progress_callback: Option<Box<dyn Fn(usize, usize)>>,\n) -> Result<EmbedResult>\n```\n\n**EmbedSelection:** Pending | RetryFailed\n**EmbedResult:** { embedded, failed, skipped }\n\n**Algorithm (per PRD):**\n1. count_pending_documents(conn, selection) for progress total\n2. Keyset pagination loop: find_pending_documents(conn, DB_PAGE_SIZE, last_id, selection)\n3. For each page:\n a. Begin transaction\n b. For each doc: clear_document_embeddings(&tx, doc.id), split_into_chunks(&doc.content)\n c. Build ChunkWork items with doc_hash + chunk_hash\n d. Commit clearing transaction\n4. Batch ChunkWork texts into Ollama calls (BATCH_SIZE=32)\n5. Use **FuturesUnordered** for concurrent HTTP, cap at \\`concurrency\\`\n6. collect_writes() in per-batch transactions: validate dims (768), store LE bytes, write metadata\n7. On error: record_embedding_error per chunk (not abort)\n8. Advance keyset cursor\n\n**ChunkWork struct:**\n```rust\nstruct ChunkWork {\n doc_id: i64,\n chunk_index: usize,\n doc_hash: String, // SHA-256 of FULL document (staleness detection)\n chunk_hash: String, // SHA-256 of THIS chunk (provenance)\n text: String,\n}\n```\n\n**Splitting:** split_into_chunks(content) -> Vec<(usize, String)>\n- Documents <= CHUNK_MAX_CHARS: single chunk (index 0)\n- Longer: split at paragraph boundaries (\\\\n\\\\n), fallback to sentence/word, with CHUNK_OVERLAP_CHARS overlap\n\n**Storage:** embeddings as raw LE bytes, rowid = encode_rowid(doc_id, chunk_idx)\n**Staleness detection:** uses document_hash (not chunk_hash) because it's document-level\n\nAlso create \\`src/embedding/change_detector.rs\\` (referenced in PRD module structure):\n```rust\npub fn detect_embedding_changes(conn: &Connection) -> Result<Vec<PendingDocument>>;\n```\n\n## Acceptance Criteria\n- [ ] Pipeline is async (uses FuturesUnordered for concurrent HTTP)\n- [ ] concurrency parameter caps in-flight HTTP requests\n- [ ] progress_callback reports (processed, total)\n- [ ] New documents embedded, changed re-embedded, unchanged skipped\n- [ ] clear_document_embeddings before re-embedding (range delete vec0 + metadata)\n- [ ] Chunking at paragraph boundaries with 500-char overlap\n- [ ] Short documents (<32k chars) produce exactly 1 chunk\n- [ ] Embeddings stored as raw LE bytes in vec0\n- [ ] Rowids encoded via encode_rowid(doc_id, chunk_index)\n- [ ] Dimension validation: 768 floats per embedding (mismatch -> record error, not store)\n- [ ] Per-batch transactions for writes\n- [ ] Errors recorded in embedding_metadata per chunk (last_error, attempt_count)\n- [ ] Keyset pagination (d.id > last_id, not OFFSET)\n- [ ] Pending detection uses document_hash (not chunk_hash)\n- [ ] \\`cargo build\\` succeeds\n\n## Files\n- \\`src/embedding/pipeline.rs\\` — new file (async)\n- \\`src/embedding/change_detector.rs\\` — new file\n- \\`src/embedding/mod.rs\\` — add \\`pub mod pipeline; pub mod change_detector;\\` + re-exports\n\n## TDD Loop\nRED: Unit tests for chunking:\n- \\`test_short_document_single_chunk\\` — <32k produces [(0, full_content)]\n- \\`test_long_document_multiple_chunks\\` — >32k splits at paragraph boundaries\n- \\`test_chunk_overlap\\` — adjacent chunks share 500-char overlap\n- \\`test_no_paragraph_boundary\\` — falls back to char boundary\nIntegration tests need Ollama or mock.\nGREEN: Implement split_into_chunks, embed_documents (async)\nVERIFY: \\`cargo test pipeline\\`\n\n## Edge Cases\n- Empty document content_text: skip (don't embed)\n- No paragraph boundaries: split at CHUNK_MAX_CHARS with overlap\n- Ollama error for one batch: record error per chunk, continue with next batch\n- Dimension mismatch (model returns 512 instead of 768): record error, don't store corrupt data\n- Document deleted between pagination and embedding: skip gracefully","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:26:34.093701Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:58:58.908585Z","closed_at":"2026-01-30T17:58:58.908525Z","close_reason":"Implemented embedding pipeline: chunking at paragraph boundaries with 500-char overlap, change detector (keyset pagination, hash-based staleness), async embed via Ollama with batch processing, dimension validation, per-chunk error recording, LE byte vector storage. 7 chunking tests pass. 289 total tests.","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-am7","depends_on_id":"bd-1y8","type":"blocks","created_at":"2026-01-30T15:29:24.697418Z","created_by":"tayloreernisse"},{"issue_id":"bd-am7","depends_on_id":"bd-2ac","type":"blocks","created_at":"2026-01-30T15:29:24.732567Z","created_by":"tayloreernisse"},{"issue_id":"bd-am7","depends_on_id":"bd-335","type":"blocks","created_at":"2026-01-30T15:29:24.660199Z","created_by":"tayloreernisse"}]}
{"id":"bd-apmo","title":"OBSERV: Create migration 014 for sync_runs enrichment","description":"## Background\nThe sync_runs table (created in migration 001) has columns id, started_at, heartbeat_at, finished_at, status, command, error, metrics_json but NOTHING writes to it. This migration adds columns for the observability correlation ID and aggregate counts, enabling queryable sync history.\n\n## Approach\nCreate migrations/014_sync_runs_enrichment.sql:\n\n```sql\n-- Migration 014: sync_runs enrichment for observability\n-- Adds correlation ID and aggregate counts for queryable sync history\n\nALTER TABLE sync_runs ADD COLUMN run_id TEXT;\nALTER TABLE sync_runs ADD COLUMN total_items_processed INTEGER DEFAULT 0;\nALTER TABLE sync_runs ADD COLUMN total_errors INTEGER DEFAULT 0;\n\n-- Index for correlation queries (find run by run_id from logs)\nCREATE INDEX IF NOT EXISTS idx_sync_runs_run_id ON sync_runs(run_id);\n```\n\nMigration naming convention: check migrations/ directory. Current latest is 013_resource_event_watermarks.sql. Next is 014.\n\nNote: SQLite ALTER TABLE ADD COLUMN is always safe -- it sets NULL for existing rows. DEFAULT 0 applies to new INSERTs only.\n\n## Acceptance Criteria\n- [ ] Migration 014 applies cleanly on a fresh DB (all migrations 001-014)\n- [ ] Migration 014 applies cleanly on existing DB with 001-013 already applied\n- [ ] sync_runs table has run_id TEXT column\n- [ ] sync_runs table has total_items_processed INTEGER DEFAULT 0 column\n- [ ] sync_runs table has total_errors INTEGER DEFAULT 0 column\n- [ ] idx_sync_runs_run_id index exists\n- [ ] Existing sync_runs rows (if any) have NULL run_id, 0 for counts\n- [ ] cargo clippy --all-targets -- -D warnings passes (no code changes, but verify migration is picked up)\n\n## Files\n- migrations/014_sync_runs_enrichment.sql (new file)\n\n## TDD Loop\nRED:\n - test_migration_014_applies: apply all migrations on fresh in-memory DB, query sync_runs schema\n - test_migration_014_idempotent: CREATE INDEX IF NOT EXISTS makes re-run safe; ALTER TABLE ADD COLUMN is NOT idempotent in SQLite (will error). Consider: skip this test or use IF NOT EXISTS workaround\nGREEN: Create migration file\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- ALTER TABLE ADD COLUMN in SQLite: NOT idempotent. Running migration twice will error \"duplicate column name.\" The migration system should prevent re-runs, but IF NOT EXISTS is not available for ALTER TABLE in SQLite. Rely on migration tracking.\n- Migration numbering conflict: if another PR adds 014 first, renumber to 015. Check before merging.\n- metrics_json already exists (from migration 001): we don't touch it. The new columns supplement it with queryable aggregates.","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-04T15:54:51.311879Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:34:05.309761Z","closed_at":"2026-02-04T17:34:05.309714Z","close_reason":"Created migration 014 adding run_id TEXT, total_items_processed INTEGER, total_errors INTEGER to sync_runs, with idx_sync_runs_run_id index","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-apmo","depends_on_id":"bd-3pz","type":"parent-child","created_at":"2026-02-04T15:54:51.314770Z","created_by":"tayloreernisse"}]}
{"id":"bd-bjo","title":"Implement vector search function","description":"## Background\nVector search queries the sqlite-vec virtual table for nearest-neighbor documents. Because documents may have multiple chunks, the raw KNN results need deduplication by document_id (keeping the best/lowest distance per document). The function over-fetches 3x to ensure enough unique documents after dedup.\n\n## Approach\nCreate `src/search/vector.rs`:\n\n```rust\npub struct VectorResult {\n pub document_id: i64,\n pub distance: f64, // Lower = closer match\n}\n\n/// Search documents using sqlite-vec KNN query.\n/// Over-fetches 3x limit to handle chunk dedup.\npub fn search_vector(\n conn: &Connection,\n query_embedding: &[f32], // 768-dim embedding of search query\n limit: usize,\n) -> Result<Vec<VectorResult>>\n```\n\n**SQL (KNN query):**\n```sql\nSELECT rowid, distance\nFROM embeddings\nWHERE embedding MATCH ?\n AND k = ?\nORDER BY distance\n```\n\n**Algorithm:**\n1. Convert query_embedding to raw LE bytes\n2. Execute KNN with k = limit * 3 (over-fetch for dedup)\n3. Decode each rowid via decode_rowid() -> (document_id, chunk_index)\n4. Group by document_id, keep minimum distance (best chunk)\n5. Sort by distance ascending\n6. Take first `limit` results\n\n## Acceptance Criteria\n- [ ] Returns deduplicated document-level results (not chunk-level)\n- [ ] Best chunk distance kept per document (lowest distance wins)\n- [ ] KNN with k parameter (3x limit)\n- [ ] Query embedding passed as raw LE bytes\n- [ ] Results sorted by distance ascending (closest first)\n- [ ] Returns at most `limit` results\n- [ ] Empty embeddings table returns empty Vec\n- [ ] `cargo build` succeeds\n\n## Files\n- `src/search/vector.rs` — new file\n- `src/search/mod.rs` — add `pub use vector::{search_vector, VectorResult};`\n\n## TDD Loop\nRED: Integration tests need sqlite-vec + seeded embeddings:\n- `test_vector_search_basic` — finds nearest document\n- `test_vector_search_dedup` — multi-chunk doc returns once with best distance\n- `test_vector_search_empty` — empty table returns empty\n- `test_vector_search_limit` — respects limit parameter\nGREEN: Implement search_vector\nVERIFY: `cargo test vector`\n\n## Edge Cases\n- All chunks belong to same document: returns single result\n- Query embedding wrong dimension: sqlite-vec may error — handle gracefully\n- Over-fetch returns fewer than limit unique docs: return what we have\n- Distance = 0.0: exact match (valid result)","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:26:50.270357Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:44:56.233611Z","closed_at":"2026-01-30T17:44:56.233512Z","close_reason":"Implemented search_vector with KNN query, 3x over-fetch, chunk dedup. 3 tests pass.","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-bjo","depends_on_id":"bd-1y8","type":"blocks","created_at":"2026-01-30T15:29:24.842469Z","created_by":"tayloreernisse"},{"issue_id":"bd-bjo","depends_on_id":"bd-2ac","type":"blocks","created_at":"2026-01-30T15:29:24.878048Z","created_by":"tayloreernisse"}]}
{"id":"bd-cbo","title":"[CP1] Cargo.toml updates - async-stream and futures","description":"Add required dependencies for async pagination streams.\n\n## Changes\nAdd to Cargo.toml:\n- async-stream = \"0.3\"\n- futures = \"0.3\"\n\n## Why\nThe pagination methods use async generators which require async-stream crate.\nfutures crate provides StreamExt for consuming the streams.\n\n## Done When\n- cargo check passes with new deps\n- No unused dependency warnings\n\nFiles: Cargo.toml","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:42:31.143927Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.661666Z","deleted_at":"2026-01-25T17:02:01.661662Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-cq2","title":"[CP1] Integration tests for label linkage","description":"Integration tests verifying label linkage and stale removal.\n\n## Tests (tests/label_linkage_tests.rs)\n\n- clears_existing_labels_before_linking_new_set\n- removes_stale_label_links_on_issue_update\n- handles_issue_with_all_labels_removed\n- preserves_labels_that_still_exist\n\n## Test Scenario\n1. Create issue with labels [A, B]\n2. Verify issue_labels has links to A and B\n3. Update issue with labels [B, C]\n4. Verify A link removed, B preserved, C added\n\n## Why This Matters\nThe clear-and-relink pattern ensures GitLab reality is reflected locally.\nIf we only INSERT, removed labels would persist incorrectly.\n\nFiles: tests/label_linkage_tests.rs\nDone when: Stale label links correctly removed on resync","status":"tombstone","priority":3,"issue_type":"task","created_at":"2026-01-25T16:59:10.665771Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:02.062192Z","deleted_at":"2026-01-25T17:02:02.062188Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-czk","title":"Add entity_references table to migration 010","description":"## Background\nThe entity_references table is now part of migration 011 (combined with resource event tables and dependent fetch queue). This bead is satisfied by bd-hu3 since the entity_references table schema is included in the same migration.\n\n## Approach\nThis bead's work is folded into bd-hu3 (Write migration 011). The entity_references table from Phase B spec §2.2 is included in migrations/011_resource_events.sql alongside the event tables and queue.\n\nThe entity_references schema includes:\n- source/target entity type + id with reference_type and source_method\n- Unresolved reference support (target_entity_id NULL with target_project_path + target_entity_iid)\n- UNIQUE constraint using COALESCE for nullable columns\n- Partial indexes for source, target (where not null), and unresolved refs\n\nNo separate migration file needed — this is in 011.\n\n## Acceptance Criteria\n- [ ] entity_references table exists in migration 011 (verified by bd-hu3)\n- [ ] UNIQUE constraint handles NULL columns via COALESCE\n- [ ] Indexes created: source composite, target composite (partial), unresolved (partial)\n- [ ] reference_type CHECK includes 'closes', 'mentioned', 'related'\n- [ ] source_method CHECK includes 'api_closes_issues', 'api_state_event', 'system_note_parse'\n\n## Files\n- migrations/011_resource_events.sql (part of bd-hu3)\n\n## TDD Loop\nCovered by bd-hu3's test_migration_011_entity_references_dedup test.\n\nVERIFY: `cargo test migration_tests -- --nocapture`\n\n## Edge Cases\n- Same as bd-hu3's entity_references edge cases","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-02T21:32:33.506883Z","created_by":"tayloreernisse","updated_at":"2026-02-02T22:42:06.104237Z","closed_at":"2026-02-02T22:42:06.104190Z","close_reason":"Work folded into bd-hu3 (migration 011 includes entity_references table)","compaction_level":0,"original_size":0,"labels":["gate-2","phase-b","schema"]}
{"id":"bd-dty","title":"Implement timeline robot mode JSON output","description":"## Background\n\nThe robot mode JSON output for timeline must follow the established pattern from other commands: `{ok: true, data: {...}, meta: {...}}`. This bead defines the exact JSON schema and implements the serialization.\n\n**Spec reference:** `docs/phase-b-temporal-intelligence.md` Section 3.5 (Robot Mode JSON). The JSON output structure MUST match the spec exactly -- this is the contract for AI agent consumers.\n\n## Approach\n\nCreate `print_timeline_json()` in `src/cli/commands/timeline.rs`:\n\n```rust\n#[derive(Serialize)]\nstruct TimelineJsonOutput {\n ok: bool,\n data: TimelineJsonData,\n meta: TimelineJsonMeta,\n}\n\n#[derive(Serialize)]\nstruct TimelineJsonData {\n query: String,\n event_count: usize,\n seed_entities: Vec<SeedEntityJson>,\n expanded_entities: Vec<ExpandedEntityJson>,\n unresolved_references: Vec<UnresolvedRefJson>,\n events: Vec<TimelineEventJson>,\n}\n\n#[derive(Serialize)]\nstruct TimelineJsonMeta {\n search_mode: String, // \"lexical\"\n expansion_depth: u32,\n expand_mentions: bool,\n total_entities: usize, // seeds + expanded\n total_events: usize, // before limit\n evidence_notes_included: usize,\n unresolved_references: usize,\n showing: usize, // after limit\n}\n\n/// Spec Section 3.5: seed_entities format\n#[derive(Serialize)]\nstruct SeedEntityJson {\n #[serde(rename = \"type\")]\n entity_type: String, // \"issue\" | \"merge_request\"\n iid: i64,\n project: String, // NOTE: \"project\" not \"project_path\"\n}\n\n/// Spec Section 3.5: expanded_entities format with nested \"via\" object\n#[derive(Serialize)]\nstruct ExpandedEntityJson {\n #[serde(rename = \"type\")]\n entity_type: String,\n iid: i64,\n project: String,\n depth: u32,\n via: ViaJson, // NESTED per spec -- not flat fields\n}\n\n/// Spec Section 3.5: the \"via\" provenance object\n#[derive(Serialize)]\nstruct ViaJson {\n from: SeedEntityJson, // which entity referenced this one\n reference_type: String, // \"closes\", \"mentioned\", \"related\"\n source_method: String, // \"api\", \"note_parse\", etc.\n}\n\n/// Spec Section 3.5: unresolved_references format\n#[derive(Serialize)]\nstruct UnresolvedRefJson {\n source: SeedEntityJson, // entity containing the reference\n target_project: String,\n target_type: String,\n target_iid: i64,\n reference_type: String,\n}\n\n/// Spec Section 3.5: events array element format\n#[derive(Serialize)]\nstruct TimelineEventJson {\n timestamp: String, // ISO 8601\n entity_type: String,\n entity_iid: i64,\n project: String, // NOTE: \"project\" not \"project_path\"\n event_type: String, // flat string: \"created\", \"state_changed\", etc.\n summary: String,\n actor: Option<String>,\n url: Option<String>, // web URL for the event (spec requirement)\n is_seed: bool,\n details: Option<serde_json::Value>, // event-type-specific details (spec Section 3.5)\n}\n```\n\n### JSON Key Names (from spec)\n\nThe spec uses `\"project\"` not `\"project_path\"`, and `\"type\"` not `\"entity_type\"` at the top level of entity objects. Use `#[serde(rename = \"type\")]` to achieve this.\n\n### Details Object\n\nThe spec (Section 3.5) shows a `details` object on events with type-specific fields:\n- created: `{ \"labels\": [...] }`\n- note_evidence: `{ \"note_id\": 12345, \"snippet\": \"...\" }`\n- state_changed: `{ \"state\": \"closed\" }`\n- label_added: `{ \"label\": \"security-review\" }`\n\n### Timestamp Conversion\n\nAll internal timestamps are ms epoch UTC. Robot output uses ISO 8601:\n```rust\nuse crate::core::time::ms_to_iso;\nlet iso = ms_to_iso(event.timestamp); // \"2024-01-15T10:30:00Z\"\n```\n\n### Example Output (from spec Section 3.5)\n\n```json\n{\n \"ok\": true,\n \"data\": {\n \"query\": \"auth migration\",\n \"event_count\": 12,\n \"seed_entities\": [\n { \"type\": \"issue\", \"iid\": 234, \"project\": \"group/repo\" }\n ],\n \"expanded_entities\": [\n {\n \"type\": \"issue\", \"iid\": 299, \"project\": \"group/repo\",\n \"depth\": 1,\n \"via\": {\n \"from\": { \"type\": \"merge_request\", \"iid\": 567, \"project\": \"group/repo\" },\n \"reference_type\": \"closes\",\n \"source_method\": \"api_closes_issues\"\n }\n }\n ],\n \"unresolved_references\": [\n {\n \"source\": { \"type\": \"merge_request\", \"iid\": 567, \"project\": \"group/repo\" },\n \"target_project\": \"group/other-repo\",\n \"target_type\": \"issue\",\n \"target_iid\": 42,\n \"reference_type\": \"mentioned\"\n }\n ],\n \"events\": [\n {\n \"timestamp\": \"2024-03-15T10:00:00Z\",\n \"entity_type\": \"issue\",\n \"entity_iid\": 234,\n \"project\": \"group/repo\",\n \"event_type\": \"created\",\n \"summary\": \"Migrate to OAuth2\",\n \"actor\": \"alice\",\n \"url\": \"https://gitlab.com/group/repo/-/issues/234\",\n \"is_seed\": true,\n \"details\": { \"labels\": [\"auth\", \"breaking-change\"] }\n }\n ]\n },\n \"meta\": {\n \"search_mode\": \"lexical\",\n \"expansion_depth\": 1,\n \"expand_mentions\": false,\n \"total_entities\": 3,\n \"total_events\": 12,\n \"evidence_notes_included\": 4,\n \"unresolved_references\": 1\n }\n}\n```\n\n## Acceptance Criteria\n\n- [ ] `lore --robot timeline \"query\"` outputs valid JSON to stdout\n- [ ] JSON matches the `{ok, data, meta}` envelope pattern\n- [ ] All timestamps in ISO 8601 format (not ms epoch)\n- [ ] Entity objects use `\"type\"` key (not `\"entity_type\"`) and `\"project\"` key (not `\"project_path\"`) per spec\n- [ ] `expanded_entities` uses nested `\"via\"` object with `\"from\"`, `\"reference_type\"`, `\"source_method\"` per spec Section 3.5\n- [ ] `unresolved_references` includes `\"source\"` entity object per spec\n- [ ] Events include `\"url\"` field when available\n- [ ] Events include `\"details\"` object with event-type-specific fields per spec\n- [ ] meta.total_events = count before limit; meta.showing = count after limit\n- [ ] meta.evidence_notes_included counts NoteEvidence events\n- [ ] meta.search_mode = \"lexical\" (FTS5-based)\n- [ ] unresolved_references listed in both data and meta (data has details, meta has count)\n- [ ] `cargo check --all-targets` passes\n- [ ] `cargo clippy --all-targets -- -D warnings` passes\n\n## Files\n\n- `src/cli/commands/timeline.rs` (add print_timeline_json function + JSON structs)\n- `src/cli/commands/mod.rs` (re-export print_timeline_json)\n\n## TDD Loop\n\nRED: No unit test for JSON output shape -- verify with:\n```bash\nlore --robot timeline \"test\" | jq '.data.expanded_entities[0].via.from'\nlore --robot timeline \"test\" | jq '.data.events[0].url'\nlore --robot timeline \"test\" | jq '.data.events[0].details'\nlore --robot timeline \"test\" | jq '.meta'\n```\n\nGREEN: Implement the JSON structs and serialization.\n\nVERIFY: `cargo check --all-targets && lore --robot timeline \"test\" | python3 -m json.tool`\n\n## Edge Cases\n\n- Empty results: events array is [], meta.showing=0, meta.total_events=0\n- Very long query string: include full query in data.query (no truncation)\n- Unicode in event summaries: serde handles UTF-8 natively\n- Null actor: serializes as `\"actor\": null` (not omitted)\n- Null url: serializes as `\"url\": null` (not omitted)\n- Events without details: `\"details\": null`\n- source_method value: use actual DB values (\"api\", \"note_parse\", \"description_parse\") not spec's original values -- migration 011 CHECK constraint is the source of truth","status":"open","priority":2,"issue_type":"task","created_at":"2026-02-02T21:33:28.374690Z","created_by":"tayloreernisse","updated_at":"2026-02-05T19:09:40.287155Z","compaction_level":0,"original_size":0,"labels":["cli","gate-3","phase-b","robot-mode"],"dependencies":[{"issue_id":"bd-dty","depends_on_id":"bd-3as","type":"blocks","created_at":"2026-02-02T21:33:37.703617Z","created_by":"tayloreernisse"},{"issue_id":"bd-dty","depends_on_id":"bd-ike","type":"parent-child","created_at":"2026-02-02T21:33:28.377349Z","created_by":"tayloreernisse"}]}
{"id":"bd-epj","title":"[CP0] Config loading with Zod validation","description":"## Background\n\nConfig loading is critical infrastructure - every CLI command needs the config. Uses Zod for schema validation with sensible defaults. Must handle missing files gracefully with typed errors.\n\nReference: docs/prd/checkpoint-0.md sections \"Configuration Schema\", \"Config Resolution Order\"\n\n## Approach\n\n**src/core/config.ts:**\n```typescript\nimport { z } from 'zod';\nimport { readFileSync } from 'node:fs';\nimport { ConfigNotFoundError, ConfigValidationError } from './errors';\nimport { getConfigPath } from './paths';\n\nexport const ConfigSchema = z.object({\n gitlab: z.object({\n baseUrl: z.string().url(),\n tokenEnvVar: z.string().default('GITLAB_TOKEN'),\n }),\n projects: z.array(z.object({\n path: z.string().min(1),\n })).min(1),\n sync: z.object({\n backfillDays: z.number().int().positive().default(14),\n staleLockMinutes: z.number().int().positive().default(10),\n heartbeatIntervalSeconds: z.number().int().positive().default(30),\n cursorRewindSeconds: z.number().int().nonnegative().default(2),\n primaryConcurrency: z.number().int().positive().default(4),\n dependentConcurrency: z.number().int().positive().default(2),\n }).default({}),\n storage: z.object({\n dbPath: z.string().optional(),\n backupDir: z.string().optional(),\n compressRawPayloads: z.boolean().default(true),\n }).default({}),\n embedding: z.object({\n provider: z.literal('ollama').default('ollama'),\n model: z.string().default('nomic-embed-text'),\n baseUrl: z.string().url().default('http://localhost:11434'),\n concurrency: z.number().int().positive().default(4),\n }).default({}),\n});\n\nexport type Config = z.infer<typeof ConfigSchema>;\n\nexport function loadConfig(cliOverride?: string): Config {\n const path = getConfigPath(cliOverride);\n // throws ConfigNotFoundError if missing\n // throws ConfigValidationError if invalid\n}\n```\n\n## Acceptance Criteria\n\n- [ ] `loadConfig()` returns validated Config object\n- [ ] `loadConfig()` throws ConfigNotFoundError if file missing\n- [ ] `loadConfig()` throws ConfigValidationError with Zod errors if invalid\n- [ ] Empty optional fields get default values\n- [ ] projects array must have at least 1 item\n- [ ] gitlab.baseUrl must be valid URL\n- [ ] All number fields must be positive integers\n- [ ] tests/unit/config.test.ts passes (8 tests)\n\n## Files\n\nCREATE:\n- src/core/config.ts\n- tests/unit/config.test.ts\n- tests/fixtures/mock-responses/valid-config.json\n- tests/fixtures/mock-responses/invalid-config.json\n\n## TDD Loop\n\nRED:\n```typescript\n// tests/unit/config.test.ts\ndescribe('Config', () => {\n it('loads config from file path')\n it('throws ConfigNotFoundError if file missing')\n it('throws ConfigValidationError if required fields missing')\n it('validates project paths are non-empty strings')\n it('applies default values for optional fields')\n it('loads from XDG path by default')\n it('respects GI_CONFIG_PATH override')\n it('respects --config flag override')\n})\n```\n\nGREEN: Implement loadConfig() function\n\nVERIFY: `npm run test -- tests/unit/config.test.ts`\n\n## Edge Cases\n\n- JSON parse error should wrap in ConfigValidationError\n- Zod error messages should be human-readable\n- File exists but empty → ConfigValidationError\n- File has extra fields → should pass (Zod strips by default)","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-24T16:09:49.091078Z","created_by":"tayloreernisse","updated_at":"2026-01-25T03:04:32.592139Z","closed_at":"2026-01-25T03:04:32.592003Z","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-epj","depends_on_id":"bd-gg1","type":"blocks","created_at":"2026-01-24T16:13:07.835800Z","created_by":"tayloreernisse"}]}
{"id":"bd-gba","title":"OBSERV: Add tracing-appender dependency to Cargo.toml","description":"## Background\ntracing-appender provides non-blocking, daily-rotating file writes for the tracing ecosystem. It's the canonical solution used by tokio-rs projects. We need it for the file logging layer (Phase 1) that writes JSON logs to ~/.local/share/lore/logs/.\n\n## Approach\nAdd tracing-appender to [dependencies] in Cargo.toml (line ~54, after the existing tracing-subscriber entry):\n\n```toml\ntracing-appender = \"0.2\"\n```\n\nAlso add the \"json\" feature to tracing-subscriber since the file layer and --log-format json both need it:\n\n```toml\ntracing-subscriber = { version = \"0.3\", features = [\"env-filter\", \"json\"] }\n```\n\nCurrent tracing deps (Cargo.toml lines 53-54):\n tracing = \"0.1\"\n tracing-subscriber = { version = \"0.3\", features = [\"env-filter\"] }\n\n## Acceptance Criteria\n- [ ] cargo check --all-targets succeeds with tracing-appender available\n- [ ] tracing_appender::rolling::daily() is importable\n- [ ] tracing-subscriber json feature is available (fmt::layer().json() compiles)\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- Cargo.toml (modify lines 53-54 region)\n\n## TDD Loop\nRED: Not applicable (dependency addition)\nGREEN: Add deps, run cargo check\nVERIFY: cargo check --all-targets && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- Ensure tracing-appender 0.2 is compatible with tracing-subscriber 0.3 (both from tokio-rs/tracing monorepo, always compatible)\n- The \"json\" feature on tracing-subscriber pulls in serde_json, which is already a dependency","status":"closed","priority":1,"issue_type":"task","created_at":"2026-02-04T15:53:55.364100Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:10:22.520471Z","closed_at":"2026-02-04T17:10:22.520423Z","close_reason":"Added tracing-appender 0.2 and json feature to tracing-subscriber","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-gba","depends_on_id":"bd-2nx","type":"parent-child","created_at":"2026-02-04T15:53:55.366945Z","created_by":"tayloreernisse"}]}
{"id":"bd-gg1","title":"[CP0] Core utilities - paths, time, errors, logger","description":"## Background\n\nCore utilities provide the foundation for all other modules. Path resolution enables XDG-compliant config/data locations. Time utilities ensure consistent timestamp handling (ms epoch for DB, ISO for API). Error classes provide typed exceptions for clean error handling. Logger provides structured logging to stderr.\n\nReference: docs/prd/checkpoint-0.md sections \"Config + Data Locations\", \"Timestamp Convention\", \"Error Classes\", \"Logging Configuration\"\n\n## Approach\n\n**src/core/paths.ts:**\n- `getConfigPath(cliOverride?)`: resolution order is CLI flag → GI_CONFIG_PATH env → XDG default → local fallback\n- `getDataDir()`: uses XDG_DATA_HOME or ~/.local/share/gi\n- `getDbPath(configOverride?)`: returns data dir + data.db\n- `getBackupDir(configOverride?)`: returns data dir + backups/\n\n**src/core/time.ts:**\n- `isoToMs(isoString)`: converts GitLab API ISO 8601 → ms epoch\n- `msToIso(ms)`: converts ms epoch → ISO 8601\n- `nowMs()`: returns Date.now() for DB storage\n\n**src/core/errors.ts:**\nError hierarchy (all extend GiError base class with code and cause):\n- ConfigNotFoundError, ConfigValidationError\n- GitLabAuthError, GitLabNotFoundError, GitLabRateLimitError, GitLabNetworkError\n- DatabaseLockError, MigrationError\n- TokenNotSetError\n\n**src/core/logger.ts:**\n- pino logger to stderr (fd 2) with pino-pretty in dev\n- Child loggers: dbLogger, gitlabLogger, configLogger\n- LOG_LEVEL env var support (default: info)\n\n## Acceptance Criteria\n\n- [ ] `getConfigPath()` returns ~/.config/gi/config.json when no overrides\n- [ ] `getConfigPath()` respects GI_CONFIG_PATH env var\n- [ ] `getConfigPath(\"./custom.json\")` returns \"./custom.json\"\n- [ ] `isoToMs(\"2024-01-27T00:00:00.000Z\")` returns 1706313600000\n- [ ] `msToIso(1706313600000)` returns \"2024-01-27T00:00:00.000Z\"\n- [ ] All error classes have correct code property\n- [ ] Logger outputs to stderr (not stdout)\n- [ ] tests/unit/paths.test.ts passes\n- [ ] tests/unit/errors.test.ts passes\n\n## Files\n\nCREATE:\n- src/core/paths.ts\n- src/core/time.ts\n- src/core/errors.ts\n- src/core/logger.ts\n- tests/unit/paths.test.ts\n- tests/unit/errors.test.ts\n\n## TDD Loop\n\nRED: Write tests first\n```typescript\n// tests/unit/paths.test.ts\ndescribe('getConfigPath', () => {\n it('uses XDG_CONFIG_HOME if set')\n it('falls back to ~/.config/gi if XDG not set')\n it('prefers --config flag over environment')\n it('prefers environment over XDG default')\n it('falls back to local gi.config.json in dev')\n})\n```\n\nGREEN: Implement paths.ts, errors.ts, time.ts, logger.ts\n\nVERIFY: `npm run test -- tests/unit/paths.test.ts tests/unit/errors.test.ts`\n\n## Edge Cases\n\n- XDG_CONFIG_HOME may not exist - don't create, just return path\n- existsSync() check for local fallback - only return if file exists\n- Time conversion must handle timezone edge cases - always use UTC\n- Logger must work even if pino-pretty not installed (production)","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-24T16:09:48.604382Z","created_by":"tayloreernisse","updated_at":"2026-01-25T02:53:26.527997Z","closed_at":"2026-01-25T02:53:26.527862Z","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-gg1","depends_on_id":"bd-327","type":"blocks","created_at":"2026-01-24T16:13:07.368187Z","created_by":"tayloreernisse"}]}
{"id":"bd-hbo","title":"[CP1] Discussion ingestion module","description":"## Background\n\nDiscussion ingestion fetches all discussions and notes for a single issue. It is called as part of dependent sync - only for issues whose `updated_at` has advanced beyond `discussions_synced_for_updated_at`. After successful sync, it updates the watermark to prevent redundant refetches.\n\n## Approach\n\n### Module: src/ingestion/discussions.rs\n\n### Key Structs\n\n```rust\n#[derive(Debug, Default)]\npub struct IngestDiscussionsResult {\n pub discussions_fetched: usize,\n pub discussions_upserted: usize,\n pub notes_upserted: usize,\n pub system_notes_count: usize,\n}\n```\n\n### Main Function\n\n```rust\npub async fn ingest_issue_discussions(\n conn: &Connection,\n client: &GitLabClient,\n config: &Config,\n project_id: i64, // Local DB project ID\n gitlab_project_id: i64, // GitLab project ID\n issue_iid: i64,\n local_issue_id: i64,\n issue_updated_at: i64, // For watermark update\n) -> Result<IngestDiscussionsResult>\n```\n\n### Logic\n\n1. Stream discussions via `client.paginate_issue_discussions()`\n2. For each discussion:\n - Begin transaction\n - Store raw payload (compressed based on config)\n - Transform to NormalizedDiscussion\n - Upsert discussion\n - Get local discussion ID\n - Transform notes via `transform_notes()`\n - For each note: store raw payload, upsert note\n - Track system_notes_count\n - Commit transaction\n3. After all discussions processed: `mark_discussions_synced(conn, local_issue_id, issue_updated_at)`\n\n### Helper Functions\n\n```rust\nfn upsert_discussion(conn, discussion, payload_id) -> Result<()>\nfn get_local_discussion_id(conn, project_id, gitlab_id) -> Result<i64>\nfn upsert_note(conn, discussion_id, note, payload_id) -> Result<()>\nfn mark_discussions_synced(conn, issue_id, issue_updated_at) -> Result<()>\n```\n\n### Critical Invariant\n\n`discussions_synced_for_updated_at` MUST be updated only AFTER all discussions are successfully synced. This watermark prevents redundant refetches on subsequent runs.\n\n## Acceptance Criteria\n\n- [ ] `ingest_issue_discussions` streams all discussions for an issue\n- [ ] Each discussion wrapped in transaction for atomicity\n- [ ] Raw payloads stored for discussions and notes\n- [ ] `discussions_synced_for_updated_at` updated after successful sync\n- [ ] System notes tracked in result.system_notes_count\n- [ ] Notes linked to correct discussion via local discussion ID\n\n## Files\n\n- src/ingestion/mod.rs (add `pub mod discussions;`)\n- src/ingestion/discussions.rs (create)\n\n## TDD Loop\n\nRED:\n```rust\n// tests/discussion_watermark_tests.rs\n#[tokio::test] async fn fetches_discussions_when_updated_at_advanced()\n#[tokio::test] async fn updates_watermark_after_successful_discussion_sync()\n#[tokio::test] async fn does_not_update_watermark_on_discussion_sync_failure()\n#[tokio::test] async fn stores_raw_payload_for_each_discussion()\n#[tokio::test] async fn stores_raw_payload_for_each_note()\n```\n\nGREEN: Implement ingest_issue_discussions with watermark logic\n\nVERIFY: `cargo test discussion_watermark`\n\n## Edge Cases\n\n- Issue with 0 discussions - mark synced anyway (empty is valid)\n- Discussion with 0 notes - should not happen per GitLab API (discussions always have >= 1 note)\n- Network failure mid-sync - watermark NOT updated, next run retries\n- individual_note=true discussions - have exactly 1 note","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-25T17:02:38.267582Z","created_by":"tayloreernisse","updated_at":"2026-01-25T22:52:47.500700Z","closed_at":"2026-01-25T22:52:47.500644Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-hbo","depends_on_id":"bd-1qf","type":"blocks","created_at":"2026-01-25T17:04:05.534265Z","created_by":"tayloreernisse"},{"issue_id":"bd-hbo","depends_on_id":"bd-2iq","type":"blocks","created_at":"2026-01-25T17:04:05.499474Z","created_by":"tayloreernisse"},{"issue_id":"bd-hbo","depends_on_id":"bd-xhz","type":"blocks","created_at":"2026-01-25T17:04:05.559260Z","created_by":"tayloreernisse"}]}
{"id":"bd-hrs","title":"Create migration 007_documents.sql","description":"## Background\nMigration 007 creates the document storage layer that Gate A's entire search pipeline depends on. It introduces 5 tables: `documents` (the searchable unit), `document_labels` and `document_paths` (for filtered search), and two queue tables (`dirty_sources`, `pending_discussion_fetches`) that drive incremental document regeneration and discussion fetching in Gate C. This is the most-depended-on bead in the project (6 downstream beads block on it).\n\n## Approach\nCreate `migrations/007_documents.sql` with the exact SQL from PRD Section 1.1. The schema is fully specified in the PRD — no design decisions remain.\n\nKey implementation details:\n- `documents` table has `UNIQUE(source_type, source_id)` constraint for upsert support\n- `document_labels` and `document_paths` use `WITHOUT ROWID` for compact storage\n- `dirty_sources` uses composite PK `(source_type, source_id)` with `ON CONFLICT` upsert semantics\n- `pending_discussion_fetches` uses composite PK `(project_id, noteable_type, noteable_iid)`\n- Both queue tables have `next_attempt_at` indexed for efficient backoff queries\n- `labels_hash` and `paths_hash` on documents enable write optimization (skip unchanged labels/paths)\n\nRegister the migration in `src/core/db.rs` by adding entry 7 to the `MIGRATIONS` array.\n\n## Acceptance Criteria\n- [ ] `migrations/007_documents.sql` file exists with all 5 CREATE TABLE statements\n- [ ] Migration applies cleanly on fresh DB (`cargo test migration_tests`)\n- [ ] Migration applies cleanly after CP2 schema (migrations 001-006 already applied)\n- [ ] All foreign keys enforced: `documents.project_id -> projects(id)`, `document_labels.document_id -> documents(id) ON DELETE CASCADE`, `document_paths.document_id -> documents(id) ON DELETE CASCADE`, `pending_discussion_fetches.project_id -> projects(id)`\n- [ ] All indexes created: `idx_documents_project_updated`, `idx_documents_author`, `idx_documents_source`, `idx_documents_hash`, `idx_document_labels_label`, `idx_document_paths_path`, `idx_dirty_sources_next_attempt`, `idx_pending_discussions_next_attempt`\n- [ ] `labels_hash TEXT NOT NULL DEFAULT ''` and `paths_hash TEXT NOT NULL DEFAULT ''` columns present on `documents`\n- [ ] Schema version 7 recorded in `schema_version` table\n- [ ] `cargo build` succeeds after registering migration in db.rs\n\n## Files\n- `migrations/007_documents.sql` — new file (copy exact SQL from PRD Section 1.1)\n- `src/core/db.rs` — add migration 7 to `MIGRATIONS` array\n\n## TDD Loop\nRED: Add migration to db.rs, run `cargo test migration_tests` — fails because SQL file missing\nGREEN: Create `migrations/007_documents.sql` with full schema\nVERIFY: `cargo test migration_tests && cargo build`\n\n## Edge Cases\n- Migration must be idempotent-safe if applied twice (INSERT into schema_version will fail on second run — this is expected and handled by the migration runner's version check)\n- `WITHOUT ROWID` tables (document_labels, document_paths) require explicit PK — already defined\n- `CHECK` constraint on `documents.source_type` must match exactly: `'issue','merge_request','discussion'`\n- `CHECK` constraint on `documents.truncated_reason` allows NULL or one of 4 specific values","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:25:25.734380Z","created_by":"tayloreernisse","updated_at":"2026-01-30T16:54:12.854351Z","closed_at":"2026-01-30T16:54:12.854149Z","close_reason":"Completed: migration 007_documents.sql with 5 tables (documents, document_labels, document_paths, dirty_sources, pending_discussion_fetches), 8 indexes, registered in db.rs, cargo build + migration tests pass","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-hrs","depends_on_id":"bd-3lc","type":"blocks","created_at":"2026-01-30T15:29:15.536304Z","created_by":"tayloreernisse"}]}
{"id":"bd-hu3","title":"Write migration 011: resource event tables, entity_references, and dependent fetch queue","description":"## Background\nPhase B needs three new event tables and a generic dependent fetch queue to power temporal queries (timeline, file-history, trace). These tables store structured event data from GitLab Resource Events APIs, replacing fragile system note parsing for state/label/milestone changes.\n\nMigration 010_chunk_config.sql already exists, so Phase B starts at migration 011.\n\n## Approach\nCreate migrations/011_resource_events.sql with the exact schema from the Phase B spec (§1.2 + §2.2):\n\n**Event tables:**\n- resource_state_events: state changes (opened/closed/reopened/merged/locked) with source_merge_request_id for \"closed by MR\" linking\n- resource_label_events: label add/remove with label_name\n- resource_milestone_events: milestone add/remove with milestone_title + milestone_id\n\n**Cross-reference table (Gate 2):**\n- entity_references: source/target entity pairs with reference_type (closes/mentioned/related), source_method provenance, and unresolved reference support (target_entity_id NULL with target_project_path + target_entity_iid)\n\n**Dependent fetch queue:**\n- pending_dependent_fetches: generic job queue with job_type IN ('resource_events', 'mr_closes_issues', 'mr_diffs'), locked_at crash recovery, exponential backoff via attempts + next_retry_at\n\n**All tables must have:**\n- CHECK constraints for entity exclusivity (issue XOR merge_request) on event tables\n- UNIQUE constraints (gitlab_id + project_id for events, composite for queue, multi-column for references)\n- Partial indexes (WHERE issue_id IS NOT NULL, WHERE target_entity_id IS NULL, etc.)\n- CASCADE deletes on project_id and entity FKs\n\nRegister in src/core/db.rs MIGRATIONS array:\n```rust\n(\"011\", include_str!(\"../../migrations/011_resource_events.sql\")),\n```\n\nEnd migration with:\n```sql\nINSERT INTO schema_version (version, applied_at, description)\nVALUES (11, strftime('%s', 'now') * 1000, 'Resource events, entity references, and dependent fetch queue');\n```\n\n## Acceptance Criteria\n- [ ] migrations/011_resource_events.sql exists with all 4 tables + indexes + constraints\n- [ ] src/core/db.rs MIGRATIONS array includes (\"011\", include_str!(...))\n- [ ] `cargo build` succeeds (migration SQL compiles into binary)\n- [ ] `cargo test migration` passes (migration applies cleanly on fresh DB)\n- [ ] All CHECK constraints enforced (issue XOR merge_request on event tables)\n- [ ] All UNIQUE constraints present (prevents duplicate events/refs/jobs)\n- [ ] entity_references UNIQUE handles NULL coalescing correctly\n- [ ] pending_dependent_fetches job_type CHECK includes all three types\n\n## Files\n- migrations/011_resource_events.sql (new)\n- src/core/db.rs (add to MIGRATIONS array, line ~46)\n\n## TDD Loop\nRED: Add test in tests/migration_tests.rs:\n- `test_migration_011_creates_event_tables` - verify all 4 tables exist after migration\n- `test_migration_011_entity_exclusivity_constraint` - verify CHECK rejects both NULL and both non-NULL for issue_id/merge_request_id\n- `test_migration_011_event_dedup` - verify UNIQUE(gitlab_id, project_id) rejects duplicate events\n- `test_migration_011_entity_references_dedup` - verify UNIQUE constraint with NULL coalescing\n- `test_migration_011_queue_dedup` - verify UNIQUE(project_id, entity_type, entity_iid, job_type)\n\nGREEN: Write the migration SQL + register in db.rs\n\nVERIFY: `cargo test migration_tests -- --nocapture`\n\n## Edge Cases\n- entity_references UNIQUE uses COALESCE for NULLable columns — test with both resolved and unresolved refs\n- pending_dependent_fetches job_type CHECK — ensure 'mr_diffs' is included (Gate 4 needs it)\n- SQLite doesn't enforce CHECK on INSERT OR REPLACE — verify constraint behavior\n- The entity exclusivity CHECK must allow exactly one of issue_id/merge_request_id to be non-NULL\n- Verify CASCADE deletes work (delete project → all events/refs/jobs deleted)","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-02T21:31:23.933894Z","created_by":"tayloreernisse","updated_at":"2026-02-03T16:06:28.918228Z","closed_at":"2026-02-03T16:06:28.917906Z","close_reason":"Already completed in prior session, re-closing after accidental reopen","compaction_level":0,"original_size":0,"labels":["gate-1","phase-b","schema"],"dependencies":[{"issue_id":"bd-hu3","depends_on_id":"bd-2zl","type":"parent-child","created_at":"2026-02-02T21:31:23.937375Z","created_by":"tayloreernisse"}]}
{"id":"bd-iba","title":"Add GitLab client MR pagination methods","description":"## Background\nGitLab client pagination for merge requests and discussions. Must support robust pagination with fallback chain because some GitLab instances/proxies strip headers.\n\n## Approach\nAdd to existing `src/gitlab/client.rs`:\n1. `MergeRequestPage` struct - Items + pagination metadata\n2. `parse_link_header_next()` - RFC 8288 Link header parsing\n3. `fetch_merge_requests_page()` - Single page fetch with metadata\n4. `paginate_merge_requests()` - Async stream for all MRs\n5. `paginate_mr_discussions()` - Async stream for MR discussions\n\n## Files\n- `src/gitlab/client.rs` - Add pagination methods\n\n## Acceptance Criteria\n- [ ] `MergeRequestPage` struct exists with `items`, `next_page`, `is_last_page`\n- [ ] `parse_link_header_next()` extracts `rel=\"next\"` URL from Link header\n- [ ] Pagination fallback chain: Link header > x-next-page > full-page heuristic\n- [ ] `paginate_merge_requests()` returns `Pin<Box<dyn Stream<Item = Result<GitLabMergeRequest>>>>`\n- [ ] `paginate_mr_discussions()` returns `Pin<Box<dyn Stream<Item = Result<GitLabDiscussion>>>>`\n- [ ] MR endpoint uses `scope=all&state=all` to include all MRs\n- [ ] `cargo test client` passes\n\n## TDD Loop\nRED: `cargo test fetch_merge_requests` -> method not found\nGREEN: Add pagination methods\nVERIFY: `cargo test client`\n\n## Struct Definitions\n```rust\n#[derive(Debug)]\npub struct MergeRequestPage {\n pub items: Vec<GitLabMergeRequest>,\n pub next_page: Option<u32>,\n pub is_last_page: bool,\n}\n```\n\n## Link Header Parsing (RFC 8288)\n```rust\n/// Parse Link header to extract rel=\"next\" URL.\nfn parse_link_header_next(headers: &reqwest::header::HeaderMap) -> Option<String> {\n headers\n .get(\"link\")\n .and_then(|v| v.to_str().ok())\n .and_then(|link_str| {\n // Format: <url>; rel=\"next\", <url>; rel=\"last\"\n for part in link_str.split(',') {\n let part = part.trim();\n if part.contains(\"rel=\\\"next\\\"\") || part.contains(\"rel=next\") {\n if let Some(start) = part.find('<') {\n if let Some(end) = part.find('>') {\n return Some(part[start + 1..end].to_string());\n }\n }\n }\n }\n None\n })\n}\n```\n\n## Pagination Fallback Chain\n```rust\nlet next_page = match (link_next, x_next_page, items.len() as u32 == per_page) {\n (Some(_), _, _) => Some(page + 1), // Link header present: continue\n (None, Some(np), _) => Some(np), // x-next-page present: use it\n (None, None, true) => Some(page + 1), // Full page, no headers: try next\n (None, None, false) => None, // Partial page: we're done\n};\n```\n\n## Fetch Single Page\n```rust\npub async fn fetch_merge_requests_page(\n &self,\n gitlab_project_id: i64,\n updated_after: Option<i64>,\n cursor_rewind_seconds: u32,\n page: u32,\n per_page: u32,\n) -> Result<MergeRequestPage> {\n let mut params = vec![\n (\"scope\", \"all\".to_string()),\n (\"state\", \"all\".to_string()),\n (\"order_by\", \"updated_at\".to_string()),\n (\"sort\", \"asc\".to_string()),\n (\"per_page\", per_page.to_string()),\n (\"page\", page.to_string()),\n ];\n // Apply cursor rewind for safety\n // ...\n}\n```\n\n## Async Stream Pattern\n```rust\npub fn paginate_merge_requests(\n &self,\n gitlab_project_id: i64,\n updated_after: Option<i64>,\n cursor_rewind_seconds: u32,\n) -> Pin<Box<dyn Stream<Item = Result<GitLabMergeRequest>> + Send + '_>> {\n Box::pin(async_stream::try_stream! {\n let mut page = 1u32;\n let per_page = 100u32;\n loop {\n let page_result = self.fetch_merge_requests_page(...).await?;\n for mr in page_result.items {\n yield mr;\n }\n if page_result.is_last_page {\n break;\n }\n match page_result.next_page {\n Some(np) => page = np,\n None => break,\n }\n }\n })\n}\n```\n\n## Edge Cases\n- `scope=all` required to include all MRs (not just authored by current user)\n- `state=all` required to include merged/closed (GitLab defaults may exclude)\n- `locked` state cannot be filtered server-side (use local SQL filtering)\n- Cursor rewind should clamp to 0 to avoid negative timestamps","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-26T22:06:41.633065Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:13:05.613625Z","closed_at":"2026-01-27T00:13:05.613440Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-iba","depends_on_id":"bd-5ta","type":"blocks","created_at":"2026-01-26T22:08:54.364647Z","created_by":"tayloreernisse"}]}
{"id":"bd-ike","title":"Epic: Gate 3 - Decision Timeline (lore timeline)","description":"## Background\nGate 3 is the flagship feature of Phase B — the \"lore timeline\" command that answers \"What happened with X?\" by producing a keyword-driven chronological narrative across issues, MRs, and discussions. This is the forcing function for the entire phase: if timeline works, the architecture is validated.\n\nThe query pipeline has 5 stages: SEED (FTS5 search) → HYDRATE (map docs to entities) → EXPAND (BFS cross-reference traversal) → COLLECT EVENTS → INTERLEAVE (chronological sort). Evidence-bearing notes from FTS5 are included as first-class timeline events to surface decision rationale, not just activity counts.\n\n## Architecture\n- **No new tables.** Timeline reads across existing tables at query time and produces a virtual event stream.\n- **TimelineEvent model:** Unified struct covering created/state_changed/label/milestone/merged/note_evidence/cross_referenced events.\n- **Expansion:** BFS over entity_references, depth-limited (default 1). Default follows closes+related edges; --expand-mentions adds mentioned edges (high fan-out).\n- **Evidence notes:** Top 10 FTS5-matched notes included as NoteEvidence events with ~200 char snippets.\n- **Dual output:** Human-readable colored format + robot JSON with expansion provenance.\n\n## Children (Execution Order)\n1. **bd-20e** [OPEN] — TimelineEvent model and TimelineEventType enum (src/core/timeline.rs types)\n2. **bd-32q** [OPEN] — Seed phase: FTS5 keyword search → entity IDs + evidence note candidates\n3. **bd-ypa** [OPEN] — Expand phase: BFS cross-reference expansion over entity_references\n4. **bd-3as** [OPEN] — Collect events: gather state/label/milestone/creation/merge events + interleave\n5. **bd-2f2** [OPEN] — Human output renderer (colored, spec §3.4 format)\n6. **bd-dty** [OPEN] — Robot mode JSON output (spec §3.5 format)\n7. **bd-1nf** [OPEN] — CLI wiring: register command with all flags (--depth, --since, --expand-mentions, -p, -n)\n\n## Gate Completion Criteria\n- [ ] `lore timeline <query>` returns chronologically ordered events\n- [ ] Seed entities found via FTS5 (issues, MRs, and notes)\n- [ ] State, label, milestone events interleaved from resource event tables\n- [ ] Entity creation and merge events included\n- [ ] Evidence notes included as note_evidence events (top 10 FTS5 matches)\n- [ ] Cross-reference expansion follows entity_references to configurable depth\n- [ ] Default: closes+related edges; --expand-mentions adds mentioned edges\n- [ ] --depth 0 disables expansion\n- [ ] --since filters by event timestamp\n- [ ] -p scopes to project\n- [ ] Human output colored and readable\n- [ ] Robot JSON with expansion provenance (via) and unresolved references\n- [ ] Query latency < 200ms for < 50 seed entities\n\n## Dependencies\n- Depends on: Gate 1 (bd-2zl) for event tables, Gate 2 (bd-1se) for entity_references\n- Downstream: Gates 4 and 5 can proceed in parallel (no dependency on Gate 3)","status":"open","priority":1,"issue_type":"feature","created_at":"2026-02-02T21:31:01.036474Z","created_by":"tayloreernisse","updated_at":"2026-02-04T19:32:48.598603Z","compaction_level":0,"original_size":0,"labels":["epic","gate-3","phase-b"],"dependencies":[{"issue_id":"bd-ike","depends_on_id":"bd-1se","type":"blocks","created_at":"2026-02-02T21:33:37.875622Z","created_by":"tayloreernisse"},{"issue_id":"bd-ike","depends_on_id":"bd-2zl","type":"blocks","created_at":"2026-02-02T21:33:37.831914Z","created_by":"tayloreernisse"}]}
{"id":"bd-jec","title":"Add fetchMrFileChanges config flag","description":"## Background\n\nGate 4 requires MR diff data to populate mr_file_changes. This config flag controls whether diff fetching is enabled, following the exact same pattern as the existing fetchResourceEvents flag in SyncConfig.\n\n## Approach\n\n### 1. Add field to SyncConfig in `src/core/config.rs`\n\n```rust\n#[derive(Debug, Clone, Deserialize)]\n#[serde(default)]\npub struct SyncConfig {\n // ... existing fields ...\n\n #[serde(rename = \"fetchMrFileChanges\", default = \"default_true\")]\n pub fetch_mr_file_changes: bool,\n}\n```\n\nUpdate the `Default` impl:\n```rust\nimpl Default for SyncConfig {\n fn default() -> Self {\n Self {\n // ... existing defaults ...\n fetch_mr_file_changes: true,\n }\n }\n}\n```\n\n### 2. Add CLI override in SyncArgs (`src/cli/mod.rs`)\n\n```rust\n/// Skip MR file change fetching (overrides config)\n#[arg(long = \"no-file-changes\")]\npub no_file_changes: bool,\n```\n\n### 3. Apply override in `src/main.rs` handle_sync_cmd\n\n```rust\nif args.no_file_changes {\n config.sync.fetch_mr_file_changes = false;\n}\n```\n\nSame pattern as the existing `args.no_events` / `config.sync.fetch_resource_events` override.\n\n### 4. Guard in orchestrator\n\nWherever mr_diffs jobs are enqueued (src/ingestion/orchestrator.rs), check:\n```rust\nif config.sync.fetch_mr_file_changes {\n // enqueue mr_diffs dependent fetch jobs\n}\n```\n\n## Acceptance Criteria\n\n- [ ] `fetchMrFileChanges` appears in SyncConfig with default `true`\n- [ ] Config file without the field defaults to true (backwards compatible)\n- [ ] `lore sync --no-file-changes` disables MR diff fetching\n- [ ] Orchestrator skips mr_diffs jobs when flag is false\n- [ ] `cargo check --all-targets` passes\n- [ ] `cargo clippy --all-targets -- -D warnings` passes\n\n## Files\n\n- `src/core/config.rs` (add field to SyncConfig + Default impl)\n- `src/cli/mod.rs` (add --no-file-changes to SyncArgs)\n- `src/main.rs` (apply override in handle_sync_cmd)\n- `src/ingestion/orchestrator.rs` (guard mr_diffs enqueue)\n\n## TDD Loop\n\nRED: Add test in config.rs tests:\n- `test_config_default_fetch_mr_file_changes` - default SyncConfig has fetch_mr_file_changes=true\n- `test_config_deserialize_fetch_mr_file_changes_false` - JSON with \"fetchMrFileChanges\": false\n\nGREEN: Add the field, default, and serde attribute.\n\nVERIFY: `cargo test --lib -- config`\n\n## Edge Cases\n\n- Existing config files without this field: serde default makes it true (no migration needed)\n- Config has fetchMrFileChanges=true but --no-file-changes flag: CLI flag wins\n- Config has fetchMrFileChanges=false: no need for CLI flag, but --no-file-changes is harmless","status":"open","priority":3,"issue_type":"task","created_at":"2026-02-02T21:34:08.892666Z","created_by":"tayloreernisse","updated_at":"2026-02-05T18:52:46.752923Z","compaction_level":0,"original_size":0,"labels":["config","gate-4","phase-b"],"dependencies":[{"issue_id":"bd-jec","depends_on_id":"bd-14q","type":"parent-child","created_at":"2026-02-02T21:34:08.895167Z","created_by":"tayloreernisse"}]}
{"id":"bd-jov","title":"[CP1] Discussion and note transformers","description":"Transform GitLab discussion/note payloads to normalized database schema.\n\n## Module\nsrc/gitlab/transformers/discussion.rs\n\n## Structs\n\n### NormalizedDiscussion\n- gitlab_discussion_id: String\n- project_id: i64\n- issue_id: i64\n- noteable_type: String (\"Issue\")\n- individual_note: bool\n- first_note_at, last_note_at: Option<i64>\n- last_seen_at: i64\n- resolvable, resolved: bool\n\n### NormalizedNote\n- gitlab_id: i64\n- project_id: i64\n- note_type: Option<String>\n- is_system: bool\n- author_username: String\n- body: String\n- created_at, updated_at, last_seen_at: i64\n- position: i32 (array index in notes[])\n- resolvable, resolved: bool\n- resolved_by: Option<String>\n- resolved_at: Option<i64>\n\n## Functions\n\n### transform_discussion(gitlab_discussion, local_project_id, local_issue_id) -> NormalizedDiscussion\n- Compute first_note_at/last_note_at from notes array min/max created_at\n- Compute resolvable (any note resolvable)\n- Compute resolved (resolvable AND all resolvable notes resolved)\n\n### transform_notes(gitlab_discussion, local_project_id) -> Vec<NormalizedNote>\n- Enumerate notes to get position (array index)\n- Set is_system from note.system\n- Convert timestamps to ms epoch\n\nFiles: src/gitlab/transformers/discussion.rs\nTests: tests/discussion_transformer_tests.rs\nDone when: Unit tests pass for discussion/note transformation with system note flagging","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:43:04.481361Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.759691Z","deleted_at":"2026-01-25T17:02:01.759684Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-k7b","title":"[CP1] gi show issue command","description":"Show issue details with discussions.\n\n## Module\nsrc/cli/commands/show.rs\n\n## Clap Definition\nShow {\n #[arg(value_parser = [\"issue\", \"mr\"])]\n entity: String,\n \n iid: i64,\n \n #[arg(long)]\n project: Option<String>,\n}\n\n## Output Format\nIssue #1234: Authentication redesign\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n\nProject: group/project-one\nState: opened\nAuthor: @johndoe\nCreated: 2024-01-15\nUpdated: 2024-03-20\nLabels: enhancement, auth\nURL: https://gitlab.example.com/group/project-one/-/issues/1234\n\nDescription:\n We need to redesign the authentication flow to support...\n\nDiscussions (5):\n\n @janedoe (2024-01-16):\n I agree we should move to JWT-based auth...\n\n @johndoe (2024-01-16):\n What about refresh token strategy?\n\n @bobsmith (2024-01-17):\n Have we considered OAuth2?\n\n## Ambiguity Handling\nIf multiple projects have same iid, either:\n- Prompt for --project flag\n- Show error listing which projects have that iid\n\nFiles: src/cli/commands/show.rs\nDone when: Issue detail view displays all fields including threaded discussions","status":"tombstone","priority":3,"issue_type":"task","created_at":"2026-01-25T16:58:26.904813Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.944183Z","deleted_at":"2026-01-25T17:02:01.944179Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-lcb","title":"Epic: CP2 Gate E - CLI Complete","description":"## Background\nGate E validates all CLI commands are functional and user-friendly. This is the final usability gate - even if all data is correct, users need good CLI UX to access it.\n\n## Acceptance Criteria (Pass/Fail)\n\n### List Command\n- [ ] `gi list mrs` shows MR table with columns: iid, title, state, author, branches, updated\n- [ ] `gi list mrs --state=opened` filters to only opened MRs\n- [ ] `gi list mrs --state=merged` filters to only merged MRs\n- [ ] `gi list mrs --state=closed` filters to only closed MRs\n- [ ] `gi list mrs --state=locked` filters locally (not server-side filter)\n- [ ] `gi list mrs --draft` shows only draft MRs\n- [ ] `gi list mrs --no-draft` excludes draft MRs\n- [ ] Draft MRs show `[DRAFT]` prefix in title column\n- [ ] `gi list mrs --author=username` filters by author\n- [ ] `gi list mrs --assignee=username` filters by assignee\n- [ ] `gi list mrs --reviewer=username` filters by reviewer\n- [ ] `gi list mrs --target-branch=main` filters by target branch\n- [ ] `gi list mrs --source-branch=feature/x` filters by source branch\n- [ ] `gi list mrs --label=bugfix` filters by label\n- [ ] `gi list mrs --limit=N` limits output\n\n### Show Command\n- [ ] `gi show mr <iid>` displays full MR detail\n- [ ] Show includes: title, description, state, draft status, author\n- [ ] Show includes: assignees, reviewers, labels\n- [ ] Show includes: source_branch, target_branch\n- [ ] Show includes: detailed_merge_status (e.g., \"mergeable\")\n- [ ] Show includes: merge_user and merged_at for merged MRs\n- [ ] Show includes: discussions with author and date\n- [ ] DiffNote shows file context: `[src/file.ts:45]`\n- [ ] Multi-line DiffNote shows range: `[src/file.ts:45-48]`\n- [ ] Resolved discussions show `[RESOLVED]` marker\n\n### Count Command\n- [ ] `gi count mrs` shows total count\n- [ ] Count shows state breakdown: opened, merged, closed\n\n### Sync Status\n- [ ] `gi sync-status` shows MR cursor position\n- [ ] Sync status shows last sync timestamp\n\n## Validation Script\n```bash\n#!/bin/bash\nset -e\n\nDB_PATH=\"${XDG_DATA_HOME:-$HOME/.local/share}/gitlab-inbox/db.sqlite3\"\n\necho \"=== Gate E: CLI Complete ===\"\n\n# 1. Test list command (basic)\necho \"Step 1: Basic list...\"\ngi list mrs --limit=5 || { echo \"FAIL: list mrs failed\"; exit 1; }\n\n# 2. Test state filters\necho \"Step 2: State filters...\"\nfor state in opened merged closed; do\n echo \" Testing --state=$state\"\n gi list mrs --state=$state --limit=3 || echo \" Warning: No $state MRs\"\ndone\n\n# 3. Test draft filters\necho \"Step 3: Draft filters...\"\ngi list mrs --draft --limit=3 || echo \" Note: No draft MRs found\"\ngi list mrs --no-draft --limit=3 || echo \" Note: All MRs are drafts?\"\n\n# 4. Check [DRAFT] prefix\necho \"Step 4: Check [DRAFT] prefix...\"\nDRAFT_IID=$(sqlite3 \"$DB_PATH\" \"SELECT iid FROM merge_requests WHERE draft = 1 LIMIT 1;\")\nif [ -n \"$DRAFT_IID\" ]; then\n if gi list mrs --limit=100 | grep -q \"\\[DRAFT\\]\"; then\n echo \" PASS: [DRAFT] prefix found\"\n else\n echo \" FAIL: Draft MR exists but no [DRAFT] prefix in output\"\n fi\nelse\n echo \" Skip: No draft MRs to test\"\nfi\n\n# 5. Test author/assignee/reviewer filters\necho \"Step 5: User filters...\"\nAUTHOR=$(sqlite3 \"$DB_PATH\" \"SELECT author_username FROM merge_requests LIMIT 1;\")\nif [ -n \"$AUTHOR\" ]; then\n echo \" Testing --author=$AUTHOR\"\n gi list mrs --author=\"$AUTHOR\" --limit=3\nfi\n\nREVIEWER=$(sqlite3 \"$DB_PATH\" \"SELECT username FROM mr_reviewers LIMIT 1;\")\nif [ -n \"$REVIEWER\" ]; then\n echo \" Testing --reviewer=$REVIEWER\"\n gi list mrs --reviewer=\"$REVIEWER\" --limit=3\nfi\n\n# 6. Test branch filters\necho \"Step 6: Branch filters...\"\nTARGET=$(sqlite3 \"$DB_PATH\" \"SELECT target_branch FROM merge_requests LIMIT 1;\")\nif [ -n \"$TARGET\" ]; then\n echo \" Testing --target-branch=$TARGET\"\n gi list mrs --target-branch=\"$TARGET\" --limit=3\nfi\n\n# 7. Test show command\necho \"Step 7: Show command...\"\nMR_IID=$(sqlite3 \"$DB_PATH\" \"SELECT iid FROM merge_requests LIMIT 1;\")\ngi show mr \"$MR_IID\" || { echo \"FAIL: show mr failed\"; exit 1; }\n\n# 8. Test show with DiffNote context\necho \"Step 8: Show with DiffNote...\"\nDIFFNOTE_MR=$(sqlite3 \"$DB_PATH\" \"\n SELECT DISTINCT m.iid\n FROM merge_requests m\n JOIN discussions d ON d.merge_request_id = m.id\n JOIN notes n ON n.discussion_id = d.id\n WHERE n.position_new_path IS NOT NULL\n LIMIT 1;\n\")\nif [ -n \"$DIFFNOTE_MR\" ]; then\n echo \" Testing MR with DiffNotes: !$DIFFNOTE_MR\"\n OUTPUT=$(gi show mr \"$DIFFNOTE_MR\")\n if echo \"$OUTPUT\" | grep -qE '\\[[^]]+:[0-9]+\\]'; then\n echo \" PASS: File context [path:line] found\"\n else\n echo \" FAIL: DiffNote should show [path:line] context\"\n fi\nelse\n echo \" Skip: No MRs with DiffNotes\"\nfi\n\n# 9. Test count command\necho \"Step 9: Count command...\"\ngi count mrs || { echo \"FAIL: count mrs failed\"; exit 1; }\n\n# 10. Test sync-status\necho \"Step 10: Sync status...\"\ngi sync-status || echo \" Note: sync-status may need implementation\"\n\necho \"\"\necho \"=== Gate E: PASSED ===\"\n```\n\n## Test Commands (Quick Verification)\n```bash\n# List with all column types visible:\ngi list mrs --limit=10\n\n# Show a specific MR:\ngi show mr 42\n\n# Count with breakdown:\ngi count mrs\n\n# Complex filter:\ngi list mrs --state=opened --reviewer=alice --target-branch=main --limit=5\n```\n\n## Expected Output Formats\n\n### gi list mrs\n```\nMerge Requests (showing 5 of 1,234)\n\n !847 Refactor auth to use JWT tokens merged @johndoe main <- feature/jwt 3d ago\n !846 Fix memory leak in websocket handler opened @janedoe main <- fix/websocket 5d ago\n !845 [DRAFT] Add dark mode CSS variables opened @bobsmith main <- ui/dark-mode 1w ago\n !844 Update dependencies to latest versions closed @alice main <- chore/deps 2w ago\n```\n\n### gi show mr 847\n```\nMerge Request !847: Refactor auth to use JWT tokens\n================================================================================\n\nProject: group/project-one\nState: merged\nDraft: No\nAuthor: @johndoe\nAssignees: @janedoe, @bobsmith\nReviewers: @alice, @charlie\nLabels: enhancement, auth, reviewed\nSource: feature/jwt\nTarget: main\nMerge Status: merged\nMerged By: @alice\nMerged At: 2024-03-20 14:30:00\n\nDescription:\n Moving away from session cookies to JWT-based authentication...\n\nDiscussions (3):\n\n @janedoe (2024-03-16) [src/auth/jwt.ts:45]:\n Should we use a separate signing key for refresh tokens?\n\n @johndoe (2024-03-16):\n Good point. I'll add a separate key with rotation support.\n\n @alice (2024-03-18) [RESOLVED]:\n Looks good! Just one nit about the token expiry constant.\n```\n\n### gi count mrs\n```\nMerge Requests: 1,234\n opened: 89\n merged: 1,045\n closed: 100\n```\n\n## Dependencies\nThis gate requires:\n- bd-3js (CLI commands implementation)\n- All previous gates must pass first\n\n## Edge Cases\n- Ambiguous MR iid across projects: should prompt for `--project` or show error\n- Very long titles: should truncate with `...` in list view\n- Empty description: should show \"No description\" or empty section\n- No discussions: should show \"No discussions\" message\n- Unicode in titles/descriptions: should render correctly","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-26T22:06:02.411132Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:48:21.061166Z","closed_at":"2026-01-27T00:48:21.061125Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-lcb","depends_on_id":"bd-3js","type":"blocks","created_at":"2026-01-26T22:08:55.957747Z","created_by":"tayloreernisse"}]}
{"id":"bd-ljf","title":"Add embedding error variants to LoreError","description":"## Background\nGate B introduces Ollama-dependent operations that need distinct error variants for clear diagnostics. Each error has a unique exit code, a descriptive message, and an actionable suggestion. These errors must integrate with the existing LoreError enum pattern (renamed from GiError in bd-3lc).\n\n## Approach\nExtend `src/core/error.rs` with 4 new variants per PRD Section 4.3.\n\n**ErrorCode additions:**\n```rust\npub enum ErrorCode {\n // ... existing (InternalError=1 through TransformError=13)\n OllamaUnavailable, // exit code 14\n OllamaModelNotFound, // exit code 15\n EmbeddingFailed, // exit code 16\n}\n```\n\n**LoreError additions:**\n```rust\n/// Ollama-specific connection failure. Use instead of Http for Ollama errors\n/// because it includes base_url for actionable error messages.\n#[error(\"Cannot connect to Ollama at {base_url}. Is it running?\")]\nOllamaUnavailable {\n base_url: String,\n #[source]\n source: Option<reqwest::Error>,\n},\n\n#[error(\"Ollama model '{model}' not found. Run: ollama pull {model}\")]\nOllamaModelNotFound { model: String },\n\n#[error(\"Embedding failed for document {document_id}: {reason}\")]\nEmbeddingFailed { document_id: i64, reason: String },\n\n#[error(\"No embeddings found. Run: lore embed\")]\nEmbeddingsNotBuilt,\n```\n\n**code() mapping:**\n- OllamaUnavailable => ErrorCode::OllamaUnavailable\n- OllamaModelNotFound => ErrorCode::OllamaModelNotFound\n- EmbeddingFailed => ErrorCode::EmbeddingFailed\n- EmbeddingsNotBuilt => ErrorCode::EmbeddingFailed (shares exit code 16)\n\n**suggestion() mapping:**\n- OllamaUnavailable => \"Start Ollama: ollama serve\"\n- OllamaModelNotFound => \"Pull the model: ollama pull nomic-embed-text\"\n- EmbeddingFailed => \"Check Ollama logs or retry with 'lore embed --retry-failed'\"\n- EmbeddingsNotBuilt => \"Generate embeddings first: lore embed\"\n\n## Acceptance Criteria\n- [ ] All 4 error variants compile\n- [ ] Exit codes: OllamaUnavailable=14, OllamaModelNotFound=15, EmbeddingFailed=16\n- [ ] EmbeddingsNotBuilt shares exit code 16 (mapped to ErrorCode::EmbeddingFailed)\n- [ ] OllamaUnavailable has `base_url: String` and `source: Option<reqwest::Error>`\n- [ ] EmbeddingFailed has `document_id: i64` and `reason: String`\n- [ ] Each variant has actionable .suggestion() text per PRD\n- [ ] ErrorCode Display: OLLAMA_UNAVAILABLE, OLLAMA_MODEL_NOT_FOUND, EMBEDDING_FAILED\n- [ ] Robot mode JSON includes code + suggestion for each variant\n- [ ] `cargo build` succeeds\n\n## Files\n- `src/core/error.rs` — extend LoreError enum + ErrorCode enum + impl blocks\n\n## TDD Loop\nRED: Add variants, `cargo build` fails on missing match arms\nGREEN: Add match arms in code(), exit_code(), suggestion(), to_robot_error(), Display\nVERIFY: `cargo build && cargo test error`\n\n## Edge Cases\n- OllamaUnavailable with source=None: still valid (used when no HTTP error available)\n- EmbeddingFailed with document_id=0: used for batch-level failures (not per-doc)\n- EmbeddingsNotBuilt vs OllamaUnavailable: former means \"never ran embed\", latter means \"Ollama down right now\"","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:26:33.994316Z","created_by":"tayloreernisse","updated_at":"2026-01-30T16:51:20.385574Z","closed_at":"2026-01-30T16:51:20.385369Z","close_reason":"Completed: Added 4 LoreError variants (OllamaUnavailable, OllamaModelNotFound, EmbeddingFailed, EmbeddingsNotBuilt) and 3 ErrorCode variants with exit codes 14-16. cargo build succeeds.","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-ljf","depends_on_id":"bd-3lc","type":"blocks","created_at":"2026-01-30T15:29:15.640924Z","created_by":"tayloreernisse"}]}
{"id":"bd-lsz","title":"Epic: Gate B - Hybrid MVP","description":"## Background\nGate B adds semantic search capabilities via Ollama embeddings and sqlite-vec vector storage. It builds on Gate A's document layer, adding the embedding pipeline, vector search, RRF-based hybrid ranking, and graceful degradation when Ollama is unavailable. Gate B is independently shippable on top of Gate A.\n\n## Gate B Deliverables\n1. Ollama-powered embedding pipeline with sqlite-vec storage\n2. Hybrid search (RRF-ranked vector + lexical) with rich filtering + graceful degradation\n\n## Bead Dependencies (execution order, after Gate A)\n1. **bd-mem** — Shared backoff utility (no deps)\n2. **bd-1y8** — Chunk ID encoding (no deps)\n3. **bd-3ez** — RRF ranking (no deps)\n4. **bd-ljf** — Embedding error variants (blocked by bd-3lc)\n5. **bd-2ac** — Migration 009 embeddings (blocked by bd-hrs)\n6. **bd-335** — Ollama API client (blocked by bd-ljf)\n7. **bd-am7** — Embedding pipeline (blocked by bd-335, bd-2ac, bd-1y8)\n8. **bd-bjo** — Vector search (blocked by bd-2ac, bd-1y8)\n9. **bd-2sx** — Embed CLI (blocked by bd-am7)\n10. **bd-3eu** — Hybrid search (blocked by bd-3ez, bd-bjo, bd-1k1, bd-3q2)\n\n## Acceptance Criteria\n- [ ] `lore embed` builds embeddings for all documents via Ollama\n- [ ] `lore embed --retry-failed` re-attempts failed embeddings\n- [ ] `lore search --mode=hybrid \"query\"` uses both FTS + vector\n- [ ] `lore search --mode=semantic \"query\"` uses vector only\n- [ ] Graceful degradation: Ollama down -> FTS fallback with warning\n- [ ] `lore search --explain` shows vector_rank, fts_rank, rrf_score\n- [ ] sqlite-vec loaded before migration 009","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-30T15:25:13.462602Z","created_by":"tayloreernisse","updated_at":"2026-01-30T18:02:57.669194Z","closed_at":"2026-01-30T18:02:57.669142Z","close_reason":"All Gate B sub-beads complete: backoff, chunk IDs, RRF, error variants, migration 009, Ollama client, embedding pipeline, vector search, embed CLI, hybrid search","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-lsz","depends_on_id":"bd-2sx","type":"blocks","created_at":"2026-01-30T15:29:35.818914Z","created_by":"tayloreernisse"},{"issue_id":"bd-lsz","depends_on_id":"bd-3eu","type":"blocks","created_at":"2026-01-30T15:29:35.783218Z","created_by":"tayloreernisse"}]}
{"id":"bd-mem","title":"Implement shared backoff utility","description":"## Background\nBoth `dirty_sources` and `pending_discussion_fetches` tables use exponential backoff with `next_attempt_at` timestamps. Without a shared utility, each module would duplicate the backoff curve logic, risking drift. The shared backoff module ensures consistent retry behavior across all queue consumers in Gate C.\n\n## Approach\nCreate `src/core/backoff.rs` per PRD Section 6.X.\n\n**IMPORTANT — PRD-exact signature and implementation:**\n```rust\nuse rand::Rng;\n\n/// Compute next_attempt_at with exponential backoff and jitter.\n///\n/// Formula: now + min(3600000, 1000 * 2^attempt_count) * (0.9 to 1.1)\n/// - Capped at 1 hour to prevent runaway delays\n/// - ±10% jitter prevents synchronized retries after outages\n///\n/// Used by:\n/// - `dirty_sources` retry scheduling (document regeneration failures)\n/// - `pending_discussion_fetches` retry scheduling (API fetch failures)\n///\n/// Having one implementation prevents subtle divergence between queues\n/// (e.g., different caps or jitter ranges).\npub fn compute_next_attempt_at(now: i64, attempt_count: i64) -> i64 {\n // Cap attempt_count to prevent overflow (2^30 > 1 hour anyway)\n let capped_attempts = attempt_count.min(30) as u32;\n let base_delay_ms = 1000_i64.saturating_mul(1 << capped_attempts);\n let capped_delay_ms = base_delay_ms.min(3_600_000); // 1 hour cap\n\n // Add ±10% jitter\n let jitter_factor = rand::thread_rng().gen_range(0.9..=1.1);\n let delay_with_jitter = (capped_delay_ms as f64 * jitter_factor) as i64;\n\n now + delay_with_jitter\n}\n```\n\n**Key PRD details (must match exactly):**\n- `attempt_count` parameter is `i64` (not `u32`) — matches SQLite integer type from DB columns\n- Overflow prevention: `.min(30) as u32` caps before shift\n- Base delay: `1000_i64.saturating_mul(1 << capped_attempts)` — uses `saturating_mul` for safety\n- Cap: `3_600_000` (1 hour)\n- Jitter: `gen_range(0.9..=1.1)` — inclusive range\n- Return: `i64` (milliseconds epoch)\n\n**Cargo.toml change:** Add `rand = \"0.8\"` to `[dependencies]`.\n\n## Acceptance Criteria\n- [ ] Single shared implementation used by both dirty_tracker and discussion_queue\n- [ ] Signature: `pub fn compute_next_attempt_at(now: i64, attempt_count: i64) -> i64`\n- [ ] attempt_count is i64 (matches SQLite column type), not u32\n- [ ] Overflow prevention: `.min(30) as u32` before shift\n- [ ] Base delay uses `1000_i64.saturating_mul(1 << capped_attempts)`\n- [ ] Cap at 1 hour (3,600,000 ms)\n- [ ] Jitter: `gen_range(0.9..=1.1)` inclusive range\n- [ ] Exponential curve: 1s, 2s, 4s, 8s, ... up to 1h cap\n- [ ] `cargo test backoff` passes\n\n## Files\n- `src/core/backoff.rs` — new file\n- `src/core/mod.rs` — add `pub mod backoff;`\n- `Cargo.toml` — add `rand = \"0.8\"`\n\n## TDD Loop\nRED: `src/core/backoff.rs` with `#[cfg(test)] mod tests`:\n- `test_exponential_curve` — verify delays double each attempt (within jitter range)\n- `test_cap_at_one_hour` — attempt 20+ still produces delay <= MAX_DELAY_MS * 1.1\n- `test_jitter_range` — run 100 iterations, all delays within [0.9x, 1.1x] of base\n- `test_first_retry_is_about_one_second` — attempt 1 produces ~1000ms delay\n- `test_overflow_safety` — very large attempt_count doesn't panic\nGREEN: Implement compute_next_attempt_at()\nVERIFY: `cargo test backoff`\n\n## Edge Cases\n- `attempt_count` > 30: `.min(30)` caps, saturating_mul prevents overflow\n- `attempt_count` = 0: not used in practice (callers pass `attempt_count + 1`)\n- `attempt_count` = 1: delay is ~1 second (first retry)\n- Negative attempt_count: `.min(30)` still works, shift of negative-as-u32 wraps but saturating_mul handles it","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-30T15:27:09.474Z","created_by":"tayloreernisse","updated_at":"2026-01-30T16:57:24.900137Z","closed_at":"2026-01-30T16:57:24.899942Z","close_reason":"Completed: compute_next_attempt_at with exp backoff (1s base, 1h cap, +-10% jitter), i64 params matching SQLite, overflow-safe, 5 tests pass","compaction_level":0,"original_size":0}
{"id":"bd-mk3","title":"Update ingest command for merge_requests type","description":"## Background\nCLI entry point for MR ingestion. Routes `--type=merge_requests` to the orchestrator. Must ensure `--full` resets both MR cursor AND discussion watermarks. This is the user-facing command that kicks off the entire MR sync pipeline.\n\n## Approach\nUpdate `src/cli/commands/ingest.rs` to handle `merge_requests` type:\n1. Add `merge_requests` branch to the resource type match statement\n2. Validate resource type early with helpful error message\n3. Pass `full` flag through to orchestrator (it handles the watermark reset internally)\n\n## Files\n- `src/cli/commands/ingest.rs` - Add merge_requests branch to `run_ingest`\n\n## Acceptance Criteria\n- [ ] `gi ingest --type=merge_requests` runs MR ingestion successfully\n- [ ] `gi ingest --type=merge_requests --full` resets cursor AND discussion watermarks\n- [ ] `gi ingest --type=invalid` returns helpful error listing valid types\n- [ ] Progress output shows MR counts, discussion counts, and skip counts\n- [ ] Default type remains `issues` for backward compatibility\n- [ ] `cargo test ingest_command` passes\n\n## TDD Loop\nRED: `gi ingest --type=merge_requests` -> \"invalid type: merge_requests\"\nGREEN: Add merge_requests to match statement in run_ingest\nVERIFY: `gi ingest --type=merge_requests --help` shows merge_requests as valid\n\n## Function Signature\n```rust\npub async fn run_ingest(\n config: &Config,\n args: &IngestArgs,\n) -> Result<(), GiError>\n```\n\n## IngestArgs Reference (existing)\n```rust\n#[derive(Parser, Debug)]\npub struct IngestArgs {\n /// Resource type to ingest\n #[arg(long, short = 't', default_value = \"issues\")]\n pub r#type: String,\n \n /// Filter to specific project (by path or ID)\n #[arg(long, short = 'p')]\n pub project: Option<String>,\n \n /// Force run even if another ingest is in progress\n #[arg(long, short = 'f')]\n pub force: bool,\n \n /// Full sync - reset cursor and refetch all\n #[arg(long)]\n pub full: bool,\n}\n```\n\n## Code Change\n```rust\nuse crate::core::errors::GiError;\nuse crate::ingestion::orchestrator::Orchestrator;\n\npub async fn run_ingest(\n config: &Config,\n args: &IngestArgs,\n) -> Result<(), GiError> {\n let resource_type = args.r#type.as_str();\n \n // Validate resource type early\n match resource_type {\n \"issues\" | \"merge_requests\" => {}\n _ => {\n return Err(GiError::InvalidArgument {\n name: \"type\".to_string(),\n value: resource_type.to_string(),\n expected: \"issues or merge_requests\".to_string(),\n });\n }\n }\n \n // Acquire single-flight lock (unless --force)\n if !args.force {\n acquire_ingest_lock(config, resource_type)?;\n }\n \n // Get projects to ingest (filtered if --project specified)\n let projects = get_projects_to_ingest(config, args.project.as_deref())?;\n \n for project in projects {\n println!(\"Ingesting {} for {}...\", resource_type, project.path);\n \n let orchestrator = Orchestrator::new(\n &config,\n project.id,\n project.gitlab_id,\n )?;\n \n let result = orchestrator.run_ingestion(resource_type, args.full).await?;\n \n // Print results based on resource type\n match resource_type {\n \"issues\" => {\n println!(\" {}: {} issues fetched, {} upserted\",\n project.path, result.issues_fetched, result.issues_upserted);\n }\n \"merge_requests\" => {\n println!(\" {}: {} MRs fetched, {} new labels, {} assignees, {} reviewers\",\n project.path,\n result.mrs_fetched,\n result.labels_created,\n result.assignees_linked,\n result.reviewers_linked,\n );\n println!(\" Discussions: {} synced, {} notes ({} DiffNotes)\",\n result.discussions_synced,\n result.notes_synced,\n result.diffnotes_count,\n );\n if result.mrs_skipped_discussion_sync > 0 {\n println!(\" Skipped discussion sync for {} unchanged MRs\",\n result.mrs_skipped_discussion_sync);\n }\n if result.failed_discussion_syncs > 0 {\n eprintln!(\" Warning: {} MRs failed discussion sync (will retry next run)\",\n result.failed_discussion_syncs);\n }\n }\n _ => unreachable!(),\n }\n }\n \n // Release lock\n if !args.force {\n release_ingest_lock(config, resource_type)?;\n }\n \n Ok(())\n}\n```\n\n## Output Format\n```\nIngesting merge_requests for group/project-one...\n group/project-one: 567 MRs fetched, 12 new labels, 89 assignees, 45 reviewers\n Discussions: 456 synced, 1,234 notes (89 DiffNotes)\n Skipped discussion sync for 444 unchanged MRs\n\nTotal: 567 MRs, 456 discussions, 1,234 notes\n```\n\n## Full Sync Behavior\nWhen `--full` is passed:\n1. MR cursor reset to NULL (handled by `ingest_merge_requests` with `full_sync: true`)\n2. Discussion watermarks reset to NULL (handled by `reset_discussion_watermarks` called from ingestion)\n3. All MRs re-fetched from GitLab API\n4. All discussions re-fetched for every MR\n\n## Error Types (from GiError enum)\n```rust\n// In src/core/errors.rs\npub enum GiError {\n InvalidArgument {\n name: String,\n value: String,\n expected: String,\n },\n LockError {\n resource: String,\n message: String,\n },\n // ... other variants\n}\n```\n\n## Edge Cases\n- Default type is `issues` for backward compatibility with CP1\n- Project filter (`--project`) can limit to specific project by path or ID\n- Force flag (`--force`) bypasses single-flight lock for debugging\n- If no projects configured, return helpful error about running `gi project add` first\n- Empty project (no MRs): completes successfully with \"0 MRs fetched\"","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-26T22:06:43.034952Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:28:52.711235Z","closed_at":"2026-01-27T00:28:52.711166Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-mk3","depends_on_id":"bd-10f","type":"blocks","created_at":"2026-01-26T22:08:55.003544Z","created_by":"tayloreernisse"}]}
{"id":"bd-o7b","title":"[CP1] gi show issue command","description":"## Background\n\nThe `gi show issue <iid>` command displays detailed information about a single issue including metadata, description, labels, and all discussions with their notes. It provides a complete view similar to the GitLab web UI.\n\n## Approach\n\n### Module: src/cli/commands/show.rs\n\n### Clap Definition\n\n```rust\n#[derive(Args)]\npub struct ShowArgs {\n /// Entity type\n #[arg(value_parser = [\"issue\", \"mr\"])]\n pub entity: String,\n\n /// Entity IID\n pub iid: i64,\n\n /// Project path (required if ambiguous)\n #[arg(long)]\n pub project: Option<String>,\n}\n```\n\n### Handler Function\n\n```rust\npub async fn handle_show(args: ShowArgs, conn: &Connection) -> Result<()>\n```\n\n### Logic (for entity=\"issue\")\n\n1. **Find issue**: Query by iid, optionally filtered by project\n - If multiple projects have same iid, require --project or error\n2. **Load metadata**: title, state, author, created_at, updated_at, web_url\n3. **Load labels**: JOIN through issue_labels to labels table\n4. **Load discussions**: All discussions for this issue\n5. **Load notes**: All notes for each discussion, ordered by position\n6. **Format output**: Rich display with sections\n\n### Output Format (matches PRD)\n\n```\nIssue #1234: Authentication redesign\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n\nProject: group/project-one\nState: opened\nAuthor: @johndoe\nCreated: 2024-01-15\nUpdated: 2024-03-20\nLabels: enhancement, auth\nURL: https://gitlab.example.com/group/project-one/-/issues/1234\n\nDescription:\n We need to redesign the authentication flow to support...\n\nDiscussions (5):\n\n @janedoe (2024-01-16):\n I agree we should move to JWT-based auth...\n\n @johndoe (2024-01-16):\n What about refresh token strategy?\n\n @bobsmith (2024-01-17):\n Have we considered OAuth2?\n```\n\n### Queries\n\n```sql\n-- Find issue\nSELECT i.*, p.path as project_path\nFROM issues i\nJOIN projects p ON i.project_id = p.id\nWHERE i.iid = ? AND (p.path = ? OR ? IS NULL)\n\n-- Get labels\nSELECT l.name FROM labels l\nJOIN issue_labels il ON l.id = il.label_id\nWHERE il.issue_id = ?\n\n-- Get discussions with notes\nSELECT d.*, n.* FROM discussions d\nJOIN notes n ON d.id = n.discussion_id\nWHERE d.issue_id = ?\nORDER BY d.first_note_at, n.position\n```\n\n## Acceptance Criteria\n\n- [ ] Shows issue metadata (title, state, author, dates, URL)\n- [ ] Shows labels as comma-separated list\n- [ ] Shows description (truncated if very long)\n- [ ] Shows discussions grouped with notes indented\n- [ ] Handles --project filter correctly\n- [ ] Errors clearly if iid is ambiguous without --project\n\n## Files\n\n- src/cli/commands/mod.rs (add `pub mod show;`)\n- src/cli/commands/show.rs (create)\n- src/cli/mod.rs (add Show variant to Commands enum)\n\n## TDD Loop\n\nRED:\n```rust\n#[tokio::test] async fn show_issue_displays_metadata()\n#[tokio::test] async fn show_issue_displays_labels()\n#[tokio::test] async fn show_issue_displays_discussions()\n#[tokio::test] async fn show_issue_requires_project_when_ambiguous()\n```\n\nGREEN: Implement handler with queries and formatting\n\nVERIFY: `cargo test show_issue`\n\n## Edge Cases\n\n- Issue with no labels - show \"Labels: (none)\"\n- Issue with no discussions - show \"Discussions: (none)\"\n- Issue with very long description - truncate with \"...\"\n- System notes in discussions - filter out or show with [system] prefix\n- Individual notes (not threaded) - show without reply indentation","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-25T17:02:38.384702Z","created_by":"tayloreernisse","updated_at":"2026-01-25T23:05:25.688102Z","closed_at":"2026-01-25T23:05:25.688043Z","close_reason":"Implemented gi show issue command with metadata, labels, and discussions display","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-o7b","depends_on_id":"bd-208","type":"blocks","created_at":"2026-01-25T17:04:05.701560Z","created_by":"tayloreernisse"},{"issue_id":"bd-o7b","depends_on_id":"bd-hbo","type":"blocks","created_at":"2026-01-25T17:04:05.725767Z","created_by":"tayloreernisse"}]}
{"id":"bd-ozy","title":"[CP1] Ingestion orchestrator","description":"## Background\n\nThe ingestion orchestrator coordinates issue sync followed by dependent discussion sync. It implements the CP1 canonical pattern: fetch issues, identify which need discussion sync (updated_at advanced), then execute discussion sync with bounded concurrency.\n\n## Approach\n\n### Module: src/ingestion/orchestrator.rs\n\n### Main Function\n\n```rust\npub async fn ingest_project_issues(\n conn: &Connection,\n client: &GitLabClient,\n config: &Config,\n project_id: i64, // Local DB project ID\n gitlab_project_id: i64,\n) -> Result<IngestProjectResult>\n\n#[derive(Debug, Default)]\npub struct IngestProjectResult {\n pub issues_fetched: usize,\n pub issues_upserted: usize,\n pub labels_created: usize,\n pub discussions_fetched: usize,\n pub notes_fetched: usize,\n pub system_notes_count: usize,\n pub issues_skipped_discussion_sync: usize,\n}\n```\n\n### Orchestration Steps\n\n1. **Call issue ingestion**: `ingest_issues(conn, client, config, project_id, gitlab_project_id)`\n2. **Get issues needing discussion sync**: From IngestIssuesResult.issues_needing_discussion_sync\n3. **Execute bounded discussion sync**:\n - Use `tokio::task::LocalSet` for single-threaded runtime\n - Respect `config.sync.dependent_concurrency` (default: 5)\n - For each IssueForDiscussionSync:\n - Call `ingest_issue_discussions(...)`\n - Aggregate results\n4. **Calculate skipped count**: total_issues - issues_needing_discussion_sync.len()\n\n### Bounded Concurrency Pattern\n\n```rust\nuse futures::stream::{self, StreamExt};\n\nlet local_set = LocalSet::new();\nlocal_set.run_until(async {\n stream::iter(issues_needing_sync)\n .map(|issue| async {\n ingest_issue_discussions(\n conn, client, config,\n project_id, gitlab_project_id,\n issue.iid, issue.local_issue_id, issue.updated_at,\n ).await\n })\n .buffer_unordered(config.sync.dependent_concurrency)\n .try_collect::<Vec<_>>()\n .await\n}).await\n```\n\nNote: Single-threaded runtime means concurrency is I/O-bound, not parallel execution.\n\n## Acceptance Criteria\n\n- [ ] Orchestrator calls issue ingestion first\n- [ ] Only issues with updated_at > discussions_synced_for_updated_at get discussion sync\n- [ ] Bounded concurrency respects dependent_concurrency config\n- [ ] Results aggregated from both issue and discussion ingestion\n- [ ] issues_skipped_discussion_sync accurately reflects unchanged issues\n\n## Files\n\n- src/ingestion/mod.rs (add `pub mod orchestrator;`)\n- src/ingestion/orchestrator.rs (create)\n\n## TDD Loop\n\nRED:\n```rust\n// tests/orchestrator_tests.rs\n#[tokio::test] async fn orchestrates_issue_then_discussion_sync()\n#[tokio::test] async fn skips_discussion_sync_for_unchanged_issues()\n#[tokio::test] async fn respects_bounded_concurrency()\n#[tokio::test] async fn aggregates_results_correctly()\n```\n\nGREEN: Implement orchestrator with bounded concurrency\n\nVERIFY: `cargo test orchestrator`\n\n## Edge Cases\n\n- All issues unchanged - no discussion sync calls\n- All issues new - all get discussion sync\n- dependent_concurrency=1 - sequential discussion fetches\n- Issue ingestion fails - orchestrator returns error, no discussion sync","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-25T17:02:38.289941Z","created_by":"tayloreernisse","updated_at":"2026-01-25T22:54:07.447647Z","closed_at":"2026-01-25T22:54:07.447577Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-ozy","depends_on_id":"bd-208","type":"blocks","created_at":"2026-01-25T17:04:05.583955Z","created_by":"tayloreernisse"},{"issue_id":"bd-ozy","depends_on_id":"bd-hbo","type":"blocks","created_at":"2026-01-25T17:04:05.605851Z","created_by":"tayloreernisse"}]}
{"id":"bd-pgdw","title":"OBSERV: Add root tracing span with run_id to sync and ingest","description":"## Background\nA root tracing span per command invocation provides the top of the span hierarchy. All child spans (ingest_issues, fetch_pages, etc.) inherit the run_id field, making every log line within a run filterable by jq.\n\n## Approach\nIn run_sync() (src/cli/commands/sync.rs:54), after generating run_id, create a root span:\n\n```rust\npub async fn run_sync(config: &Config, options: SyncOptions) -> Result<SyncResult> {\n let run_id = &uuid::Uuid::new_v4().to_string()[..8];\n let _root = tracing::info_span!(\"sync\", %run_id).entered();\n // ... existing sync pipeline code\n}\n```\n\nIn run_ingest() (src/cli/commands/ingest.rs:107), same pattern:\n\n```rust\npub async fn run_ingest(...) -> Result<IngestResult> {\n let run_id = &uuid::Uuid::new_v4().to_string()[..8];\n let _root = tracing::info_span!(\"ingest\", %run_id, resource_type).entered();\n // ... existing ingest code\n}\n```\n\nCRITICAL: The _root guard must live for the entire function scope. If it drops early (e.g., shadowed or moved into a block), child spans lose their parent context. Use let _root (underscore prefix) to signal intentional unused binding that's kept alive for its Drop impl.\n\nFor async functions, use .entered() NOT .enter(). In async Rust, Span::enter() returns a guard that is NOT Send, which prevents the future from being sent across threads. However, .entered() on an info_span! creates an Entered<Span> which is also !Send. For async, prefer:\n\n```rust\nlet root_span = tracing::info_span!(\"sync\", %run_id);\nasync move {\n // ... body\n}.instrument(root_span).await\n```\n\nOr use #[instrument] on the function itself with the run_id field.\n\n## Acceptance Criteria\n- [ ] Root span established for every sync and ingest invocation\n- [ ] run_id appears in span context of all child log lines\n- [ ] jq 'select(.spans[]? | .run_id)' can extract all lines from a run\n- [ ] Span is active for entire function duration (not dropped early)\n- [ ] Works correctly with async/await (span propagated across .await points)\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/cli/commands/sync.rs (add root span in run_sync, line ~54)\n- src/cli/commands/ingest.rs (add root span in run_ingest, line ~107)\n\n## TDD Loop\nRED: test_root_span_propagates_run_id (capture JSON log output, verify run_id in span context)\nGREEN: Add root spans to run_sync and run_ingest\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- Async span propagation: .entered() is !Send. For async functions, use .instrument() or #[instrument]. The run_sync function is async (line 54: pub async fn run_sync).\n- Nested command calls: run_sync calls run_ingest internally. If both create root spans, we get a nested hierarchy: sync > ingest. This is correct behavior -- the ingest span becomes a child of sync.\n- Span storage: tracing-subscriber registry handles span storage automatically. No manual setup needed beyond adding the layer.","status":"closed","priority":1,"issue_type":"task","created_at":"2026-02-04T15:54:07.771605Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:19:33.006274Z","closed_at":"2026-02-04T17:19:33.006227Z","close_reason":"Added root tracing spans with run_id to run_sync() and run_ingest() using .instrument() pattern for async compatibility","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-pgdw","depends_on_id":"bd-2ni","type":"parent-child","created_at":"2026-02-04T15:54:07.772319Z","created_by":"tayloreernisse"},{"issue_id":"bd-pgdw","depends_on_id":"bd-37qw","type":"blocks","created_at":"2026-02-04T15:55:19.742022Z","created_by":"tayloreernisse"}]}
{"id":"bd-pr1","title":"Implement lore stats CLI command","description":"## Background\nThe stats command provides visibility into the document/search/embedding pipeline health. It reports counts (DocumentStats, EmbeddingStats, FtsStats, QueueStats), verifies consistency between tables (--check), and repairs inconsistencies (--repair). This is essential for diagnosing sync issues and validating Gate A/B/C correctness.\n\n## Approach\nCreate `src/cli/commands/stats.rs` per PRD Section 4.6.\n\n**Stats structs (PRD-exact):**\n```rust\n#[derive(Debug, Serialize)]\npub struct Stats {\n pub documents: DocumentStats,\n pub embeddings: EmbeddingStats,\n pub fts: FtsStats,\n pub queues: QueueStats,\n}\n\n#[derive(Debug, Serialize)]\npub struct DocumentStats {\n pub issues: usize,\n pub mrs: usize,\n pub discussions: usize,\n pub total: usize,\n pub truncated: usize,\n}\n\n#[derive(Debug, Serialize)]\npub struct EmbeddingStats {\n /// Documents with at least one embedding (chunk_index=0 exists in embedding_metadata)\n pub embedded: usize,\n pub pending: usize,\n pub failed: usize,\n /// embedded / total_documents * 100 (document-level, not chunk-level)\n pub coverage_pct: f64,\n /// Total chunks across all embedded documents\n pub total_chunks: usize,\n}\n\n#[derive(Debug, Serialize)]\npub struct FtsStats { pub indexed: usize }\n\n#[derive(Debug, Serialize)]\npub struct QueueStats {\n pub dirty_sources: usize,\n pub dirty_sources_failed: usize,\n pub pending_discussion_fetches: usize,\n pub pending_discussion_fetches_failed: usize,\n}\n```\n\n**IntegrityCheck struct (PRD-exact):**\n```rust\n#[derive(Debug, Serialize)]\npub struct IntegrityCheck {\n pub documents_count: usize,\n pub fts_count: usize,\n pub embeddings_count: usize,\n pub metadata_count: usize,\n pub orphaned_embeddings: usize,\n pub hash_mismatches: usize,\n pub ok: bool,\n}\n```\n\n**RepairResult struct (PRD-exact):**\n```rust\n#[derive(Debug, Serialize)]\npub struct RepairResult {\n pub orphaned_embeddings_deleted: usize,\n pub stale_embeddings_cleared: usize,\n pub missing_fts_repopulated: usize,\n}\n```\n\n**Core functions:**\n- `run_stats(config) -> Result<Stats>` — gather all stats\n- `run_integrity_check(config) -> Result<IntegrityCheck>` — verify consistency\n- `run_repair(config) -> Result<RepairResult>` — fix issues\n\n**Integrity checks (per PRD):**\n1. documents count == documents_fts count\n2. All `embeddings.rowid / 1000` map to valid `documents.id` (orphan detection)\n3. `embedding_metadata.document_hash == documents.content_hash` for chunk_index=0 rows (staleness uses `document_hash`, NOT `chunk_hash`)\n\n**Repair operations (PRD-exact):**\n1. Delete orphaned embedding_metadata (document_id NOT IN documents)\n2. Delete orphaned vec0 rows: `DELETE FROM embeddings WHERE rowid / 1000 NOT IN (SELECT id FROM documents)` — uses `rowid / 1000` for chunked scheme\n3. Clear stale embeddings: find documents where `embedding_metadata.document_hash != documents.content_hash` (chunk_index=0 comparison), delete ALL chunks for those docs (range-based: `rowid >= doc_id * 1000 AND rowid < (doc_id + 1) * 1000`)\n4. FTS rebuild: `INSERT INTO documents_fts(documents_fts) VALUES('rebuild')` — full rebuild, NOT optimize. PRD note: partial fix is fragile with external-content FTS; rebuild is guaranteed correct.\n\n**CLI args (PRD-exact):**\n```rust\n#[derive(Args)]\npub struct StatsArgs {\n #[arg(long)]\n check: bool,\n #[arg(long, requires = \"check\")]\n repair: bool, // --repair requires --check\n}\n```\n\n## Acceptance Criteria\n- [ ] Document counts by type: issues, mrs, discussions, total, truncated\n- [ ] Embedding coverage is document-level (not chunk-level): `embedded / total * 100`\n- [ ] Embedding stats include total_chunks count\n- [ ] FTS indexed count reported\n- [ ] Queue stats: dirty_sources + dirty_sources_failed, pending_discussion_fetches + pending_discussion_fetches_failed\n- [ ] --check verifies: FTS count == documents count, orphan embeddings, hash mismatches\n- [ ] Orphan detection uses `rowid / 1000` for chunked embedding scheme\n- [ ] Hash mismatch uses `document_hash` (not `chunk_hash`) for document-level staleness\n- [ ] --repair deletes orphaned embeddings (range-based for chunks)\n- [ ] --repair clears stale metadata (document_hash != content_hash at chunk_index=0)\n- [ ] --repair uses FTS `rebuild` (not `optimize`) for correct-by-construction repair\n- [ ] --repair requires --check (Clap `requires` attribute)\n- [ ] Human output: formatted with aligned columns\n- [ ] JSON output: `{\"ok\": true, \"data\": stats}`\n- [ ] `cargo build` succeeds\n\n## Files\n- `src/cli/commands/stats.rs` — new file\n- `src/cli/commands/mod.rs` — add `pub mod stats;`\n- `src/cli/mod.rs` — add StatsArgs, wire up stats subcommand\n- `src/main.rs` — add stats command handler\n\n## TDD Loop\nRED: Integration tests:\n- `test_stats_empty_db` — all counts 0, coverage 0%\n- `test_stats_with_documents` — correct counts by type\n- `test_integrity_check_healthy` — ok=true when consistent\n- `test_integrity_check_fts_mismatch` — detects FTS/doc count divergence\n- `test_integrity_check_orphan_embeddings` — detects orphaned rowids\n- `test_repair_rebuilds_fts` — FTS count matches after repair\n- `test_repair_cleans_orphans` — orphaned embeddings deleted\n- `test_repair_clears_stale` — stale metadata cleared (doc_hash mismatch)\nGREEN: Implement stats, integrity check, repair\nVERIFY: `cargo build && cargo test stats`\n\n## Edge Cases\n- Empty database: all counts 0, coverage 0%, no integrity issues\n- Gate A only (no embeddings table): skip embedding stats gracefully\n- --repair on healthy DB: no-op, reports \"no issues found\" / zero counts\n- FTS rebuild on large DB: may be slow\n- --repair without --check: Clap rejects (requires attribute enforces dependency)","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:26:50.232629Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:54:31.065586Z","closed_at":"2026-01-30T17:54:31.065501Z","close_reason":"Implemented stats CLI with document counts by type, embedding coverage, FTS index count, queue stats, --check integrity (FTS mismatch, orphan embeddings, stale metadata), --repair (rebuild FTS, delete orphans, clear stale). Human + JSON output. Builds clean.","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-pr1","depends_on_id":"bd-3qs","type":"blocks","created_at":"2026-01-30T15:29:24.806108Z","created_by":"tayloreernisse"}]}
{"id":"bd-ser","title":"Implement MR ingestion module","description":"## Background\nMR ingestion module with cursor-based sync. Follows the same pattern as issue ingestion from CP1. Discussion sync eligibility is determined via DB query AFTER ingestion (not in-memory collection) to avoid memory growth on large projects.\n\n## Approach\nCreate `src/ingestion/merge_requests.rs` with:\n1. `IngestMergeRequestsResult` - Aggregated stats\n2. `ingest_merge_requests()` - Main ingestion function\n3. `upsert_merge_request()` - Single MR upsert\n4. Helper functions for labels, assignees, reviewers, cursor management\n\n## Files\n- `src/ingestion/merge_requests.rs` - New module\n- `src/ingestion/mod.rs` - Export new module\n- `tests/mr_ingestion_tests.rs` - Integration tests\n\n## Acceptance Criteria\n- [ ] `IngestMergeRequestsResult` has: fetched, upserted, labels_created, assignees_linked, reviewers_linked\n- [ ] `ingest_merge_requests()` returns `Result<IngestMergeRequestsResult>`\n- [ ] Page-boundary cursor updates (not item-count modulo)\n- [ ] Tuple-based cursor filtering: `(updated_at, gitlab_id)`\n- [ ] Transaction per MR for atomicity\n- [ ] Raw payload stored for each MR\n- [ ] Labels: clear-and-relink pattern (removes stale)\n- [ ] Assignees: clear-and-relink pattern\n- [ ] Reviewers: clear-and-relink pattern\n- [ ] `reset_discussion_watermarks()` for --full sync\n- [ ] `cargo test mr_ingestion` passes\n\n## TDD Loop\nRED: `cargo test ingest_mr` -> module not found\nGREEN: Add ingestion module with full logic\nVERIFY: `cargo test mr_ingestion`\n\n## Main Function Signature\n```rust\npub async fn ingest_merge_requests(\n conn: &Connection,\n client: &GitLabClient,\n config: &Config,\n project_id: i64, // Local DB project ID\n gitlab_project_id: i64, // GitLab project ID\n full_sync: bool, // Reset cursor if true\n) -> Result<IngestMergeRequestsResult>\n```\n\n## Ingestion Loop (page-based)\n```rust\nlet mut page = 1u32;\nloop {\n let page_result = client.fetch_merge_requests_page(...).await?;\n \n for mr in &page_result.items {\n // Tuple cursor filtering\n if let (Some(cursor_ts), Some(cursor_id)) = (cursor_updated_at, cursor_gitlab_id) {\n if mr_updated_at < cursor_ts { continue; }\n if mr_updated_at == cursor_ts && mr.id <= cursor_id { continue; }\n }\n \n // Begin transaction\n let tx = conn.unchecked_transaction()?;\n \n // Store raw payload\n let payload_id = store_payload(&tx, ...)?;\n \n // Transform and upsert\n let transformed = transform_merge_request(&mr, project_id)?;\n let upsert_result = upsert_merge_request(&tx, &transformed.merge_request, payload_id)?;\n \n // Clear-and-relink labels\n clear_mr_labels(&tx, local_mr_id)?;\n for label in &labels { ... }\n \n // Clear-and-relink assignees\n clear_mr_assignees(&tx, local_mr_id)?;\n for username in &transformed.assignee_usernames { ... }\n \n // Clear-and-relink reviewers\n clear_mr_reviewers(&tx, local_mr_id)?;\n for username in &transformed.reviewer_usernames { ... }\n \n tx.commit()?;\n \n // Track for cursor\n last_updated_at = Some(mr_updated_at);\n last_gitlab_id = Some(mr.id);\n }\n \n // Page-boundary cursor flush\n if let (Some(updated_at), Some(gitlab_id)) = (last_updated_at, last_gitlab_id) {\n update_cursor(conn, project_id, \"merge_requests\", updated_at, gitlab_id)?;\n }\n \n if page_result.is_last_page { break; }\n page = page_result.next_page.unwrap_or(page + 1);\n}\n```\n\n## Full Sync Watermark Reset\n```rust\nfn reset_discussion_watermarks(conn: &Connection, project_id: i64) -> Result<()> {\n conn.execute(\n \"UPDATE merge_requests\n SET discussions_synced_for_updated_at = NULL,\n discussions_sync_attempts = 0,\n discussions_sync_last_error = NULL\n WHERE project_id = ?\",\n [project_id],\n )?;\n Ok(())\n}\n```\n\n## DB Helper Functions\n- `get_cursor(conn, project_id) -> (Option<i64>, Option<i64>)` - Get (updated_at, gitlab_id)\n- `update_cursor(conn, project_id, resource_type, updated_at, gitlab_id)`\n- `reset_cursor(conn, project_id, resource_type)`\n- `upsert_merge_request(conn, mr, payload_id) -> Result<UpsertMergeRequestResult>`\n- `clear_mr_labels(conn, mr_id)`\n- `link_mr_label(conn, mr_id, label_id)`\n- `clear_mr_assignees(conn, mr_id)`\n- `upsert_mr_assignee(conn, mr_id, username)`\n- `clear_mr_reviewers(conn, mr_id)`\n- `upsert_mr_reviewer(conn, mr_id, username)`\n\n## Edge Cases\n- Cursor rewind may cause refetch of already-seen MRs (tuple filtering handles this)\n- Large projects: 10k+ MRs - page-based cursor prevents massive refetch on crash\n- Labels/assignees/reviewers may change - clear-and-relink ensures correctness","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-26T22:06:41.967459Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:15:24.526208Z","closed_at":"2026-01-27T00:15:24.526142Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-ser","depends_on_id":"bd-34o","type":"blocks","created_at":"2026-01-26T22:08:54.519486Z","created_by":"tayloreernisse"},{"issue_id":"bd-ser","depends_on_id":"bd-3ir","type":"blocks","created_at":"2026-01-26T22:08:54.440174Z","created_by":"tayloreernisse"},{"issue_id":"bd-ser","depends_on_id":"bd-iba","type":"blocks","created_at":"2026-01-26T22:08:54.593550Z","created_by":"tayloreernisse"}]}
{"id":"bd-sqw","title":"Add Resource Events API endpoints to GitLab client","description":"## Background\nNeed paginated fetching of state/label/milestone events per entity from GitLab Resource Events APIs. The existing client uses reqwest with rate limiting and has stream_issues/stream_merge_requests patterns for paginated endpoints. However, resource events are per-entity (not project-wide), so they should return Vec rather than use streaming.\n\nExisting pagination pattern in client.rs: follow Link headers with per_page=100.\n\n## Approach\nAdd to src/gitlab/client.rs a generic helper and 6 endpoint methods:\n\n1. Generic paginated fetch helper (if not already present):\n```rust\nasync fn fetch_all_pages<T: DeserializeOwned>(&self, url: &str) -> Result<Vec<T>> {\n let mut results = Vec::new();\n let mut next_url = Some(url.to_string());\n while let Some(current_url) = next_url {\n self.rate_limiter.lock().unwrap().wait();\n let resp = self.client.get(&current_url)\n .header(\"PRIVATE-TOKEN\", &self.token)\n .query(&[(\"per_page\", \"100\")])\n .send().await?;\n // ... parse Link header for next page\n let page: Vec<T> = resp.json().await?;\n results.extend(page);\n next_url = parse_next_link(&resp_headers);\n }\n Ok(results)\n}\n```\n\n2. Six endpoint methods:\n```rust\npub async fn fetch_issue_state_events(&self, project_id: i64, iid: i64) -> Result<Vec<GitLabStateEvent>>\npub async fn fetch_issue_label_events(&self, project_id: i64, iid: i64) -> Result<Vec<GitLabLabelEvent>>\npub async fn fetch_issue_milestone_events(&self, project_id: i64, iid: i64) -> Result<Vec<GitLabMilestoneEvent>>\npub async fn fetch_mr_state_events(&self, project_id: i64, iid: i64) -> Result<Vec<GitLabStateEvent>>\npub async fn fetch_mr_label_events(&self, project_id: i64, iid: i64) -> Result<Vec<GitLabLabelEvent>>\npub async fn fetch_mr_milestone_events(&self, project_id: i64, iid: i64) -> Result<Vec<GitLabMilestoneEvent>>\n```\n\nURL patterns:\n- Issues: `/api/v4/projects/{project_id}/issues/{iid}/resource_{type}_events`\n- MRs: `/api/v4/projects/{project_id}/merge_requests/{iid}/resource_{type}_events`\n\n3. Consider a convenience method that fetches all 3 event types for an entity in one call:\n```rust\npub async fn fetch_all_resource_events(&self, project_id: i64, entity_type: &str, iid: i64) \n -> Result<(Vec<GitLabStateEvent>, Vec<GitLabLabelEvent>, Vec<GitLabMilestoneEvent>)>\n```\n\n## Acceptance Criteria\n- [ ] All 6 endpoints construct correct URLs\n- [ ] Pagination follows Link headers (handles entities with >100 events)\n- [ ] Rate limiter respected for each page request\n- [ ] 404 returns GitLabNotFound error (entity may have been deleted)\n- [ ] Network errors wrapped in GitLabNetworkError\n- [ ] Types from bd-2fm used for deserialization\n\n## Files\n- src/gitlab/client.rs (add methods + optionally generic helper)\n\n## TDD Loop\nRED: Add to tests/gitlab_client_tests.rs (or new file):\n- `test_fetch_issue_state_events_url` - verify URL construction (mock or inspect)\n- `test_fetch_mr_label_events_url` - verify URL construction\n- Note: Full integration tests require a mock HTTP server (mockito or wiremock). If the project doesn't already have one, write URL-construction unit tests only.\n\nGREEN: Implement the 6 methods using the generic helper\n\nVERIFY: `cargo test gitlab_client -- --nocapture && cargo build`\n\n## Edge Cases\n- project_id here is the GitLab project ID (not local DB id) — callers must pass gitlab_project_id\n- Empty results (new entity with no events) should return Ok(Vec::new()), not error\n- GitLab returns 403 for projects where Resource Events API is disabled — map to appropriate error\n- Very old entities may have thousands of events — pagination is essential\n- Rate limiter must be called per-page, not per-entity","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-02T21:31:24.137296Z","created_by":"tayloreernisse","updated_at":"2026-02-03T16:19:18.432602Z","closed_at":"2026-02-03T16:19:18.432559Z","close_reason":"Added fetch_all_pages<T> generic paginator, 6 per-entity endpoint methods (state/label/milestone for issues and MRs), and fetch_all_resource_events convenience method in src/gitlab/client.rs.","compaction_level":0,"original_size":0,"labels":["api","gate-1","phase-b"],"dependencies":[{"issue_id":"bd-sqw","depends_on_id":"bd-2fm","type":"blocks","created_at":"2026-02-02T21:32:06.101374Z","created_by":"tayloreernisse"},{"issue_id":"bd-sqw","depends_on_id":"bd-2zl","type":"parent-child","created_at":"2026-02-02T21:31:24.138647Z","created_by":"tayloreernisse"}]}
{"id":"bd-tir","title":"Implement generic dependent fetch queue (enqueue + drain)","description":"## Background\nThe pending_dependent_fetches table (migration 011) provides a generic job queue for all dependent resource fetches across Gates 1, 2, and 4. This module implements the queue operations: enqueue, claim, complete, fail, and stale lock reclamation. It generalizes the existing discussion_queue.rs pattern.\n\n## Approach\nCreate src/core/dependent_queue.rs with:\n\n```rust\nuse rusqlite::Connection;\nuse super::error::Result;\n\n/// A pending job from the dependent fetch queue.\npub struct PendingJob {\n pub id: i64,\n pub project_id: i64,\n pub entity_type: String, // \"issue\" | \"merge_request\"\n pub entity_iid: i64,\n pub entity_local_id: i64,\n pub job_type: String, // \"resource_events\" | \"mr_closes_issues\" | \"mr_diffs\"\n pub payload_json: Option<String>,\n pub attempts: i32,\n}\n\n/// Enqueue a dependent fetch job. Idempotent via UNIQUE constraint (INSERT OR IGNORE).\npub fn enqueue_job(\n conn: &Connection,\n project_id: i64,\n entity_type: &str,\n entity_iid: i64,\n entity_local_id: i64,\n job_type: &str,\n payload_json: Option<&str>,\n) -> Result<bool> // returns true if actually inserted (not deduped)\n\n/// Claim a batch of jobs for processing. Atomically sets locked_at.\n/// Only claims jobs where locked_at IS NULL AND (next_retry_at IS NULL OR next_retry_at <= now).\npub fn claim_jobs(\n conn: &Connection,\n job_type: &str,\n batch_size: usize,\n) -> Result<Vec<PendingJob>>\n\n/// Mark a job as complete (DELETE the row).\npub fn complete_job(conn: &Connection, job_id: i64) -> Result<()>\n\n/// Mark a job as failed. Increment attempts, set next_retry_at with exponential backoff, clear locked_at.\n/// Backoff: 30s * 2^(attempts-1), capped at 480s.\npub fn fail_job(conn: &Connection, job_id: i64, error: &str) -> Result<()>\n\n/// Reclaim stale locks (locked_at older than threshold).\n/// Returns count of reclaimed jobs.\npub fn reclaim_stale_locks(conn: &Connection, stale_threshold_minutes: u32) -> Result<usize>\n\n/// Count pending jobs by job_type (for stats/progress).\npub fn count_pending_jobs(conn: &Connection) -> Result<HashMap<String, usize>>\n```\n\nRegister in src/core/mod.rs: `pub mod dependent_queue;`\n\n**Key implementation details:**\n- claim_jobs uses a two-step approach: SELECT ids WHERE available, then UPDATE SET locked_at for those ids. Use a single transaction.\n- enqueued_at = current time in ms epoch UTC\n- locked_at = current time in ms epoch UTC when claimed\n- Backoff formula: next_retry_at = now + min(30_000 * 2^(attempts-1), 480_000) ms\n\n## Acceptance Criteria\n- [ ] enqueue_job is idempotent (INSERT OR IGNORE on UNIQUE constraint)\n- [ ] enqueue_job returns true on insert, false on dedup\n- [ ] claim_jobs only claims unlocked, non-retrying jobs\n- [ ] claim_jobs respects batch_size limit\n- [ ] complete_job DELETEs the row\n- [ ] fail_job increments attempts, sets next_retry_at, clears locked_at, records last_error\n- [ ] Backoff: 30s, 60s, 120s, 240s, 480s (capped)\n- [ ] reclaim_stale_locks clears locked_at for jobs older than threshold\n- [ ] count_pending_jobs returns accurate counts by job_type\n\n## Files\n- src/core/dependent_queue.rs (new)\n- src/core/mod.rs (add `pub mod dependent_queue;`)\n\n## TDD Loop\nRED: tests/dependent_queue_tests.rs (new):\n- `test_enqueue_job_basic` - enqueue a job, verify it exists\n- `test_enqueue_job_idempotent` - enqueue same job twice, verify single row\n- `test_claim_jobs_batch` - enqueue 5, claim 3, verify 3 returned and locked\n- `test_claim_jobs_skips_locked` - lock a job, claim again, verify it's skipped\n- `test_claim_jobs_respects_retry_at` - set next_retry_at in future, verify skipped\n- `test_claim_jobs_includes_retryable` - set next_retry_at in past, verify claimed\n- `test_complete_job_deletes` - complete a job, verify gone\n- `test_fail_job_backoff` - fail 3 times, verify exponential next_retry_at values\n- `test_reclaim_stale_locks` - set old locked_at, reclaim, verify cleared\n\nSetup: create_test_db() with migrations 001-011, seed project + issue.\n\nGREEN: Implement all functions\n\nVERIFY: `cargo test dependent_queue -- --nocapture`\n\n## Edge Cases\n- claim_jobs with batch_size=0 should return empty vec (not error)\n- enqueue_job with invalid job_type will be rejected by CHECK constraint — map rusqlite error to LoreError\n- fail_job on a non-existent job_id should be a no-op (job may have been completed by another path)\n- reclaim_stale_locks with 0 threshold would reclaim everything — ensure threshold is reasonable (minimum 1 min)\n- Timestamps must use consistent ms epoch UTC (not seconds)","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-02T21:31:57.290181Z","created_by":"tayloreernisse","updated_at":"2026-02-03T16:19:14.222626Z","closed_at":"2026-02-03T16:19:14.222579Z","close_reason":"Implemented PendingJob struct, enqueue_job, claim_jobs, complete_job, fail_job (with exponential backoff), reclaim_stale_locks, count_pending_jobs in src/core/dependent_queue.rs.","compaction_level":0,"original_size":0,"labels":["gate-1","phase-b","queue"],"dependencies":[{"issue_id":"bd-tir","depends_on_id":"bd-2zl","type":"parent-child","created_at":"2026-02-02T21:31:57.291894Z","created_by":"tayloreernisse"},{"issue_id":"bd-tir","depends_on_id":"bd-hu3","type":"blocks","created_at":"2026-02-02T21:31:57.292472Z","created_by":"tayloreernisse"}]}
{"id":"bd-v6i","title":"[CP1] gi ingest --type=issues command","description":"## Background\n\nThe `gi ingest --type=issues` command is the main entry point for issue ingestion. It acquires a single-flight lock, calls the orchestrator for each configured project, and outputs progress/summary to the user.\n\n## Approach\n\n### Module: src/cli/commands/ingest.rs\n\n### Clap Definition\n\n```rust\n#[derive(Args)]\npub struct IngestArgs {\n /// Resource type to ingest\n #[arg(long, value_parser = [\"issues\", \"merge_requests\"])]\n pub r#type: String,\n\n /// Filter to single project\n #[arg(long)]\n pub project: Option<String>,\n\n /// Override stale sync lock\n #[arg(long)]\n pub force: bool,\n}\n```\n\n### Handler Function\n\n```rust\npub async fn handle_ingest(args: IngestArgs, config: &Config) -> Result<()>\n```\n\n### Logic\n\n1. **Acquire single-flight lock**: `acquire_sync_lock(conn, args.force)?`\n2. **Get projects to sync**:\n - If `args.project` specified, filter to that one\n - Otherwise, get all configured projects from DB\n3. **For each project**:\n - Print \"Ingesting issues for {project_path}...\"\n - Call `ingest_project_issues(conn, client, config, project_id, gitlab_project_id)`\n - Print \"{N} issues fetched, {M} new labels\"\n4. **Print discussion sync summary**:\n - \"Fetching discussions ({N} issues with updates)...\"\n - \"{N} discussions, {M} notes (excluding {K} system notes)\"\n - \"Skipped discussion sync for {N} unchanged issues.\"\n5. **Release lock**: Lock auto-released when handler returns\n\n### Output Format (matches PRD)\n\n```\nIngesting issues...\n\n group/project-one: 1,234 issues fetched, 45 new labels\n\nFetching discussions (312 issues with updates)...\n\n group/project-one: 312 issues → 1,234 discussions, 5,678 notes\n\nTotal: 1,234 issues, 1,234 discussions, 5,678 notes (excluding 1,234 system notes)\nSkipped discussion sync for 922 unchanged issues.\n```\n\n## Acceptance Criteria\n\n- [ ] Clap args parse --type, --project, --force correctly\n- [ ] Single-flight lock acquired before sync starts\n- [ ] Lock error message is clear if concurrent run attempted\n- [ ] Progress output shows per-project counts\n- [ ] Summary includes unchanged issues skipped count\n- [ ] --force flag allows overriding stale lock\n\n## Files\n\n- src/cli/commands/mod.rs (add `pub mod ingest;`)\n- src/cli/commands/ingest.rs (create)\n- src/cli/mod.rs (add Ingest variant to Commands enum)\n\n## TDD Loop\n\nRED:\n```rust\n// tests/cli_ingest_tests.rs\n#[tokio::test] async fn ingest_issues_acquires_lock()\n#[tokio::test] async fn ingest_issues_fails_on_concurrent_run()\n#[tokio::test] async fn ingest_issues_respects_project_filter()\n#[tokio::test] async fn ingest_issues_force_overrides_stale_lock()\n```\n\nGREEN: Implement handler with lock and orchestrator calls\n\nVERIFY: `cargo test cli_ingest`\n\n## Edge Cases\n\n- No projects configured - return early with helpful message\n- Project filter matches nothing - error with \"project not found\"\n- Lock already held - clear error \"Sync already in progress\"\n- Ctrl-C during sync - lock should be released (via Drop or SIGINT handler)","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-25T17:02:38.312565Z","created_by":"tayloreernisse","updated_at":"2026-01-25T22:56:44.090142Z","closed_at":"2026-01-25T22:56:44.090086Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-v6i","depends_on_id":"bd-ozy","type":"blocks","created_at":"2026-01-25T17:04:05.629772Z","created_by":"tayloreernisse"}]}
{"id":"bd-xhz","title":"[CP1] GitLab client pagination methods","description":"## Background\n\nGitLab pagination methods enable fetching large result sets (issues, discussions) as async streams. The client uses `x-next-page` headers to determine continuation and applies cursor rewind for tuple-based incremental sync.\n\n## Approach\n\nAdd pagination methods to GitLabClient using `async-stream` crate:\n\n### Methods to Add\n\n```rust\nimpl GitLabClient {\n /// Paginate through issues for a project.\n pub fn paginate_issues(\n &self,\n gitlab_project_id: i64,\n updated_after: Option<i64>, // ms epoch cursor\n cursor_rewind_seconds: u32,\n ) -> Pin<Box<dyn Stream<Item = Result<GitLabIssue>> + Send + '_>>\n\n /// Paginate through discussions for an issue.\n pub fn paginate_issue_discussions(\n &self,\n gitlab_project_id: i64,\n issue_iid: i64,\n ) -> Pin<Box<dyn Stream<Item = Result<GitLabDiscussion>> + Send + '_>>\n\n /// Make request and return response with headers for pagination.\n async fn request_with_headers<T: DeserializeOwned>(\n &self,\n path: &str,\n params: &[(&str, String)],\n ) -> Result<(T, HeaderMap)>\n}\n```\n\n### Pagination Logic\n\n1. Start at page 1, per_page=100\n2. For issues: add scope=all, state=all, order_by=updated_at, sort=asc\n3. Apply cursor rewind: `updated_after = cursor - rewind_seconds` (clamped to 0)\n4. Yield each item from response\n5. Check `x-next-page` header for continuation\n6. Stop when header is empty/absent OR response is empty\n\n### Cursor Rewind\n\n```rust\nif let Some(ts) = updated_after {\n let rewind_ms = (cursor_rewind_seconds as i64) * 1000;\n let rewound = (ts - rewind_ms).max(0); // Clamp to avoid underflow\n // Convert to ISO 8601 for updated_after param\n}\n```\n\n## Acceptance Criteria\n\n- [ ] `paginate_issues` returns Stream of GitLabIssue\n- [ ] `paginate_issues` adds scope=all, state=all, order_by=updated_at, sort=asc\n- [ ] `paginate_issues` applies cursor rewind with max(0) clamping\n- [ ] `paginate_issue_discussions` returns Stream of GitLabDiscussion\n- [ ] Both methods follow x-next-page header until empty\n- [ ] Both methods stop on empty response (fallback)\n- [ ] `request_with_headers` returns (T, HeaderMap) tuple\n\n## Files\n\n- src/gitlab/client.rs (edit - add methods)\n\n## TDD Loop\n\nRED:\n```rust\n// tests/pagination_tests.rs\n#[tokio::test] async fn fetches_all_pages_when_multiple_exist()\n#[tokio::test] async fn respects_per_page_parameter()\n#[tokio::test] async fn follows_x_next_page_header_until_empty()\n#[tokio::test] async fn falls_back_to_empty_page_stop_if_headers_missing()\n#[tokio::test] async fn applies_cursor_rewind_for_tuple_semantics()\n#[tokio::test] async fn clamps_negative_rewind_to_zero()\n```\n\nGREEN: Implement pagination methods with async-stream\n\nVERIFY: `cargo test pagination`\n\n## Edge Cases\n\n- cursor_updated_at near zero - rewind must not underflow (use max(0))\n- GitLab returns empty x-next-page - treat as end of pages\n- GitLab omits pagination headers entirely - use empty response as stop condition\n- DateTime conversion fails - omit updated_after and fetch all (safe fallback)","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-25T17:02:38.222168Z","created_by":"tayloreernisse","updated_at":"2026-01-25T22:28:39.192876Z","closed_at":"2026-01-25T22:28:39.192815Z","close_reason":"Implemented paginate_issues and paginate_issue_discussions with async-stream, cursor rewind with max(0) clamping, x-next-page header following, 4 unit tests passing","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-xhz","depends_on_id":"bd-1np","type":"blocks","created_at":"2026-01-25T17:04:05.398212Z","created_by":"tayloreernisse"},{"issue_id":"bd-xhz","depends_on_id":"bd-2ys","type":"blocks","created_at":"2026-01-25T17:04:05.371440Z","created_by":"tayloreernisse"}]}
{"id":"bd-ymd","title":"[CP1] Final validation - Gate A through D","description":"Run all tests and verify all internal gates pass.\n\n## Gate A: Issues Only (Must Pass First)\n- [ ] gi ingest --type=issues fetches all issues from configured projects\n- [ ] Issues stored with correct schema, including last_seen_at\n- [ ] Cursor-based sync is resumable (re-run fetches only new/updated)\n- [ ] Incremental cursor updates every 100 issues\n- [ ] Raw payloads stored for each issue\n- [ ] gi list issues and gi count issues work\n\n## Gate B: Labels Correct (Must Pass)\n- [ ] Labels extracted and stored (name-only)\n- [ ] Label links created correctly\n- [ ] Stale label links removed on re-sync (verified with test)\n- [ ] Label count per issue matches GitLab\n\n## Gate C: Dependent Discussion Sync (Must Pass)\n- [ ] Discussions fetched for issues with updated_at advancement\n- [ ] Notes stored with is_system flag correctly set\n- [ ] Raw payloads stored for discussions and notes\n- [ ] discussions_synced_for_updated_at watermark updated after sync\n- [ ] Unchanged issues skip discussion refetch (verified with test)\n- [ ] Bounded concurrency (dependent_concurrency respected)\n\n## Gate D: Resumability Proof (Must Pass)\n- [ ] Kill mid-run, rerun; bounded redo (cursor progress preserved)\n- [ ] No redundant discussion refetch after crash recovery\n- [ ] Single-flight lock prevents concurrent runs\n\n## Final Gate (Must Pass)\n- [ ] All unit tests pass (cargo test)\n- [ ] All integration tests pass (mocked with wiremock)\n- [ ] cargo clippy passes with no warnings\n- [ ] cargo fmt --check passes\n- [ ] Compiles with --release\n\n## Validation Commands\ncargo test\ncargo clippy -- -D warnings\ncargo fmt --check\ncargo build --release\n\nFiles: All CP1 files\nDone when: All gate criteria pass","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T16:59:26.795633Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:02.132613Z","deleted_at":"2026-01-25T17:02:02.132608Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0}
{"id":"bd-ypa","title":"Implement timeline expand phase: BFS cross-reference expansion","description":"## Background\n\nThe expand phase is step 3 of the timeline pipeline (spec Section 3.2). Starting from seed entities, it performs BFS over entity_references to discover related entities. This enriches the timeline with events from linked issues and MRs that the keyword search alone wouldn't find.\n\n**Spec reference:** `docs/phase-b-temporal-intelligence.md` Section 3.2 step 3, Section 3.5 (expanded_entities JSON).\n\n## Approach\n\nCreate `src/core/timeline_expand.rs`:\n\n```rust\nuse std::collections::{HashSet, VecDeque};\nuse rusqlite::Connection;\nuse crate::core::timeline::{EntityRef, ExpandedEntityRef, UnresolvedRef};\n\npub struct ExpandResult {\n pub expanded_entities: Vec<ExpandedEntityRef>,\n pub unresolved_references: Vec<UnresolvedRef>,\n}\n\npub fn expand_timeline(\n conn: &Connection,\n seeds: &[EntityRef],\n depth: u32, // 0=no expansion, 1=default, 2+=deep\n include_mentions: bool, // --expand-mentions flag\n max_entities: usize, // cap at 100 to prevent explosion\n) -> Result<ExpandResult> {\n // BFS over entity_references\n // Default edge types: \"closes\", \"related\" (spec: skip \"mentioned\" by default)\n // If include_mentions: also traverse \"mentioned\" edges\n // Track provenance: which entity + edge type led to discovery\n // Collect unresolved refs (target_entity_id IS NULL)\n}\n```\n\n### BFS Algorithm\n\n```\nfn expand_timeline(conn, seeds, depth, include_mentions, max_entities):\n visited: HashSet<(entity_type, entity_id)> = seeds as set\n queue: VecDeque<(EntityRef, u32 current_depth)>\n expanded: Vec<ExpandedEntityRef> = []\n unresolved: Vec<UnresolvedRef> = []\n\n // Seed initial expansion targets\n for seed in seeds:\n for neighbor in query_neighbors(conn, &seed, edge_types):\n if neighbor is unresolved:\n add to unresolved with source=seed\n elif (type, id) not in visited AND expanded.len() < max_entities:\n visited.insert((type, id))\n expanded.push(ExpandedEntityRef {\n entity_ref: neighbor,\n depth: 1,\n via_from: seed.clone(),\n via_reference_type: edge.reference_type,\n via_source_method: edge.source_method,\n })\n if 1 < depth: queue.push_back((neighbor, 1))\n\n // Continue BFS for depth > 1\n while let Some((entity, current_depth)) = queue.pop_front():\n if current_depth >= depth: continue\n for neighbor in query_neighbors(conn, &entity, edge_types):\n ...same pattern with current_depth + 1...\n```\n\n### Provenance Tracking (Critical for spec compliance)\n\nPer spec Section 3.5, each expanded entity needs a `via` object:\n```json\n\"via\": {\n \"from\": { \"type\": \"merge_request\", \"iid\": 567, \"project\": \"group/repo\" },\n \"reference_type\": \"closes\",\n \"source_method\": \"api_closes_issues\"\n}\n```\n\nThe `ExpandedEntityRef` struct (defined in bd-20e) carries these fields:\n- `via_from: EntityRef` - the entity that referenced this one\n- `via_reference_type: String` - \"closes\", \"mentioned\", \"related\"\n- `via_source_method: String` - from entity_references.source_method column\n\n### Query for neighbors\n\nTraverses BOTH outgoing AND incoming edges:\n```sql\n-- Outgoing: this entity references others\nSELECT target_entity_type, target_entity_id, target_project_path,\n target_entity_iid, reference_type, source_method\nFROM entity_references\nWHERE source_entity_type = ?1 AND source_entity_id = ?2\n AND reference_type IN ('closes', 'related' [, 'mentioned'])\n\n-- Incoming: other entities reference this one\nSELECT source_entity_type, source_entity_id, reference_type, source_method\nFROM entity_references\nWHERE target_entity_type = ?1 AND target_entity_id = ?2\n AND reference_type IN ('closes', 'related' [, 'mentioned'])\n```\n\n### Unresolved References\n\nPer spec Section 3.5, when `target_entity_id IS NULL`, the reference is to an external entity. Collect these in `UnresolvedRef` with:\n- `source`: the entity containing the reference\n- `target_project`: from `target_project_path` column\n- `target_type`: from `target_entity_type` column\n- `target_iid`: from `target_entity_iid` column\n- `reference_type`: from `reference_type` column\n\n### Edge Type Filtering (spec Section 3.1)\n\n```rust\nfn edge_types(include_mentions: bool) -> Vec<&'static str> {\n if include_mentions {\n vec![\"closes\", \"related\", \"mentioned\"]\n } else {\n vec![\"closes\", \"related\"]\n }\n}\n```\n\nRegister in `src/core/mod.rs`: `pub mod timeline_expand;`\n\n## Acceptance Criteria\n\n- [ ] BFS traverses outgoing AND incoming edges in entity_references\n- [ ] Default: only \"closes\" and \"related\" edges traversed (spec Section 3.1)\n- [ ] `--expand-mentions`: also traverses \"mentioned\" edges\n- [ ] depth=0: returns empty expanded list (no expansion)\n- [ ] depth=1: one hop from seeds\n- [ ] max_entities cap prevents explosion (default 100)\n- [ ] Provenance tracked per spec: via_from, via_reference_type, via_source_method\n- [ ] source_method value comes from entity_references.source_method column\n- [ ] Unresolved references (target_entity_id IS NULL) collected per spec Section 3.5\n- [ ] No duplicates: visited set prevents re-expansion\n- [ ] Module registered in `src/core/mod.rs`\n- [ ] `cargo check --all-targets` passes\n- [ ] `cargo clippy --all-targets -- -D warnings` passes\n\n## Files\n\n- `src/core/timeline_expand.rs` (NEW)\n- `src/core/mod.rs` (add `pub mod timeline_expand;`)\n\n## TDD Loop\n\nRED: Create tests in `src/core/timeline_expand.rs`:\n- `test_expand_depth_zero` - returns empty expanded list\n- `test_expand_finds_linked_entity` - seed issue -> closes -> linked MR found\n- `test_expand_bidirectional` - if A closes B, starting from B finds A\n- `test_expand_respects_max_entities` - stops at cap\n- `test_expand_skips_mentions_by_default` - \"mentioned\" edges not traversed without flag\n- `test_expand_includes_mentions_when_flagged` - \"mentioned\" edges traversed with flag\n- `test_expand_collects_unresolved` - NULL target_entity_id goes to unresolved list\n- `test_expand_tracks_provenance` - expanded entity has correct via_from, via_reference_type, via_source_method\n\nTests need in-memory DB with migration 011 applied and test entity_references rows.\n\nGREEN: Implement BFS with visited set, provenance tracking, and edge type filtering.\n\nVERIFY: `cargo test --lib -- timeline_expand`\n\n## Edge Cases\n\n- Circular references (A closes B, B related to A): visited set prevents infinite loop\n- Entity referenced from multiple seeds: only first discovery tracked (first-come provenance)\n- Empty entity_references table: returns empty expanded list, not error\n- Self-referencing entity (source = target): skip, don't add to expanded\n- Cross-project references where target_entity_id is NULL: add to unresolved","status":"open","priority":2,"issue_type":"task","created_at":"2026-02-02T21:33:08.659381Z","created_by":"tayloreernisse","updated_at":"2026-02-05T19:11:19.095262Z","compaction_level":0,"original_size":0,"labels":["gate-3","phase-b","query"],"dependencies":[{"issue_id":"bd-ypa","depends_on_id":"bd-32q","type":"blocks","created_at":"2026-02-02T21:33:37.448515Z","created_by":"tayloreernisse"},{"issue_id":"bd-ypa","depends_on_id":"bd-3ia","type":"blocks","created_at":"2026-02-02T21:33:37.528233Z","created_by":"tayloreernisse"},{"issue_id":"bd-ypa","depends_on_id":"bd-ike","type":"parent-child","created_at":"2026-02-02T21:33:08.661036Z","created_by":"tayloreernisse"}]}
{"id":"bd-z0s","title":"[CP1] Final validation - Gate A through D","description":"Run all tests and verify all internal gates pass.\n\n## Gate A: Issues Only (Must Pass First)\n- [ ] gi ingest --type=issues fetches all issues from configured projects\n- [ ] Issues stored with correct schema, including last_seen_at\n- [ ] Cursor-based sync is resumable (re-run fetches only new/updated)\n- [ ] Incremental cursor updates every 100 issues\n- [ ] Raw payloads stored for each issue\n- [ ] gi list issues and gi count issues work\n\n## Gate B: Labels Correct (Must Pass)\n- [ ] Labels extracted and stored (name-only)\n- [ ] Label links created correctly\n- [ ] **Stale label links removed on re-sync** (verified with test)\n- [ ] Label count per issue matches GitLab\n\n## Gate C: Dependent Discussion Sync (Must Pass)\n- [ ] Discussions fetched for issues with updated_at advancement\n- [ ] Notes stored with is_system flag correctly set\n- [ ] Raw payloads stored for discussions and notes\n- [ ] discussions_synced_for_updated_at watermark updated after sync\n- [ ] **Unchanged issues skip discussion refetch** (verified with test)\n- [ ] Bounded concurrency (dependent_concurrency respected)\n\n## Gate D: Resumability Proof (Must Pass)\n- [ ] Kill mid-run, rerun; bounded redo (cursor progress preserved)\n- [ ] No redundant discussion refetch after crash recovery\n- [ ] Single-flight lock prevents concurrent runs\n\n## Final Gate (Must Pass)\n- [ ] All unit tests pass (cargo test)\n- [ ] All integration tests pass (mocked with wiremock)\n- [ ] cargo clippy passes with no warnings\n- [ ] cargo fmt --check passes\n- [ ] Compiles with --release\n\n## Validation Commands\ncargo test\ncargo clippy -- -D warnings\ncargo fmt --check\ncargo build --release\n\n## Data Integrity Checks\n- SELECT COUNT(*) FROM issues matches GitLab issue count\n- Every issue has a raw_payloads row\n- Every discussion has a raw_payloads row\n- Labels in issue_labels junction all exist in labels table\n- Re-running gi ingest --type=issues fetches 0 new items\n- After removing a label in GitLab and re-syncing, the link is removed\n\nFiles: All CP1 files\nDone when: All gate criteria pass","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-25T17:02:38.459095Z","created_by":"tayloreernisse","updated_at":"2026-01-25T23:27:09.567537Z","closed_at":"2026-01-25T23:27:09.567478Z","close_reason":"All gates pass: 71 tests, clippy clean, fmt clean, release build successful","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-z0s","depends_on_id":"bd-17v","type":"blocks","created_at":"2026-01-25T17:04:05.889114Z","created_by":"tayloreernisse"},{"issue_id":"bd-z0s","depends_on_id":"bd-2f0","type":"blocks","created_at":"2026-01-25T17:04:05.841210Z","created_by":"tayloreernisse"},{"issue_id":"bd-z0s","depends_on_id":"bd-39w","type":"blocks","created_at":"2026-01-25T17:04:05.913316Z","created_by":"tayloreernisse"},{"issue_id":"bd-z0s","depends_on_id":"bd-3n1","type":"blocks","created_at":"2026-01-25T17:04:05.817830Z","created_by":"tayloreernisse"},{"issue_id":"bd-z0s","depends_on_id":"bd-o7b","type":"blocks","created_at":"2026-01-25T17:04:05.864480Z","created_by":"tayloreernisse"},{"issue_id":"bd-z0s","depends_on_id":"bd-v6i","type":"blocks","created_at":"2026-01-25T17:04:05.794555Z","created_by":"tayloreernisse"}]}
{"id":"bd-z94","title":"Implement 'lore file-history' command with human and robot output","description":"## Background\n\nThe file-history command is Gate 4's user-facing CLI. It answers \"which MRs touched this file, and why?\" by combining mr_file_changes data with rename chain resolution. Follows the same CLI pattern as issues/mrs commands with both human and robot output.\n\n## Approach\n\n### 1. Add FileHistory subcommand in \\`src/cli/mod.rs\\`\n\n```rust\n/// Show MR history for a file path with rename chain resolution\n#[command(name = \"file-history\")]\nFileHistory(FileHistoryArgs),\n```\n\n```rust\n#[derive(Parser)]\npub struct FileHistoryArgs {\n /// File path to query\n pub path: String,\n\n /// Filter by project path\n #[arg(short = 'p', long, help_heading = \"Filters\")]\n pub project: Option<String>,\n\n /// Include DiffNote discussion snippets\n #[arg(long, help_heading = \"Output\", overrides_with = \"no_discussions\")]\n pub discussions: bool,\n\n #[arg(long = \"no-discussions\", hide = true, overrides_with = \"discussions\")]\n pub no_discussions: bool,\n\n /// Disable rename chain following\n #[arg(long = \"no-follow-renames\", help_heading = \"Output\")]\n pub no_follow_renames: bool,\n\n /// Only show merged MRs (skip open/closed)\n #[arg(long, help_heading = \"Filters\", overrides_with = \"no_merged\")]\n pub merged: bool,\n\n #[arg(long = \"no-merged\", hide = true, overrides_with = \"merged\")]\n pub no_merged: bool,\n\n /// Maximum results\n #[arg(short = 'n', long = \"limit\", default_value = \"50\", help_heading = \"Output\")]\n pub limit: usize,\n}\n```\n\n### 2. Handler in \\`src/main.rs\\`\n\n```rust\nSome(Commands::FileHistory(args)) => handle_file_history(cli.config.as_deref(), args, robot_mode),\n```\n\n### 3. Query Logic\n\n```rust\npub fn run_file_history(\n conn: &Connection,\n project_id: i64,\n path: &str,\n follow_renames: bool,\n merged_only: bool,\n include_discussions: bool,\n limit: usize,\n) -> Result<FileHistoryResult> {\n // 1. Resolve rename chain (unless no_follow_renames)\n // 2. Query mr_file_changes for all resolved paths\n // 3. Filter by state if merged_only\n // 4. Optionally fetch DiffNote discussions per MR\n // 5. Order by MR updated_at DESC, limit\n}\n```\n\n### 4. Human Output\n\n```\nFile History: src/auth/oauth.rs (via 3 paths, 5 MRs)\nRename chain: src/authentication/oauth.rs -> src/auth/oauth.rs\n\nMR !456 \"Implement OAuth2 flow\" merged @alice 2024-01-22 modified\nMR !489 \"Fix OAuth token expiry\" merged @bob 2024-02-15 modified\nMR !234 \"Initial auth module\" merged @alice 2023-11-01 added\n```\n\n### 5. Robot JSON\n\n```json\n{\n \"ok\": true,\n \"data\": {\n \"path\": \"src/auth/oauth.rs\",\n \"rename_chain\": [\"src/authentication/oauth.rs\", \"src/auth/oauth.rs\"],\n \"merge_requests\": [\n {\n \"iid\": 456,\n \"title\": \"Implement OAuth2 flow\",\n \"state\": \"merged\",\n \"author\": \"alice\",\n \"change_type\": \"modified\",\n \"discussions\": []\n }\n ]\n },\n \"meta\": {\n \"total_paths_searched\": 2,\n \"total_mrs\": 5,\n \"showing\": 5,\n \"follow_renames\": true\n }\n}\n```\n\n## Acceptance Criteria\n\n- [ ] \\`lore file-history src/foo.rs\\` works with human output\n- [ ] \\`lore --robot file-history src/foo.rs\\` works with JSON output\n- [ ] Rename chain displayed in human output\n- [ ] \\`--no-follow-renames\\` disables rename resolution\n- [ ] \\`--merged\\` filters to only merged MRs\n- [ ] \\`--discussions\\` includes DiffNote snippets\n- [ ] \\`-p group/repo\\` filters to single project (ambiguous -> exit 18)\n- [ ] \\`-n 10\\` limits results\n- [ ] File with no MR history: \"No MR history found for 'path'\" (exit 0)\n- [ ] Command appears in robot-docs and VALID_COMMANDS\n- [ ] \\`cargo check --all-targets\\` passes\n- [ ] \\`cargo clippy --all-targets -- -D warnings\\` passes\n\n## Files\n\n- \\`src/cli/mod.rs\\` (add FileHistoryArgs + Commands::FileHistory)\n- \\`src/cli/commands/file_history.rs\\` (NEW -- handler + output functions)\n- \\`src/cli/commands/mod.rs\\` (re-export)\n- \\`src/main.rs\\` (add handle_file_history + match arm + VALID_COMMANDS + robot-docs)\n\n## TDD Loop\n\nRED: No unit tests for CLI wiring. Verify with:\n- \\`cargo check --all-targets\\`\n- Manual: \\`lore file-history src/main.rs\\` against a synced database with diffs\n\nGREEN: Wire up the clap struct, handler, and output functions.\n\nVERIFY: \\`cargo check --all-targets && cargo clippy --all-targets -- -D warnings\\`\n\n## Edge Cases\n\n- File path with spaces: clap handles quoting, but query must preserve spaces exactly\n- Path not in any MR: return empty result, not error\n- Path only in renamed MRs (old name): rename chain must find it\n- Large result set (100+ MRs): limit prevents unbounded output\n- \\`--merged\\` combined with \\`--discussions\\`: only show discussions on merged MRs","status":"open","priority":2,"issue_type":"task","created_at":"2026-02-02T21:34:09.027259Z","created_by":"tayloreernisse","updated_at":"2026-02-05T18:55:04.304507Z","compaction_level":0,"original_size":0,"labels":["cli","gate-4","phase-b"],"dependencies":[{"issue_id":"bd-z94","depends_on_id":"bd-14q","type":"parent-child","created_at":"2026-02-02T21:34:09.028633Z","created_by":"tayloreernisse"},{"issue_id":"bd-z94","depends_on_id":"bd-1yx","type":"blocks","created_at":"2026-02-02T21:34:16.784122Z","created_by":"tayloreernisse"},{"issue_id":"bd-z94","depends_on_id":"bd-2yo","type":"blocks","created_at":"2026-02-02T21:34:16.741201Z","created_by":"tayloreernisse"},{"issue_id":"bd-z94","depends_on_id":"bd-3ia","type":"blocks","created_at":"2026-02-02T21:34:16.824983Z","created_by":"tayloreernisse"}]}