From c2f34d3a4fb874583444be9cf7c189656390ff04 Mon Sep 17 00:00:00 2001 From: Taylor Eernisse Date: Thu, 5 Feb 2026 11:23:13 -0500 Subject: [PATCH] chore(beads): Update issue tracker metadata Co-Authored-By: Claude Opus 4.5 --- .beads/issues.jsonl | 9 ++++++--- .beads/last-touched | 2 +- 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/.beads/issues.jsonl b/.beads/issues.jsonl index 2e31f21..03ad5d8 100644 --- a/.beads/issues.jsonl +++ b/.beads/issues.jsonl @@ -2,6 +2,7 @@ {"id":"bd-10i","title":"Epic: CP2 Gate D - Resumability Proof","description":"## Background\nGate D validates resumability and crash recovery. Proves that cursor and watermark mechanics prevent massive refetch after interruption. This is critical for large projects where a full refetch would take hours.\n\n## Acceptance Criteria (Pass/Fail)\n- [ ] Kill mid-run, rerun -> bounded redo (not full refetch from beginning)\n- [ ] Cursor saved at page boundary (not item boundary)\n- [ ] No redundant discussion refetch after crash recovery\n- [ ] No watermark advancement on partial pagination failure\n- [ ] Single-flight lock prevents concurrent ingest runs\n- [ ] `--full` flag resets MR cursor to NULL\n- [ ] `--full` flag resets ALL `discussions_synced_for_updated_at` to NULL\n- [ ] `--force` bypasses single-flight lock\n\n## Validation Script\n```bash\n#!/bin/bash\nset -e\n\nDB_PATH=\"${XDG_DATA_HOME:-$HOME/.local/share}/gitlab-inbox/db.sqlite3\"\n\necho \"=== Gate D: Resumability Proof ===\"\n\n# 1. Test single-flight lock\necho \"Step 1: Test single-flight lock...\"\ngi ingest --type=merge_requests &\nFIRST_PID=$!\nsleep 1\n\n# Try second ingest - should fail with lock error\nif gi ingest --type=merge_requests 2>&1 | grep -q \"lock\\|already running\"; then\n echo \" PASS: Second ingest blocked by lock\"\nelse\n echo \" FAIL: Lock not working\"\nfi\nwait $FIRST_PID 2>/dev/null || true\n\n# 2. Test --force bypasses lock\necho \"Step 2: Test --force flag...\"\ngi ingest --type=merge_requests &\nFIRST_PID=$!\nsleep 1\nif gi ingest --type=merge_requests --force 2>&1; then\n echo \" PASS: --force bypassed lock\"\nelse\n echo \" Note: --force test inconclusive\"\nfi\nwait $FIRST_PID 2>/dev/null || true\n\n# 3. Check cursor state\necho \"Step 3: Check cursor state...\"\nsqlite3 \"$DB_PATH\" \"\n SELECT resource_type, updated_at, gitlab_id\n FROM sync_cursors \n WHERE resource_type = 'merge_requests';\n\"\n\n# 4. Test crash recovery\necho \"Step 4: Test crash recovery...\"\n\n# Record current cursor\nCURSOR_BEFORE=$(sqlite3 \"$DB_PATH\" \"\n SELECT updated_at FROM sync_cursors WHERE resource_type = 'merge_requests';\n\")\necho \" Cursor before: $CURSOR_BEFORE\"\n\n# Force full sync and kill\necho \" Starting full sync then killing...\"\ngi ingest --type=merge_requests --full &\nPID=$!\nsleep 5 && kill -9 $PID 2>/dev/null || true\nwait $PID 2>/dev/null || true\n\n# Check cursor was saved (should be non-null if any page completed)\nCURSOR_AFTER=$(sqlite3 \"$DB_PATH\" \"\n SELECT updated_at FROM sync_cursors WHERE resource_type = 'merge_requests';\n\")\necho \" Cursor after kill: $CURSOR_AFTER\"\n\n# Re-run and verify bounded redo\necho \" Re-running (should resume from cursor)...\"\ntime gi ingest --type=merge_requests\n# Should be faster than first full sync\n\n# 5. Test --full reset\necho \"Step 5: Test --full resets watermarks...\"\n\n# Check watermarks before\nWATERMARKS_BEFORE=$(sqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM merge_requests \n WHERE discussions_synced_for_updated_at IS NOT NULL;\n\")\necho \" Watermarks set before --full: $WATERMARKS_BEFORE\"\n\n# Record cursor before\nCURSOR_BEFORE_FULL=$(sqlite3 \"$DB_PATH\" \"\n SELECT updated_at, gitlab_id FROM sync_cursors WHERE resource_type = 'merge_requests';\n\")\necho \" Cursor before --full: $CURSOR_BEFORE_FULL\"\n\n# Run --full\ngi ingest --type=merge_requests --full\n\n# Check cursor was reset then rebuilt\nCURSOR_AFTER_FULL=$(sqlite3 \"$DB_PATH\" \"\n SELECT updated_at, gitlab_id FROM sync_cursors WHERE resource_type = 'merge_requests';\n\")\necho \" Cursor after --full: $CURSOR_AFTER_FULL\"\n\n# Watermarks should be set again (sync completed)\nWATERMARKS_AFTER=$(sqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM merge_requests \n WHERE discussions_synced_for_updated_at IS NOT NULL;\n\")\necho \" Watermarks set after --full: $WATERMARKS_AFTER\"\n\necho \"\"\necho \"=== Gate D: PASSED ===\"\n```\n\n## Watermark Safety Test (Simulated Network Failure)\n```bash\n# This tests that watermark doesn't advance on partial failure\n# Requires ability to simulate network issues\n\n# 1. Get an MR that needs discussion sync\nMR_ID=$(sqlite3 \"$DB_PATH\" \"\n SELECT id FROM merge_requests \n WHERE discussions_synced_for_updated_at IS NULL \n OR updated_at > discussions_synced_for_updated_at\n LIMIT 1;\n\")\n\n# 2. Note current watermark\nWATERMARK_BEFORE=$(sqlite3 \"$DB_PATH\" \"\n SELECT discussions_synced_for_updated_at FROM merge_requests WHERE id = $MR_ID;\n\")\necho \"Watermark before: $WATERMARK_BEFORE\"\n\n# 3. Simulate network failure (requires network manipulation)\n# Option A: Block GitLab API temporarily\n# Option B: Run in a container with network limits\n# Option C: Use the automated test instead:\ncargo test does_not_advance_discussion_watermark_on_partial_failure\n\n# 4. Verify watermark unchanged after failure\nWATERMARK_AFTER=$(sqlite3 \"$DB_PATH\" \"\n SELECT discussions_synced_for_updated_at FROM merge_requests WHERE id = $MR_ID;\n\")\necho \"Watermark after failure: $WATERMARK_AFTER\"\n[ \"$WATERMARK_BEFORE\" = \"$WATERMARK_AFTER\" ] && echo \"PASS: Watermark preserved\"\n```\n\n## Test Commands (Quick Verification)\n```bash\n# Check cursor state:\nsqlite3 ~/.local/share/gitlab-inbox/db.sqlite3 \"\n SELECT * FROM sync_cursors WHERE resource_type = 'merge_requests';\n\"\n\n# Check watermark distribution:\nsqlite3 ~/.local/share/gitlab-inbox/db.sqlite3 \"\n SELECT \n SUM(CASE WHEN discussions_synced_for_updated_at IS NULL THEN 1 ELSE 0 END) as needs_sync,\n SUM(CASE WHEN discussions_synced_for_updated_at IS NOT NULL THEN 1 ELSE 0 END) as synced\n FROM merge_requests;\n\"\n\n# Test --full resets (check before/after):\nsqlite3 ~/.local/share/gitlab-inbox/db.sqlite3 \"SELECT COUNT(*) FROM merge_requests WHERE discussions_synced_for_updated_at IS NOT NULL;\"\ngi ingest --type=merge_requests --full\n# During full sync, watermarks should be NULL, then repopulated\n```\n\n## Critical Automated Tests\nThese tests MUST pass for Gate D:\n```bash\ncargo test does_not_advance_discussion_watermark_on_partial_failure\ncargo test full_sync_resets_discussion_watermarks\ncargo test cursor_saved_at_page_boundary\n```\n\n## Dependencies\nThis gate requires:\n- bd-mk3 (ingest command with --full and --force support)\n- bd-ser (MR ingestion with cursor mechanics)\n- bd-20h (MR discussion ingestion with watermark safety)\n- Gates A, B, C must pass first\n\n## Edge Cases\n- Very fast sync: May complete before kill signal reaches; retest with larger project\n- Lock file stale: If previous run crashed, lock file may exist; --force handles this\n- Clock skew: Cursor timestamps should use server time, not local time","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-26T22:06:02.124186Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:48:21.060596Z","closed_at":"2026-01-27T00:48:21.060555Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-10i","depends_on_id":"bd-mk3","type":"blocks","created_at":"2026-01-26T22:08:55.875790Z","created_by":"tayloreernisse"}]} {"id":"bd-12ae","title":"OBSERV: Add structured tracing fields to rate-limit/retry handling","description":"## Background\nRate limit and retry events are currently logged at WARN with minimal context (src/gitlab/client.rs:~157). This enriches them with structured fields so MetricsLayer can count them and -v mode shows actionable retry information.\n\n## Approach\n### src/gitlab/client.rs - request() method (line ~119-171)\n\nCurrent 429 handling (~line 155-158):\n```rust\nif response.status() == StatusCode::TOO_MANY_REQUESTS && attempt < Self::MAX_RETRIES {\n let retry_after = Self::parse_retry_after(&response);\n tracing::warn!(retry_after_secs = retry_after, attempt, path, \"Rate limited by GitLab, retrying\");\n sleep(Duration::from_secs(retry_after)).await;\n continue;\n}\n```\n\nReplace with INFO-level structured log:\n```rust\nif response.status() == StatusCode::TOO_MANY_REQUESTS && attempt < Self::MAX_RETRIES {\n let retry_after = Self::parse_retry_after(&response);\n tracing::info!(\n path = %path,\n attempt = attempt,\n retry_after_secs = retry_after,\n status_code = 429u16,\n \"Rate limited, retrying\"\n );\n sleep(Duration::from_secs(retry_after)).await;\n continue;\n}\n```\n\nFor transient errors (network errors, 5xx responses), add similar structured logging:\n```rust\ntracing::info!(\n path = %path,\n attempt = attempt,\n error = %e,\n \"Retrying after transient error\"\n);\n```\n\nKey changes:\n- Level: WARN -> INFO (visible in -v mode, not alarming in default mode)\n- Added: status_code field for 429\n- Added: structured path, attempt fields for all retry events\n- These structured fields enable MetricsLayer (bd-3vqk) to count rate_limit_hits and retries\n\n## Acceptance Criteria\n- [ ] 429 responses log at INFO with fields: path, attempt, retry_after_secs, status_code=429\n- [ ] Transient error retries log at INFO with fields: path, attempt, error\n- [ ] lore -v sync shows retry activity on stderr (INFO is visible in -v mode)\n- [ ] Default mode (no -v) does NOT show retry lines on stderr (INFO filtered out)\n- [ ] File layer captures all retry events (always at DEBUG+)\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/gitlab/client.rs (modify request() method, lines ~119-171)\n\n## TDD Loop\nRED:\n - test_rate_limit_log_fields: mock 429 response, capture log output, parse JSON, assert fields\n - test_retry_log_fields: mock network error + retry, assert structured fields\nGREEN: Change log level and add structured fields\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- parse_retry_after returns 0 or very large values: the existing logic handles this\n- All retries exhausted: the final attempt returns the error normally. No special logging needed (the error propagates).\n- path may contain sensitive data (project IDs): project IDs are not sensitive in this context","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-04T15:55:02.448070Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:21:42.304259Z","closed_at":"2026-02-04T17:21:42.304213Z","close_reason":"Changed 429 rate-limit logging from WARN to INFO with structured fields: path, attempt, retry_after_secs, status_code=429 in both request() and request_with_headers()","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-12ae","depends_on_id":"bd-3pk","type":"parent-child","created_at":"2026-02-04T15:55:02.450343Z","created_by":"tayloreernisse"}]} {"id":"bd-13b","title":"[CP0] CLI entry point with Commander.js","description":"## Background\n\nCommander.js provides the CLI framework. The main entry point sets up the program with all subcommands. Uses ESM with proper shebang for npx/global installation.\n\nReference: docs/prd/checkpoint-0.md section \"CLI Commands\"\n\n## Approach\n\n**src/cli/index.ts:**\n```typescript\n#!/usr/bin/env node\n\nimport { Command } from 'commander';\nimport { version } from '../../package.json' with { type: 'json' };\nimport { initCommand } from './commands/init';\nimport { authTestCommand } from './commands/auth-test';\nimport { doctorCommand } from './commands/doctor';\nimport { versionCommand } from './commands/version';\nimport { backupCommand } from './commands/backup';\nimport { resetCommand } from './commands/reset';\nimport { syncStatusCommand } from './commands/sync-status';\n\nconst program = new Command();\n\nprogram\n .name('gi')\n .description('GitLab Inbox - Unified notification management')\n .version(version);\n\n// Global --config flag available to all commands\nprogram.option('-c, --config ', 'Path to config file');\n\n// Register subcommands\nprogram.addCommand(initCommand);\nprogram.addCommand(authTestCommand);\nprogram.addCommand(doctorCommand);\nprogram.addCommand(versionCommand);\nprogram.addCommand(backupCommand);\nprogram.addCommand(resetCommand);\nprogram.addCommand(syncStatusCommand);\n\nprogram.parse();\n```\n\nEach command file exports a Command instance:\n```typescript\n// src/cli/commands/version.ts\nimport { Command } from 'commander';\n\nexport const versionCommand = new Command('version')\n .description('Show version information')\n .action(() => {\n console.log(`gi version ${version}`);\n });\n```\n\n## Acceptance Criteria\n\n- [ ] `gi --help` shows all commands and global options\n- [ ] `gi --version` shows version from package.json\n- [ ] `gi --help` shows command-specific help\n- [ ] `gi --config ./path` passes config path to commands\n- [ ] Unknown command shows error and suggests --help\n- [ ] Exit code 0 on success, non-zero on error\n- [ ] Shebang line works for npx execution\n\n## Files\n\nCREATE:\n- src/cli/index.ts (main entry point)\n- src/cli/commands/version.ts (simple command as template)\n\nMODIFY (later beads):\n- package.json (add \"bin\" field pointing to dist/cli/index.js)\n\n## TDD Loop\n\nN/A for CLI entry point - verify with manual testing:\n\n```bash\nnpm run build\nnode dist/cli/index.js --help\nnode dist/cli/index.js version\nnode dist/cli/index.js unknown-command # should error\n```\n\n## Edge Cases\n\n- package.json import requires Node 20+ with { type: 'json' } assertion\n- Alternative: read version from package.json with readFileSync\n- Command registration order affects help display - alphabetical preferred\n- Global options must be defined before subcommands","status":"closed","priority":1,"issue_type":"task","created_at":"2026-01-24T16:09:50.499023Z","created_by":"tayloreernisse","updated_at":"2026-01-25T03:10:49.224627Z","closed_at":"2026-01-25T03:10:49.224499Z","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-13b","depends_on_id":"bd-gg1","type":"blocks","created_at":"2026-01-24T16:13:09.370408Z","created_by":"tayloreernisse"}]} +{"id":"bd-13pt","title":"Display closing MRs in lore issues output","description":"## Background\nThe `entity_references` table stores MR->Issue 'closes' relationships (from the closes_issues API), but this data is never displayed when viewing an issue. This is the 'Development' section in GitLab UI showing which MRs will close an issue when merged.\n\n**System fit**: Data already flows through `fetch_mr_closes_issues()` -> `store_closes_issues_refs()` -> `entity_references` table. We just need to query and display it.\n\n## Approach\n\nAll changes in `src/cli/commands/show.rs`:\n\n### 1. Add ClosingMrRef struct (after DiffNotePosition ~line 57)\n```rust\n#[derive(Debug, Clone, Serialize)]\npub struct ClosingMrRef {\n pub iid: i64,\n pub title: String,\n pub state: String,\n pub web_url: Option,\n}\n```\n\n### 2. Update IssueDetail struct (line ~59)\n```rust\npub struct IssueDetail {\n // ... existing fields ...\n pub closing_merge_requests: Vec, // NEW - add after discussions\n}\n```\n\n### 3. Add ClosingMrRefJson struct (after NoteDetailJson ~line 797)\n```rust\n#[derive(Serialize)]\npub struct ClosingMrRefJson {\n pub iid: i64,\n pub title: String,\n pub state: String,\n pub web_url: Option,\n}\n```\n\n### 4. Update IssueDetailJson struct (line ~770)\n```rust\npub struct IssueDetailJson {\n // ... existing fields ...\n pub closing_merge_requests: Vec, // NEW\n}\n```\n\n### 5. Add get_closing_mrs() function (after get_issue_discussions ~line 245)\n```rust\nfn get_closing_mrs(conn: &Connection, issue_id: i64) -> Result> {\n let mut stmt = conn.prepare(\n \"SELECT mr.iid, mr.title, mr.state, mr.web_url\n FROM entity_references er\n JOIN merge_requests mr ON mr.id = er.source_entity_id\n WHERE er.target_entity_type = 'issue'\n AND er.target_entity_id = ?\n AND er.source_entity_type = 'merge_request'\n AND er.reference_type = 'closes'\n ORDER BY mr.iid\"\n )?;\n \n let mrs = stmt\n .query_map([issue_id], |row| {\n Ok(ClosingMrRef {\n iid: row.get(0)?,\n title: row.get(1)?,\n state: row.get(2)?,\n web_url: row.get(3)?,\n })\n })?\n .collect::, _>>()?;\n \n Ok(mrs)\n}\n```\n\n### 6. Update run_show_issue() (line ~89)\n```rust\nlet closing_mrs = get_closing_mrs(&conn, issue.id)?;\n// In return struct:\nclosing_merge_requests: closing_mrs,\n```\n\n### 7. Update print_show_issue() (after Labels section ~line 556)\n```rust\nif !issue.closing_merge_requests.is_empty() {\n println!(\"Development:\");\n for mr in &issue.closing_merge_requests {\n let state_indicator = match mr.state.as_str() {\n \"merged\" => style(\"merged\").green(),\n \"opened\" => style(\"opened\").cyan(),\n \"closed\" => style(\"closed\").red(),\n _ => style(&mr.state).dim(),\n };\n println!(\" !{} {} ({})\", mr.iid, mr.title, state_indicator);\n }\n}\n```\n\n### 8. Update From<&IssueDetail> for IssueDetailJson (line ~799)\n```rust\nclosing_merge_requests: issue.closing_merge_requests.iter().map(|mr| ClosingMrRefJson {\n iid: mr.iid,\n title: mr.title.clone(),\n state: mr.state.clone(),\n web_url: mr.web_url.clone(),\n}).collect(),\n```\n\n## Acceptance Criteria\n- [ ] `cargo test test_get_closing_mrs` passes (4 tests)\n- [ ] `lore issues ` shows Development section when closing MRs exist\n- [ ] Development section shows MR iid, title, and state\n- [ ] State is color-coded (green=merged, cyan=opened, red=closed)\n- [ ] `lore -J issues ` includes closing_merge_requests array\n- [ ] `cargo clippy --all-targets -- -D warnings` passes\n\n## Files\n- `src/cli/commands/show.rs` - ALL changes\n\n## TDD Loop\n\n**RED** - Add tests to `src/cli/commands/show.rs` `#[cfg(test)] mod tests`:\n\n```rust\nfn seed_issue_with_closing_mr(conn: &Connection) -> (i64, i64) {\n conn.execute(\n \"INSERT INTO projects (id, gitlab_project_id, path_with_namespace, web_url, created_at, updated_at)\n VALUES (1, 100, 'group/repo', 'https://gitlab.example.com', 1000, 2000)\", []\n ).unwrap();\n conn.execute(\n \"INSERT INTO issues (id, gitlab_id, iid, project_id, title, state, author_username,\n created_at, updated_at, last_seen_at) VALUES (1, 200, 10, 1, 'Bug fix', 'opened', 'dev', 1000, 2000, 2000)\", []\n ).unwrap();\n conn.execute(\n \"INSERT INTO merge_requests (id, gitlab_id, iid, project_id, title, state, author_username,\n source_branch, target_branch, created_at, updated_at, last_seen_at)\n VALUES (1, 300, 5, 1, 'Fix the bug', 'merged', 'dev', 'fix', 'main', 1000, 2000, 2000)\", []\n ).unwrap();\n conn.execute(\n \"INSERT INTO entity_references (project_id, source_entity_type, source_entity_id,\n target_entity_type, target_entity_id, reference_type, source_method, created_at)\n VALUES (1, 'merge_request', 1, 'issue', 1, 'closes', 'api', 3000)\", []\n ).unwrap();\n (1, 1) // (issue_id, mr_id)\n}\n\n#[test]\nfn test_get_closing_mrs_empty() {\n let conn = setup_test_db();\n // seed project + issue with no closing MRs\n conn.execute(\"INSERT INTO projects ...\", []).unwrap();\n conn.execute(\"INSERT INTO issues ...\", []).unwrap();\n let result = get_closing_mrs(&conn, 1).unwrap();\n assert!(result.is_empty());\n}\n\n#[test]\nfn test_get_closing_mrs_single() {\n let conn = setup_test_db();\n seed_issue_with_closing_mr(&conn);\n let result = get_closing_mrs(&conn, 1).unwrap();\n assert_eq!(result.len(), 1);\n assert_eq!(result[0].iid, 5);\n assert_eq!(result[0].title, \"Fix the bug\");\n assert_eq!(result[0].state, \"merged\");\n}\n\n#[test]\nfn test_get_closing_mrs_ignores_mentioned() {\n let conn = setup_test_db();\n seed_issue_with_closing_mr(&conn);\n // Add a 'mentioned' reference that should be ignored\n conn.execute(\n \"INSERT INTO merge_requests (id, gitlab_id, iid, project_id, title, state, author_username,\n source_branch, target_branch, created_at, updated_at, last_seen_at)\n VALUES (2, 301, 6, 1, 'Other MR', 'opened', 'dev', 'other', 'main', 1000, 2000, 2000)\", []\n ).unwrap();\n conn.execute(\n \"INSERT INTO entity_references (project_id, source_entity_type, source_entity_id,\n target_entity_type, target_entity_id, reference_type, source_method, created_at)\n VALUES (1, 'merge_request', 2, 'issue', 1, 'mentioned', 'note_parse', 3000)\", []\n ).unwrap();\n let result = get_closing_mrs(&conn, 1).unwrap();\n assert_eq!(result.len(), 1); // Only the 'closes' ref\n}\n\n#[test]\nfn test_get_closing_mrs_multiple_sorted() {\n let conn = setup_test_db();\n seed_issue_with_closing_mr(&conn);\n // Add second closing MR with higher iid\n conn.execute(\n \"INSERT INTO merge_requests (id, gitlab_id, iid, project_id, title, state, author_username,\n source_branch, target_branch, created_at, updated_at, last_seen_at)\n VALUES (2, 301, 8, 1, 'Another fix', 'opened', 'dev', 'fix2', 'main', 1000, 2000, 2000)\", []\n ).unwrap();\n conn.execute(\n \"INSERT INTO entity_references (project_id, source_entity_type, source_entity_id,\n target_entity_type, target_entity_id, reference_type, source_method, created_at)\n VALUES (1, 'merge_request', 2, 'issue', 1, 'closes', 'api', 3000)\", []\n ).unwrap();\n let result = get_closing_mrs(&conn, 1).unwrap();\n assert_eq!(result.len(), 2);\n assert_eq!(result[0].iid, 5); // Lower iid first\n assert_eq!(result[1].iid, 8);\n}\n```\n\n**GREEN** - Implement get_closing_mrs() and struct updates\n\n**VERIFY**: `cargo test test_get_closing_mrs && cargo clippy --all-targets -- -D warnings`\n\n## Edge Cases\n- Empty closing MRs -> don't print Development section\n- MR in different states -> color-coded appropriately \n- Cross-project closes (target_entity_id IS NULL) -> not displayed (unresolved refs)\n- Multiple MRs closing same issue -> all shown, ordered by iid","status":"closed","priority":1,"issue_type":"feature","created_at":"2026-02-05T15:15:37.598249Z","created_by":"tayloreernisse","updated_at":"2026-02-05T15:26:09.522557Z","closed_at":"2026-02-05T15:26:09.522506Z","close_reason":"Implemented: closing MRs (Development section) now display in lore issues . All 4 new tests pass.","compaction_level":0,"original_size":0,"labels":["ISSUE"]} {"id":"bd-140","title":"[CP1] Database migration 002_issues.sql","description":"Create migration file with tables for issues, labels, issue_labels, discussions, and notes.\n\nTables to create:\n- issues: gitlab_id, project_id, iid, title, description, state, author_username, timestamps, web_url, raw_payload_id\n- labels: gitlab_id, project_id, name, color, description (unique on project_id+name)\n- issue_labels: junction table\n- discussions: gitlab_discussion_id, project_id, issue_id, noteable_type, individual_note, timestamps, resolvable/resolved\n- notes: gitlab_id, discussion_id, project_id, type, is_system, author_username, body, timestamps, position, resolution fields, DiffNote position fields\n\nInclude appropriate indexes:\n- idx_issues_project_updated, idx_issues_author, uq_issues_project_iid\n- uq_labels_project_name, idx_labels_name\n- idx_issue_labels_label\n- uq_discussions_project_discussion_id, idx_discussions_issue/mr/last_note\n- idx_notes_discussion/author/system\n\nFiles: migrations/002_issues.sql\nDone when: Migration applies cleanly on top of 001_initial.sql","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:18:53.954039Z","created_by":"tayloreernisse","updated_at":"2026-01-25T15:21:35.154936Z","deleted_at":"2026-01-25T15:21:35.154934Z","deleted_by":"tayloreernisse","delete_reason":"delete","original_type":"task","compaction_level":0,"original_size":0} {"id":"bd-14q","title":"Epic: Gate 4 - File Decision History (lore file-history)","description":"## Background\nGate 4 adds file-level decision history — \"which MRs touched this file, and why?\" This bridges the gap between code and project management by linking file paths to MRs, MRs to issues, and issues to discussions. The key innovation is rename chain resolution: when a file was renamed from src/auth/handler.rs to src/auth/oauth.rs, querying either path finds all historical MRs.\n\nGate 4 also captures merge_commit_sha and squash_commit_sha on merge_requests, which Gate 5 uses for code tracing and which Phase C will use for git blame integration.\n\n## Architecture\n- **New table:** mr_file_changes (migration 012) — file paths + change types per MR\n- **New columns:** merge_requests.merge_commit_sha, merge_requests.squash_commit_sha (migration 012)\n- **Opt-in config:** sync.fetchMrFileChanges (default true). --no-file-changes CLI flag.\n- **Data source:** GET /projects/:id/merge_requests/:iid/diffs — extract file metadata only, discard diff content.\n- **Queue integration:** Uses dependent fetch queue with job_type=mr_diffs\n- **Rename chain resolution:** BFS over mr_file_changes WHERE change_type=renamed, bounded at 10 hops with cycle detection.\n\n## Children (Execution Order)\n1. **bd-1oo** [OPEN] — Migration 012: mr_file_changes + merge_commit_sha + squash_commit_sha\n2. **bd-jec** [OPEN] — fetchMrFileChanges config flag + --no-file-changes CLI flag\n3. **bd-2yo** [OPEN] — Fetch MR diffs API, populate mr_file_changes, capture commit SHAs\n4. **bd-1yx** [OPEN] — Rename chain resolution algorithm (src/core/file_history.rs)\n5. **bd-z94** [OPEN] — lore file-history command with human + robot output\n\n## Gate Completion Criteria\n- [ ] mr_file_changes populated from GitLab diffs API for synced MRs\n- [ ] merge_commit_sha and squash_commit_sha captured in merge_requests\n- [ ] `lore file-history ` returns MRs ordered by merge/creation date\n- [ ] Output includes MR title, state, author, change type, discussion count\n- [ ] --discussions shows DiffNote snippets on the queried file\n- [ ] Rename chains resolved with bounded hop count (default 10) + cycle detection\n- [ ] --no-follow-renames disables chain resolution\n- [ ] Robot JSON includes rename_chain when renames detected\n- [ ] -p required when path exists in multiple projects (Ambiguous error, exit 18)\n- [ ] Graceful empty state: \"No MR data found. Run lore sync with fetchMrFileChanges: true\"\n\n## Dependencies\n- Depends on: Gate 1 (bd-2zl) for dependent fetch queue, Gate 2 (bd-1se) for entity_references (MR→issue linking)\n- Downstream: Gate 5 (bd-1ht) depends on mr_file_changes and commit SHAs","status":"open","priority":1,"issue_type":"feature","created_at":"2026-02-02T21:31:01.094024Z","created_by":"tayloreernisse","updated_at":"2026-02-04T19:33:06.778936Z","compaction_level":0,"original_size":0,"labels":["epic","gate-4","phase-b"],"dependencies":[{"issue_id":"bd-14q","depends_on_id":"bd-1se","type":"blocks","created_at":"2026-02-02T21:34:16.913465Z","created_by":"tayloreernisse"},{"issue_id":"bd-14q","depends_on_id":"bd-2zl","type":"blocks","created_at":"2026-02-02T21:34:16.870058Z","created_by":"tayloreernisse"}]} {"id":"bd-157","title":"[CP1] Issue transformer with label extraction","description":"Transform GitLab issue payloads to normalized database schema.\n\n## Module\nsrc/gitlab/transformers/issue.rs\n\n## Structs\n\n### NormalizedIssue\n- gitlab_id: i64\n- project_id: i64 (local DB project ID)\n- iid: i64\n- title: String\n- description: Option\n- state: String\n- author_username: String\n- created_at, updated_at, last_seen_at: i64 (ms epoch)\n- web_url: String\n\n### NormalizedLabel (CP1: name-only)\n- project_id: i64\n- name: String\n\n## Functions\n\n### transform_issue(gitlab_issue: &GitLabIssue, local_project_id: i64) -> NormalizedIssue\n- Convert ISO timestamps to ms epoch using iso_to_ms()\n- Set last_seen_at to now_ms()\n- Clone string fields\n\n### extract_labels(gitlab_issue: &GitLabIssue, local_project_id: i64) -> Vec\n- Map labels vec to NormalizedLabel structs\n\nFiles: \n- src/gitlab/transformers/mod.rs\n- src/gitlab/transformers/issue.rs\nTests: tests/issue_transformer_tests.rs\nDone when: Unit tests pass for payload transformation and label extraction","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:42:47.719562Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.736142Z","deleted_at":"2026-01-25T17:02:01.736129Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0} @@ -32,12 +33,12 @@ {"id":"bd-1np","title":"[CP1] GitLab types for issues, discussions, notes","description":"## Background\n\nGitLab types define the Rust structs for deserializing GitLab API responses. These types are the foundation for all ingestion work - issues, discussions, and notes must be correctly typed for serde to parse them.\n\n## Approach\n\nAdd types to `src/gitlab/types.rs` with serde derives:\n\n### GitLabIssue\n\n```rust\n#[derive(Debug, Clone, Deserialize)]\npub struct GitLabIssue {\n pub id: i64, // GitLab global ID\n pub iid: i64, // Project-scoped issue number\n pub project_id: i64,\n pub title: String,\n pub description: Option,\n pub state: String, // \"opened\" | \"closed\"\n pub created_at: String, // ISO 8601\n pub updated_at: String, // ISO 8601\n pub closed_at: Option,\n pub author: GitLabAuthor,\n pub labels: Vec, // Array of label names (CP1 canonical)\n pub web_url: String,\n}\n```\n\nNOTE: `labels_details` intentionally NOT modeled - varies across GitLab versions.\n\n### GitLabAuthor\n\n```rust\n#[derive(Debug, Clone, Deserialize)]\npub struct GitLabAuthor {\n pub id: i64,\n pub username: String,\n pub name: String,\n}\n```\n\n### GitLabDiscussion\n\n```rust\n#[derive(Debug, Clone, Deserialize)]\npub struct GitLabDiscussion {\n pub id: String, // String ID like \"6a9c1750b37d...\"\n pub individual_note: bool, // true = standalone comment\n pub notes: Vec,\n}\n```\n\n### GitLabNote\n\n```rust\n#[derive(Debug, Clone, Deserialize)]\npub struct GitLabNote {\n pub id: i64,\n #[serde(rename = \"type\")]\n pub note_type: Option, // \"DiscussionNote\" | \"DiffNote\" | null\n pub body: String,\n pub author: GitLabAuthor,\n pub created_at: String, // ISO 8601\n pub updated_at: String, // ISO 8601\n pub system: bool, // true for system-generated notes\n #[serde(default)]\n pub resolvable: bool,\n #[serde(default)]\n pub resolved: bool,\n pub resolved_by: Option,\n pub resolved_at: Option,\n pub position: Option,\n}\n```\n\n### GitLabNotePosition\n\n```rust\n#[derive(Debug, Clone, Deserialize)]\npub struct GitLabNotePosition {\n pub old_path: Option,\n pub new_path: Option,\n pub old_line: Option,\n pub new_line: Option,\n}\n```\n\n## Acceptance Criteria\n\n- [ ] GitLabIssue deserializes from API response JSON\n- [ ] GitLabAuthor embedded correctly in issue and note\n- [ ] GitLabDiscussion with notes array deserializes\n- [ ] GitLabNote handles null note_type (use Option)\n- [ ] GitLabNote uses #[serde(rename = \"type\")] for reserved keyword\n- [ ] resolvable/resolved default to false via #[serde(default)]\n- [ ] All timestamp fields are String (ISO 8601 parsed elsewhere)\n\n## Files\n\n- src/gitlab/types.rs (edit - add types)\n\n## TDD Loop\n\nRED:\n```rust\n// tests/gitlab_types_tests.rs\n#[test] fn deserializes_gitlab_issue_from_json()\n#[test] fn deserializes_gitlab_discussion_from_json()\n#[test] fn handles_null_note_type()\n#[test] fn handles_missing_resolvable_field()\n#[test] fn deserializes_labels_as_string_array()\n```\n\nGREEN: Add type definitions with serde attributes\n\nVERIFY: `cargo test gitlab_types`\n\n## Edge Cases\n\n- note_type can be null, \"DiscussionNote\", or \"DiffNote\"\n- labels array can be empty\n- description can be null\n- resolved_by/resolved_at can be null\n- position is only present for DiffNotes","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-25T17:02:38.150472Z","created_by":"tayloreernisse","updated_at":"2026-01-25T22:17:08.842965Z","closed_at":"2026-01-25T22:17:08.842895Z","close_reason":"Implemented GitLabAuthor, GitLabIssue, GitLabDiscussion, GitLabNote, GitLabNotePosition types with 10 passing tests","compaction_level":0,"original_size":0} {"id":"bd-1o1","title":"OBSERV: Add -v/--verbose and --log-format CLI flags","description":"## Background\nUsers and agents need CLI-controlled verbosity without knowing RUST_LOG syntax. The -v flag convention (cargo, curl, ssh) is universally understood. --log-format json enables lore sync 2>&1 | jq workflows without reading log files.\n\n## Approach\nAdd two new global flags to the Cli struct in src/cli/mod.rs (insert after the quiet field at line ~37):\n\n```rust\n/// Increase log verbosity (-v, -vv, -vvv)\n#[arg(short = 'v', long = \"verbose\", action = clap::ArgAction::Count, global = true)]\npub verbose: u8,\n\n/// Log format for stderr output: text (default) or json\n#[arg(long = \"log-format\", global = true, value_parser = [\"text\", \"json\"], default_value = \"text\")]\npub log_format: String,\n```\n\nThe existing Cli struct (src/cli/mod.rs:13-42) has these global flags: config, robot, json, color, quiet. The new flags follow the same pattern.\n\nNote: clap::ArgAction::Count allows -v, -vv, -vvv as a single flag with increasing count (0, 1, 2, 3).\n\n## Acceptance Criteria\n- [ ] lore -v sync parses without error (verbose=1)\n- [ ] lore -vv sync parses (verbose=2)\n- [ ] lore -vvv sync parses (verbose=3)\n- [ ] lore --log-format json sync parses (log_format=\"json\")\n- [ ] lore --log-format text sync parses (default)\n- [ ] lore --log-format xml sync errors (invalid value)\n- [ ] Existing commands unaffected (verbose defaults to 0, log_format to \"text\")\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/cli/mod.rs (modify Cli struct, lines 13-42)\n\n## TDD Loop\nRED: Write test that parses Cli with -v flag and asserts verbose=1\nGREEN: Add the two fields to Cli struct\nVERIFY: cargo test -p lore && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- -v and -q together: both parse fine; conflict resolution happens in subscriber setup (bd-2rr), not here\n- -v flag must be global=true so it works before and after subcommands: lore -v sync AND lore sync -v\n- --log-format is a string, not enum, to keep Cli struct simple","status":"closed","priority":1,"issue_type":"task","created_at":"2026-02-04T15:53:55.421339Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:10:22.585947Z","closed_at":"2026-02-04T17:10:22.585905Z","close_reason":"Added -v/--verbose (count) and --log-format (text|json) global CLI flags","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-1o1","depends_on_id":"bd-2nx","type":"parent-child","created_at":"2026-02-04T15:53:55.422103Z","created_by":"tayloreernisse"}]} {"id":"bd-1o4h","title":"OBSERV: Define StageTiming struct in src/core/metrics.rs","description":"## Background\nStageTiming is the materialized view of span timing data. It's the data structure that flows through robot JSON output, sync_runs.metrics_json, and the human-readable timing summary. Defined in a new file because it's genuinely new functionality that doesn't fit existing modules.\n\n## Approach\nCreate src/core/metrics.rs:\n\n```rust\nuse serde::Serialize;\n\nfn is_zero(v: &usize) -> bool { *v == 0 }\n\n#[derive(Debug, Clone, Serialize)]\npub struct StageTiming {\n pub name: String,\n #[serde(skip_serializing_if = \"Option::is_none\")]\n pub project: Option,\n pub elapsed_ms: u64,\n pub items_processed: usize,\n #[serde(skip_serializing_if = \"is_zero\")]\n pub items_skipped: usize,\n #[serde(skip_serializing_if = \"is_zero\")]\n pub errors: usize,\n #[serde(skip_serializing_if = \"Vec::is_empty\")]\n pub sub_stages: Vec,\n}\n```\n\nRegister module in src/core/mod.rs (line ~11, add):\n```rust\npub mod metrics;\n```\n\nThe is_zero helper is a private function used by serde's skip_serializing_if. It must take &usize (reference) and return bool.\n\n## Acceptance Criteria\n- [ ] StageTiming serializes to JSON matching PRD Section 4.6.2 example\n- [ ] items_skipped omitted when 0\n- [ ] errors omitted when 0\n- [ ] sub_stages omitted when empty vec\n- [ ] project omitted when None\n- [ ] name, elapsed_ms, items_processed always present\n- [ ] Struct is Debug + Clone + Serialize\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/core/metrics.rs (new file)\n- src/core/mod.rs (register module, add line after existing pub mod declarations)\n\n## TDD Loop\nRED:\n - test_stage_timing_serialization: create StageTiming with sub_stages, serialize, assert JSON structure\n - test_stage_timing_zero_fields_omitted: errors=0, items_skipped=0, assert no \"errors\" or \"items_skipped\" keys\n - test_stage_timing_empty_sub_stages: sub_stages=vec![], assert no \"sub_stages\" key\nGREEN: Create metrics.rs with StageTiming struct and is_zero helper\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- is_zero must be a function, not a closure (serde skip_serializing_if requires a function path)\n- Vec::is_empty is a method on Vec, and serde accepts \"Vec::is_empty\" as a path for skip_serializing_if\n- Recursive StageTiming (sub_stages contains StageTiming): serde handles this naturally, no special handling needed","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-04T15:54:31.907234Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:21:40.915842Z","closed_at":"2026-02-04T17:21:40.915794Z","close_reason":"Created src/core/metrics.rs with StageTiming struct, serde skip_serializing_if for zero/empty fields, 5 tests","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-1o4h","depends_on_id":"bd-3er","type":"parent-child","created_at":"2026-02-04T15:54:31.910015Z","created_by":"tayloreernisse"}]} -{"id":"bd-1oo","title":"Write migration 012: mr_file_changes + commit SHA columns","description":"## Background\nNeed to track which files each MR touched, plus commit SHAs for git linking. This is migration 012 (since 011 is resource events + entity_references + dependent fetch queue).\n\n## Approach\nCreate migrations/012_file_changes.sql with the exact schema from spec §4.1:\n\n```sql\n-- Files changed by each merge request\nCREATE TABLE mr_file_changes (\n id INTEGER PRIMARY KEY,\n merge_request_id INTEGER NOT NULL REFERENCES merge_requests(id) ON DELETE CASCADE,\n project_id INTEGER NOT NULL REFERENCES projects(id) ON DELETE CASCADE,\n old_path TEXT,\n new_path TEXT NOT NULL,\n change_type TEXT NOT NULL CHECK (change_type IN ('added', 'modified', 'deleted', 'renamed')),\n UNIQUE(merge_request_id, new_path)\n);\n\nCREATE INDEX idx_mr_files_new_path ON mr_file_changes(new_path);\nCREATE INDEX idx_mr_files_old_path ON mr_file_changes(old_path) WHERE old_path IS NOT NULL;\nCREATE INDEX idx_mr_files_mr ON mr_file_changes(merge_request_id);\n\n-- Add commit SHAs to merge_requests (cherry-picked from Phase A)\nALTER TABLE merge_requests ADD COLUMN merge_commit_sha TEXT;\nALTER TABLE merge_requests ADD COLUMN squash_commit_sha TEXT;\n\nINSERT INTO schema_version (version, applied_at, description)\nVALUES (12, strftime('%s', 'now') * 1000, 'MR file changes and commit SHA columns');\n```\n\nRegister in src/core/db.rs MIGRATIONS array:\n```rust\n(\"012\", include_str!(\"../../migrations/012_file_changes.sql\")),\n```\n\n## Acceptance Criteria\n- [ ] migrations/012_file_changes.sql exists with mr_file_changes table + indexes\n- [ ] merge_requests table has merge_commit_sha and squash_commit_sha columns\n- [ ] src/core/db.rs MIGRATIONS array includes (\"012\", ...)\n- [ ] Migration applies cleanly after 011\n- [ ] `cargo test migration` passes\n- [ ] UNIQUE(merge_request_id, new_path) enforced\n- [ ] change_type CHECK constraint enforced\n\n## Files\n- migrations/012_file_changes.sql (new)\n- src/core/db.rs (add to MIGRATIONS array)\n\n## TDD Loop\nRED: tests/migration_tests.rs:\n- `test_migration_012_creates_mr_file_changes` - verify table exists\n- `test_migration_012_adds_commit_sha_columns` - verify columns on merge_requests\n- `test_migration_012_change_type_constraint` - verify CHECK rejects invalid values\n- `test_migration_012_unique_constraint` - verify duplicate (mr_id, new_path) rejected\n\nGREEN: Write the migration SQL + register in db.rs\n\nVERIFY: `cargo test migration_tests -- --nocapture`\n\n## Edge Cases\n- ALTER TABLE on merge_requests with existing data — SQLite handles this gracefully (new columns default to NULL)\n- old_path can be NULL (new files) or same as new_path (modified files) — only populated for renames\n- If migration 011 hasn't been applied, 012 should still fail cleanly (schema_version check in runner)","status":"open","priority":2,"issue_type":"task","created_at":"2026-02-02T21:34:08.837816Z","created_by":"tayloreernisse","updated_at":"2026-02-02T21:48:00.372836Z","compaction_level":0,"original_size":0,"labels":["gate-4","phase-b","schema"],"dependencies":[{"issue_id":"bd-1oo","depends_on_id":"bd-14q","type":"parent-child","created_at":"2026-02-02T21:34:08.843541Z","created_by":"tayloreernisse"},{"issue_id":"bd-1oo","depends_on_id":"bd-hu3","type":"blocks","created_at":"2026-02-02T21:34:16.505965Z","created_by":"tayloreernisse"}]} +{"id":"bd-1oo","title":"Write migration 015: mr_file_changes + commit SHA columns","description":"## Background\nNeed to track which files each MR touched, plus commit SHAs for git linking. This is migration 015 (next available after 014_sync_runs_enrichment).\n\n## Approach\nCreate migrations/015_file_changes.sql with the exact schema from spec §4.1:\n\n```sql\n-- Files changed by each merge request\nCREATE TABLE mr_file_changes (\n id INTEGER PRIMARY KEY,\n merge_request_id INTEGER NOT NULL REFERENCES merge_requests(id) ON DELETE CASCADE,\n project_id INTEGER NOT NULL REFERENCES projects(id) ON DELETE CASCADE,\n old_path TEXT,\n new_path TEXT NOT NULL,\n change_type TEXT NOT NULL CHECK (change_type IN ('added', 'modified', 'deleted', 'renamed')),\n UNIQUE(merge_request_id, new_path)\n);\n\nCREATE INDEX idx_mr_files_new_path ON mr_file_changes(new_path);\nCREATE INDEX idx_mr_files_old_path ON mr_file_changes(old_path) WHERE old_path IS NOT NULL;\nCREATE INDEX idx_mr_files_mr ON mr_file_changes(merge_request_id);\n\n-- Add commit SHAs to merge_requests\nALTER TABLE merge_requests ADD COLUMN merge_commit_sha TEXT;\nALTER TABLE merge_requests ADD COLUMN squash_commit_sha TEXT;\n\nINSERT INTO schema_version (version, applied_at, description)\nVALUES (15, strftime('%s', 'now') * 1000, 'MR file changes and commit SHA columns');\n```\n\nRegister in src/core/db.rs MIGRATIONS array:\n```rust\n(\"015\", include_str\\!(\"../../migrations/015_file_changes.sql\")),\n```\n\n## Acceptance Criteria\n- [ ] migrations/015_file_changes.sql exists with mr_file_changes table + indexes\n- [ ] merge_requests table has merge_commit_sha and squash_commit_sha columns\n- [ ] src/core/db.rs MIGRATIONS array includes (\"015\", ...)\n- [ ] Migration applies cleanly after 014\n- [ ] `cargo test migration` passes\n- [ ] UNIQUE(merge_request_id, new_path) enforced\n- [ ] change_type CHECK constraint enforced\n\n## Files\n- migrations/015_file_changes.sql (new)\n- src/core/db.rs (add to MIGRATIONS array)\n\n## TDD Loop\nRED: tests/migration_tests.rs:\n- `test_migration_015_creates_mr_file_changes` - verify table exists\n- `test_migration_015_adds_commit_sha_columns` - verify columns on merge_requests\n- `test_migration_015_change_type_constraint` - verify CHECK rejects invalid values\n- `test_migration_015_unique_constraint` - verify duplicate (mr_id, new_path) rejected\n\nGREEN: Write the migration SQL + register in db.rs\n\nVERIFY: `cargo test migration_tests -- --nocapture`\n\n## Edge Cases\n- ALTER TABLE on merge_requests with existing data — SQLite handles this gracefully (new columns default to NULL)\n- old_path can be NULL (new files) or same as new_path (modified files) — only populated for renames\n- If migration 014 hasn't been applied, 015 should still fail cleanly (schema_version check in runner)","status":"open","priority":2,"issue_type":"task","created_at":"2026-02-02T21:34:08.837816Z","created_by":"tayloreernisse","updated_at":"2026-02-05T16:16:22.172528Z","compaction_level":0,"original_size":0,"labels":["gate-4","phase-b","schema"],"dependencies":[{"issue_id":"bd-1oo","depends_on_id":"bd-14q","type":"parent-child","created_at":"2026-02-02T21:34:08.843541Z","created_by":"tayloreernisse"},{"issue_id":"bd-1oo","depends_on_id":"bd-hu3","type":"blocks","created_at":"2026-02-02T21:34:16.505965Z","created_by":"tayloreernisse"}]} {"id":"bd-1qf","title":"[CP1] Discussion and note transformers","description":"## Background\n\nDiscussion and note transformers convert GitLab API discussion responses into our normalized schema. They compute derived fields like `first_note_at`, `last_note_at`, resolvable/resolved status, and note positions. These are pure functions with no I/O.\n\n## Approach\n\nCreate transformer module with:\n\n### Structs\n\n```rust\n// src/gitlab/transformers/discussion.rs\n\npub struct NormalizedDiscussion {\n pub gitlab_discussion_id: String,\n pub project_id: i64,\n pub issue_id: i64,\n pub noteable_type: String, // \"Issue\"\n pub individual_note: bool,\n pub first_note_at: Option, // min(note.created_at) in ms epoch\n pub last_note_at: Option, // max(note.created_at) in ms epoch\n pub last_seen_at: i64,\n pub resolvable: bool, // any note is resolvable\n pub resolved: bool, // all resolvable notes are resolved\n}\n\npub struct NormalizedNote {\n pub gitlab_id: i64,\n pub project_id: i64,\n pub note_type: Option, // \"DiscussionNote\" | \"DiffNote\" | null\n pub is_system: bool, // from note.system\n pub author_username: String,\n pub body: String,\n pub created_at: i64, // ms epoch\n pub updated_at: i64, // ms epoch\n pub last_seen_at: i64,\n pub position: i32, // 0-indexed array position\n pub resolvable: bool,\n pub resolved: bool,\n pub resolved_by: Option,\n pub resolved_at: Option,\n}\n```\n\n### Functions\n\n```rust\npub fn transform_discussion(\n gitlab_discussion: &GitLabDiscussion,\n local_project_id: i64,\n local_issue_id: i64,\n) -> NormalizedDiscussion\n\npub fn transform_notes(\n gitlab_discussion: &GitLabDiscussion,\n local_project_id: i64,\n) -> Vec\n```\n\n## Acceptance Criteria\n\n- [ ] `NormalizedDiscussion` struct with all fields\n- [ ] `NormalizedNote` struct with all fields\n- [ ] `transform_discussion` computes first_note_at/last_note_at from notes array\n- [ ] `transform_discussion` computes resolvable (any note is resolvable)\n- [ ] `transform_discussion` computes resolved (all resolvable notes resolved)\n- [ ] `transform_notes` preserves array order via position field (0-indexed)\n- [ ] `transform_notes` maps system flag to is_system\n- [ ] Unit tests cover all computed fields\n\n## Files\n\n- src/gitlab/transformers/mod.rs (add `pub mod discussion;`)\n- src/gitlab/transformers/discussion.rs (create)\n\n## TDD Loop\n\nRED:\n```rust\n// tests/discussion_transformer_tests.rs\n#[test] fn transforms_discussion_payload_to_normalized_schema()\n#[test] fn extracts_notes_array_from_discussion()\n#[test] fn sets_individual_note_flag_correctly()\n#[test] fn flags_system_notes_with_is_system_true()\n#[test] fn preserves_note_order_via_position_field()\n#[test] fn computes_first_note_at_and_last_note_at_correctly()\n#[test] fn computes_resolvable_and_resolved_status()\n```\n\nGREEN: Implement transform_discussion and transform_notes\n\nVERIFY: `cargo test discussion_transformer`\n\n## Edge Cases\n\n- Discussion with single note - first_note_at == last_note_at\n- All notes are system notes - still compute timestamps\n- No notes resolvable - resolvable=false, resolved=false\n- Mix of resolved/unresolved notes - resolved=false until all done","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-25T17:02:38.196079Z","created_by":"tayloreernisse","updated_at":"2026-01-25T22:27:11.485112Z","closed_at":"2026-01-25T22:27:11.485058Z","close_reason":"Implemented NormalizedDiscussion, NormalizedNote, transform_discussion, transform_notes with 9 passing unit tests","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1qf","depends_on_id":"bd-1np","type":"blocks","created_at":"2026-01-25T17:04:05.347218Z","created_by":"tayloreernisse"}]} {"id":"bd-1qz","title":"[CP1] Database migration 002_issues.sql","description":"Create migration file with tables for issues, labels, issue_labels, discussions, and notes.\n\n## Tables\n\n### issues\n- id INTEGER PRIMARY KEY\n- gitlab_id INTEGER UNIQUE NOT NULL\n- project_id INTEGER NOT NULL REFERENCES projects(id)\n- iid INTEGER NOT NULL\n- title TEXT, description TEXT, state TEXT\n- author_username TEXT\n- created_at, updated_at, last_seen_at INTEGER (ms epoch UTC)\n- discussions_synced_for_updated_at INTEGER (watermark for dependent sync)\n- web_url TEXT\n- raw_payload_id INTEGER REFERENCES raw_payloads(id)\n\n### labels (name-only for CP1)\n- id INTEGER PRIMARY KEY\n- gitlab_id INTEGER (optional, for future Labels API)\n- project_id INTEGER NOT NULL REFERENCES projects(id)\n- name TEXT NOT NULL\n- color TEXT, description TEXT (nullable, deferred)\n- UNIQUE(project_id, name)\n\n### issue_labels (junction)\n- issue_id, label_id with CASCADE DELETE\n- Clear existing links before INSERT to handle removed labels\n\n### discussions\n- gitlab_discussion_id TEXT (string ID from API)\n- project_id, issue_id/merge_request_id FKs\n- noteable_type TEXT ('Issue' | 'MergeRequest')\n- individual_note INTEGER, first_note_at, last_note_at, last_seen_at\n- resolvable, resolved flags\n- CHECK constraint for Issue vs MR exclusivity\n\n### notes\n- gitlab_id INTEGER UNIQUE NOT NULL\n- discussion_id, project_id FKs\n- note_type, is_system, author_username, body\n- timestamps, position (array order)\n- resolution fields, DiffNote position fields\n\n## Indexes\n- idx_issues_project_updated, idx_issues_author, idx_issues_discussions_sync\n- uq_issues_project_iid, uq_labels_project_name\n- idx_issue_labels_label\n- uq_discussions_project_discussion_id, idx_discussions_issue/mr/last_note\n- idx_notes_discussion/author/system\n\nFiles: migrations/002_issues.sql\nDone when: Migration applies cleanly on top of 001_initial.sql, schema_version = 2","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:42:31.464544Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.685262Z","deleted_at":"2026-01-25T17:02:01.685258Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0} {"id":"bd-1re","title":"[CP1] gi show issue command","description":"Show issue details with discussions.\n\nFlags:\n- --project=PATH (required if iid is ambiguous across projects)\n\nOutput:\n- Title, project, state, author, dates, labels, URL\n- Description text\n- All discussions with notes (formatted thread view)\n\nHandle ambiguity: If multiple projects have same iid, prompt for --project or show error.\n\nFiles: src/cli/commands/show.ts\nDone when: Issue detail view displays all fields including threaded discussions","status":"tombstone","priority":3,"issue_type":"task","created_at":"2026-01-25T15:20:29.826786Z","created_by":"tayloreernisse","updated_at":"2026-01-25T15:21:35.153211Z","deleted_at":"2026-01-25T15:21:35.153208Z","deleted_by":"tayloreernisse","delete_reason":"delete","original_type":"task","compaction_level":0,"original_size":0} {"id":"bd-1s1","title":"[CP1] Integration tests for issue ingestion","description":"Full integration tests for issue ingestion module.\n\n## Tests (tests/issue_ingestion_tests.rs)\n\n- inserts_issues_into_database\n- creates_labels_from_issue_payloads\n- links_issues_to_labels_via_junction_table\n- removes_stale_label_links_on_resync\n- stores_raw_payload_for_each_issue\n- stores_raw_payload_for_each_discussion\n- updates_cursor_incrementally_per_page\n- resumes_from_cursor_on_subsequent_runs\n- handles_issues_with_no_labels\n- upserts_existing_issues_on_refetch\n- skips_discussion_refetch_for_unchanged_issues\n\n## Test Setup\n- tempfile::TempDir for isolated database\n- wiremock::MockServer for GitLab API\n- Mock handlers returning fixture data\n\nFiles: tests/issue_ingestion_tests.rs\nDone when: All integration tests pass with mocked GitLab","status":"tombstone","priority":3,"issue_type":"task","created_at":"2026-01-25T16:59:12.158586Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:02.109109Z","deleted_at":"2026-01-25T17:02:02.109105Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0} -{"id":"bd-1se","title":"Epic: Gate 2 - Cross-Reference Extraction","description":"## Background\nGate 2 builds the entity relationship graph that connects issues, MRs, and discussions. Without cross-references, temporal queries can only show events for individually-matched entities. With them, \"lore timeline auth migration\" can discover that MR !567 closed issue #234, which spawned follow-up issue #299 — even if #299 does not contain the words \"auth migration.\"\n\nThree data sources feed entity_references:\n1. **Structured API (reliable):** GET /projects/:id/merge_requests/:iid/closes_issues\n2. **State events (reliable):** resource_state_events.source_merge_request_id\n3. **System note parsing (best-effort):** \"mentioned in !456\", \"closed by !789\" patterns\n\n## Architecture\n- **entity_references table:** Already created in migration 011 (bd-hu3/bd-czk). Stores source→target relationships with reference_type (closes/mentioned/related) and source_method provenance.\n- **Directionality convention:** source = entity where reference was observed, target = entity being referenced. Consistent across all source_methods.\n- **Unresolved references:** Cross-project refs stored with target_entity_id=NULL, target_project_path populated. Still valuable for timeline narratives.\n- **closes_issues fetch:** Uses generic dependent fetch queue (job_type = mr_closes_issues). One API call per MR.\n- **System note parsing:** Local post-processing after all dependent fetches complete. No API calls. English-only, best-effort.\n\n## Children (Execution Order)\n1. **bd-czk** [CLOSED] — entity_references schema (folded into migration 011)\n2. **bd-8t4** [OPEN] — Extract cross-references from resource_state_events (source_merge_request_id)\n3. **bd-3ia** [OPEN] — Fetch closes_issues API and populate entity_references\n4. **bd-1ji** [OPEN] — Parse system notes for cross-reference patterns\n\n## Gate Completion Criteria\n- [ ] entity_references populated from closes_issues API for all synced MRs\n- [ ] entity_references populated from state events where source_merge_request_id present\n- [ ] System notes parsed for cross-reference patterns (English instances)\n- [ ] Cross-project references stored as unresolved (target_entity_id=NULL)\n- [ ] source_method tracks provenance of each reference\n- [ ] Deduplication: same relationship from multiple sources stored once (UNIQUE constraint)\n- [ ] Timeline JSON includes expansion provenance (via) for expanded entities\n- [ ] Integration test: sync with all three extraction methods, verify entity_references populated\n\n## Dependencies\n- Depends on: Gate 1 (bd-2zl) — event tables and dependent fetch queue\n- Downstream: Gate 3 (bd-ike) depends on entity_references for BFS expansion","status":"open","priority":1,"issue_type":"feature","created_at":"2026-02-02T21:31:00.981132Z","created_by":"tayloreernisse","updated_at":"2026-02-04T19:32:29.146809Z","compaction_level":0,"original_size":0,"labels":["epic","gate-2","phase-b"],"dependencies":[{"issue_id":"bd-1se","depends_on_id":"bd-2zl","type":"blocks","created_at":"2026-02-02T21:32:43.028033Z","created_by":"tayloreernisse"}]} +{"id":"bd-1se","title":"Epic: Gate 2 - Cross-Reference Extraction","description":"## Background\nGate 2 builds the entity relationship graph that connects issues, MRs, and discussions. Without cross-references, temporal queries can only show events for individually-matched entities. With them, \"lore timeline auth migration\" can discover that MR !567 closed issue #234, which spawned follow-up issue #299 — even if #299 does not contain the words \"auth migration.\"\n\nThree data sources feed entity_references:\n1. **Structured API (reliable):** GET /projects/:id/merge_requests/:iid/closes_issues\n2. **State events (reliable):** resource_state_events.source_merge_request_id\n3. **System note parsing (best-effort):** \"mentioned in !456\", \"closed by !789\" patterns\n\n## Architecture\n- **entity_references table:** Already created in migration 011 (bd-hu3/bd-czk). Stores source→target relationships with reference_type (closes/mentioned/related) and source_method provenance.\n- **Directionality convention:** source = entity where reference was observed, target = entity being referenced. Consistent across all source_methods.\n- **Unresolved references:** Cross-project refs stored with target_entity_id=NULL, target_project_path populated. Still valuable for timeline narratives.\n- **closes_issues fetch:** Uses generic dependent fetch queue (job_type = mr_closes_issues). One API call per MR.\n- **System note parsing:** Local post-processing after all dependent fetches complete. No API calls. English-only, best-effort.\n\n## Children (Execution Order)\n1. **bd-czk** [CLOSED] — entity_references schema (folded into migration 011)\n2. **bd-8t4** [OPEN] — Extract cross-references from resource_state_events (source_merge_request_id)\n3. **bd-3ia** [OPEN] — Fetch closes_issues API and populate entity_references\n4. **bd-1ji** [OPEN] — Parse system notes for cross-reference patterns\n\n## Gate Completion Criteria\n- [ ] entity_references populated from closes_issues API for all synced MRs\n- [ ] entity_references populated from state events where source_merge_request_id present\n- [ ] System notes parsed for cross-reference patterns (English instances)\n- [ ] Cross-project references stored as unresolved (target_entity_id=NULL)\n- [ ] source_method tracks provenance of each reference\n- [ ] Deduplication: same relationship from multiple sources stored once (UNIQUE constraint)\n- [ ] Timeline JSON includes expansion provenance (via) for expanded entities\n- [ ] Integration test: sync with all three extraction methods, verify entity_references populated\n\n## Dependencies\n- Depends on: Gate 1 (bd-2zl) — event tables and dependent fetch queue\n- Downstream: Gate 3 (bd-ike) depends on entity_references for BFS expansion","status":"closed","priority":1,"issue_type":"feature","created_at":"2026-02-02T21:31:00.981132Z","created_by":"tayloreernisse","updated_at":"2026-02-05T16:08:26.965177Z","closed_at":"2026-02-05T16:08:26.964997Z","close_reason":"All child beads completed: bd-8t4 (state event extraction), bd-3ia (closes_issues API), bd-1ji (system note parsing)","compaction_level":0,"original_size":0,"labels":["epic","gate-2","phase-b"],"dependencies":[{"issue_id":"bd-1se","depends_on_id":"bd-2zl","type":"blocks","created_at":"2026-02-02T21:32:43.028033Z","created_by":"tayloreernisse"}]} {"id":"bd-1t4","title":"Epic: CP2 Gate C - Dependent Discussion Sync","description":"## Background\nGate C validates the dependent discussion sync with DiffNote position capture. This is critical for code review context preservation - without DiffNote positions, we lose the file/line context for review comments.\n\n## Acceptance Criteria (Pass/Fail)\n- [ ] Discussions fetched for MRs with updated_at > discussions_synced_for_updated_at\n- [ ] `SELECT COUNT(*) FROM discussions WHERE merge_request_id IS NOT NULL` > 0\n- [ ] DiffNotes have `position_new_path` populated (file path)\n- [ ] DiffNotes have `position_new_line` populated (line number)\n- [ ] DiffNotes have `position_type` populated (text/image/file)\n- [ ] DiffNotes have SHA triplet: `position_base_sha`, `position_start_sha`, `position_head_sha`\n- [ ] Multi-line DiffNotes have `position_line_range_start` and `position_line_range_end`\n- [ ] Unchanged MRs skip discussion refetch (watermark comparison works)\n- [ ] Watermark NOT advanced on HTTP error mid-pagination\n- [ ] Watermark NOT advanced on note timestamp parse failure\n- [ ] `gi show mr ` displays DiffNote with file context `[path:line]`\n\n## Validation Script\n```bash\n#!/bin/bash\nset -e\n\nDB_PATH=\"${XDG_DATA_HOME:-$HOME/.local/share}/gitlab-inbox/db.sqlite3\"\n\necho \"=== Gate C: Dependent Discussion Sync ===\"\n\n# 1. Check discussion count for MRs\necho \"Step 1: Check MR discussion count...\"\nMR_DISC_COUNT=$(sqlite3 \"$DB_PATH\" \"SELECT COUNT(*) FROM discussions WHERE merge_request_id IS NOT NULL;\")\necho \" MR discussions: $MR_DISC_COUNT\"\n[ \"$MR_DISC_COUNT\" -gt 0 ] || { echo \"FAIL: No MR discussions found\"; exit 1; }\n\n# 2. Check note count\necho \"Step 2: Check note count...\"\nNOTE_COUNT=$(sqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM notes n\n JOIN discussions d ON d.id = n.discussion_id\n WHERE d.merge_request_id IS NOT NULL;\n\")\necho \" MR notes: $NOTE_COUNT\"\n\n# 3. Check DiffNote position data\necho \"Step 3: Check DiffNote positions...\"\nDIFFNOTE_COUNT=$(sqlite3 \"$DB_PATH\" \"SELECT COUNT(*) FROM notes WHERE position_new_path IS NOT NULL;\")\necho \" DiffNotes with position: $DIFFNOTE_COUNT\"\n\n# 4. Sample DiffNote data\necho \"Step 4: Sample DiffNote data...\"\nsqlite3 \"$DB_PATH\" \"\n SELECT \n n.gitlab_id,\n n.position_new_path,\n n.position_new_line,\n n.position_type,\n SUBSTR(n.position_head_sha, 1, 7) as head_sha\n FROM notes n\n WHERE n.position_new_path IS NOT NULL\n LIMIT 5;\n\"\n\n# 5. Check multi-line DiffNotes\necho \"Step 5: Check multi-line DiffNotes...\"\nMULTILINE_COUNT=$(sqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM notes \n WHERE position_line_range_start IS NOT NULL \n AND position_line_range_end IS NOT NULL\n AND position_line_range_start != position_line_range_end;\n\")\necho \" Multi-line DiffNotes: $MULTILINE_COUNT\"\n\n# 6. Check watermarks set\necho \"Step 6: Check watermarks...\"\nWATERMARKED=$(sqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM merge_requests \n WHERE discussions_synced_for_updated_at IS NOT NULL;\n\")\necho \" MRs with watermark set: $WATERMARKED\"\n\n# 7. Check last_seen_at for sweep pattern\necho \"Step 7: Check last_seen_at (sweep pattern)...\"\nsqlite3 \"$DB_PATH\" \"\n SELECT \n MIN(last_seen_at) as oldest,\n MAX(last_seen_at) as newest\n FROM discussions \n WHERE merge_request_id IS NOT NULL;\n\"\n\n# 8. Test show command with DiffNote\necho \"Step 8: Find MR with DiffNotes for show test...\"\nMR_IID=$(sqlite3 \"$DB_PATH\" \"\n SELECT DISTINCT m.iid\n FROM merge_requests m\n JOIN discussions d ON d.merge_request_id = m.id\n JOIN notes n ON n.discussion_id = d.id\n WHERE n.position_new_path IS NOT NULL\n LIMIT 1;\n\")\nif [ -n \"$MR_IID\" ]; then\n echo \" Testing: gi show mr $MR_IID\"\n gi show mr \"$MR_IID\" | head -50\nfi\n\n# 9. Re-run and verify skip count\necho \"Step 9: Re-run ingest (should skip unchanged MRs)...\"\ngi ingest --type=merge_requests\n# Should report \"Skipped discussion sync for N unchanged MRs\"\n\necho \"\"\necho \"=== Gate C: PASSED ===\"\n```\n\n## Atomicity Test (Manual - Kill Test)\n```bash\n# This tests that partial failure preserves data\n\n# 1. Get an MR with discussions\nMR_ID=$(sqlite3 \"$DB_PATH\" \"\n SELECT m.id FROM merge_requests m\n JOIN discussions d ON d.merge_request_id = m.id\n LIMIT 1;\n\")\n\n# 2. Note current note count\nBEFORE=$(sqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM notes n\n JOIN discussions d ON d.id = n.discussion_id\n WHERE d.merge_request_id = $MR_ID;\n\")\necho \"Notes before: $BEFORE\"\n\n# 3. Note watermark\nWATERMARK_BEFORE=$(sqlite3 \"$DB_PATH\" \"\n SELECT discussions_synced_for_updated_at FROM merge_requests WHERE id = $MR_ID;\n\")\necho \"Watermark before: $WATERMARK_BEFORE\"\n\n# 4. Force full sync and kill mid-run\ngi ingest --type=merge_requests --full &\nPID=$!\nsleep 3 && kill -9 $PID 2>/dev/null || true\nwait $PID 2>/dev/null || true\n\n# 5. Verify notes preserved (should be same or more, never less)\nAFTER=$(sqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM notes n\n JOIN discussions d ON d.id = n.discussion_id\n WHERE d.merge_request_id = $MR_ID;\n\")\necho \"Notes after kill: $AFTER\"\n[ \"$AFTER\" -ge \"$BEFORE\" ] || echo \"WARNING: Notes decreased - atomicity may be broken\"\n\n# 6. Note watermark should NOT have advanced if killed mid-pagination\nWATERMARK_AFTER=$(sqlite3 \"$DB_PATH\" \"\n SELECT discussions_synced_for_updated_at FROM merge_requests WHERE id = $MR_ID;\n\")\necho \"Watermark after: $WATERMARK_AFTER\"\n```\n\n## Test Commands (Quick Verification)\n```bash\n# Check DiffNote data:\nsqlite3 ~/.local/share/gitlab-inbox/db.sqlite3 \"\n SELECT \n (SELECT COUNT(*) FROM discussions WHERE merge_request_id IS NOT NULL) as mr_discussions,\n (SELECT COUNT(*) FROM notes WHERE position_new_path IS NOT NULL) as diffnotes,\n (SELECT COUNT(*) FROM merge_requests WHERE discussions_synced_for_updated_at IS NOT NULL) as watermarked;\n\"\n\n# Find MR with DiffNotes and show it:\ngi show mr $(sqlite3 ~/.local/share/gitlab-inbox/db.sqlite3 \"\n SELECT DISTINCT m.iid FROM merge_requests m\n JOIN discussions d ON d.merge_request_id = m.id\n JOIN notes n ON n.discussion_id = d.id\n WHERE n.position_new_path IS NOT NULL LIMIT 1;\n\")\n```\n\n## Dependencies\nThis gate requires:\n- bd-3j6 (Discussion transformer with DiffNote position extraction)\n- bd-20h (MR discussion ingestion with atomicity guarantees)\n- bd-iba (Client pagination for MR discussions)\n- Gates A and B must pass first\n\n## Edge Cases\n- MRs without discussions: should sync successfully, just with 0 discussions\n- Discussions without DiffNotes: regular comments have NULL position fields\n- Deleted discussions in GitLab: sweep pattern should remove them locally\n- Invalid note timestamps: should NOT advance watermark, should log warning","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-26T22:06:01.769694Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:48:21.060017Z","closed_at":"2026-01-27T00:48:21.059974Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1t4","depends_on_id":"bd-20h","type":"blocks","created_at":"2026-01-26T22:08:55.778989Z","created_by":"tayloreernisse"}]} {"id":"bd-1ta","title":"[CP1] Integration tests for pagination","description":"Integration tests for GitLab pagination with wiremock.\n\n## Tests (tests/pagination_tests.rs)\n\n### Page Navigation\n- fetches_all_pages_when_multiple_exist\n- respects_per_page_parameter\n- follows_x_next_page_header_until_empty\n- falls_back_to_empty_page_stop_if_headers_missing\n\n### Cursor Behavior\n- applies_cursor_rewind_for_tuple_semantics\n- clamps_negative_rewind_to_zero\n\n## Test Setup\n- Use wiremock::MockServer\n- Set up handlers for /api/v4/projects/:id/issues\n- Return x-next-page headers\n- Verify request params (updated_after, per_page)\n\nFiles: tests/pagination_tests.rs\nDone when: All pagination tests pass with mocked server","status":"tombstone","priority":3,"issue_type":"task","created_at":"2026-01-25T16:59:07.806593Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:02.038945Z","deleted_at":"2026-01-25T17:02:02.038939Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0} {"id":"bd-1u1","title":"Implement document regenerator","description":"## Background\nThe document regenerator drains the dirty_sources queue, regenerating documents for each entry. It uses per-item transactions for crash safety, a triple-hash fast path to skip unchanged documents entirely (no writes at all), and a bounded batch loop that drains completely. Error recording includes backoff computation.\n\n## Approach\nCreate `src/documents/regenerator.rs` per PRD Section 6.3.\n\n**Core function:**\n```rust\npub fn regenerate_dirty_documents(conn: &Connection) -> Result\n```\n\n**RegenerateResult:** { regenerated, unchanged, errored }\n\n**Algorithm (per PRD):**\n1. Loop: get_dirty_sources(conn) -> Vec<(SourceType, i64)>\n2. If empty, break (queue fully drained)\n3. For each (source_type, source_id):\n a. Begin transaction\n b. Call regenerate_one_tx(&tx, source_type, source_id) -> Result\n c. If Ok(changed): clear_dirty_tx, commit, count regenerated or unchanged\n d. If Err: record_dirty_error_tx (with backoff), commit, count errored\n\n**regenerate_one_tx (per PRD):**\n1. Extract document via extract_{type}_document(conn, source_id)\n2. If None (deleted): delete_document, return Ok(true)\n3. If Some(doc): call get_existing_hash() to check current state\n4. **If ALL THREE hashes match: return Ok(false) — skip ALL writes** (fast path)\n5. Otherwise: upsert_document with conditional label/path relinking\n6. Return Ok(content changed)\n\n**Helper functions (PRD-exact):**\n\n`get_existing_hash` — uses `optional()` to distinguish missing rows from DB errors:\n```rust\nfn get_existing_hash(\n conn: &Connection,\n source_type: SourceType,\n source_id: i64,\n) -> Result> {\n use rusqlite::OptionalExtension;\n let hash: Option = stmt\n .query_row(params, |row| row.get(0))\n .optional()?; // IMPORTANT: Not .ok() — .ok() would hide real DB errors\n Ok(hash)\n}\n```\n\n`get_document_id` — resolve document ID after upsert:\n```rust\nfn get_document_id(conn: &Connection, source_type: SourceType, source_id: i64) -> Result\n```\n\n`upsert_document` — checks existing triple hash before writing:\n```rust\nfn upsert_document(conn: &Connection, doc: &DocumentData) -> Result<()> {\n // 1. Query existing (id, content_hash, labels_hash, paths_hash) via OptionalExtension\n // 2. Triple-hash fast path: all match -> return Ok(())\n // 3. Upsert document row (ON CONFLICT DO UPDATE)\n // 4. Get doc_id (from existing or query after insert)\n // 5. Only delete+reinsert labels if labels_hash changed\n // 6. Only delete+reinsert paths if paths_hash changed\n}\n```\n\n**Key PRD detail — triple-hash fast path:**\n```rust\nif old_content_hash == &doc.content_hash\n && old_labels_hash == &doc.labels_hash\n && old_paths_hash == &doc.paths_hash\n{ return Ok(()); } // Skip ALL writes — prevents WAL churn\n```\n\n**Error recording with backoff:**\nrecord_dirty_error_tx reads current attempt_count from DB, computes next_attempt_at via shared backoff utility:\n```rust\nlet next_attempt_at = crate::core::backoff::compute_next_attempt_at(now, attempt_count + 1);\n```\n\n**All internal functions use _tx suffix** (take &Transaction) for atomicity.\n\n## Acceptance Criteria\n- [ ] Queue fully drained (bounded batch loop until empty)\n- [ ] Per-item transactions (crash loses at most 1 doc)\n- [ ] Triple-hash fast path: ALL THREE hashes match -> skip ALL writes (return Ok(false))\n- [ ] Content change: upsert document, update labels/paths\n- [ ] Labels-only change: relabels but skips path writes (paths_hash unchanged)\n- [ ] Deleted entity: delete document (cascade handles FTS/labels/paths/embeddings)\n- [ ] get_existing_hash uses `.optional()` (not `.ok()`) to preserve DB errors\n- [ ] get_document_id resolves document ID after upsert\n- [ ] Error recording: increment attempt_count, compute next_attempt_at via backoff\n- [ ] FTS triggers fire on insert/update/delete (verified by trigger, not regenerator)\n- [ ] RegenerateResult counts accurate (regenerated, unchanged, errored)\n- [ ] Errors do not abort batch (log, increment, continue)\n- [ ] `cargo test regenerator` passes\n\n## Files\n- `src/documents/regenerator.rs` — new file\n- `src/documents/mod.rs` — add `pub use regenerator::regenerate_dirty_documents;`\n\n## TDD Loop\nRED: Tests requiring DB:\n- `test_creates_new_document` — dirty source -> document created\n- `test_skips_unchanged_triple_hash` — all 3 hashes match -> unchanged count incremented, no DB writes\n- `test_updates_changed_content` — content_hash mismatch -> updated\n- `test_updates_changed_labels_only` — content same but labels_hash different -> updated\n- `test_updates_changed_paths_only` — content same but paths_hash different -> updated\n- `test_deletes_missing_source` — source deleted -> document deleted\n- `test_drains_queue` — queue empty after regeneration\n- `test_error_records_backoff` — error -> attempt_count incremented, next_attempt_at set\n- `test_get_existing_hash_not_found` — returns Ok(None) for missing document\nGREEN: Implement regenerate_dirty_documents + all helpers\nVERIFY: `cargo test regenerator`\n\n## Edge Cases\n- Empty queue: return immediately with all-zero counts\n- Extractor error for one item: record_dirty_error_tx, commit, continue\n- Triple-hash prevents WAL churn on incremental syncs (most entities unchanged)\n- Labels change but content does not: labels_hash mismatch triggers upsert with label relinking\n- get_existing_hash on missing document: returns Ok(None) via .optional() (not DB error)\n- get_existing_hash on corrupt DB: propagates real DB error (not masked by .ok())","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:25:55.178825Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:41:29.942386Z","closed_at":"2026-01-30T17:41:29.942324Z","close_reason":"Implemented document regenerator with triple-hash fast path, queue draining, fail-soft error handling + 5 tests","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1u1","depends_on_id":"bd-1yz","type":"blocks","created_at":"2026-01-30T15:29:16.020686Z","created_by":"tayloreernisse"},{"issue_id":"bd-1u1","depends_on_id":"bd-247","type":"blocks","created_at":"2026-01-30T15:29:15.982772Z","created_by":"tayloreernisse"},{"issue_id":"bd-1u1","depends_on_id":"bd-2fp","type":"blocks","created_at":"2026-01-30T15:29:16.055043Z","created_by":"tayloreernisse"}]} @@ -50,6 +51,7 @@ {"id":"bd-1yx","title":"Implement rename chain resolution for file-history","description":"## Background\nFile renames need to be resolved as bounded chains to find all historical paths. When a user queries for src/auth/oauth.rs, the system should also find MRs that touched src/auth/handler.rs if it was renamed. This is a core algorithm for both file-history and trace commands.\n\n## Approach\nCreate src/core/file_history.rs:\n\n```rust\n/// Result of rename chain resolution.\npub struct RenameChain {\n /// All paths the file has been known by, ordered chronologically.\n pub paths: Vec,\n /// The original query path.\n pub query_path: String,\n}\n\n/// Resolve the rename chain for a file path.\n/// Starts with the query path and follows renames in both directions\n/// up to max_hops (default 10).\npub fn resolve_rename_chain(\n conn: &Connection,\n query_path: &str,\n max_hops: usize, // default 10, configurable\n) -> Result\n```\n\nAlgorithm:\n```\n1. path_set = HashSet::from([query_path])\n2. for hop in 0..max_hops:\n3. new_paths = query:\n SELECT old_path, new_path FROM mr_file_changes\n WHERE change_type = 'renamed'\n AND (new_path IN path_set OR old_path IN path_set)\n4. for (old, new) in new_paths:\n5. add the \"other side\" (not already in path_set) to path_set\n6. if no new paths discovered: break\n7. return ordered path set (attempt chronological ordering by MR date)\n```\n\nThe ordering uses MR merge/creation dates to present paths chronologically:\n```sql\nSELECT DISTINCT mfc.old_path, mfc.new_path, COALESCE(mr.merged_at, mr.created_at) as date\nFROM mr_file_changes mfc\nJOIN merge_requests mr ON mr.id = mfc.merge_request_id\nWHERE mfc.change_type = 'renamed'\n AND (mfc.new_path IN (...) OR mfc.old_path IN (...))\nORDER BY date;\n```\n\nRegister in src/core/mod.rs: `pub mod file_history;`\n\n## Acceptance Criteria\n- [ ] Single rename resolved: query \"b.rs\" finds \"a.rs\" (renamed a→b)\n- [ ] Multi-hop: a→b→c chain resolved from any entry point\n- [ ] Cycle detection: a→b→a doesn't infinite loop\n- [ ] 10-hop cap enforced\n- [ ] Paths returned in chronological order\n- [ ] --no-follow-renames returns single-element chain (just query path)\n\n## Files\n- src/core/file_history.rs (new)\n- src/core/mod.rs (add `pub mod file_history;`)\n\n## TDD Loop\nRED: tests/file_history_tests.rs:\n- `test_rename_chain_single_hop` - a.rs renamed to b.rs, query b.rs → [a.rs, b.rs]\n- `test_rename_chain_multi_hop` - a→b→c, query c → [a, b, c]\n- `test_rename_chain_reverse_direction` - query a.rs also discovers b.rs\n- `test_rename_chain_cycle_detection` - a→b, b→a in different MRs → terminates\n- `test_rename_chain_hop_cap` - 15-hop chain with cap 10 → truncated at 10\n- `test_rename_chain_no_renames` - file never renamed → single-element chain\n\nSetup: create_test_db with migrations 001-012, seed mr_file_changes with renamed entries.\n\nGREEN: Implement resolve_rename_chain\n\nVERIFY: `cargo test file_history -- --nocapture`\n\n## Edge Cases\n- Same file renamed back and forth (a→b in MR !1, b→a in MR !2): cycle detection handles this\n- File renamed in two different projects: the query is project-scoped, so only same-project renames matter\n- old_path in mr_file_changes is only populated for change_type='renamed' — the WHERE clause already filters this\n- Path matching is exact (case-sensitive on Linux, should be consistent)","status":"open","priority":2,"issue_type":"task","created_at":"2026-02-02T21:34:08.985345Z","created_by":"tayloreernisse","updated_at":"2026-02-02T21:49:00.894547Z","compaction_level":0,"original_size":0,"labels":["gate-4","phase-b","query"],"dependencies":[{"issue_id":"bd-1yx","depends_on_id":"bd-14q","type":"parent-child","created_at":"2026-02-02T21:34:08.986730Z","created_by":"tayloreernisse"},{"issue_id":"bd-1yx","depends_on_id":"bd-1oo","type":"blocks","created_at":"2026-02-02T21:34:16.698782Z","created_by":"tayloreernisse"}]} {"id":"bd-1yz","title":"Implement MR document extraction","description":"## Background\nMR documents are similar to issue documents but include source/target branch information in the header. The extractor queries merge_requests and mr_labels tables. Like issue extraction, it produces a DocumentData struct for the regeneration pipeline.\n\n## Approach\nImplement `extract_mr_document()` in `src/documents/extractor.rs`:\n\n```rust\n/// Extract a searchable document from a merge request.\n/// Returns None if the MR has been deleted from the DB.\npub fn extract_mr_document(conn: &Connection, mr_id: i64) -> Result>\n```\n\n**SQL queries (from PRD Section 2.2):**\n```sql\n-- Main entity\nSELECT m.id, m.iid, m.title, m.description, m.state, m.author_username,\n m.source_branch, m.target_branch,\n m.created_at, m.updated_at, m.web_url,\n p.path_with_namespace, p.id AS project_id\nFROM merge_requests m\nJOIN projects p ON p.id = m.project_id\nWHERE m.id = ?\n\n-- Labels\nSELECT l.name FROM mr_labels ml\nJOIN labels l ON l.id = ml.label_id\nWHERE ml.merge_request_id = ?\nORDER BY l.name\n```\n\n**Document format:**\n```\n[[MergeRequest]] !456: Implement JWT authentication\nProject: group/project-one\nURL: https://gitlab.example.com/group/project-one/-/merge_requests/456\nLabels: [\"feature\", \"auth\"]\nState: opened\nAuthor: @johndoe\nSource: feature/jwt-auth -> main\n\n--- Description ---\n\nThis MR implements JWT-based authentication...\n```\n\n**Key difference from issues:** The `Source:` line with `source_branch -> target_branch`.\n\n## Acceptance Criteria\n- [ ] Deleted MR returns Ok(None)\n- [ ] MR document has `[[MergeRequest]]` prefix with `!` before iid (not `#`)\n- [ ] Source line shows `source_branch -> target_branch`\n- [ ] Labels sorted alphabetically in JSON array\n- [ ] content_hash computed from full content_text\n- [ ] labels_hash computed from sorted labels\n- [ ] paths is empty (MR-level docs don't have DiffNote paths; those are on discussion docs)\n- [ ] `cargo test extract_mr` passes\n\n## Files\n- `src/documents/extractor.rs` — implement `extract_mr_document()`\n\n## TDD Loop\nRED: Tests in `#[cfg(test)] mod tests`:\n- `test_mr_document_format` — verify header matches PRD template with Source line\n- `test_mr_not_found` — returns Ok(None)\n- `test_mr_no_description` — header only\n- `test_mr_branch_info` — Source line correct\nGREEN: Implement extract_mr_document with SQL queries\nVERIFY: `cargo test extract_mr`\n\n## Edge Cases\n- MR with NULL description: skip \"--- Description ---\" section\n- MR with NULL source_branch or target_branch: omit Source line (shouldn't happen in practice)\n- Draft MRs: state field captures this, no special handling needed","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-30T15:25:45.521703Z","created_by":"tayloreernisse","updated_at":"2026-01-30T17:30:04.308781Z","closed_at":"2026-01-30T17:30:04.308598Z","close_reason":"Implemented extract_mr_document() with Source line, PRD format, and 5 tests","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-1yz","depends_on_id":"bd-36p","type":"blocks","created_at":"2026-01-30T15:29:15.749264Z","created_by":"tayloreernisse"},{"issue_id":"bd-1yz","depends_on_id":"bd-hrs","type":"blocks","created_at":"2026-01-30T15:29:15.814729Z","created_by":"tayloreernisse"}]} {"id":"bd-1zj6","title":"OBSERV: Enrich robot JSON meta with run_id and stages","description":"## Background\nRobot JSON currently has a flat meta.elapsed_ms. This enriches it with run_id and a stages array, making every lore --robot sync output a complete performance profile.\n\n## Approach\nThe robot JSON output is built in src/cli/commands/sync.rs. The current SyncResult (line 15-25) is serialized into the data field. The meta field is built alongside it.\n\n1. Find or create the SyncMeta struct (likely near SyncResult). Add fields:\n```rust\n#[derive(Debug, Serialize)]\nstruct SyncMeta {\n run_id: String,\n elapsed_ms: u64,\n stages: Vec,\n}\n```\n\n2. After run_sync() completes, extract timings from MetricsLayer:\n```rust\nlet stages = metrics_handle.extract_timings();\nlet meta = SyncMeta {\n run_id: run_id.to_string(),\n elapsed_ms: start.elapsed().as_millis() as u64,\n stages,\n};\n```\n\n3. Build the JSON envelope:\n```rust\nlet output = serde_json::json!({\n \"ok\": true,\n \"data\": result,\n \"meta\": meta,\n});\n```\n\nThe metrics_handle (Arc) must be passed from main.rs to the command handler. This requires adding a parameter to handle_sync_cmd() and run_sync(), or using a global. Prefer parameter passing.\n\nSame pattern for standalone ingest: add stages to IngestMeta.\n\n## Acceptance Criteria\n- [ ] lore --robot sync output includes meta.run_id (string, 8 hex chars)\n- [ ] lore --robot sync output includes meta.stages (array of StageTiming)\n- [ ] meta.elapsed_ms still present (total wall clock time)\n- [ ] Each stage has name, elapsed_ms, items_processed at minimum\n- [ ] Top-level stages have sub_stages when applicable\n- [ ] lore --robot ingest also includes run_id and stages\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/cli/commands/sync.rs (add SyncMeta struct, wire extract_timings)\n- src/cli/commands/ingest.rs (same for standalone ingest)\n- src/main.rs (pass metrics_handle to command handlers)\n\n## TDD Loop\nRED: test_sync_meta_includes_stages (run robot-mode sync, parse JSON, assert meta.stages is array)\nGREEN: Add SyncMeta, extract timings, include in JSON output\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- Empty stages: if sync runs with --no-docs --no-embed, some stages won't exist. stages array is shorter, not padded.\n- extract_timings() called before root span closes: returns incomplete tree. Must call AFTER run_sync returns (span is dropped on function exit).\n- metrics_handle clone: MetricsLayer uses Arc internally, clone is cheap (reference count increment).","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-04T15:54:32.062410Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:31:11.073580Z","closed_at":"2026-02-04T17:31:11.073534Z","close_reason":"Wired MetricsLayer into subscriber stack (all 4 branches), added run_id to SyncResult, enriched SyncMeta with run_id + stages Vec, updated print_sync_json to accept MetricsLayer and extract timings","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-1zj6","depends_on_id":"bd-34ek","type":"blocks","created_at":"2026-02-04T15:55:20.085372Z","created_by":"tayloreernisse"},{"issue_id":"bd-1zj6","depends_on_id":"bd-3er","type":"parent-child","created_at":"2026-02-04T15:54:32.063354Z","created_by":"tayloreernisse"}]} +{"id":"bd-1zwv","title":"Display assignees, due_date, and milestone in lore issues output","description":"## Background\nThe `lore issues ` command displays issue details but omits key metadata that exists in the database: assignees, due dates, and milestones. Users need this information to understand issue context without opening GitLab.\n\n**System fit**: This data is already ingested during issue sync (migration 005) but the show command never queries it.\n\n## Approach\n\nAll changes in `src/cli/commands/show.rs`:\n\n### 1. Update IssueRow struct (line ~119)\nAdd fields to internal row struct:\n```rust\nstruct IssueRow {\n // ... existing 10 fields ...\n due_date: Option, // NEW\n milestone_title: Option, // NEW\n}\n```\n\n### 2. Update find_issue() SQL (line ~137)\nExtend SELECT:\n```sql\nSELECT i.id, i.iid, i.title, i.description, i.state, i.author_username,\n i.created_at, i.updated_at, i.web_url, p.path_with_namespace,\n i.due_date, i.milestone_title -- ADD THESE\nFROM issues i ...\n```\n\nUpdate row mapping to extract columns 10 and 11.\n\n### 3. Add get_issue_assignees() (after get_issue_labels ~line 189)\n```rust\nfn get_issue_assignees(conn: &Connection, issue_id: i64) -> Result> {\n let mut stmt = conn.prepare(\n \"SELECT username FROM issue_assignees WHERE issue_id = ? ORDER BY username\"\n )?;\n let assignees = stmt\n .query_map([issue_id], |row| row.get(0))?\n .collect::, _>>()?;\n Ok(assignees)\n}\n```\n\n### 4. Update IssueDetail struct (line ~59)\n```rust\npub struct IssueDetail {\n // ... existing 12 fields ...\n pub assignees: Vec, // NEW\n pub due_date: Option, // NEW\n pub milestone: Option, // NEW\n}\n```\n\n### 5. Update IssueDetailJson struct (line ~770)\nAdd same 3 fields with identical types.\n\n### 6. Update run_show_issue() (line ~89)\n```rust\nlet assignees = get_issue_assignees(&conn, issue.id)?;\n// In return struct:\nassignees,\ndue_date: issue.due_date,\nmilestone: issue.milestone_title,\n```\n\n### 7. Update print_show_issue() (line ~533, after Author line ~548)\n```rust\nif !issue.assignees.is_empty() {\n println!(\"Assignee{}: {}\",\n if issue.assignees.len() > 1 { \"s\" } else { \"\" },\n issue.assignees.iter().map(|a| format!(\"@{}\", a)).collect::>().join(\", \"));\n}\nif let Some(due) = &issue.due_date {\n println!(\"Due: {}\", due);\n}\nif let Some(ms) = &issue.milestone {\n println!(\"Milestone: {}\", ms);\n}\n```\n\n### 8. Update From<&IssueDetail> for IssueDetailJson (line ~799)\n```rust\nassignees: issue.assignees.clone(),\ndue_date: issue.due_date.clone(),\nmilestone: issue.milestone.clone(),\n```\n\n## Acceptance Criteria\n- [ ] `cargo test test_get_issue_assignees` passes (3 tests)\n- [ ] `lore issues ` shows Assignees line when assignees exist\n- [ ] `lore issues ` shows Due line when due_date set\n- [ ] `lore issues ` shows Milestone line when milestone set\n- [ ] `lore -J issues ` includes assignees/due_date/milestone in JSON\n- [ ] `cargo clippy --all-targets -- -D warnings` passes\n\n## Files\n- `src/cli/commands/show.rs` - ALL changes\n\n## TDD Loop\n\n**RED** - Add tests to `src/cli/commands/show.rs` `#[cfg(test)] mod tests`:\n\n```rust\nuse crate::core::db::{create_connection, run_migrations};\nuse std::path::Path;\n\nfn setup_test_db() -> Connection {\n let conn = create_connection(Path::new(\":memory:\")).unwrap();\n run_migrations(&conn).unwrap();\n conn\n}\n\n#[test]\nfn test_get_issue_assignees_empty() {\n let conn = setup_test_db();\n // seed project + issue with no assignees\n let result = get_issue_assignees(&conn, 1).unwrap();\n assert!(result.is_empty());\n}\n\n#[test]\nfn test_get_issue_assignees_multiple_sorted() {\n let conn = setup_test_db();\n // seed with alice, bob\n let result = get_issue_assignees(&conn, 1).unwrap();\n assert_eq!(result, vec![\"alice\", \"bob\"]); // alphabetical\n}\n\n#[test]\nfn test_get_issue_assignees_single() {\n let conn = setup_test_db();\n // seed with charlie only\n let result = get_issue_assignees(&conn, 1).unwrap();\n assert_eq!(result, vec![\"charlie\"]);\n}\n```\n\n**GREEN** - Implement get_issue_assignees() and struct updates\n\n**VERIFY**: `cargo test test_get_issue_assignees && cargo clippy --all-targets -- -D warnings`\n\n## Edge Cases\n- Empty assignees list -> don't print Assignees line\n- NULL due_date -> don't print Due line \n- NULL milestone_title -> don't print Milestone line\n- Single vs multiple assignees -> \"Assignee\" vs \"Assignees\" grammar","status":"closed","priority":1,"issue_type":"feature","created_at":"2026-02-05T15:16:00.105830Z","created_by":"tayloreernisse","updated_at":"2026-02-05T15:26:08.147202Z","closed_at":"2026-02-05T15:26:08.147154Z","close_reason":"Implemented: assignees, due_date, milestone now display in lore issues . All 7 new tests pass.","compaction_level":0,"original_size":0,"labels":["ISSUE"]} {"id":"bd-208","title":"[CP1] Issue ingestion module","description":"## Background\n\nThe issue ingestion module fetches and stores issues with cursor-based incremental sync. It is the primary data ingestion component, establishing the pattern reused for MR ingestion in CP2. The module handles tuple-cursor semantics, raw payload storage, label extraction, and tracking which issues need discussion sync.\n\n## Approach\n\n### Module: src/ingestion/issues.rs\n\n### Key Structs\n\n```rust\n#[derive(Debug, Default)]\npub struct IngestIssuesResult {\n pub fetched: usize,\n pub upserted: usize,\n pub labels_created: usize,\n pub issues_needing_discussion_sync: Vec,\n}\n\n#[derive(Debug, Clone)]\npub struct IssueForDiscussionSync {\n pub local_issue_id: i64,\n pub iid: i64,\n pub updated_at: i64, // ms epoch\n}\n```\n\n### Main Function\n\n```rust\npub async fn ingest_issues(\n conn: &Connection,\n client: &GitLabClient,\n config: &Config,\n project_id: i64, // Local DB project ID\n gitlab_project_id: i64,\n) -> Result\n```\n\n### Logic (Step by Step)\n\n1. **Get current cursor** from sync_cursors table:\n```sql\nSELECT updated_at_cursor, tie_breaker_id\nFROM sync_cursors\nWHERE project_id = ? AND resource_type = 'issues'\n```\n\n2. **Call pagination method** with cursor rewind:\n```rust\nlet issues_stream = client.paginate_issues(\n gitlab_project_id,\n cursor.updated_at_cursor,\n config.sync.cursor_rewind_seconds,\n);\n```\n\n3. **Apply local filtering** for tuple cursor semantics:\n```rust\n// Skip if issue.updated_at < cursor_updated_at\n// Skip if issue.updated_at == cursor_updated_at AND issue.gitlab_id <= cursor_gitlab_id\nfn passes_cursor_filter(issue: &GitLabIssue, cursor: &SyncCursor) -> bool {\n if issue.updated_at < cursor.updated_at_cursor {\n return false;\n }\n if issue.updated_at == cursor.updated_at_cursor \n && issue.gitlab_id <= cursor.tie_breaker_id {\n return false;\n }\n true\n}\n```\n\n4. **For each issue passing filter**:\n```rust\n// Begin transaction (unchecked_transaction for rusqlite)\nlet tx = conn.unchecked_transaction()?;\n\n// Store raw payload (compressed based on config)\nlet payload_id = store_raw_payload(&tx, &issue_json, config.storage.compress_raw_payloads)?;\n\n// Transform and upsert issue\nlet issue_row = transform_issue(&issue)?;\nupsert_issue(&tx, &issue_row, project_id, payload_id)?;\nlet local_issue_id = get_local_issue_id(&tx, project_id, issue.iid)?;\n\n// Clear existing label links (stale removal!)\ntx.execute(\"DELETE FROM issue_labels WHERE issue_id = ?\", [local_issue_id])?;\n\n// Extract and upsert labels\nfor label_name in &issue_row.label_names {\n let label_id = upsert_label(&tx, project_id, label_name)?;\n link_issue_label(&tx, local_issue_id, label_id)?;\n}\n\ntx.commit()?;\n```\n\n5. **Incremental cursor update** every 100 issues:\n```rust\nif batch_count % 100 == 0 {\n update_sync_cursor(conn, project_id, \"issues\", last_updated_at, last_gitlab_id)?;\n}\n```\n\n6. **Final cursor update** after all issues processed\n\n7. **Determine issues needing discussion sync**:\n```sql\nSELECT id, iid, updated_at\nFROM issues\nWHERE project_id = ?\n AND updated_at > COALESCE(discussions_synced_for_updated_at, 0)\n```\n\n### Helper Functions\n\n```rust\nfn store_raw_payload(conn, json: &Value, compress: bool) -> Result\nfn upsert_issue(conn, issue: &IssueRow, project_id: i64, payload_id: i64) -> Result<()>\nfn get_local_issue_id(conn, project_id: i64, iid: i64) -> Result\nfn upsert_label(conn, project_id: i64, name: &str) -> Result\nfn link_issue_label(conn, issue_id: i64, label_id: i64) -> Result<()>\nfn update_sync_cursor(conn, project_id: i64, resource: &str, updated_at: i64, gitlab_id: i64) -> Result<()>\n```\n\n### Critical Invariant\n\nStale label links MUST be removed on resync. The \"DELETE then INSERT\" pattern ensures GitLab reality is reflected locally. If an issue had labels [A, B] and now has [A, C], the B link must be removed.\n\n## Acceptance Criteria\n\n- [ ] `ingest_issues` returns IngestIssuesResult with all counts\n- [ ] Cursor fetched from sync_cursors at start\n- [ ] Cursor rewind applied before API call\n- [ ] Local filtering skips already-processed issues\n- [ ] Each issue wrapped in transaction for atomicity\n- [ ] Raw payload stored with correct compression\n- [ ] Issue upserted (INSERT OR REPLACE pattern)\n- [ ] Existing label links deleted before new links inserted\n- [ ] Labels upserted (INSERT OR IGNORE by project+name)\n- [ ] Cursor updated every 100 issues (crash recovery)\n- [ ] Final cursor update after all issues\n- [ ] issues_needing_discussion_sync populated correctly\n\n## Files\n\n- src/ingestion/mod.rs (add `pub mod issues;`)\n- src/ingestion/issues.rs (create)\n\n## TDD Loop\n\nRED:\n```rust\n// tests/issue_ingestion_tests.rs\n#[tokio::test] async fn ingests_issues_from_stream()\n#[tokio::test] async fn applies_cursor_filter_correctly()\n#[tokio::test] async fn updates_cursor_every_100_issues()\n#[tokio::test] async fn stores_raw_payload_for_each_issue()\n#[tokio::test] async fn upserts_issues_correctly()\n\n// tests/label_linkage_tests.rs\n#[tokio::test] async fn extracts_and_stores_labels()\n#[tokio::test] async fn removes_stale_label_links_on_resync()\n#[tokio::test] async fn handles_empty_labels_array()\n\n// tests/discussion_eligibility_tests.rs\n#[tokio::test] async fn identifies_issues_needing_discussion_sync()\n#[tokio::test] async fn skips_issues_with_current_watermark()\n```\n\nGREEN: Implement ingest_issues with all helper functions\n\nVERIFY: `cargo test issue_ingestion && cargo test label_linkage && cargo test discussion_eligibility`\n\n## Edge Cases\n\n- Empty issues stream - return result with all zeros\n- Cursor at epoch 0 - fetch all issues (no filtering)\n- Issue with no labels - empty Vec, no label links created\n- Issue with 50+ labels - all should be linked\n- Crash mid-batch - cursor at last 100-boundary, some issues re-fetched\n- Label already exists - upsert via INSERT OR IGNORE\n- Same issue fetched twice (due to rewind) - upsert handles it","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-25T17:02:38.245404Z","created_by":"tayloreernisse","updated_at":"2026-01-25T22:52:38.003964Z","closed_at":"2026-01-25T22:52:38.003868Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-208","depends_on_id":"bd-2iq","type":"blocks","created_at":"2026-01-25T17:04:05.425224Z","created_by":"tayloreernisse"},{"issue_id":"bd-208","depends_on_id":"bd-3nd","type":"blocks","created_at":"2026-01-25T17:04:05.450341Z","created_by":"tayloreernisse"},{"issue_id":"bd-208","depends_on_id":"bd-xhz","type":"blocks","created_at":"2026-01-25T17:04:05.473203Z","created_by":"tayloreernisse"}]} {"id":"bd-20e","title":"Define TimelineEvent model and TimelineEventType enum","description":"## Background\nThe timeline needs a unified event model that spans multiple source tables (resource events, issues, MRs, notes). This is a read-time virtual event stream — not a separate stored table. The spec (§3.3) defines the exact shapes.\n\n## Approach\nCreate src/core/timeline.rs with the core types:\n\n```rust\nuse serde::Serialize;\n\n/// A single event in a timeline query result.\n#[derive(Debug, Clone, Serialize)]\npub struct TimelineEvent {\n pub timestamp: i64, // ms epoch UTC\n pub entity_type: String, // \"issue\" | \"merge_request\"\n pub entity_iid: i64,\n pub project_path: String,\n pub event_type: TimelineEventType,\n pub summary: String, // human-readable one-liner\n pub actor: Option, // username\n pub url: Option,\n pub is_seed: bool, // matched by keyword (vs. expanded via reference)\n}\n\n/// Types of events in the timeline.\n#[derive(Debug, Clone, Serialize)]\n#[serde(tag = \"type\", rename_all = \"snake_case\")]\npub enum TimelineEventType {\n Created,\n StateChanged { state: String },\n LabelAdded { label: String },\n LabelRemoved { label: String },\n MilestoneSet { milestone: String },\n MilestoneRemoved { milestone: String },\n Merged,\n NoteEvidence {\n note_id: i64,\n snippet: String, // first ~200 chars\n discussion_id: Option,\n },\n CrossReferenced { target: String },\n}\n\nimpl TimelineEvent {\n /// Create a summary string for human output.\n pub fn format_summary(&self) -> String { ... }\n}\n\nimpl Ord for TimelineEvent {\n /// Sort chronologically by timestamp, then by entity_iid for stable tiebreak.\n fn cmp(&self, other: &Self) -> Ordering {\n self.timestamp.cmp(&other.timestamp)\n .then_with(|| self.entity_iid.cmp(&other.entity_iid))\n }\n}\nimpl PartialOrd for TimelineEvent { ... }\nimpl Eq for TimelineEvent {}\nimpl PartialEq for TimelineEvent { ... }\n\n/// Result of a timeline query.\npub struct TimelineResult {\n pub query: String,\n pub events: Vec,\n pub seed_entities: Vec,\n pub expanded_entities: Vec,\n pub unresolved_references: Vec,\n}\n\n#[derive(Debug, Clone, Serialize)]\npub struct EntityRef {\n pub entity_type: String,\n pub iid: i64,\n pub project: String,\n}\n\n#[derive(Debug, Clone, Serialize)]\npub struct ExpandedEntityRef {\n pub entity_type: String,\n pub iid: i64,\n pub project: String,\n pub depth: usize,\n pub via: ExpansionProvenance,\n}\n\n#[derive(Debug, Clone, Serialize)]\npub struct ExpansionProvenance {\n pub from: EntityRef,\n pub reference_type: String,\n pub source_method: String,\n}\n\n#[derive(Debug, Clone, Serialize)]\npub struct UnresolvedRef {\n pub source: EntityRef,\n pub target_project: String,\n pub target_type: String,\n pub target_iid: i64,\n pub reference_type: String,\n}\n```\n\nRegister in src/core/mod.rs: `pub mod timeline;`\n\n## Acceptance Criteria\n- [ ] All types compile with Serialize derive\n- [ ] TimelineEventType serde serializes with tagged union (type field)\n- [ ] TimelineEvent Ord sorts chronologically with stable tiebreak\n- [ ] format_summary produces readable strings for each event type\n- [ ] All helper types (EntityRef, ExpandedEntityRef, UnresolvedRef) defined\n\n## Files\n- src/core/timeline.rs (new)\n- src/core/mod.rs (add `pub mod timeline;`)\n\n## TDD Loop\nRED: tests/timeline_types_tests.rs:\n- `test_timeline_event_sort_chronological` - events at different timestamps sort correctly\n- `test_timeline_event_sort_stable_tiebreak` - same timestamp, different iids, stable order\n- `test_timeline_event_type_serialize` - verify JSON tag serialization for each variant\n- `test_format_summary_created` - verify human string for Created\n- `test_format_summary_state_changed` - verify \"closed\" / \"reopened\" etc.\n- `test_format_summary_note_evidence` - verify snippet truncation\n\nGREEN: Define all types and implement traits\n\nVERIFY: `cargo test timeline_types -- --nocapture`\n\n## Edge Cases\n- TimelineEventType::NoteEvidence snippet should be truncated at ~200 chars with \"...\" suffix — handle multi-byte UTF-8 correctly (don't split in middle of char)\n- serde tag=\"type\" may conflict if any variant has a field named \"type\" — none do, so safe\n- Ord comparison on timestamps should handle equal timestamps gracefully — iid tiebreak ensures stability","status":"open","priority":2,"issue_type":"task","created_at":"2026-02-02T21:33:08.569126Z","created_by":"tayloreernisse","updated_at":"2026-02-02T21:45:25.651058Z","compaction_level":0,"original_size":0,"labels":["gate-3","phase-b","types"],"dependencies":[{"issue_id":"bd-20e","depends_on_id":"bd-ike","type":"parent-child","created_at":"2026-02-02T21:33:08.573079Z","created_by":"tayloreernisse"}]} {"id":"bd-20h","title":"Implement MR discussion ingestion module","description":"## Background\nMR discussion ingestion with critical atomicity guarantees. Parse notes BEFORE destructive DB operations to prevent data loss. Watermark ONLY advanced on full success.\n\n## Approach\nCreate `src/ingestion/mr_discussions.rs` with:\n1. `IngestMrDiscussionsResult` - Per-MR stats\n2. `ingest_mr_discussions()` - Main function with atomicity guarantees\n3. Upsert + sweep pattern for notes (not delete-all-then-insert)\n4. Sync health telemetry for debugging failures\n\n## Files\n- `src/ingestion/mr_discussions.rs` - New module\n- `tests/mr_discussion_ingestion_tests.rs` - Integration tests\n\n## Acceptance Criteria\n- [ ] `IngestMrDiscussionsResult` has: discussions_fetched, discussions_upserted, notes_upserted, notes_skipped_bad_timestamp, diffnotes_count, pagination_succeeded\n- [ ] `ingest_mr_discussions()` returns `Result`\n- [ ] CRITICAL: Notes parsed BEFORE any DELETE operations\n- [ ] CRITICAL: Watermark NOT advanced if `pagination_succeeded == false`\n- [ ] CRITICAL: Watermark NOT advanced if any note parse fails\n- [ ] Upsert + sweep pattern using `last_seen_at`\n- [ ] Stale discussions/notes removed only on full success\n- [ ] Selective raw payload storage (skip system notes without position)\n- [ ] Sync health telemetry recorded on failure\n- [ ] `does_not_advance_discussion_watermark_on_partial_failure` test passes\n- [ ] `atomic_note_replacement_preserves_data_on_parse_failure` test passes\n\n## TDD Loop\nRED: `cargo test does_not_advance_watermark` -> test fails\nGREEN: Add ingestion with atomicity guarantees\nVERIFY: `cargo test mr_discussion_ingestion`\n\n## Main Function\n```rust\npub async fn ingest_mr_discussions(\n conn: &Connection,\n client: &GitLabClient,\n config: &Config,\n project_id: i64,\n gitlab_project_id: i64,\n mr_iid: i64,\n local_mr_id: i64,\n mr_updated_at: i64,\n) -> Result\n```\n\n## CRITICAL: Atomic Note Replacement\n```rust\n// Record sync start time for sweep\nlet run_seen_at = now_ms();\n\nwhile let Some(discussion_result) = stream.next().await {\n let discussion = match discussion_result {\n Ok(d) => d,\n Err(e) => {\n result.pagination_succeeded = false;\n break; // Stop but don't advance watermark\n }\n };\n \n // CRITICAL: Parse BEFORE destructive operations\n let notes = match transform_notes_with_diff_position(&discussion, project_id) {\n Ok(notes) => notes,\n Err(e) => {\n warn!(\"Note transform failed; preserving existing notes\");\n result.notes_skipped_bad_timestamp += discussion.notes.len();\n result.pagination_succeeded = false;\n continue; // Skip this discussion, don't delete existing\n }\n };\n \n // Only NOW start transaction (after parse succeeded)\n let tx = conn.unchecked_transaction()?;\n \n // Upsert discussion with run_seen_at\n // Upsert notes with run_seen_at (not delete-all)\n \n tx.commit()?;\n}\n```\n\n## Stale Data Sweep (only on success)\n```rust\nif result.pagination_succeeded {\n // Sweep stale discussions\n conn.execute(\n \"DELETE FROM discussions\n WHERE project_id = ? AND merge_request_id = ?\n AND last_seen_at < ?\",\n params![project_id, local_mr_id, run_seen_at],\n )?;\n \n // Sweep stale notes\n conn.execute(\n \"DELETE FROM notes\n WHERE discussion_id IN (\n SELECT id FROM discussions\n WHERE project_id = ? AND merge_request_id = ?\n )\n AND last_seen_at < ?\",\n params![project_id, local_mr_id, run_seen_at],\n )?;\n}\n```\n\n## Watermark Update (ONLY on success)\n```rust\nif result.pagination_succeeded {\n mark_discussions_synced(conn, local_mr_id, mr_updated_at)?;\n clear_sync_health_error(conn, local_mr_id)?;\n} else {\n record_sync_health_error(conn, local_mr_id, \"Pagination incomplete or parse failure\")?;\n warn!(\"Watermark NOT advanced; will retry on next sync\");\n}\n```\n\n## Selective Payload Storage\n```rust\n// Only store payload for DiffNotes and non-system notes\nlet should_store_note_payload =\n !note.is_system() ||\n note.position_new_path().is_some() ||\n note.position_old_path().is_some();\n```\n\n## Integration Tests (CRITICAL)\n```rust\n#[tokio::test]\nasync fn does_not_advance_discussion_watermark_on_partial_failure() {\n // Setup: MR with updated_at > discussions_synced_for_updated_at\n // Mock: Page 1 returns OK, Page 2 returns 500\n // Assert: discussions_synced_for_updated_at unchanged\n}\n\n#[tokio::test]\nasync fn does_not_advance_discussion_watermark_on_note_parse_failure() {\n // Setup: Existing notes in DB\n // Mock: Discussion with note having invalid created_at\n // Assert: Original notes preserved, watermark unchanged\n}\n\n#[tokio::test]\nasync fn atomic_note_replacement_preserves_data_on_parse_failure() {\n // Setup: Discussion with 3 valid notes\n // Mock: Updated discussion where note 2 has bad timestamp\n // Assert: All 3 original notes still in DB\n}\n```\n\n## Edge Cases\n- HTTP error mid-pagination: preserve existing data, log error, no watermark advance\n- Invalid note timestamp: skip discussion, preserve existing notes\n- System notes without position: don't store raw payload (saves space)\n- Empty discussion: still upsert discussion record, no notes","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-26T22:06:42.335714Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:22:43.207057Z","closed_at":"2026-01-27T00:22:43.206996Z","close_reason":"Implemented MR discussion ingestion module with full atomicity guarantees:\n- IngestMrDiscussionsResult with all required fields\n- parse-before-destructive pattern (transform notes before DB ops)\n- Upsert + sweep pattern with last_seen_at timestamps\n- Watermark advanced ONLY on full pagination success\n- Selective payload storage (skip system notes without position)\n- Sync health telemetry for failure debugging\n- All 163 tests passing","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-20h","depends_on_id":"bd-3ir","type":"blocks","created_at":"2026-01-26T22:08:54.649094Z","created_by":"tayloreernisse"},{"issue_id":"bd-20h","depends_on_id":"bd-3j6","type":"blocks","created_at":"2026-01-26T22:08:54.686066Z","created_by":"tayloreernisse"},{"issue_id":"bd-20h","depends_on_id":"bd-iba","type":"blocks","created_at":"2026-01-26T22:08:54.722746Z","created_by":"tayloreernisse"}]} @@ -91,7 +93,7 @@ {"id":"bd-2yo","title":"Fetch MR diffs API and populate mr_file_changes","description":"## Background\nGET /projects/:id/merge_requests/:iid/diffs returns file change metadata. We extract file paths and change types but NOT diff content. Uses the generic dependent fetch queue (job_type = 'mr_diffs').\n\n## Approach\n\n**1. Add API endpoint (src/gitlab/client.rs):**\n```rust\npub async fn fetch_mr_diffs(&self, project_id: i64, iid: i64) -> Result>\n```\n\nNew type in src/gitlab/types.rs:\n```rust\n#[derive(Debug, Clone, Deserialize)]\npub struct GitLabDiffFile {\n pub old_path: String,\n pub new_path: String,\n pub new_file: bool,\n pub renamed_file: bool,\n pub deleted_file: bool,\n // diff content fields exist but we ignore them\n}\n```\n\nURL: `GET /api/v4/projects/{project_id}/merge_requests/{iid}/diffs?per_page=100`\n\n**2. Enqueue jobs during MR ingestion:**\nIn orchestrator.rs, after MR upsert (when fetch_mr_file_changes is true):\n```rust\nif config.sync.fetch_mr_file_changes {\n enqueue_job(conn, project_id, \"merge_request\", iid, local_id, \"mr_diffs\", None)?;\n}\n```\n\n**3. Process jobs in drain step:**\nHandle \"mr_diffs\" job_type:\n```rust\nlet diffs = client.fetch_mr_diffs(gitlab_project_id, job.entity_iid).await?;\n// DELETE existing rows for this MR (diffs can change on rebase)\nconn.execute(\"DELETE FROM mr_file_changes WHERE merge_request_id = ?\", [job.entity_local_id])?;\n// Insert new rows\nfor diff in &diffs {\n let change_type = if diff.new_file { \"added\" }\n else if diff.renamed_file { \"renamed\" }\n else if diff.deleted_file { \"deleted\" }\n else { \"modified\" };\n conn.execute(\n \"INSERT INTO mr_file_changes (merge_request_id, project_id, old_path, new_path, change_type) VALUES (?, ?, ?, ?, ?)\",\n params![job.entity_local_id, project_id, \n if diff.renamed_file { Some(&diff.old_path) } else { None },\n &diff.new_path, change_type],\n )?;\n}\n```\n\n**4. Also capture commit SHAs:**\nDuring MR ingestion (orchestrator.rs), update merge_requests with merge_commit_sha and squash_commit_sha from the GitLab API response. These fields need to be added to GitLabMergeRequest type and transformer.\n\nAdd to src/gitlab/types.rs GitLabMergeRequest:\n```rust\npub merge_commit_sha: Option,\npub squash_commit_sha: Option,\n```\n\nUpdate MR transformer to pass these through, and UPDATE merge_requests SET merge_commit_sha = ?, squash_commit_sha = ? during upsert.\n\n## Acceptance Criteria\n- [ ] fetch_mr_diffs returns file metadata (no diff content)\n- [ ] Change types correctly derived: new_file→added, renamed_file→renamed, deleted_file→deleted, else→modified\n- [ ] Re-sync DELETEs + re-inserts (handles rebased MRs)\n- [ ] old_path only populated for renamed files\n- [ ] merge_commit_sha and squash_commit_sha captured in merge_requests table\n- [ ] Jobs only enqueued when fetch_mr_file_changes is true\n\n## Files\n- src/gitlab/client.rs (add fetch_mr_diffs)\n- src/gitlab/types.rs (add GitLabDiffFile, add fields to GitLabMergeRequest)\n- src/gitlab/transformers/merge_request.rs (pass through commit SHAs)\n- src/ingestion/orchestrator.rs (enqueue mr_diffs jobs, update commit SHAs)\n- src/core/drain.rs or sync.rs (handle mr_diffs in drain dispatcher)\n\n## TDD Loop\nRED: tests/file_changes_tests.rs:\n- `test_derive_change_type_added` - new_file=true → \"added\"\n- `test_derive_change_type_renamed` - renamed_file=true → \"renamed\", old_path populated\n- `test_derive_change_type_deleted` - deleted_file=true → \"deleted\"\n- `test_derive_change_type_modified` - all false → \"modified\"\n- `test_resync_deletes_and_reinserts` - insert, then re-process with different files, verify old rows gone\n\ntests/gitlab_types_tests.rs:\n- `test_deserialize_diff_file` - verify GitLabDiffFile deserialization\n- `test_deserialize_mr_with_commit_shas` - verify new fields on GitLabMergeRequest\n\nGREEN: Implement API endpoint, change type derivation, drain handler, commit SHA capture\n\nVERIFY: `cargo test file_changes -- --nocapture && cargo test gitlab_types -- --nocapture`\n\n## Edge Cases\n- MR with 1000+ files (monorepo): pagination essential on diffs endpoint\n- old_path for non-renames: GitLab still returns old_path (same as new_path) — only store when renamed_file=true\n- Draft MRs: diffs may change frequently — DELETE+INSERT handles this\n- MR with no diffs (empty MR): returns empty array, no rows inserted, job still completed\n- merge_commit_sha is NULL until MR is merged — don't error on NULL","status":"open","priority":2,"issue_type":"task","created_at":"2026-02-02T21:34:08.939514Z","created_by":"tayloreernisse","updated_at":"2026-02-02T21:48:37.319521Z","compaction_level":0,"original_size":0,"labels":["api","gate-4","phase-b"],"dependencies":[{"issue_id":"bd-2yo","depends_on_id":"bd-14q","type":"parent-child","created_at":"2026-02-02T21:34:08.941359Z","created_by":"tayloreernisse"},{"issue_id":"bd-2yo","depends_on_id":"bd-1oo","type":"blocks","created_at":"2026-02-02T21:34:16.555239Z","created_by":"tayloreernisse"},{"issue_id":"bd-2yo","depends_on_id":"bd-jec","type":"blocks","created_at":"2026-02-02T21:34:16.656402Z","created_by":"tayloreernisse"},{"issue_id":"bd-2yo","depends_on_id":"bd-tir","type":"blocks","created_at":"2026-02-02T21:34:16.605198Z","created_by":"tayloreernisse"}]} {"id":"bd-2yq","title":"[CP1] Issue transformer with label extraction","description":"Transform GitLab issue payloads to normalized database schema.\n\nFunctions to implement:\n- transformIssue(gitlabIssue, localProjectId) → NormalizedIssue\n- extractLabels(gitlabIssue, localProjectId) → Label[]\n\nTransformation rules:\n- Convert ISO timestamps to ms epoch using isoToMs()\n- Set last_seen_at to nowMs()\n- Handle labels vs labels_details (prefer details when available)\n- Handle missing optional fields gracefully\n\nFiles: src/gitlab/transformers/issue.ts\nTests: tests/unit/issue-transformer.test.ts\nDone when: Unit tests pass for payload transformation and label extraction","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:19:09.660448Z","created_by":"tayloreernisse","updated_at":"2026-01-25T15:21:35.152259Z","deleted_at":"2026-01-25T15:21:35.152254Z","deleted_by":"tayloreernisse","delete_reason":"delete","original_type":"task","compaction_level":0,"original_size":0} {"id":"bd-2ys","title":"[CP1] Cargo.toml updates - async-stream and futures","description":"## Background\n\nThe GitLab client pagination methods require async streaming capabilities. The `async-stream` crate provides the `stream!` macro for creating async iterators, and `futures` provides `StreamExt` for consuming them with `.next()` and other combinators.\n\n## Approach\n\nAdd these dependencies to Cargo.toml:\n\n```toml\n[dependencies]\nasync-stream = \"0.3\"\nfutures = { version = \"0.3\", default-features = false, features = [\"alloc\"] }\n```\n\nUse minimal features on `futures` to avoid pulling unnecessary code.\n\n## Acceptance Criteria\n\n- [ ] `async-stream = \"0.3\"` is in Cargo.toml [dependencies]\n- [ ] `futures` with `alloc` feature is in Cargo.toml [dependencies]\n- [ ] `cargo check` succeeds after adding dependencies\n\n## Files\n\n- Cargo.toml (edit)\n\n## TDD Loop\n\nRED: Not applicable (dependency addition)\nGREEN: Add lines to Cargo.toml\nVERIFY: `cargo check`\n\n## Edge Cases\n\n- If `futures` is already present, merge features rather than duplicate\n- Use exact version pins for reproducibility","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-25T17:02:38.104664Z","created_by":"tayloreernisse","updated_at":"2026-01-25T22:25:10.274787Z","closed_at":"2026-01-25T22:25:10.274727Z","close_reason":"Added async-stream 0.3 and futures 0.3 (alloc feature) to Cargo.toml, cargo check passes","compaction_level":0,"original_size":0} -{"id":"bd-2zl","title":"Epic: Gate 1 - Resource Events Ingestion","description":"## Background\nGate 1 transforms gitlore from a snapshot engine into a temporal data store by ingesting structured event data from GitLab Resource Events APIs (state, label, milestone changes). This is the foundation — Gates 2-5 all depend on the event tables and dependent fetch queue that Gate 1 establishes.\n\nCurrently, when an issue is closed or a label changes, gitlore overwrites the current state. The transition is lost. Gate 1 captures these transitions as discrete events with timestamps, actors, and provenance, enabling temporal queries like \"when did this issue become critical?\" and \"who closed this MR?\"\n\n## Architecture\n- **Three new tables:** resource_state_events, resource_label_events, resource_milestone_events (migration 011, already shipped as bd-hu3)\n- **Generic dependent fetch queue:** pending_dependent_fetches table replaces per-type queue tables. Supports job_types: resource_events, mr_closes_issues, mr_diffs. Used by Gates 1, 2, and 4.\n- **Opt-in via config:** sync.fetchResourceEvents (default true). --no-events CLI flag to skip.\n- **Incremental:** Only changed entities enqueued. --full re-enqueues all.\n- **Crash recovery:** locked_at column with 5-minute stale lock reclaim.\n\n## Children (Execution Order)\n1. **bd-hu3** [CLOSED] — Migration 011: event tables + entity_references + dependent fetch queue\n2. **bd-2e8** [CLOSED] — fetchResourceEvents config flag\n3. **bd-2fm** [CLOSED] — GitLab Resource Event serde types\n4. **bd-sqw** [CLOSED] — Resource Events API endpoints in GitLab client\n5. **bd-1uc** [CLOSED] — DB upsert functions for resource events\n6. **bd-tir** [CLOSED] — Generic dependent fetch queue (enqueue + drain)\n7. **bd-1ep** [CLOSED] — Wire resource event fetching into sync pipeline\n8. **bd-3sh** [CLOSED] — lore count events command\n9. **bd-1m8** [CLOSED] — lore stats --check for event integrity + queue health\n\n## Gate Completion Criteria\n- [ ] All 9 children closed\n- [ ] `lore sync` fetches resource events for changed entities\n- [ ] `lore sync --no-events` skips event fetching\n- [ ] Event fetch failures queued for retry with exponential backoff\n- [ ] Stale locks auto-reclaimed on next sync run\n- [ ] `lore count events` shows counts by type (state/label/milestone)\n- [ ] `lore stats --check` validates referential integrity + queue health\n- [ ] Robot mode JSON for all new commands\n- [ ] Integration test: full sync cycle with events enabled\n\n## Dependencies\n- None (Gate 1 is the foundation)\n- Downstream: Gate 2 (bd-1se) depends on event tables and dependent fetch queue","status":"open","priority":1,"issue_type":"feature","created_at":"2026-02-02T21:30:49.136036Z","created_by":"tayloreernisse","updated_at":"2026-02-04T19:32:13.737741Z","compaction_level":0,"original_size":0,"labels":["epic","gate-1","phase-b"]} +{"id":"bd-2zl","title":"Epic: Gate 1 - Resource Events Ingestion","description":"## Background\nGate 1 transforms gitlore from a snapshot engine into a temporal data store by ingesting structured event data from GitLab Resource Events APIs (state, label, milestone changes). This is the foundation — Gates 2-5 all depend on the event tables and dependent fetch queue that Gate 1 establishes.\n\nCurrently, when an issue is closed or a label changes, gitlore overwrites the current state. The transition is lost. Gate 1 captures these transitions as discrete events with timestamps, actors, and provenance, enabling temporal queries like \"when did this issue become critical?\" and \"who closed this MR?\"\n\n## Architecture\n- **Three new tables:** resource_state_events, resource_label_events, resource_milestone_events (migration 011, already shipped as bd-hu3)\n- **Generic dependent fetch queue:** pending_dependent_fetches table replaces per-type queue tables. Supports job_types: resource_events, mr_closes_issues, mr_diffs. Used by Gates 1, 2, and 4.\n- **Opt-in via config:** sync.fetchResourceEvents (default true). --no-events CLI flag to skip.\n- **Incremental:** Only changed entities enqueued. --full re-enqueues all.\n- **Crash recovery:** locked_at column with 5-minute stale lock reclaim.\n\n## Children (Execution Order)\n1. **bd-hu3** [CLOSED] — Migration 011: event tables + entity_references + dependent fetch queue\n2. **bd-2e8** [CLOSED] — fetchResourceEvents config flag\n3. **bd-2fm** [CLOSED] — GitLab Resource Event serde types\n4. **bd-sqw** [CLOSED] — Resource Events API endpoints in GitLab client\n5. **bd-1uc** [CLOSED] — DB upsert functions for resource events\n6. **bd-tir** [CLOSED] — Generic dependent fetch queue (enqueue + drain)\n7. **bd-1ep** [CLOSED] — Wire resource event fetching into sync pipeline\n8. **bd-3sh** [CLOSED] — lore count events command\n9. **bd-1m8** [CLOSED] — lore stats --check for event integrity + queue health\n\n## Gate Completion Criteria\n- [ ] All 9 children closed\n- [ ] `lore sync` fetches resource events for changed entities\n- [ ] `lore sync --no-events` skips event fetching\n- [ ] Event fetch failures queued for retry with exponential backoff\n- [ ] Stale locks auto-reclaimed on next sync run\n- [ ] `lore count events` shows counts by type (state/label/milestone)\n- [ ] `lore stats --check` validates referential integrity + queue health\n- [ ] Robot mode JSON for all new commands\n- [ ] Integration test: full sync cycle with events enabled\n\n## Dependencies\n- None (Gate 1 is the foundation)\n- Downstream: Gate 2 (bd-1se) depends on event tables and dependent fetch queue","status":"closed","priority":1,"issue_type":"feature","created_at":"2026-02-02T21:30:49.136036Z","created_by":"tayloreernisse","updated_at":"2026-02-05T16:06:52.080788Z","closed_at":"2026-02-05T16:06:52.080725Z","close_reason":"Already implemented: migration 011 exists, events_db.rs has upsert functions, client.rs has fetch_*_state_events, orchestrator.rs has drain_resource_events. Full Gate 1 functionality is live.","compaction_level":0,"original_size":0,"labels":["epic","gate-1","phase-b"]} {"id":"bd-2zr","title":"[CP1] GitLab client pagination methods","description":"Add async stream methods for paginated GitLab API calls.\n\n## Methods to Add to GitLabClient\n\n### paginate_issues(gitlab_project_id, updated_after, cursor_rewind_seconds) -> Stream\n- Use async_stream::try_stream! macro\n- Query params: scope=all, state=all, order_by=updated_at, sort=asc, per_page=100\n- If updated_after provided, apply cursor_rewind_seconds (subtract from timestamp)\n- Clamp to 0 to avoid underflow: (ts - rewind_ms).max(0)\n- Follow x-next-page header until empty/absent\n- Fall back to empty-page detection if headers missing\n\n### paginate_issue_discussions(gitlab_project_id, issue_iid) -> Stream\n- Paginate through discussions for single issue\n- per_page=100\n- Follow x-next-page header\n\n### request_with_headers(path, params) -> Result<(T, HeaderMap)>\n- Acquire rate limiter\n- Make request with PRIVATE-TOKEN header\n- Return both deserialized data and response headers\n\n## Dependencies\n- async-stream = \"0.3\" (for try_stream! macro)\n- futures = \"0.3\" (for Stream trait and StreamExt)\n\nFiles: src/gitlab/client.rs\nTests: tests/pagination_tests.rs\nDone when: Pagination handles multiple pages and respects cursors, tests pass","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T16:57:13.045971Z","created_by":"tayloreernisse","updated_at":"2026-01-25T17:02:01.784887Z","deleted_at":"2026-01-25T17:02:01.784883Z","deleted_by":"tayloreernisse","delete_reason":"recreating with correct deps","original_type":"task","compaction_level":0,"original_size":0} {"id":"bd-31b","title":"[CP1] Discussion ingestion module","description":"Fetch and store discussions/notes for each issue.\n\nImplement ingestIssueDiscussions(options) → { discussionsFetched, discussionsUpserted, notesUpserted, systemNotesCount }\n\nLogic:\n1. Paginate through all discussions for given issue\n2. For each discussion:\n - Store raw payload (compressed)\n - Upsert discussion record with correct issue FK\n - Transform and upsert all notes\n - Store raw payload per note\n - Track system notes count\n\nFiles: src/ingestion/discussions.ts\nTests: tests/integration/issue-discussion-ingestion.test.ts\nDone when: Discussions and notes populated with correct FKs and is_system flags","status":"tombstone","priority":2,"issue_type":"task","created_at":"2026-01-25T15:19:57.131442Z","created_by":"tayloreernisse","updated_at":"2026-01-25T15:21:35.156574Z","deleted_at":"2026-01-25T15:21:35.156571Z","deleted_by":"tayloreernisse","delete_reason":"delete","original_type":"task","compaction_level":0,"original_size":0} {"id":"bd-31i","title":"Epic: CP2 Gate B - Labels + Assignees + Reviewers","description":"## Background\nGate B validates junction tables for labels, assignees, and reviewers. Ensures relationships are tracked correctly and stale links are removed on resync. This is critical for filtering (`--reviewer=alice`) and display.\n\n## Acceptance Criteria (Pass/Fail)\n- [ ] `mr_labels` table has rows for MRs with labels\n- [ ] Label count per MR matches GitLab UI (spot check 3 MRs)\n- [ ] `mr_assignees` table has rows for MRs with assignees\n- [ ] Assignee usernames match GitLab UI (spot check 3 MRs)\n- [ ] `mr_reviewers` table has rows for MRs with reviewers\n- [ ] Reviewer usernames match GitLab UI (spot check 3 MRs)\n- [ ] Remove label in GitLab -> resync -> link removed from mr_labels\n- [ ] Add reviewer in GitLab -> resync -> link added to mr_reviewers\n- [ ] `gi list mrs --label=bugfix` filters correctly\n- [ ] `gi list mrs --reviewer=alice` filters correctly\n\n## Validation Script\n```bash\n#!/bin/bash\nset -e\n\nDB_PATH=\"${XDG_DATA_HOME:-$HOME/.local/share}/gitlab-inbox/db.sqlite3\"\n\necho \"=== Gate B: Labels + Assignees + Reviewers ===\"\n\n# 1. Check label linkage exists\necho \"Step 1: Check label linkage...\"\nLABEL_LINKS=$(sqlite3 \"$DB_PATH\" \"SELECT COUNT(*) FROM mr_labels;\")\necho \" Total label links: $LABEL_LINKS\"\n\n# 2. Show sample label linkage\necho \"Step 2: Sample label linkage...\"\nsqlite3 \"$DB_PATH\" \"\n SELECT m.iid, GROUP_CONCAT(l.name, ', ') as labels\n FROM merge_requests m\n JOIN mr_labels ml ON ml.merge_request_id = m.id\n JOIN labels l ON l.id = ml.label_id\n GROUP BY m.id\n LIMIT 5;\n\"\n\n# 3. Check assignee linkage\necho \"Step 3: Check assignee linkage...\"\nASSIGNEE_LINKS=$(sqlite3 \"$DB_PATH\" \"SELECT COUNT(*) FROM mr_assignees;\")\necho \" Total assignee links: $ASSIGNEE_LINKS\"\n\n# 4. Show sample assignee linkage\necho \"Step 4: Sample assignee linkage...\"\nsqlite3 \"$DB_PATH\" \"\n SELECT m.iid, GROUP_CONCAT(a.username, ', ') as assignees\n FROM merge_requests m\n JOIN mr_assignees a ON a.merge_request_id = m.id\n GROUP BY m.id\n LIMIT 5;\n\"\n\n# 5. Check reviewer linkage\necho \"Step 5: Check reviewer linkage...\"\nREVIEWER_LINKS=$(sqlite3 \"$DB_PATH\" \"SELECT COUNT(*) FROM mr_reviewers;\")\necho \" Total reviewer links: $REVIEWER_LINKS\"\n\n# 6. Show sample reviewer linkage\necho \"Step 6: Sample reviewer linkage...\"\nsqlite3 \"$DB_PATH\" \"\n SELECT m.iid, GROUP_CONCAT(r.username, ', ') as reviewers\n FROM merge_requests m\n JOIN mr_reviewers r ON r.merge_request_id = m.id\n GROUP BY m.id\n LIMIT 5;\n\"\n\n# 7. Test filter commands\necho \"Step 7: Test filter commands...\"\n# Get a label that exists\nLABEL=$(sqlite3 \"$DB_PATH\" \"SELECT name FROM labels LIMIT 1;\")\nif [ -n \"$LABEL\" ]; then\n echo \" Testing --label=$LABEL\"\n gi list mrs --label=\"$LABEL\" --limit=3\nfi\n\n# Get a reviewer that exists\nREVIEWER=$(sqlite3 \"$DB_PATH\" \"SELECT username FROM mr_reviewers LIMIT 1;\")\nif [ -n \"$REVIEWER\" ]; then\n echo \" Testing --reviewer=$REVIEWER\"\n gi list mrs --reviewer=\"$REVIEWER\" --limit=3\nfi\n\necho \"\"\necho \"=== Gate B: PASSED ===\"\n```\n\n## Stale Link Removal Test (Manual)\n```bash\n# 1. Pick an MR with labels in GitLab UI\nMR_IID=123\n\n# 2. Note current label count\nsqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM mr_labels ml\n JOIN merge_requests m ON m.id = ml.merge_request_id\n WHERE m.iid = $MR_IID;\n\"\n# Example: 3 labels\n\n# 3. Remove a label in GitLab UI (manually)\n\n# 4. Resync\ngi ingest --type=merge_requests\n\n# 5. Verify label removed\nsqlite3 \"$DB_PATH\" \"\n SELECT COUNT(*) FROM mr_labels ml\n JOIN merge_requests m ON m.id = ml.merge_request_id\n WHERE m.iid = $MR_IID;\n\"\n# Should be: 2 labels (one less)\n```\n\n## Test Commands (Quick Verification)\n```bash\n# Check counts:\nsqlite3 ~/.local/share/gitlab-inbox/db.sqlite3 \"\n SELECT \n (SELECT COUNT(*) FROM mr_labels) as label_links,\n (SELECT COUNT(*) FROM mr_assignees) as assignee_links,\n (SELECT COUNT(*) FROM mr_reviewers) as reviewer_links;\n\"\n\n# Test filtering:\ngi list mrs --label=enhancement --limit=5\ngi list mrs --reviewer=alice --limit=5\ngi list mrs --assignee=bob --limit=5\n```\n\n## Dependencies\nThis gate requires:\n- bd-ser (MR ingestion with label/assignee/reviewer linking via clear-and-relink pattern)\n- Gate A must pass first\n\n## Edge Cases\n- MRs with no labels/assignees/reviewers: junction tables should have no rows for that MR\n- Labels shared across issues and MRs: labels table is shared, only junction differs\n- Usernames are case-sensitive: `Alice` != `alice`","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-26T22:06:01.292318Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:48:21.059422Z","closed_at":"2026-01-27T00:48:21.059378Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-31i","depends_on_id":"bd-ser","type":"blocks","created_at":"2026-01-26T22:08:55.684769Z","created_by":"tayloreernisse"}]} @@ -100,6 +102,7 @@ {"id":"bd-32mc","title":"OBSERV: Implement log retention cleanup at startup","description":"## Background\nLog files accumulate at ~1-10 MB/day. Without cleanup, they grow unbounded. Retention runs BEFORE subscriber init so deleted file handles aren't held open by the appender.\n\n## Approach\nAdd a cleanup function, called from main.rs before the subscriber is initialized (before current line 44):\n\n```rust\n/// Delete log files older than retention_days.\n/// Matches files named lore.YYYY-MM-DD.log in the log directory.\npub fn cleanup_old_logs(log_dir: &Path, retention_days: u32) -> std::io::Result {\n if retention_days == 0 {\n return Ok(0); // 0 means file logging disabled, don't delete\n }\n let cutoff = SystemTime::now() - Duration::from_secs(u64::from(retention_days) * 86400);\n let mut deleted = 0;\n\n for entry in std::fs::read_dir(log_dir)? {\n let entry = entry?;\n let name = entry.file_name();\n let name_str = name.to_string_lossy();\n\n // Only match lore.YYYY-MM-DD.log pattern\n if !name_str.starts_with(\"lore.\") || !name_str.ends_with(\".log\") {\n continue;\n }\n\n if let Ok(metadata) = entry.metadata() {\n if let Ok(modified) = metadata.modified() {\n if modified < cutoff {\n std::fs::remove_file(entry.path())?;\n deleted += 1;\n }\n }\n }\n }\n Ok(deleted)\n}\n```\n\nPlace this function in src/core/paths.rs (next to get_log_dir) or a new src/core/log_retention.rs. Prefer paths.rs since it's small and related.\n\nCall from main.rs:\n```rust\nlet log_dir = get_log_dir(config.logging.log_dir.as_deref());\nlet _ = cleanup_old_logs(&log_dir, config.logging.retention_days);\n// THEN init subscriber\n```\n\nNote: Config must be loaded before cleanup runs. Current main.rs parses Cli at line 60, but config loading happens inside command handlers. This means we need to either:\n A) Load config early in main() before subscriber init (preferred)\n B) Defer cleanup to after config load\n\nSince the subscriber must also know log_dir, approach A is natural: load config -> cleanup -> init subscriber -> dispatch command.\n\n## Acceptance Criteria\n- [ ] Files matching lore.*.log older than retention_days are deleted\n- [ ] Files matching lore.*.log within retention_days are preserved\n- [ ] Non-matching files (e.g., other.txt) are never deleted\n- [ ] retention_days=0 skips cleanup entirely (no files deleted)\n- [ ] Errors on individual files don't prevent cleanup of remaining files\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/core/paths.rs (add cleanup_old_logs function)\n- src/main.rs (call cleanup before subscriber init)\n\n## TDD Loop\nRED:\n - test_log_retention_cleanup: create tempdir with lore.2026-01-01.log through lore.2026-02-04.log, run with retention_days=7, assert old deleted, recent preserved\n - test_log_retention_ignores_non_log_files: create other.txt alongside old log files, assert other.txt untouched\n - test_log_retention_zero_days: retention_days=0, assert nothing deleted\nGREEN: Implement cleanup_old_logs\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- SystemTime::now() precision varies by OS; use file modified time, not name parsing (simpler and more reliable)\n- read_dir on non-existent directory: get_log_dir creates it first, so this shouldn't happen. But handle gracefully.\n- Permissions error on individual file: log a warning, continue with remaining files (don't propagate)\n- Race condition: another process creates a file during cleanup. Not a concern -- we only delete old files.","status":"closed","priority":1,"issue_type":"task","created_at":"2026-02-04T15:53:55.627901Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:15:04.452086Z","closed_at":"2026-02-04T17:15:04.452039Z","close_reason":"Implemented cleanup_old_logs() with date-pattern matching and retention_days config, runs at startup before subscriber init","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-32mc","depends_on_id":"bd-17n","type":"blocks","created_at":"2026-02-04T15:55:19.523048Z","created_by":"tayloreernisse"},{"issue_id":"bd-32mc","depends_on_id":"bd-1k4","type":"blocks","created_at":"2026-02-04T15:55:19.583155Z","created_by":"tayloreernisse"},{"issue_id":"bd-32mc","depends_on_id":"bd-2nx","type":"parent-child","created_at":"2026-02-04T15:53:55.628795Z","created_by":"tayloreernisse"}]} {"id":"bd-32q","title":"Implement timeline seed phase: FTS5 keyword search to entity IDs","description":"## Background\nStep 1 of timeline query: find seed entities matching the keyword. Uses the existing FTS5 infrastructure from CP3 (documents_fts table). Documents map back to source entities (issues, MRs) via documents.source_type and source_id.\n\n## Approach\nAdd to src/core/timeline.rs:\n\n```rust\n/// Seed phase: FTS5 keyword search to find matching entities.\n/// Returns seed entities (issues, MRs) and top evidence notes.\npub fn seed_from_fts(\n conn: &Connection,\n query: &str,\n project_id: Option, // from -p flag\n since: Option, // ms epoch timestamp filter\n limit: usize, // -n flag\n) -> Result\n\npub struct SeedResult {\n pub entities: Vec,\n pub evidence_notes: Vec,\n}\n\npub struct SeedEntity {\n pub entity_type: String, // \"issue\" | \"merge_request\"\n pub local_id: i64, // DB id\n pub iid: i64,\n pub project_id: i64,\n pub project_path: String,\n}\n\npub struct EvidenceNote {\n pub note_id: i64,\n pub discussion_id: Option,\n pub entity_type: String, // parent entity type\n pub entity_iid: i64,\n pub snippet: String, // first ~200 chars\n pub author: String,\n pub created_at: i64,\n pub url: Option,\n pub fts_rank: f64,\n}\n```\n\nSQL approach:\n1. FTS5 query on documents_fts:\n```sql\nSELECT d.id, d.source_type, d.source_id, d.project_id, d.title,\n rank AS fts_rank\nFROM documents_fts\nJOIN documents d ON documents_fts.rowid = d.id\nWHERE documents_fts MATCH ?1\nORDER BY rank\nLIMIT ?2;\n```\n\n2. Map document results to source entities:\n - source_type 'issue' → get issue by id\n - source_type 'merge_request' → get MR by id\n - source_type 'note' / 'discussion' → get parent entity\n\n3. Collect top 10 note matches as evidence candidates:\n - Filter documents where source relates to a note/discussion\n - Get note body, truncate to ~200 chars\n - Record note_id, discussion_id, parent entity info\n\n4. Deduplicate entities (if a note match and its parent entity both appear)\n\n5. Apply filters: --since (filter by entity created_at), -p (project scope)\n\n## Acceptance Criteria\n- [ ] FTS5 search returns matching documents\n- [ ] Documents correctly mapped to source entities (issues, MRs)\n- [ ] Note matches produce evidence note entries (top 10 by FTS rank)\n- [ ] Deduplication: note match + parent entity don't double-count\n- [ ] --since filter applied to entity timestamps\n- [ ] -p filter scopes to project\n- [ ] Empty query returns error (not all entities)\n\n## Files\n- src/core/timeline.rs (add seed_from_fts function)\n\n## TDD Loop\nRED: tests/timeline_seed_tests.rs:\n- `test_seed_finds_matching_issue` - FTS match on issue title\n- `test_seed_finds_matching_note` - FTS match on note body, maps to parent entity\n- `test_seed_deduplicates_entities` - note match and issue match for same entity = 1 seed\n- `test_seed_collects_evidence_notes` - top notes returned with snippets\n- `test_seed_applies_since_filter` - old entities excluded\n- `test_seed_applies_project_filter` - wrong project excluded\n\nSetup: create_test_db with full migrations, seed documents + FTS index.\n\nGREEN: Implement seed_from_fts\n\nVERIFY: `cargo test timeline_seed -- --nocapture`\n\n## Edge Cases\n- FTS5 MATCH syntax: user may pass natural language — FTS5 handles this but special chars may need escaping\n- Documents table may have stale data (entity deleted but document remains) — skip missing entities\n- Note-type documents: need to map from note → discussion → parent entity. The documents table has source_type and source_id but the chain may be complex\n- Empty result set is valid (no matches) — return empty SeedResult, not error","status":"open","priority":2,"issue_type":"task","created_at":"2026-02-02T21:33:08.615908Z","created_by":"tayloreernisse","updated_at":"2026-02-02T21:45:48.702921Z","compaction_level":0,"original_size":0,"labels":["gate-3","phase-b","query"],"dependencies":[{"issue_id":"bd-32q","depends_on_id":"bd-20e","type":"blocks","created_at":"2026-02-02T21:33:37.368005Z","created_by":"tayloreernisse"},{"issue_id":"bd-32q","depends_on_id":"bd-ike","type":"parent-child","created_at":"2026-02-02T21:33:08.617483Z","created_by":"tayloreernisse"}]} {"id":"bd-335","title":"Implement Ollama API client","description":"## Background\nThe Ollama API client provides the HTTP interface to the local Ollama embedding server. It handles health checks (is Ollama running? does the model exist?), batch embedding requests (up to 32 texts per call), and error translation to LoreError variants. This is the lowest-level embedding component — the pipeline (bd-am7) builds on top of it.\n\n## Approach\nCreate \\`src/embedding/ollama.rs\\` per PRD Section 4.2. **Uses async reqwest (not blocking).**\n\n```rust\nuse reqwest::Client; // NOTE: async Client, not reqwest::blocking\nuse serde::{Deserialize, Serialize};\nuse crate::core::error::{LoreError, Result};\n\npub struct OllamaConfig {\n pub base_url: String, // default \\\"http://localhost:11434\\\"\n pub model: String, // default \\\"nomic-embed-text\\\"\n pub timeout_secs: u64, // default 60\n}\n\nimpl Default for OllamaConfig { /* PRD defaults */ }\n\npub struct OllamaClient {\n client: Client, // async reqwest::Client\n config: OllamaConfig,\n}\n\n#[derive(Serialize)]\nstruct EmbedRequest { model: String, input: Vec }\n\n#[derive(Deserialize)]\nstruct EmbedResponse { model: String, embeddings: Vec> }\n\n#[derive(Deserialize)]\nstruct TagsResponse { models: Vec }\n\n#[derive(Deserialize)]\nstruct ModelInfo { name: String }\n\nimpl OllamaClient {\n pub fn new(config: OllamaConfig) -> Self;\n\n /// Async health check: GET /api/tags\n /// Model matched via starts_with (\\\"nomic-embed-text\\\" matches \\\"nomic-embed-text:latest\\\")\n pub async fn health_check(&self) -> Result<()>;\n\n /// Async batch embedding: POST /api/embed\n /// Input: Vec of texts, Response: Vec> of 768-dim embeddings\n pub async fn embed_batch(&self, texts: Vec) -> Result>>;\n}\n\n/// Quick health check without full client (async).\npub async fn check_ollama_health(base_url: &str) -> bool;\n```\n\n**Error mapping (per PRD):**\n- Connection refused/timeout -> LoreError::OllamaUnavailable { base_url, source: Some(e) }\n- Model not in /api/tags -> LoreError::OllamaModelNotFound { model }\n- Non-200 from /api/embed -> LoreError::EmbeddingFailed { document_id: 0, reason: format!(\\\"HTTP {}: {}\\\", status, body) }\n\n**Key PRD detail:** Model matching uses \\`starts_with\\` (not exact match) so \\\"nomic-embed-text\\\" matches \\\"nomic-embed-text:latest\\\".\n\n## Acceptance Criteria\n- [ ] Uses async reqwest::Client (not blocking)\n- [ ] health_check() is async, detects server availability and model presence\n- [ ] Model matched via starts_with (handles \\\":latest\\\" suffix)\n- [ ] embed_batch() is async, sends POST /api/embed\n- [ ] Batch size up to 32 texts\n- [ ] Returns Vec> with 768 dimensions each\n- [ ] OllamaUnavailable error includes base_url and source error\n- [ ] OllamaModelNotFound error includes model name\n- [ ] Non-200 response mapped to EmbeddingFailed with status + body\n- [ ] Timeout: 60 seconds default (configurable via OllamaConfig)\n- [ ] \\`cargo build\\` succeeds\n\n## Files\n- \\`src/embedding/ollama.rs\\` — new file\n- \\`src/embedding/mod.rs\\` — add \\`pub mod ollama;\\` and re-exports\n\n## TDD Loop\nRED: Tests (unit tests with mock, integration needs Ollama):\n- \\`test_config_defaults\\` — verify default base_url, model, timeout\n- \\`test_health_check_model_starts_with\\` — \\\"nomic-embed-text\\\" matches \\\"nomic-embed-text:latest\\\"\n- \\`test_embed_batch_parse\\` — mock response parsed correctly\n- \\`test_connection_error_maps_to_ollama_unavailable\\`\nGREEN: Implement OllamaClient\nVERIFY: \\`cargo test ollama\\`\n\n## Edge Cases\n- Ollama returns model name with version tag (\\\"nomic-embed-text:latest\\\"): starts_with handles this\n- Empty texts array: send empty batch, Ollama returns empty embeddings\n- Ollama returns wrong number of embeddings (2 texts, 1 embedding): caller (pipeline) validates\n- Non-JSON response: reqwest deserialization error -> wrap appropriately","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-30T15:26:34.025099Z","created_by":"tayloreernisse","updated_at":"2026-01-30T16:58:17.546852Z","closed_at":"2026-01-30T16:58:17.546794Z","close_reason":"Completed: OllamaClient with async health_check (starts_with model matching), embed_batch, error mapping to LoreError variants, check_ollama_health helper, 4 tests pass","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-335","depends_on_id":"bd-ljf","type":"blocks","created_at":"2026-01-30T15:29:24.627951Z","created_by":"tayloreernisse"}]} +{"id":"bd-343o","title":"Fetch and store GitLab linked issues (Related to)","description":"## Background\nGitLab's 'Linked items' section shows related issues connected via the issue links API. This is distinct from:\n- **closes** references (MR->Issue via closes_issues API) - already implemented\n- **mentioned** references (parsed from notes/descriptions) - partially implemented\n\nThe 'Related to' relationship requires a separate API call and provides bidirectional issue linking.\n\n**System fit**: Extends the existing `entity_references` table with `reference_type='related'` entries.\n\n## Approach\n\n### Phase 1: API Client (src/gitlab/client.rs)\n\nAdd method to fetch issue links:\n```rust\npub async fn fetch_issue_links(\n &self,\n gitlab_project_id: i64,\n issue_iid: i64,\n) -> Result> {\n let path = format!(\n \"/api/v4/projects/{}/issues/{}/links\",\n gitlab_project_id, issue_iid\n );\n self.request(&path).await\n}\n```\n\n### Phase 2: Types (src/gitlab/types.rs)\n\nAdd GitLab response struct:\n```rust\n#[derive(Debug, Clone, Deserialize, Serialize)]\npub struct GitLabIssueLink {\n pub id: i64,\n pub iid: i64,\n pub project_id: i64,\n pub title: String,\n pub state: String,\n pub web_url: String,\n pub link_type: String, // 'relates_to', 'blocks', 'is_blocked_by'\n pub link_created_at: String,\n}\n```\n\n### Phase 3: Ingestion (src/ingestion/issue_links.rs - NEW FILE)\n\nCreate new ingestion module:\n```rust\npub async fn ingest_issue_links(\n conn: &Connection,\n client: &GitLabClient,\n project_id: i64,\n gitlab_project_id: i64,\n issue_local_id: i64,\n issue_iid: i64,\n) -> Result {\n let links = client.fetch_issue_links(gitlab_project_id, issue_iid).await?;\n \n for link in links {\n let target_local_id = resolve_issue_local_id(conn, project_id, link.iid)?;\n \n let ref_ = EntityReference {\n project_id,\n source_entity_type: \"issue\",\n source_entity_id: issue_local_id,\n target_entity_type: \"issue\",\n target_entity_id: target_local_id,\n target_project_path: if target_local_id.is_none() {\n resolve_project_path(conn, link.project_id).ok().flatten()\n } else { None },\n target_entity_iid: if target_local_id.is_none() { Some(link.iid) } else { None },\n reference_type: \"related\", // Could also map link_type to blocks/is_blocked_by\n source_method: \"api\",\n };\n \n insert_entity_reference(conn, &ref_)?;\n }\n \n Ok(links.len())\n}\n```\n\n### Phase 4: Orchestrator Integration (src/ingestion/orchestrator.rs)\n\nAdd to dependent fetch queue (similar to mr_closes_issues):\n1. Enqueue `issue_links` jobs for issues\n2. Drain queue in MR sync phase\n3. Track `issue_links_fetched` / `issue_links_failed` counters\n\n### Phase 5: Display (src/cli/commands/show.rs)\n\nAdd `get_related_issues()` function:\n```rust\nfn get_related_issues(conn: &Connection, issue_id: i64) -> Result> {\n let mut stmt = conn.prepare(\n \"SELECT i.iid, i.title, i.state, i.web_url, p.path_with_namespace\n FROM entity_references er\n JOIN issues i ON i.id = er.target_entity_id\n JOIN projects p ON i.project_id = p.id\n WHERE er.source_entity_type = 'issue'\n AND er.source_entity_id = ?\n AND er.target_entity_type = 'issue'\n AND er.reference_type = 'related'\n ORDER BY i.iid\"\n )?;\n // ... map to RelatedIssueRef ...\n}\n```\n\nUpdate `IssueDetail` with `related_issues: Vec`.\n\nUpdate `print_show_issue()`:\n```rust\nif !issue.related_issues.is_empty() {\n println!(\"Linked Issues:\");\n for ri in &issue.related_issues {\n println!(\" #{} {} ({})\", ri.iid, ri.title, ri.state);\n }\n}\n```\n\n## Acceptance Criteria\n- [ ] `cargo test test_fetch_issue_links` passes (mock API test)\n- [ ] `cargo test test_ingest_issue_links` passes (DB integration)\n- [ ] `cargo test test_get_related_issues` passes (query test)\n- [ ] `lore sync` fetches and stores issue links\n- [ ] `lore issues ` shows Linked Issues section\n- [ ] `lore -J issues ` includes related_issues array\n- [ ] `cargo clippy --all-targets -- -D warnings` passes\n\n## Files\n- `src/gitlab/client.rs` - Add fetch_issue_links()\n- `src/gitlab/types.rs` - Add GitLabIssueLink struct\n- `src/ingestion/issue_links.rs` - NEW FILE\n- `src/ingestion/mod.rs` - Export issue_links\n- `src/ingestion/orchestrator.rs` - Integrate into sync flow\n- `src/cli/commands/show.rs` - Display related issues\n\n## TDD Loop\n\n**RED** - Start with API type test:\n```rust\n// src/gitlab/types.rs tests\n#[test]\nfn test_issue_link_deserialize() {\n let json = r#\"{\n \"id\": 123,\n \"iid\": 45,\n \"project_id\": 100,\n \"title\": \"Related bug\",\n \"state\": \"opened\",\n \"web_url\": \"https://gitlab.com/...\",\n \"link_type\": \"relates_to\",\n \"link_created_at\": \"2024-01-15T10:00:00Z\"\n }\"#;\n let link: GitLabIssueLink = serde_json::from_str(json).unwrap();\n assert_eq!(link.iid, 45);\n assert_eq!(link.link_type, \"relates_to\");\n}\n```\n\n**GREEN** - Implement incrementally: types -> client -> ingestion -> display\n\n**VERIFY**: `cargo test issue_link && cargo clippy --all-targets -- -D warnings`\n\n## Edge Cases\n- Cross-project links -> store with target_project_path, display with project prefix\n- Bidirectional links -> GitLab returns both directions, dedup via UNIQUE constraint\n- Deleted linked issues -> target_entity_id NULL, show as 'unresolved'\n- link_type variations -> 'blocks'/'is_blocked_by' could be stored as separate reference_types\n- Rate limiting -> batch requests, respect retry-after headers","status":"open","priority":2,"issue_type":"feature","created_at":"2026-02-05T15:14:25.202900Z","created_by":"tayloreernisse","updated_at":"2026-02-05T15:18:54.693515Z","compaction_level":0,"original_size":0,"labels":["ISSUE"]} {"id":"bd-34ek","title":"OBSERV: Implement MetricsLayer custom tracing subscriber layer","description":"## Background\nMetricsLayer is a custom tracing subscriber layer that records span timing and structured fields, then materializes them into Vec. This avoids threading a mutable collector through every function signature -- spans are the single source of truth.\n\n## Approach\nAdd to src/core/metrics.rs (same file as StageTiming):\n\n```rust\nuse std::collections::HashMap;\nuse std::sync::{Arc, Mutex};\nuse std::time::Instant;\nuse tracing::span::{Attributes, Id, Record};\nuse tracing::Subscriber;\nuse tracing_subscriber::layer::{Context, Layer};\nuse tracing_subscriber::registry::LookupSpan;\n\n#[derive(Debug)]\nstruct SpanData {\n name: String,\n parent_id: Option,\n start: Instant,\n fields: HashMap,\n}\n\n#[derive(Debug, Clone)]\npub struct MetricsLayer {\n spans: Arc>>,\n completed: Arc>>,\n}\n\nimpl MetricsLayer {\n pub fn new() -> Self {\n Self {\n spans: Arc::new(Mutex::new(HashMap::new())),\n completed: Arc::new(Mutex::new(Vec::new())),\n }\n }\n\n /// Extract timing tree for a completed run.\n /// Call this after the root span closes.\n pub fn extract_timings(&self) -> Vec {\n let completed = self.completed.lock().unwrap();\n // Build tree: find root entries (no parent), attach children\n // ... tree construction logic\n }\n}\n\nimpl Layer for MetricsLayer\nwhere\n S: Subscriber + for<'a> LookupSpan<'a>,\n{\n fn on_new_span(&self, attrs: &Attributes<'_>, id: &Id, ctx: Context<'_, S>) {\n let parent_id = ctx.span(id).and_then(|s| s.parent().map(|p| p.id()));\n let mut fields = HashMap::new();\n // Visit attrs to capture initial field values\n let mut visitor = FieldVisitor(&mut fields);\n attrs.record(&mut visitor);\n\n self.spans.lock().unwrap().insert(id.into_u64(), SpanData {\n name: attrs.metadata().name().to_string(),\n parent_id,\n start: Instant::now(),\n fields,\n });\n }\n\n fn on_record(&self, id: &Id, values: &Record<'_>, _ctx: Context<'_, S>) {\n // Capture recorded fields (items_processed, items_skipped, errors)\n if let Some(data) = self.spans.lock().unwrap().get_mut(&id.into_u64()) {\n let mut visitor = FieldVisitor(&mut data.fields);\n values.record(&mut visitor);\n }\n }\n\n fn on_close(&self, id: Id, _ctx: Context<'_, S>) {\n if let Some(data) = self.spans.lock().unwrap().remove(&id.into_u64()) {\n let elapsed = data.start.elapsed();\n let timing = StageTiming {\n name: data.name,\n project: data.fields.get(\"project\").and_then(|v| v.as_str()).map(String::from),\n elapsed_ms: elapsed.as_millis() as u64,\n items_processed: data.fields.get(\"items_processed\").and_then(|v| v.as_u64()).unwrap_or(0) as usize,\n items_skipped: data.fields.get(\"items_skipped\").and_then(|v| v.as_u64()).unwrap_or(0) as usize,\n errors: data.fields.get(\"errors\").and_then(|v| v.as_u64()).unwrap_or(0) as usize,\n sub_stages: vec![], // Will be populated during extract_timings tree construction\n };\n self.completed.lock().unwrap().push((id.into_u64(), timing));\n }\n }\n}\n```\n\nNeed a FieldVisitor struct implementing tracing::field::Visit to capture field values.\n\nRegister in subscriber stack (src/main.rs), alongside stderr and file layers:\n```rust\nlet metrics_layer = MetricsLayer::new();\nlet metrics_handle = metrics_layer.clone(); // Clone Arc for later extraction\n\nregistry()\n .with(stderr_layer.with_filter(stderr_filter))\n .with(file_layer.with_filter(file_filter))\n .with(metrics_layer) // No filter -- captures all spans\n .init();\n```\n\nPass metrics_handle to command handlers so they can call extract_timings() after the pipeline completes.\n\n## Acceptance Criteria\n- [ ] MetricsLayer captures span enter/close timing\n- [ ] on_record captures items_processed, items_skipped, errors fields\n- [ ] extract_timings() returns correctly nested Vec tree\n- [ ] Parallel spans (multiple projects) both appear as sub_stages of parent\n- [ ] Thread-safe: Arc> allows concurrent span operations\n- [ ] cargo clippy --all-targets -- -D warnings passes\n\n## Files\n- src/core/metrics.rs (add MetricsLayer, FieldVisitor, tree construction)\n- src/main.rs (register MetricsLayer in subscriber stack)\n\n## TDD Loop\nRED:\n - test_metrics_layer_single_span: enter/exit one span, extract, assert one StageTiming\n - test_metrics_layer_nested_spans: parent + child, assert child in parent.sub_stages\n - test_metrics_layer_parallel_spans: two sibling spans, assert both in parent.sub_stages\n - test_metrics_layer_field_recording: record items_processed=42, assert captured\nGREEN: Implement MetricsLayer with on_new_span, on_record, on_close, extract_timings\nVERIFY: cargo test && cargo clippy --all-targets -- -D warnings\n\n## Edge Cases\n- Span ID reuse: tracing may reuse span IDs after close. Using remove on close prevents stale data.\n- Lock contention: Mutex per operation. For high-span-count scenarios, consider parking_lot::Mutex. But lore's span count is low (<100 per run), so std::sync::Mutex is fine.\n- extract_timings tree construction: iterate completed Vec, build parent->children map, then recursively construct StageTiming tree. Root entries have parent_id matching the root span or None.\n- MetricsLayer has no filter: it sees ALL spans. To avoid noise from dependency spans, check if span name starts with known stage names, or rely on the \"stage\" field being present.","status":"closed","priority":2,"issue_type":"task","created_at":"2026-02-04T15:54:31.960669Z","created_by":"tayloreernisse","updated_at":"2026-02-04T17:25:25.523811Z","closed_at":"2026-02-04T17:25:25.523730Z","close_reason":"Implemented MetricsLayer custom tracing subscriber layer with span timing capture, rate-limit/retry event detection, tree extraction, and 12 unit tests","compaction_level":0,"original_size":0,"labels":["observability"],"dependencies":[{"issue_id":"bd-34ek","depends_on_id":"bd-1o4h","type":"blocks","created_at":"2026-02-04T15:55:19.851554Z","created_by":"tayloreernisse"},{"issue_id":"bd-34ek","depends_on_id":"bd-24j1","type":"blocks","created_at":"2026-02-04T15:55:19.905554Z","created_by":"tayloreernisse"},{"issue_id":"bd-34ek","depends_on_id":"bd-3er","type":"parent-child","created_at":"2026-02-04T15:54:31.961646Z","created_by":"tayloreernisse"}]} {"id":"bd-34o","title":"Implement MR transformer","description":"## Background\nTransforms GitLab MR API responses into normalized schema for database storage. Handles deprecated field fallbacks and extracts metadata (labels, assignees, reviewers).\n\n## Approach\nCreate new transformer module following existing issue transformer pattern:\n- `NormalizedMergeRequest` - Database-ready struct\n- `MergeRequestWithMetadata` - MR + extracted labels/assignees/reviewers\n- `transform_merge_request()` - Main transformation function\n- `extract_labels()` - Label extraction helper\n\n## Files\n- `src/gitlab/transformers/merge_request.rs` - New transformer module\n- `src/gitlab/transformers/mod.rs` - Export new module\n- `tests/mr_transformer_tests.rs` - Unit tests\n\n## Acceptance Criteria\n- [ ] `NormalizedMergeRequest` struct exists with all DB columns\n- [ ] `MergeRequestWithMetadata` contains MR + label_names + assignee_usernames + reviewer_usernames\n- [ ] `transform_merge_request()` returns `Result`\n- [ ] `draft` computed as `gitlab_mr.draft || gitlab_mr.work_in_progress`\n- [ ] `detailed_merge_status` prefers `detailed_merge_status` over `merge_status_legacy`\n- [ ] `merge_user_username` prefers `merge_user` over `merged_by`\n- [ ] `head_sha` extracted from `sha` field\n- [ ] `references_short` and `references_full` extracted from `references` Option\n- [ ] Timestamps parsed with `iso_to_ms()`, errors returned (not zeroed)\n- [ ] `last_seen_at` set to `now_ms()`\n- [ ] `cargo test mr_transformer` passes\n\n## TDD Loop\nRED: `cargo test mr_transformer` -> module not found\nGREEN: Add transformer with all fields\nVERIFY: `cargo test mr_transformer`\n\n## Struct Definitions\n```rust\n#[derive(Debug, Clone)]\npub struct NormalizedMergeRequest {\n pub gitlab_id: i64,\n pub project_id: i64,\n pub iid: i64,\n pub title: String,\n pub description: Option,\n pub state: String,\n pub draft: bool,\n pub author_username: String,\n pub source_branch: String,\n pub target_branch: String,\n pub head_sha: Option,\n pub references_short: Option,\n pub references_full: Option,\n pub detailed_merge_status: Option,\n pub merge_user_username: Option,\n pub created_at: i64,\n pub updated_at: i64,\n pub merged_at: Option,\n pub closed_at: Option,\n pub last_seen_at: i64,\n pub web_url: String,\n}\n\n#[derive(Debug, Clone)]\npub struct MergeRequestWithMetadata {\n pub merge_request: NormalizedMergeRequest,\n pub label_names: Vec,\n pub assignee_usernames: Vec,\n pub reviewer_usernames: Vec,\n}\n```\n\n## Function Signature\n```rust\npub fn transform_merge_request(\n gitlab_mr: &GitLabMergeRequest,\n local_project_id: i64,\n) -> Result\n```\n\n## Key Logic\n```rust\n// Draft: prefer draft, fallback to work_in_progress\nlet is_draft = gitlab_mr.draft || gitlab_mr.work_in_progress;\n\n// Merge status: prefer detailed_merge_status\nlet detailed_merge_status = gitlab_mr.detailed_merge_status\n .clone()\n .or_else(|| gitlab_mr.merge_status_legacy.clone());\n\n// Merge user: prefer merge_user\nlet merge_user_username = gitlab_mr.merge_user\n .as_ref()\n .map(|u| u.username.clone())\n .or_else(|| gitlab_mr.merged_by.as_ref().map(|u| u.username.clone()));\n\n// References extraction\nlet (references_short, references_full) = gitlab_mr.references\n .as_ref()\n .map(|r| (Some(r.short.clone()), Some(r.full.clone())))\n .unwrap_or((None, None));\n\n// Head SHA\nlet head_sha = gitlab_mr.sha.clone();\n```\n\n## Edge Cases\n- Invalid timestamps should return `Err`, not zero values\n- Empty labels/assignees/reviewers should return empty Vecs, not None\n- `state` must pass through as-is (including \"locked\")","status":"closed","priority":2,"issue_type":"task","created_at":"2026-01-26T22:06:40.849049Z","created_by":"tayloreernisse","updated_at":"2026-01-27T00:11:48.501301Z","closed_at":"2026-01-27T00:11:48.501241Z","close_reason":"done","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-34o","depends_on_id":"bd-3ir","type":"blocks","created_at":"2026-01-26T22:08:54.023616Z","created_by":"tayloreernisse"},{"issue_id":"bd-34o","depends_on_id":"bd-5ta","type":"blocks","created_at":"2026-01-26T22:08:54.059646Z","created_by":"tayloreernisse"}]} {"id":"bd-35o","title":"Create golden query test suite","description":"## Background\nGolden query tests verify end-to-end search quality with known-good expected results. They use a seeded SQLite DB with deterministic fixture data and fixed embedding vectors (no Ollama dependency). Each test query must return at least one expected URL in the top 10 results. These tests catch search regressions (ranking changes, filter bugs, missing results).\n\n## Approach\nCreate test infrastructure:\n\n**1. tests/fixtures/golden_queries.json:**\n```json\n[\n {\n \"query\": \"authentication login\",\n \"mode\": \"lexical\",\n \"filters\": {},\n \"expected_urls\": [\"https://gitlab.example.com/group/project/-/issues/234\"],\n \"min_results\": 1,\n \"max_rank\": 10\n },\n {\n \"query\": \"jwt token refresh\",\n \"mode\": \"hybrid\",\n \"filters\": {\"type\": \"merge_request\"},\n \"expected_urls\": [\"https://gitlab.example.com/group/project/-/merge_requests/456\"],\n \"min_results\": 1,\n \"max_rank\": 10\n }\n]\n```\n\n**2. Test harness (tests/golden_query_tests.rs):**\n- Load golden_queries.json\n- Create in-memory DB, apply all migrations\n- Seed with deterministic fixture documents (issues, MRs, discussions)\n- For hybrid/semantic queries: seed with fixed embedding vectors (768-dim, manually constructed for known similarity)\n- For each query: run search, verify expected URL in top N results\n\n**Fixture data design:**\n- 10-20 documents covering different source types\n- Known content that matches expected queries\n- Fixed embeddings: construct vectors where similar documents have small cosine distance\n- No randomness — fully deterministic\n\n## Acceptance Criteria\n- [ ] Golden queries file exists with at least 5 test queries\n- [ ] Test harness loads queries and validates each\n- [ ] All golden queries pass: expected URL in top 10\n- [ ] No external dependencies (no Ollama, no GitLab)\n- [ ] Deterministic fixture data (fixed embeddings, fixed content)\n- [ ] `cargo test --test golden_query_tests` passes in CI\n\n## Files\n- `tests/fixtures/golden_queries.json` — new file\n- `tests/golden_query_tests.rs` — new file (or tests/golden_queries.rs)\n\n## TDD Loop\nRED: Create golden_queries.json with expected results, harness fails (no fixture data)\nGREEN: Seed fixture data that satisfies expected results\nVERIFY: `cargo test --test golden_query_tests`\n\n## Edge Cases\n- Query matches multiple expected URLs: all must be present\n- Lexical queries: FTS ranking determines position, not vector\n- Hybrid queries: RRF combines both signals — fixed vectors must be designed to produce expected ranking\n- Empty result for a golden query: test failure with clear message showing actual results","status":"closed","priority":3,"issue_type":"task","created_at":"2026-01-30T15:27:21.788493Z","created_by":"tayloreernisse","updated_at":"2026-01-30T18:12:47.085563Z","closed_at":"2026-01-30T18:12:47.085363Z","close_reason":"Golden query test suite: 7 golden queries in fixture, 8 seeded documents, 2 test functions (all_pass + fixture_valid), deterministic in-memory DB, no external deps. 312 total tests pass.","compaction_level":0,"original_size":0,"dependencies":[{"issue_id":"bd-35o","depends_on_id":"bd-2no","type":"blocks","created_at":"2026-01-30T15:29:35.641568Z","created_by":"tayloreernisse"}]} diff --git a/.beads/last-touched b/.beads/last-touched index bb7812b..66ab0ec 100644 --- a/.beads/last-touched +++ b/.beads/last-touched @@ -1 +1 @@ -bd-3ia +bd-1oo