feat(cli): implement 'lore file-history' command (bd-z94)

Adds file-history command showing which MRs touched a file, with: - Rename chain resolution via BFS (resolve_rename_chain from bd-1yx) - DiffNote discussion snippets with --discussions flag - --merged filter, --no-follow-renames, -n limit - Human output with styled MR list and rename chain display - Robot JSON output with {ok, data, meta} envelope - Autocorrect registry and robot-docs manifest entry - Fixes pre-existing --no-status missing from sync autocorrect registry
2026-02-17 11:43:42 -05:00
parent 491dc52864
commit a1bca10408
30 changed files with 4758 additions and 3380 deletions
--- a/docs/plan-expose-discussion-ids.feedback-1.md
+++ b/docs/plan-expose-discussion-ids.feedback-1.md
@@ -0,0 +1,202 @@
+No `## Rejected Recommendations` section appears in the plan you pasted, so the revisions below are all net-new.
+
+1. **Add an explicit “Bridge Contract” and fix scope inconsistency**
+Analysis: The plan says “Three changes” but defines four. More importantly, identifier requirements are scattered. A single contract section prevents drift and makes every new read surface prove it can drive a write call.
+
+```diff
+@@
+-**Scope**: Three changes, delivered in order:
+**Scope**: Four workstreams, delivered in order:
+ 1. Add `gitlab_discussion_id` to notes output
+ 2. Add `gitlab_discussion_id` to show command discussion groups
+ 3. Add a standalone `discussions` list command
+ 4. Fix robot-docs to list actual field names instead of opaque type references
+
+## Bridge Contract (Cross-Cutting)
+Every read payload that surfaces notes/discussions MUST include:
+- `project_path`
+- `noteable_type`
+- `parent_iid`
+- `gitlab_discussion_id`
+- `gitlab_note_id` (when note-level data is returned)
+This contract is required so agents can deterministically construct `glab api` write calls.
+```
+
+2. **Normalize identifier naming now (break ambiguous names)**
+Analysis: Current `id`/`gitlab_id` naming is ambiguous in mixed payloads. Rename to explicit `note_id` and `gitlab_note_id` now (you explicitly don’t care about backward compatibility). This reduces automation mistakes.
+
+```diff
+@@ 1b. Add field to `NoteListRow`
+-pub struct NoteListRow {
+-    pub id: i64,
+-    pub gitlab_id: i64,
+pub struct NoteListRow {
+    pub note_id: i64,           // local DB id
+    pub gitlab_note_id: i64,    // GitLab note id
+@@
+@@ 1c. Add field to `NoteListRowJson`
+-pub struct NoteListRowJson {
+-    pub id: i64,
+-    pub gitlab_id: i64,
+pub struct NoteListRowJson {
+    pub note_id: i64,
+    pub gitlab_note_id: i64,
+@@
+-#### 2f. Add `gitlab_note_id` to note detail structs in show
+-While we're here, add `gitlab_id` to `NoteDetail`, `MrNoteDetail`, and their JSON
+#### 2f. Add `gitlab_note_id` to note detail structs in show
+While we're here, add `gitlab_note_id` to `NoteDetail`, `MrNoteDetail`, and their JSON
+ counterparts.
+```
+
+3. **Stop positional column indexing for these changes**
+Analysis: In `list.rs`, row extraction is positional (`row.get(18)`, etc.). Adding fields is fragile and easy to break silently. Use named aliases and named lookup for robustness.
+
+```diff
+@@ 1a/1b SQL + query_map
+-    p.path_with_namespace AS project_path
+    p.path_with_namespace AS project_path,
+    d.gitlab_discussion_id AS gitlab_discussion_id
+@@
+-    project_path: row.get(18)?,
+-    gitlab_discussion_id: row.get(19)?,
+    project_path: row.get("project_path")?,
+    gitlab_discussion_id: row.get("gitlab_discussion_id")?,
+```
+
+4. **Redesign `discussions` query to avoid correlated subquery fanout**
+Analysis: Proposed query uses many correlated subqueries per row. That’s acceptable for tiny MR-scoped sets, but degrades for project-wide scans. Use a base CTE + one rollup pass over notes.
+
+```diff
+@@ 3c. SQL Query
+-SELECT
+-    d.id,
+-    ...
+-    (SELECT COUNT(*) FROM notes n2 WHERE n2.discussion_id = d.id AND n2.is_system = 0) AS note_count,
+-    (SELECT n3.author_username FROM notes n3 WHERE n3.discussion_id = d.id ORDER BY n3.position LIMIT 1) AS first_author,
+-    ...
+-FROM discussions d
+WITH base AS (
+  SELECT d.id, d.gitlab_discussion_id, d.noteable_type, d.project_id, d.issue_id, d.merge_request_id,
+         d.individual_note, d.first_note_at, d.last_note_at, d.resolvable, d.resolved
+  FROM discussions d
+  {where_sql}
+),
+note_rollup AS (
+  SELECT n.discussion_id,
+         COUNT(*) FILTER (WHERE n.is_system = 0) AS user_note_count,
+         COUNT(*) AS total_note_count,
+         MIN(CASE WHEN n.is_system = 0 THEN n.position END) AS first_user_pos
+  FROM notes n
+  JOIN base b ON b.id = n.discussion_id
+  GROUP BY n.discussion_id
+)
+SELECT ...
+FROM base b
+LEFT JOIN note_rollup r ON r.discussion_id = b.id
+```
+
+5. **Add explicit index work for new access patterns**
+Analysis: Existing indexes are good but not ideal for new list patterns (`project + last_note`, note position ordering inside discussion). Add migration entries to keep latency stable.
+
+```diff
+@@ ## 3. Add Standalone `discussions` List Command
+#### 3h. Add migration for discussion-list performance
+**File**: `migrations/027_discussions_list_indexes.sql`
+```sql
+CREATE INDEX IF NOT EXISTS idx_discussions_project_last_note
+  ON discussions(project_id, last_note_at DESC, id DESC);
+CREATE INDEX IF NOT EXISTS idx_discussions_project_first_note
+  ON discussions(project_id, first_note_at DESC, id DESC);
+CREATE INDEX IF NOT EXISTS idx_notes_discussion_position
+  ON notes(discussion_id, position);
+```
+```
+
+6. **Add keyset pagination (critical for agent workflows)**
+Analysis: `--limit` alone is not enough for automation over large datasets. Add cursor-based pagination with deterministic sort keys and `next_cursor` in JSON.
+
+```diff
+@@ 3a. CLI Args
+    /// Keyset cursor from previous response
+    #[arg(long, help_heading = "Output")]
+    pub cursor: Option<String>,
+@@
+@@ Response Schema
+-    "total_count": 15,
+-    "showing": 15
+    "total_count": 15,
+    "showing": 15,
+    "next_cursor": "eyJsYXN0X25vdGVfYXQiOjE3MDAwMDAwMDAwMDAsImlkIjoxMjN9"
+@@
+@@ Validation Criteria
+7. `lore -J discussions ... --cursor <token>` returns the next stable page without duplicates/skips
+```
+
+7. **Fix semantic ambiguities in discussion summary fields**
+Analysis: `note_count` is ambiguous, and `first_author` can accidentally be a system note author. Make fields explicit and consistent with non-system default behavior.
+
+```diff
+@@ Response Schema
+-        "note_count": 3,
+-        "first_author": "elovegrove",
+        "user_note_count": 3,
+        "total_note_count": 4,
+        "first_user_author": "elovegrove",
+@@
+@@ 3d. Filters struct / path behavior
+-- `path` → `EXISTS (SELECT 1 FROM notes n WHERE n.discussion_id = d.id AND n.position_new_path LIKE ?)`
+- `path` → match on BOTH `position_new_path` and `position_old_path` (exact/prefix)
+```
+
+8. **Enrich show outputs with actionable thread metadata**
+Analysis: Adding only discussion id helps, but agents still need thread state and note ids to pick targets correctly. Add `resolvable`, `resolved`, `last_note_at_iso`, and `gitlab_note_id` in show discussion payloads.
+
+```diff
+@@ 2a/2b show discussion structs
+ pub struct DiscussionDetailJson {
+     pub gitlab_discussion_id: String,
+    pub resolvable: bool,
+    pub resolved: bool,
+    pub last_note_at_iso: String,
+     pub notes: Vec<NoteDetailJson>,
+@@
+ pub struct NoteDetailJson {
+    pub gitlab_note_id: i64,
+     pub author_username: String,
+```
+
+9. **Harden robot-docs against schema drift with tests**
+Analysis: Static JSON in `main.rs` will drift again. Add a lightweight contract test that asserts docs include required fields for `notes`, `discussions`, and show payloads.
+
+```diff
+@@ 4. Fix Robot-Docs Response Schemas
+#### 4f. Add robot-docs contract tests
+**File**: `src/main.rs` (or dedicated test module)
+- Assert `robot-docs` contains `gitlab_discussion_id` and `gitlab_note_id` in:
+  - `notes.response_schema`
+  - `issues.response_schema.show`
+  - `mrs.response_schema.show`
+  - `discussions.response_schema`
+```
+
+10. **Adjust delivery order to reduce rework and include missing CSV path**
+Analysis: In your sample `handle_discussions`, `csv` is declared in args but not handled. Also, robot-docs should land after all payload changes. Sequence should minimize churn.
+
+```diff
+@@ Delivery Order
+-3. **Change 4** (robot-docs) — depends on 1 and 2 being done so schemas are accurate.
+-4. **Change 3** (discussions command) — largest change, depends on 1 for design consistency.
+3. **Change 3** (discussions command + indexes + pagination) — largest change.
+4. **Change 4** (robot-docs + contract tests) — last, after payloads are final.
+@@ 3e. Handler wiring
+-    match format {
+    match format {
+         "json" => ...
+         "jsonl" => ...
+        "csv" => print_list_discussions_csv(&result),
+         _ => ...
+     }
+```
+
+If you want, I can produce a single consolidated revised plan markdown with these edits applied so you can drop it in directly.