feat(cli): implement 'lore trace' command (bd-2n4, bd-9dd)

Gate 5 Code Trace - Tier 1 (API-only, no git blame).
Answers 'Why was this code introduced?' by building
file -> MR -> issue -> discussion chains.

New files:
- src/core/trace.rs: run_trace() query logic with rename-aware
  path resolution, entity_reference-based issue linking, and
  DiffNote discussion extraction
- src/core/trace_tests.rs: 7 unit tests for query logic
- src/cli/commands/trace.rs: CLI command with human output,
  robot JSON output, and :line suffix parsing (5 tests)

Human output shows full content (no truncation).
Robot JSON truncates discussion bodies to 500 chars for token efficiency.

Wiring:
- TraceArgs + Commands::Trace in cli/mod.rs
- handle_trace in main.rs
- VALID_COMMANDS + robot-docs manifest entry
- COMMAND_FLAGS autocorrect registry entry

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
teernisse
2026-02-17 14:16:45 -05:00
parent a1bca10408
commit 171260a772
13 changed files with 1807 additions and 283 deletions

View File

@@ -0,0 +1,147 @@
1. **Make `gitlab_note_id` explicit in all note-level payloads without breaking existing consumers**
Rationale: Your Bridge Contract already requires `gitlab_note_id`, but current plan keeps `gitlab_id` only in `notes` list while adding `gitlab_note_id` only in `show`. That forces agents to special-case commands. Add `gitlab_note_id` as an alias field everywhere note-level data appears, while keeping `gitlab_id` for compatibility.
```diff
@@ Bridge Contract (Cross-Cutting)
-Every read payload that surfaces notes or discussions MUST include:
+Every read payload that surfaces notes or discussions MUST include:
- project_path
- noteable_type
- parent_iid
- gitlab_discussion_id
- gitlab_note_id (when note-level data is returned — i.e., in notes list and show detail)
+ - Back-compat rule: note payloads may continue exposing `gitlab_id`, but MUST also expose `gitlab_note_id` with the same value.
@@ 1. Add `gitlab_discussion_id` to Notes Output
-#### 1c. Add field to `NoteListRowJson`
+#### 1c. Add fields to `NoteListRowJson`
+Add `gitlab_note_id` alias in addition to existing `gitlab_id` (no rename, no breakage).
@@ 1f. Update `--fields minimal` preset
-"notes" => ["id", "author_username", "body", "created_at_iso", "gitlab_discussion_id"]
+"notes" => ["id", "gitlab_note_id", "author_username", "body", "created_at_iso", "gitlab_discussion_id"]
```
2. **Avoid duplicate flag semantics for discussion filtering**
Rationale: `notes` already has `--discussion-id` and it already maps to `d.gitlab_discussion_id`. Adding a second independent flag/field (`--gitlab-discussion-id`) increases complexity and precedence bugs. Keep one backing filter field and make the new flag an alias.
```diff
@@ 1g. Add `--gitlab-discussion-id` filter to notes
-Allow filtering notes directly by GitLab discussion thread ID...
+Normalize discussion ID flags:
+- Keep one backing filter field (`discussion_id`)
+- Support both `--discussion-id` (existing) and `--gitlab-discussion-id` (alias)
+- If both are provided, clap should reject as duplicate/alias conflict
```
3. **Add ambiguity guardrails for cross-project discussion IDs**
Rationale: `gitlab_discussion_id` is unique per project, not globally. Filtering by discussion ID without project can return multiple rows across repos, which breaks deterministic write bridging. Fail fast with an `Ambiguous` error and actionable fix (`--project`).
```diff
@@ Bridge Contract (Cross-Cutting)
+### Ambiguity Guardrail
+When filtering by `gitlab_discussion_id` without `--project`, if multiple projects match:
+- return `Ambiguous` error
+- include matching project paths in message
+- suggest retry with `--project <path>`
```
4. **Replace `--include-notes` N+1 retrieval with one batched top-N query**
Rationale: The current plans per-discussion follow-up query scales poorly and creates latency spikes. Use a single window-function query over selected discussion IDs and group rows in Rust. This is both faster and more predictable.
```diff
@@ 3c-ii. Note expansion query (--include-notes)
-When `include_notes > 0`, after the main discussion query, run a follow-up query per discussion...
+When `include_notes > 0`, run one batched query:
+WITH ranked_notes AS (
+ SELECT
+ n.*,
+ d.gitlab_discussion_id,
+ ROW_NUMBER() OVER (
+ PARTITION BY n.discussion_id
+ ORDER BY n.created_at DESC, n.id DESC
+ ) AS rn
+ FROM notes n
+ JOIN discussions d ON d.id = n.discussion_id
+ WHERE n.discussion_id IN ( ...selected discussion ids... )
+)
+SELECT ... FROM ranked_notes WHERE rn <= ?
+ORDER BY discussion_id, rn;
+
+Group by `discussion_id` in Rust and attach notes arrays without per-thread round-trips.
```
5. **Add hard output guardrails and explicit truncation metadata**
Rationale: `--limit` and `--include-notes` are unbounded today. For robot workflows this can accidentally generate huge payloads. Cap values and surface effective limits plus truncation state in `meta`.
```diff
@@ 3a. CLI Args
- pub limit: usize,
+ pub limit: usize, // clamp to max (e.g., 500)
- pub include_notes: usize,
+ pub include_notes: usize, // clamp to max (e.g., 20)
@@ Response Schema
- "meta": { "elapsed_ms": 12 }
+ "meta": {
+ "elapsed_ms": 12,
+ "effective_limit": 50,
+ "effective_include_notes": 2,
+ "has_more": true
+ }
```
6. **Strengthen deterministic ordering and null handling**
Rationale: `first_note_at`, `last_note_at`, and note `position` can be null/incomplete during partial sync states. Add null-safe ordering to avoid unstable output and flaky automation.
```diff
@@ 2c. Update queries to SELECT new fields
-... ORDER BY first_note_at
+... ORDER BY COALESCE(first_note_at, last_note_at, 0), id
@@ show note query
-ORDER BY position
+ORDER BY COALESCE(position, 9223372036854775807), created_at, id
@@ 3c. SQL Query
-ORDER BY {sort_column} {order}
+ORDER BY COALESCE({sort_column}, 0) {order}, fd.id {order}
```
7. **Make write-bridging more useful with optional command hints**
Rationale: Exposing IDs is necessary but not sufficient; agents still need to assemble endpoints repeatedly. Add optional `--with-write-hints` that injects compact endpoint templates (`reply`, `resolve`) derived from row context. This improves usability without bloating default output.
```diff
@@ 3a. CLI Args
+ /// Include machine-actionable glab write hints per row
+ #[arg(long, help_heading = "Output")]
+ pub with_write_hints: bool,
@@ Response Schema (notes/discussions/show)
+ "write_hints?": {
+ "reply_endpoint": "string",
+ "resolve_endpoint?": "string"
+ }
```
8. **Upgrade robot-docs/contract validation from string-contains to parity checks**
Rationale: `contains("gitlab_discussion_id")` catches very little and allows schema drift. Build field-set parity tests that compare actual serialized JSON keys to robot-docs declared fields for `notes`, `discussions`, and `show` discussion nodes.
```diff
@@ 4f. Add robot-docs contract tests
-assert!(notes_schema.contains("gitlab_discussion_id"));
+let declared = parse_schema_field_list(notes_schema);
+let sample = sample_notes_row_json_keys();
+assert_required_subset(&declared, &["project_path","noteable_type","parent_iid","gitlab_discussion_id","gitlab_note_id"]);
+assert_schema_matches_payload(&declared, &sample);
@@ 4g. Add CLI-level contract integration tests
+Add parity tests for:
+- notes list JSON
+- discussions list JSON
+- issues show discussions[*]
+- mrs show discussions[*]
```
If you want, I can produce a full revised v3 plan text with these edits merged end-to-end so its ready to execute directly.