feat(cli): implement 'lore trace' command (bd-2n4, bd-9dd)

Gate 5 Code Trace - Tier 1 (API-only, no git blame). Answers 'Why was this code introduced?' by building file -> MR -> issue -> discussion chains. New files: - src/core/trace.rs: run_trace() query logic with rename-aware path resolution, entity_reference-based issue linking, and DiffNote discussion extraction - src/core/trace_tests.rs: 7 unit tests for query logic - src/cli/commands/trace.rs: CLI command with human output, robot JSON output, and :line suffix parsing (5 tests) Wiring: - TraceArgs + Commands::Trace in cli/mod.rs - handle_trace in main.rs - VALID_COMMANDS + robot-docs manifest entry - COMMAND_FLAGS autocorrect registry entry Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 14:16:45 -05:00
parent a1bca10408
commit 415f7e69af
13 changed files with 1514 additions and 78 deletions
--- a/docs/plan-expose-discussion-ids.feedback-3.md
+++ b/docs/plan-expose-discussion-ids.feedback-3.md
@@ -0,0 +1,147 @@
+1. **Make `gitlab_note_id` explicit in all note-level payloads without breaking existing consumers**
+Rationale: Your Bridge Contract already requires `gitlab_note_id`, but current plan keeps `gitlab_id` only in `notes` list while adding `gitlab_note_id` only in `show`. That forces agents to special-case commands. Add `gitlab_note_id` as an alias field everywhere note-level data appears, while keeping `gitlab_id` for compatibility.
+
+```diff
+@@ Bridge Contract (Cross-Cutting)
+-Every read payload that surfaces notes or discussions MUST include:
+Every read payload that surfaces notes or discussions MUST include:
+   - project_path
+   - noteable_type
+   - parent_iid
+   - gitlab_discussion_id
+   - gitlab_note_id (when note-level data is returned — i.e., in notes list and show detail)
+  - Back-compat rule: note payloads may continue exposing `gitlab_id`, but MUST also expose `gitlab_note_id` with the same value.
+
+@@ 1. Add `gitlab_discussion_id` to Notes Output
+-#### 1c. Add field to `NoteListRowJson`
+#### 1c. Add fields to `NoteListRowJson`
+Add `gitlab_note_id` alias in addition to existing `gitlab_id` (no rename, no breakage).
+
+@@ 1f. Update `--fields minimal` preset
+-"notes" => ["id", "author_username", "body", "created_at_iso", "gitlab_discussion_id"]
+"notes" => ["id", "gitlab_note_id", "author_username", "body", "created_at_iso", "gitlab_discussion_id"]
+```
+
+2. **Avoid duplicate flag semantics for discussion filtering**
+Rationale: `notes` already has `--discussion-id` and it already maps to `d.gitlab_discussion_id`. Adding a second independent flag/field (`--gitlab-discussion-id`) increases complexity and precedence bugs. Keep one backing filter field and make the new flag an alias.
+
+```diff
+@@ 1g. Add `--gitlab-discussion-id` filter to notes
+-Allow filtering notes directly by GitLab discussion thread ID...
+Normalize discussion ID flags:
+- Keep one backing filter field (`discussion_id`)
+- Support both `--discussion-id` (existing) and `--gitlab-discussion-id` (alias)
+- If both are provided, clap should reject as duplicate/alias conflict
+```
+
+3. **Add ambiguity guardrails for cross-project discussion IDs**
+Rationale: `gitlab_discussion_id` is unique per project, not globally. Filtering by discussion ID without project can return multiple rows across repos, which breaks deterministic write bridging. Fail fast with an `Ambiguous` error and actionable fix (`--project`).
+
+```diff
+@@ Bridge Contract (Cross-Cutting)
+### Ambiguity Guardrail
+When filtering by `gitlab_discussion_id` without `--project`, if multiple projects match:
+- return `Ambiguous` error
+- include matching project paths in message
+- suggest retry with `--project <path>`
+```
+
+4. **Replace `--include-notes` N+1 retrieval with one batched top-N query**
+Rationale: The current plan’s per-discussion follow-up query scales poorly and creates latency spikes. Use a single window-function query over selected discussion IDs and group rows in Rust. This is both faster and more predictable.
+
+```diff
+@@ 3c-ii. Note expansion query (--include-notes)
+-When `include_notes > 0`, after the main discussion query, run a follow-up query per discussion...
+When `include_notes > 0`, run one batched query:
+WITH ranked_notes AS (
+  SELECT
+    n.*,
+    d.gitlab_discussion_id,
+    ROW_NUMBER() OVER (
+      PARTITION BY n.discussion_id
+      ORDER BY n.created_at DESC, n.id DESC
+    ) AS rn
+  FROM notes n
+  JOIN discussions d ON d.id = n.discussion_id
+  WHERE n.discussion_id IN ( ...selected discussion ids... )
+)
+SELECT ... FROM ranked_notes WHERE rn <= ?
+ORDER BY discussion_id, rn;
+
+Group by `discussion_id` in Rust and attach notes arrays without per-thread round-trips.
+```
+
+5. **Add hard output guardrails and explicit truncation metadata**
+Rationale: `--limit` and `--include-notes` are unbounded today. For robot workflows this can accidentally generate huge payloads. Cap values and surface effective limits plus truncation state in `meta`.
+
+```diff
+@@ 3a. CLI Args
+-    pub limit: usize,
+    pub limit: usize, // clamp to max (e.g., 500)
+
+-    pub include_notes: usize,
+    pub include_notes: usize, // clamp to max (e.g., 20)
+
+@@ Response Schema
+-  "meta": { "elapsed_ms": 12 }
+  "meta": {
+    "elapsed_ms": 12,
+    "effective_limit": 50,
+    "effective_include_notes": 2,
+    "has_more": true
+  }
+```
+
+6. **Strengthen deterministic ordering and null handling**
+Rationale: `first_note_at`, `last_note_at`, and note `position` can be null/incomplete during partial sync states. Add null-safe ordering to avoid unstable output and flaky automation.
+
+```diff
+@@ 2c. Update queries to SELECT new fields
+-... ORDER BY first_note_at
+... ORDER BY COALESCE(first_note_at, last_note_at, 0), id
+
+@@ show note query
+-ORDER BY position
+ORDER BY COALESCE(position, 9223372036854775807), created_at, id
+
+@@ 3c. SQL Query
+-ORDER BY {sort_column} {order}
+ORDER BY COALESCE({sort_column}, 0) {order}, fd.id {order}
+```
+
+7. **Make write-bridging more useful with optional command hints**
+Rationale: Exposing IDs is necessary but not sufficient; agents still need to assemble endpoints repeatedly. Add optional `--with-write-hints` that injects compact endpoint templates (`reply`, `resolve`) derived from row context. This improves usability without bloating default output.
+
+```diff
+@@ 3a. CLI Args
+    /// Include machine-actionable glab write hints per row
+    #[arg(long, help_heading = "Output")]
+    pub with_write_hints: bool,
+
+@@ Response Schema (notes/discussions/show)
+    "write_hints?": {
+      "reply_endpoint": "string",
+      "resolve_endpoint?": "string"
+    }
+```
+
+8. **Upgrade robot-docs/contract validation from string-contains to parity checks**
+Rationale: `contains("gitlab_discussion_id")` catches very little and allows schema drift. Build field-set parity tests that compare actual serialized JSON keys to robot-docs declared fields for `notes`, `discussions`, and `show` discussion nodes.
+
+```diff
+@@ 4f. Add robot-docs contract tests
+-assert!(notes_schema.contains("gitlab_discussion_id"));
+let declared = parse_schema_field_list(notes_schema);
+let sample = sample_notes_row_json_keys();
+assert_required_subset(&declared, &["project_path","noteable_type","parent_iid","gitlab_discussion_id","gitlab_note_id"]);
+assert_schema_matches_payload(&declared, &sample);
+
+@@ 4g. Add CLI-level contract integration tests
+Add parity tests for:
+- notes list JSON
+- discussions list JSON
+- issues show discussions[*]
+- mrs show discussions[*]
+```
+
+If you want, I can produce a full revised v3 plan text with these edits merged end-to-end so it’s ready to execute directly.
--- a/docs/plan-expose-discussion-ids.feedback-4.md
+++ b/docs/plan-expose-discussion-ids.feedback-4.md
@@ -0,0 +1,207 @@
+Below are the highest-impact revisions I’d make to this plan. I excluded everything listed in your `## Rejected Recommendations` section.
+
+**1. Fix a correctness bug in the ambiguity guardrail (must run before `LIMIT`)**
+
+The current post-query ambiguity check can silently fail when `--limit` truncates results to one project even though multiple projects match the same `gitlab_discussion_id`. That creates non-deterministic write targeting risk.
+
+```diff
+@@ ## Ambiguity Guardrail
+-**Implementation**: After the main query, if `gitlab_discussion_id` is set and no `--project`
+-was provided, check if the result set spans multiple `project_path` values.
+**Implementation**: Run a preflight distinct-project check when `gitlab_discussion_id` is set
+and `--project` was not provided, before the main list query applies `LIMIT`.
+Use:
+```sql
+SELECT DISTINCT p.path_with_namespace
+FROM discussions d
+JOIN projects p ON p.id = d.project_id
+WHERE d.gitlab_discussion_id = ?
+LIMIT 3
+```
+If more than one project is found, return `LoreError::Ambiguous` (exit code 18) with project
+paths and suggestion to retry with `--project <path>`.
+```
+
+---
+
+**2. Add `gitlab_project_id` to the Bridge Contract**
+
+`project_path` is human-friendly but mutable (renames/transfers). `gitlab_project_id` gives a stable write target and avoids path re-resolution failures.
+
+```diff
+@@ ## Bridge Contract (Cross-Cutting)
+ Every read payload that surfaces notes or discussions **MUST** include:
+ - `project_path`
+- `gitlab_project_id`
+ - `noteable_type`
+ - `parent_iid`
+ - `gitlab_discussion_id`
+ - `gitlab_note_id`
+@@
+ const BRIDGE_FIELDS_NOTES: &[&str] = &[
+-    "project_path", "noteable_type", "parent_iid",
+    "project_path", "gitlab_project_id", "noteable_type", "parent_iid",
+     "gitlab_discussion_id", "gitlab_note_id",
+ ];
+ const BRIDGE_FIELDS_DISCUSSIONS: &[&str] = &[
+-    "project_path", "noteable_type", "parent_iid",
+    "project_path", "gitlab_project_id", "noteable_type", "parent_iid",
+     "gitlab_discussion_id",
+ ];
+```
+
+---
+
+**3. Replace stringly-typed filter/sort fields with enums end-to-end**
+
+Right now `sort`, `order`, `resolution`, `noteable_type` are mostly `String`. This is fragile and risks unsafe SQL interpolation drift over time. Typed enums make invalid states unrepresentable.
+
+```diff
+@@ ## 3a. CLI Args
+-    pub resolution: Option<String>,
+    pub resolution: Option<ResolutionFilter>,
+@@
+-    pub noteable_type: Option<String>,
+    pub noteable_type: Option<NoteableTypeFilter>,
+@@
+-    pub sort: String,
+    pub sort: DiscussionSortField,
+@@
+-    pub asc: bool,
+    pub order: SortDirection,
+@@ ## 3d. Filters struct
+-    pub resolution: Option<String>,
+-    pub noteable_type: Option<String>,
+-    pub sort: String,
+-    pub order: String,
+    pub resolution: Option<ResolutionFilter>,
+    pub noteable_type: Option<NoteableTypeFilter>,
+    pub sort: DiscussionSortField,
+    pub order: SortDirection,
+@@
+Map enum -> SQL fragment via `match` in query builder; never interpolate raw strings.
+```
+
+---
+
+**4. Enforce snapshot consistency for multi-query commands**
+
+`discussions` with `--include-notes` does multiple reads. Without a single read transaction, concurrent ingest can produce mismatched `total_count`, row set, and expanded notes.
+
+```diff
+@@ ## 3c. SQL Query
+-pub fn query_discussions(...)
+pub fn query_discussions(...)
+ {
+    // Run count query + page query + note expansion under one deferred read transaction
+    // so output is a single consistent snapshot.
+    let tx = conn.transaction_with_behavior(rusqlite::TransactionBehavior::Deferred)?;
+     ...
+    tx.commit()?;
+ }
+@@ ## 1. Add `gitlab_discussion_id` to Notes Output
+Apply the same snapshot rule to `query_notes` when returning `total_count` + paged rows.
+```
+
+---
+
+**5. Correct first-note rollup semantics (current CTE can return null/incorrect `first_author`)**
+
+In the proposed SQL, `rn=1` is computed over all notes but then filtered with `is_system=0`, so threads with a leading system note may incorrectly lose `first_author`/snippet. Also path rollup uses non-deterministic `MAX(...)`.
+
+```diff
+@@ ## 3c. SQL Query
+-ranked_notes AS (
+ranked_notes AS (
+     SELECT
+         n.discussion_id,
+         n.author_username,
+         n.body,
+         n.is_system,
+         n.position_new_path,
+         n.position_new_line,
+-        ROW_NUMBER() OVER (
+-            PARTITION BY n.discussion_id
+-            ORDER BY n.position, n.id
+-        ) AS rn
+        ROW_NUMBER() OVER (
+            PARTITION BY n.discussion_id
+            ORDER BY CASE WHEN n.is_system = 0 THEN 0 ELSE 1 END, n.created_at, n.id
+        ) AS rn_first_note,
+        ROW_NUMBER() OVER (
+            PARTITION BY n.discussion_id
+            ORDER BY CASE WHEN n.position_new_path IS NULL THEN 1 ELSE 0 END, n.created_at, n.id
+        ) AS rn_first_position
+@@
+-        MAX(CASE WHEN rn = 1 AND is_system = 0 THEN author_username END) AS first_author,
+-        MAX(CASE WHEN rn = 1 AND is_system = 0 THEN body END) AS first_note_body,
+-        MAX(CASE WHEN position_new_path IS NOT NULL THEN position_new_path END) AS position_new_path,
+-        MAX(CASE WHEN position_new_line IS NOT NULL THEN position_new_line END) AS position_new_line
+        MAX(CASE WHEN rn_first_note = 1 AND is_system = 0 THEN author_username END) AS first_author,
+        MAX(CASE WHEN rn_first_note = 1 AND is_system = 0 THEN body END) AS first_note_body,
+        MAX(CASE WHEN rn_first_position = 1 THEN position_new_path END) AS position_new_path,
+        MAX(CASE WHEN rn_first_position = 1 THEN position_new_line END) AS position_new_line
+```
+
+---
+
+**6. Add per-discussion truncation signals for `--include-notes`**
+
+Top-level `has_more` is useful, but agents also need to know if an individual thread’s notes were truncated. Otherwise they can’t tell if a thread is complete.
+
+```diff
+@@ ## Response Schema
+       {
+         "gitlab_discussion_id": "...",
+         ...
+-        "notes": []
+        "included_note_count": 0,
+        "has_more_notes": false,
+        "notes": []
+       }
+@@ ## 3b. Domain Structs
+ pub struct DiscussionListRowJson {
+@@
+    pub included_note_count: usize,
+    pub has_more_notes: bool,
+     #[serde(skip_serializing_if = "Vec::is_empty")]
+     pub notes: Vec<NoteListRowJson>,
+ }
+@@ ## 3c-ii. Note expansion query (--include-notes)
+-Group by `discussion_id` in Rust and attach notes arrays...
+Group by `discussion_id` in Rust, attach notes arrays, and set:
+`included_note_count = notes.len()`,
+`has_more_notes = note_count > included_note_count`.
+```
+
+---
+
+**7. Add explicit query-plan gate and targeted index workstream (measured, not speculative)**
+
+This plan introduces heavy discussion-centric reads. You should bake in deterministic performance validation with `EXPLAIN QUERY PLAN` and only then add indexes if missing.
+
+```diff
+@@ ## Scope: Four workstreams, delivered in order:
+-4. Fix robot-docs to list actual field names instead of opaque type references
+4. Add query-plan validation + targeted index updates for new discussion queries
+5. Fix robot-docs to list actual field names instead of opaque type references
+@@
+## 4. Query-Plan Validation and Targeted Indexes
+
+Before and after implementing `query_discussions`, capture `EXPLAIN QUERY PLAN` for:
+- `--for-mr <iid> --resolution unresolved`
+- `--project <path> --since 7d --sort last_note`
+- `--gitlab-discussion-id <id>`
+
+If plans show table scans on `notes`/`discussions`, add indexes in `MIGRATIONS` array:
+- `discussions(project_id, gitlab_discussion_id)`
+- `discussions(merge_request_id, last_note_at, id)`
+- `notes(discussion_id, created_at DESC, id DESC)`
+- `notes(discussion_id, position, id)`
+
+Tests: assert the new query paths return expected rows under indexed schema and no regressions.
+```
+
+---
+
+If you want, I can produce a single consolidated “iteration 4” version of the plan text with all seven revisions merged in place.
--- a/docs/plan-expose-discussion-ids.md
+++ b/docs/plan-expose-discussion-ids.md
@@ -2,7 +2,7 @@
 plan: true
 title: ""
 status: iterating
-iteration: 2
+iteration: 3
 target_iterations: 8
 beads_revision: 0
 related_plans: []
@@ -34,6 +34,11 @@ Every read payload that surfaces notes or discussions **MUST** include:
 - `gitlab_discussion_id`
 - `gitlab_note_id` (when note-level data is returned — i.e., in notes list and show detail)

+**Back-compat rule**: Note payloads in the `notes` list command continue exposing `gitlab_id`
+for existing consumers, but **MUST also** expose `gitlab_note_id` with the same value. This
+ensures agents can use a single field name (`gitlab_note_id`) across all commands — `notes`,
+`show`, and `discussions --include-notes` — without special-casing by command.
+
 This contract exists so agents can deterministically construct `glab api` write calls without
 cross-referencing multiple commands. Each workstream below must satisfy these fields in its
 output.
@@ -64,6 +69,37 @@ In `filter_fields`, when entity is `"notes"` or `"discussions"`, merge the bridg
 requested fields before filtering the JSON value. This is a ~5-line change to the existing
 function.

+### Ambiguity Guardrail
+
+When filtering by `gitlab_discussion_id` (on either `notes` or `discussions` commands) without
+`--project`, if the query matches discussions in multiple projects:
+- Return an `Ambiguous` error (exit code 18, matching existing convention)
+- Include matching project paths in the error message
+- Suggest retry with `--project <path>`
+
+**Implementation**: After the main query, if `gitlab_discussion_id` is set and no `--project`
+was provided, check if the result set spans multiple `project_path` values. If so, return
+`LoreError::Ambiguous` with the distinct project paths. This is a post-query check (not a
+pre-query reject) so it only fires when real ambiguity exists.
+
+```rust
+// In query_notes / query_discussions, after collecting results:
+if filters.gitlab_discussion_id.is_some() && filters.project.is_none() {
+    let distinct_projects: HashSet<&str> = results.iter()
+        .map(|r| r.project_path.as_str())
+        .collect();
+    if distinct_projects.len() > 1 {
+        return Err(LoreError::Ambiguous {
+            message: format!(
+                "Discussion ID matches {} projects: {}. Use --project to disambiguate.",
+                distinct_projects.len(),
+                distinct_projects.into_iter().collect::<Vec<_>>().join(", ")
+            ),
+        });
+    }
+}
+```
+
 ---

 ## 1. Add `gitlab_discussion_id` to Notes Output
@@ -175,13 +211,17 @@ etc.) which rusqlite's `row.get("name")` can resolve. This eliminates the fragil
 column-index counting that has caused bugs in the past. If the conversion touches too many
 lines, limit named lookup to just the new field and add a follow-up task.

-#### 1c. Add field to `NoteListRowJson`
+#### 1c. Add fields to `NoteListRowJson`

 **File**: `src/cli/commands/list.rs` line ~1093

+Add both `gitlab_discussion_id` and `gitlab_note_id` (alias for `gitlab_id`):
+
 ```rust
 pub struct NoteListRowJson {
    // ... existing fields ...
+    pub gitlab_id: i64,              // KEEP — existing consumers
+    pub gitlab_note_id: i64,         // ADD — Bridge Contract alias
    pub project_path: String,
    pub gitlab_discussion_id: String,  // ADD
 }
@@ -194,6 +234,8 @@ impl From<&NoteListRow> for NoteListRowJson {
    fn from(row: &NoteListRow) -> Self {
        Self {
            // ... existing fields ...
+            gitlab_id: row.gitlab_id,
+            gitlab_note_id: row.gitlab_id,  // ADD — same value as gitlab_id
            project_path: row.project_path.clone(),
            gitlab_discussion_id: row.gitlab_discussion_id.clone(),  // ADD
        }
@@ -205,7 +247,7 @@ impl From<&NoteListRow> for NoteListRowJson {

 **File**: `src/cli/commands/list.rs` line ~1004

-Add `gitlab_discussion_id` to the CSV header and row output.
+Add `gitlab_discussion_id` and `gitlab_note_id` to the CSV header and row output.

 #### 1e. Add to table display

@@ -218,13 +260,13 @@ Add a column showing a truncated discussion ID (first 8 chars) in the table view
 **File**: `src/cli/robot.rs` line ~67

 ```rust
-"notes" => ["id", "author_username", "body", "created_at_iso", "gitlab_discussion_id"]
+"notes" => ["id", "gitlab_note_id", "author_username", "body", "created_at_iso", "gitlab_discussion_id"]
    .iter()
    .map(|s| (*s).to_string())
    .collect(),
 ```

-The discussion ID is critical enough for agent workflows that it belongs in `minimal`.
+The discussion ID and note ID are critical for agent bridge workflows and belong in `minimal`.

 #### 1g. Add `--gitlab-discussion-id` filter to notes

@@ -233,6 +275,10 @@ the internal integer). This enables one-hop note retrieval from external referen
 that received a `gitlab_discussion_id` from another command or webhook can jump straight to
 the relevant notes without knowing the internal discussion ID.

+**Note**: This is distinct from the existing `--discussion-id` filter which takes the internal
+integer ID. The two filters serve different use cases: internal cross-referencing vs. external
+API bridging.
+
 **File**: `src/cli/mod.rs` (NotesArgs)

 ```rust
@@ -286,6 +332,7 @@ fn note_list_row_json_includes_gitlab_discussion_id() {

    let json_row = NoteListRowJson::from(&row);
    assert_eq!(json_row.gitlab_discussion_id, "6a9c1750b37d");
+    assert_eq!(json_row.gitlab_note_id, 100);  // alias matches gitlab_id

    let serialized = serde_json::to_value(&json_row).unwrap();
    assert!(serialized.get("gitlab_discussion_id").is_some());
@@ -293,6 +340,9 @@ fn note_list_row_json_includes_gitlab_discussion_id() {
        serialized["gitlab_discussion_id"].as_str().unwrap(),
        "6a9c1750b37d"
    );
+    // Both gitlab_id and gitlab_note_id present with same value
+    assert_eq!(serialized["gitlab_id"], 100);
+    assert_eq!(serialized["gitlab_note_id"], 100);
 }
 ```

@@ -436,6 +486,19 @@ fn notes_filter_by_gitlab_discussion_id() {
 }
 ```

+#### Test 6: Ambiguity guardrail fires for cross-project discussion ID matches
+
+```rust
+#[test]
+fn notes_ambiguous_gitlab_discussion_id_across_projects() {
+    let conn = create_test_db();
+    // Insert 2 projects, each with a discussion sharing the same gitlab_discussion_id
+    // (this can happen since IDs are per-project)
+    // Filter by gitlab_discussion_id without --project
+    // Assert LoreError::Ambiguous is returned with both project paths
+}
+```
+
 ---

 ## 2. Add `gitlab_discussion_id` to Show Command Discussion Groups
@@ -534,18 +597,24 @@ pub struct MrDiscussionDetailJson {

 **Issue discussions** (`show.rs:325`):
 ```sql
-SELECT id, gitlab_discussion_id, individual_note, resolvable, resolved, last_note_at
+SELECT id, gitlab_discussion_id, individual_note, resolvable, resolved,
+       COALESCE(last_note_at, first_note_at, 0) AS last_note_at
 FROM discussions
-WHERE issue_id = ? ORDER BY first_note_at
+WHERE issue_id = ? ORDER BY COALESCE(first_note_at, last_note_at, 0), id
 ```

 **MR discussions** (`show.rs:537`):
 ```sql
-SELECT id, gitlab_discussion_id, individual_note, resolvable, resolved, last_note_at
+SELECT id, gitlab_discussion_id, individual_note, resolvable, resolved,
+       COALESCE(last_note_at, first_note_at, 0) AS last_note_at
 FROM discussions
-WHERE merge_request_id = ? ORDER BY first_note_at
+WHERE merge_request_id = ? ORDER BY COALESCE(first_note_at, last_note_at, 0), id
 ```

+**Note on ordering**: The `COALESCE` with tiebreaker `id` ensures deterministic ordering even
+when timestamps are NULL (possible during partial sync states). This prevents unstable output
+that could confuse automated workflows.
+
 #### 2d. Update query_map closures

 The `disc_rows` tuple changes from `(i64, bool)` to a richer shape. Use named columns here
@@ -753,7 +822,12 @@ lore -J discussions --for-mr 99 --resolution unresolved --include-notes 2
    "total_count": 15,
    "showing": 15
  },
-  "meta": { "elapsed_ms": 12 }
+  "meta": {
+    "elapsed_ms": 12,
+    "effective_limit": 50,
+    "effective_include_notes": 0,
+    "has_more": false
+  }
 }
 ```

@@ -761,6 +835,10 @@ The `notes` array is empty by default (zero overhead). When `--include-notes N`
 each discussion includes up to N of its most recent notes inline. This covers the common
 agent pattern of "show me unresolved threads with context" in a single round-trip.

+The `meta` block includes `effective_limit` and `effective_include_notes` (the clamped values
+actually used) plus `has_more` (true when total_count > showing). This lets agents detect
+truncation and decide whether to paginate or narrow their query.
+
 ### File Architecture

 **No new files.** Follow the existing pattern:
@@ -789,7 +867,7 @@ Args struct:
 ```rust
 #[derive(Parser)]
 pub struct DiscussionsArgs {
-    /// Maximum results
+    /// Maximum results (clamped to 500)
    #[arg(short = 'n', long = "limit", default_value = "50", help_heading = "Output")]
    pub limit: usize,

@@ -833,7 +911,7 @@ pub struct DiscussionsArgs {
    #[arg(long, value_parser = ["Issue", "MergeRequest"], help_heading = "Filters")]
    pub noteable_type: Option<String>,

-    /// Include up to N latest notes per discussion (0 = none, default)
+    /// Include up to N latest notes per discussion (0 = none, default; clamped to 20)
    #[arg(long, default_value = "0", help_heading = "Output")]
    pub include_notes: usize,

@@ -847,6 +925,11 @@ pub struct DiscussionsArgs {
 }
 ```

+**Output guardrails**: The handler clamps `limit` to `min(limit, 500)` and `include_notes`
+to `min(include_notes, 20)` before passing to the query layer. This prevents accidentally
+huge payloads in robot mode. The clamped values are reported in `meta.effective_limit` and
+`meta.effective_include_notes`.
+
 #### 3b. Domain Structs

 **File**: `src/cli/commands/list.rs`
@@ -981,8 +1064,8 @@ SELECT
    COALESCE(nr.note_count, 0) AS note_count,
    nr.first_author,
    nr.first_note_body,
-    fd.first_note_at,
-    fd.last_note_at,
+    COALESCE(fd.first_note_at, fd.last_note_at, 0) AS first_note_at,
+    COALESCE(fd.last_note_at, fd.first_note_at, 0) AS last_note_at,
    fd.resolvable,
    fd.resolved,
    nr.position_new_path,
@@ -992,7 +1075,7 @@ JOIN projects p ON fd.project_id = p.id
 LEFT JOIN issues i ON fd.issue_id = i.id
 LEFT JOIN merge_requests m ON fd.merge_request_id = m.id
 LEFT JOIN note_rollup nr ON nr.discussion_id = fd.id
-ORDER BY {sort_column} {order}
+ORDER BY COALESCE({sort_column}, 0) {order}, fd.id {order}
 LIMIT ?
 ```

@@ -1004,39 +1087,54 @@ of discussions, the window function approach avoids repeated index probes and pr
 more predictable query plan. The `MAX(CASE WHEN rn = 1 ...)` pattern extracts first-note
 attributes from the grouped output without additional lookups.

+**Note on ordering**: The `COALESCE({sort_column}, 0)` with tiebreaker `fd.id` ensures
+deterministic ordering even when timestamps are NULL (partial sync states). The `id`
+tiebreaker is cheap (primary key) and prevents unstable sort output.
+
 **Note on SQLite FILTER syntax**: SQLite does not support `COUNT(*) FILTER (WHERE ...)`.
 Use `SUM(CASE WHEN ... THEN 1 ELSE 0 END)` instead (as shown above).

 #### 3c-ii. Note expansion query (--include-notes)

-When `include_notes > 0`, after the main discussion query, run a follow-up query per
-discussion to fetch its N most recent notes:
+When `include_notes > 0`, after the main discussion query, run a **single batched query**
+using a window function to fetch the N most recent notes per discussion:

 ```sql
-SELECT n.id, n.gitlab_id, n.author_username, n.body, n.note_type,
-       n.is_system, n.created_at, n.updated_at,
-       n.position_new_path, n.position_new_line,
-       n.position_old_path, n.position_old_line,
-       n.resolvable, n.resolved, n.resolved_by,
-       d.noteable_type,
-       COALESCE(i.iid, m.iid) AS parent_iid,
-       COALESCE(i.title, m.title) AS parent_title,
-       p.path_with_namespace AS project_path,
-       d.gitlab_discussion_id
-FROM notes n
-JOIN discussions d ON n.discussion_id = d.id
-JOIN projects p ON n.project_id = p.id
-LEFT JOIN issues i ON d.issue_id = i.id
-LEFT JOIN merge_requests m ON d.merge_request_id = m.id
-WHERE d.id = ?
-ORDER BY n.created_at DESC
-LIMIT ?
+WITH ranked_expansion AS (
+    SELECT
+        n.id, n.gitlab_id, n.author_username, n.body, n.note_type,
+        n.is_system, n.created_at, n.updated_at,
+        n.position_new_path, n.position_new_line,
+        n.position_old_path, n.position_old_line,
+        n.resolvable, n.resolved, n.resolved_by,
+        d.noteable_type,
+        COALESCE(i.iid, m.iid) AS parent_iid,
+        COALESCE(i.title, m.title) AS parent_title,
+        p.path_with_namespace AS project_path,
+        d.gitlab_discussion_id,
+        n.discussion_id,
+        ROW_NUMBER() OVER (
+            PARTITION BY n.discussion_id
+            ORDER BY n.created_at DESC, n.id DESC
+        ) AS rn
+    FROM notes n
+    JOIN discussions d ON n.discussion_id = d.id
+    JOIN projects p ON n.project_id = p.id
+    LEFT JOIN issues i ON d.issue_id = i.id
+    LEFT JOIN merge_requests m ON d.merge_request_id = m.id
+    WHERE n.discussion_id IN ({placeholders})
+)
+SELECT * FROM ranked_expansion WHERE rn <= ?
+ORDER BY discussion_id, rn
 ```

-**Optimization**: If discussion count is small (<= 50), batch all discussion IDs into a
-single `WHERE d.id IN (?, ?, ...)` query with a secondary partition to split by discussion.
-For larger result sets, fall back to per-discussion queries to avoid huge IN clauses. This
-matches the existing note-loading pattern in `show.rs`.
+Group by `discussion_id` in Rust and attach notes arrays to the corresponding
+`DiscussionListRowJson`. This avoids per-discussion round-trips entirely — one query
+regardless of how many discussions are in the result set.
+
+The `{placeholders}` are the `id` values from the main discussion query result. Since
+the discussion count is already clamped by `--limit` (max 500), the IN clause size is
+bounded and safe.

 The returned `NoteListRow` rows reuse the same struct and `NoteListRowJson` conversion from
 workstream 1, ensuring identical note shape across all commands.
@@ -1093,8 +1191,10 @@ fn handle_discussions(
    let conn = create_connection(&db_path)?;

    let order = if args.asc { "asc" } else { "desc" };
+    let effective_limit = args.limit.min(500);
+    let effective_include_notes = args.include_notes.min(20);
    let filters = DiscussionListFilters {
-        limit: args.limit,
+        limit: effective_limit,
        project: args.project,
        for_issue_iid: args.for_issue,
        for_mr_iid: args.for_mr,
@@ -1105,7 +1205,7 @@ fn handle_discussions(
        noteable_type: args.noteable_type,
        sort: args.sort,
        order: order.to_string(),
-        include_notes: args.include_notes,
+        include_notes: effective_include_notes,
    };

    let result = query_discussions(&conn, &filters, &config)?;
@@ -1122,6 +1222,8 @@ fn handle_discussions(
            start.elapsed().as_millis() as u64,
            args.fields.as_deref(),
            robot_mode,
+            effective_limit,
+            effective_include_notes,
        ),
        "jsonl" => print_list_discussions_jsonl(&result),
        "csv" => print_list_discussions_csv(&result),
@@ -1144,9 +1246,17 @@ pub fn print_list_discussions_json(
    elapsed_ms: u64,
    fields: Option<&[String]>,
    robot_mode: bool,
+    effective_limit: usize,
+    effective_include_notes: usize,
 ) {
    let json_result = DiscussionListResultJson::from(result);
-    let meta = RobotMeta { elapsed_ms };
+    let has_more = result.total_count as usize > json_result.showing;
+    let meta = serde_json::json!({
+        "elapsed_ms": elapsed_ms,
+        "effective_limit": effective_limit,
+        "effective_include_notes": effective_include_notes,
+        "has_more": has_more,
+    });
    let output = serde_json::json!({
        "ok": true,
        "data": json_result,
@@ -1325,7 +1435,7 @@ fn query_discussions_by_gitlab_id() {
 }
 ```

-#### Test 8: --include-notes populates notes array
+#### Test 8: --include-notes populates notes array via batched query

 ```rust
 #[test]
@@ -1381,6 +1491,69 @@ fn discussions_bridge_fields_forced_in_robot_mode() {
 }
 ```

+#### Test 10: Output guardrails clamp limit and include_notes
+
+```rust
+#[test]
+fn discussions_output_guardrails() {
+    // Verify that limit > 500 is clamped to 500
+    // Verify that include_notes > 20 is clamped to 20
+    // These are handler-level tests (not query-level)
+    assert_eq!(1000_usize.min(500), 500);
+    assert_eq!(50_usize.min(20), 20);
+    assert_eq!(5_usize.min(20), 5);  // below cap stays unchanged
+}
+```
+
+#### Test 11: Ambiguity guardrail fires for cross-project discussion ID
+
+```rust
+#[test]
+fn discussions_ambiguous_gitlab_discussion_id_across_projects() {
+    let conn = create_test_db();
+    insert_project(&conn, 1);  // "group/repo-a"
+    insert_project(&conn, 2);  // "group/repo-b"
+    // Insert discussions with same gitlab_discussion_id in different projects
+    insert_discussion(&conn, 1, "shared-id", 1, None, None, "Issue");
+    insert_discussion(&conn, 2, "shared-id", 2, None, None, "Issue");
+
+    let filters = DiscussionListFilters {
+        gitlab_discussion_id: Some("shared-id".to_string()),
+        project: None,  // no project specified
+        ..DiscussionListFilters::default()
+    };
+    let result = query_discussions(&conn, &filters, &Config::default());
+    assert!(result.is_err());
+    // Error should be Ambiguous with both project paths
+}
+```
+
+#### Test 12: has_more metadata is accurate
+
+```rust
+#[test]
+fn discussions_has_more_metadata() {
+    let conn = create_test_db();
+    insert_project(&conn, 1);
+    insert_mr(&conn, 1, 1, 99, "Test MR");
+    // Insert 5 discussions
+    for i in 1..=5 {
+        insert_discussion(&conn, i, &format!("disc-{i}"), 1, None, Some(1), "MergeRequest");
+        insert_note_in_discussion(&conn, i, 500 + i, i, 1, "alice", "note");
+    }
+
+    // Limit to 3 — should show has_more = true
+    let filters = DiscussionListFilters {
+        limit: 3,
+        ..DiscussionListFilters::default_for_mr(99)
+    };
+    let result = query_discussions(&conn, &filters, &Config::default()).unwrap();
+    assert_eq!(result.discussions.len(), 3);
+    assert_eq!(result.total_count, 5);
+    // has_more = total_count > showing = 5 > 3 = true
+}
+```
+
 ---

 ## 4. Fix Robot-Docs Response Schemas
@@ -1410,7 +1583,7 @@ Replace:
 With:
 ```json
 "data": {
-    "notes": "[{id:int, gitlab_id:int, author_username:string, body:string?, note_type:string?, is_system:bool, created_at_iso:string, updated_at_iso:string, position_new_path:string?, position_new_line:int?, position_old_path:string?, position_old_line:int?, resolvable:bool, resolved:bool, resolved_by:string?, noteable_type:string?, parent_iid:int?, parent_title:string?, project_path:string, gitlab_discussion_id:string}]",
+    "notes": "[{id:int, gitlab_id:int, gitlab_note_id:int, author_username:string, body:string?, note_type:string?, is_system:bool, created_at_iso:string, updated_at_iso:string, position_new_path:string?, position_new_line:int?, position_old_path:string?, position_old_line:int?, resolvable:bool, resolved:bool, resolved_by:string?, noteable_type:string?, parent_iid:int?, parent_title:string?, project_path:string, gitlab_discussion_id:string}]",
    "total_count": "int",
    "showing": "int"
 }
@@ -1442,11 +1615,11 @@ With:
    "response_schema": {
        "ok": "bool",
        "data": {
-            "discussions": "[{gitlab_discussion_id:string, noteable_type:string, parent_iid:int?, parent_title:string?, project_path:string, individual_note:bool, note_count:int, first_author:string?, first_note_body_snippet:string?, first_note_at_iso:string, last_note_at_iso:string, resolvable:bool, resolved:bool, position_new_path:string?, position_new_line:int?, notes:[NoteListRowJson]?}]",
+            "discussions": "[{gitlab_discussion_id:string, noteable_type:string, parent_iid:int?, parent_title:string?, project_path:string, individual_note:bool, note_count:int, first_author:string?, first_note_body_snippet:string?, first_note_at_iso:string, last_note_at_iso:string, resolvable:bool, resolved:bool, position_new_path:string?, position_new_line:int?, notes:[{...NoteListRowJson fields...}]?}]",
            "total_count": "int",
            "showing": "int"
        },
-        "meta": {"elapsed_ms": "int"}
+        "meta": {"elapsed_ms": "int", "effective_limit": "int", "effective_include_notes": "int", "has_more": "bool"}
    }
 }
 ```
@@ -1473,33 +1646,83 @@ notes within show discussions now include `gitlab_note_id`.
 "discussions: Thread-level discussion listing with gitlab_discussion_id for API integration"
 ```

-#### 4f. Add robot-docs contract tests
+#### 4f. Add robot-docs contract tests (field-set parity)

 **File**: `src/main.rs` (within `#[cfg(test)]` module)

-Add lightweight tests that parse the robot-docs JSON output and assert required Bridge
-Contract fields are present. This prevents schema drift — if someone adds a field to the
-struct but forgets to update robot-docs, the test fails.
+Add tests that parse the robot-docs JSON output and compare declared fields against actual
+serialized struct fields. This is stronger than string-contains checks — it catches schema
+drift in both directions (field added to struct but not docs, or field listed in docs but
+removed from struct).

 ```rust
-#[test]
-fn robot_docs_notes_schema_includes_bridge_fields() {
-    let docs = get_robot_docs_json();  // helper that builds the robot-docs Value
-    let notes_schema = docs["commands"]["notes"]["response_schema"]["data"]["notes"]
-        .as_str().unwrap();
-    assert!(notes_schema.contains("gitlab_discussion_id"));
-    assert!(notes_schema.contains("project_path"));
-    assert!(notes_schema.contains("parent_iid"));
+/// Parse compact schema string "field1:type, field2:type?" into a set of field names
+fn parse_schema_fields(schema: &str) -> HashSet<String> {
+    // Strip leading "[{" and trailing "}]", split on ", ", extract field name before ":"
+    schema.trim_start_matches("[{").trim_end_matches("}]")
+        .split(", ")
+        .filter_map(|f| f.split(':').next())
+        .map(|f| f.to_string())
+        .collect()
+}
+
+/// Get the actual serialized field names from a sample JSON struct
+fn sample_note_json_keys() -> HashSet<String> {
+    let row = NoteListRow { /* ... test defaults ... */ };
+    let json = NoteListRowJson::from(&row);
+    let value = serde_json::to_value(&json).unwrap();
+    value.as_object().unwrap().keys().cloned().collect()
 }

 #[test]
-fn robot_docs_discussions_schema_includes_bridge_fields() {
+fn robot_docs_notes_schema_matches_actual_fields() {
+    let docs = get_robot_docs_json();
+    let notes_schema = docs["commands"]["notes"]["response_schema"]["data"]["notes"]
+        .as_str().unwrap();
+    let declared = parse_schema_fields(notes_schema);
+    let actual = sample_note_json_keys();
+
+    // All bridge fields must be in both declared and actual
+    for bridge in &["gitlab_discussion_id", "project_path", "parent_iid", "gitlab_note_id"] {
+        assert!(declared.contains(*bridge), "robot-docs missing bridge field: {bridge}");
+        assert!(actual.contains(*bridge), "NoteListRowJson missing bridge field: {bridge}");
+    }
+
+    // Every declared field should exist in the actual struct (no phantom docs)
+    for field in &declared {
+        assert!(actual.contains(field),
+            "robot-docs declares '{field}' but NoteListRowJson doesn't serialize it");
+    }
+
+    // Every actual field should be declared in docs (no undocumented fields)
+    for field in &actual {
+        assert!(declared.contains(field),
+            "NoteListRowJson serializes '{field}' but robot-docs doesn't declare it");
+    }
+}
+
+#[test]
+fn robot_docs_discussions_schema_matches_actual_fields() {
    let docs = get_robot_docs_json();
    let disc_schema = docs["commands"]["discussions"]["response_schema"]["data"]["discussions"]
        .as_str().unwrap();
-    assert!(disc_schema.contains("gitlab_discussion_id"));
-    assert!(disc_schema.contains("project_path"));
-    assert!(disc_schema.contains("parent_iid"));
+    let declared = parse_schema_fields(disc_schema);
+    let actual = sample_discussion_json_keys();
+
+    for bridge in &["gitlab_discussion_id", "project_path", "parent_iid"] {
+        assert!(declared.contains(*bridge), "robot-docs missing bridge field: {bridge}");
+        assert!(actual.contains(*bridge), "DiscussionListRowJson missing bridge field: {bridge}");
+    }
+
+    for field in &declared {
+        assert!(actual.contains(field),
+            "robot-docs declares '{field}' but DiscussionListRowJson doesn't serialize it");
+    }
+
+    for field in &actual {
+        assert!(declared.contains(field),
+            "DiscussionListRowJson serializes '{field}' but robot-docs doesn't declare it");
+    }
 }

 #[test]
@@ -1536,6 +1759,7 @@ fn notes_handler_json_includes_bridge_fields() {

    for note in value["notes"].as_array().unwrap() {
        assert!(note.get("gitlab_discussion_id").is_some(), "missing gitlab_discussion_id");
+        assert!(note.get("gitlab_note_id").is_some(), "missing gitlab_note_id");
        assert!(note.get("project_path").is_some(), "missing project_path");
        assert!(note.get("parent_iid").is_some(), "missing parent_iid");
    }
@@ -1591,7 +1815,7 @@ once and reused by Changes 3 and 4.
 After all changes:

 1. An agent can run `lore -J notes --for-mr 3929 --contains "really do prefer"` and get
-   `gitlab_discussion_id` in the response
+   `gitlab_discussion_id` and `gitlab_note_id` in the response
 2. An agent can run `lore -J discussions --for-mr 3929 --resolution unresolved` to see all
   open threads with their IDs
 3. An agent can run `lore -J mrs 3929` and see `gitlab_discussion_id`, `resolvable`,
@@ -1600,19 +1824,28 @@ After all changes:
 4. `lore robot-docs` lists actual field names for all commands
 5. All existing tests still pass
 6. No clippy warnings (pedantic + nursery)
-7. Robot-docs contract tests pass, preventing future schema drift
+7. Robot-docs contract tests pass with field-set parity (not just string-contains), preventing
+   future schema drift in both directions
 8. Bridge Contract fields (`project_path`, `noteable_type`, `parent_iid`,
   `gitlab_discussion_id`, `gitlab_note_id`) are present in every applicable read payload
 9. Bridge Contract fields survive `--fields` filtering in robot mode (guardrail enforced)
 10. `--gitlab-discussion-id` filter works on both `notes` and `discussions` commands
-11. `--include-notes N` populates inline notes on `discussions` output
+11. `--include-notes N` populates inline notes on `discussions` output via single batched query
 12. CLI-level contract integration tests verify bridge fields through the full handler path
+13. `gitlab_note_id` is available in notes list output (alongside `gitlab_id` for back-compat)
+    and in show detail notes, providing a uniform field name across all commands
+14. Ambiguity guardrail fires when `--gitlab-discussion-id` matches multiple projects without
+    `--project` specified
+15. Output guardrails clamp `--limit` to 500 and `--include-notes` to 20; `meta` reports
+    effective values and `has_more` truncation flag
+16. Discussion and show queries use deterministic ordering (COALESCE + id tiebreaker) to
+    prevent unstable output during partial sync states

 ---

 ## Rejected Recommendations

- **Rename `id`→`note_id` and `gitlab_id`→`gitlab_note_id` in notes list output** — rejected because every existing consumer (agents, scripts, field presets) uses `id` and `gitlab_id`. The fields are unambiguous within the `notes` context. The show-command note structs are a different story (they have no IDs at all), so we add `gitlab_note_id` there where it's genuinely missing. Renaming established fields is churn without proportional benefit.
+- **Rename `id`→`note_id` and `gitlab_id`→`gitlab_note_id` in notes list output** — rejected because every existing consumer (agents, scripts, field presets) uses `id` and `gitlab_id`. The fields are unambiguous within the `notes` context. The show-command note structs are a different story (they have no IDs at all), so we add `gitlab_note_id` there where it's genuinely missing. Renaming established fields is churn without proportional benefit. (Updated: we now ADD `gitlab_note_id` as an alias alongside `gitlab_id` per iteration 3 feedback.)
 - **Keyset cursor-based pagination (`--cursor` flag)** — rejected because no existing lore command has pagination, agents use `--limit` effectively, and adding a cursor mechanism is significant scope creep. Tracked as potential future work if agents hit real pagination needs.
 - **Split `note_count` into `user_note_count`/`total_note_count` and rename `first_author` to `first_user_author`** — rejected because `note_count` already excludes system notes by query design (the `WHERE is_system = 0` / `CASE WHEN` filter), and `first_author` already targets the first non-system note. The current naming is clear and consistent with how `notes --include-system` works elsewhere.
 - **Match path filter on both `position_new_path` and `position_old_path`** — rejected because agents care about where code is *now* (new path), not where it was before a rename. Matching old paths adds complexity and returns confusing results for moved files.
@@ -1621,3 +1854,6 @@ After all changes:
 - **Structured robot-docs schema (JSON objects instead of string blobs)** — rejected because the current compact string format is intentionally token-efficient for agent consumption. Switching to nested JSON objects per field would significantly bloat robot-docs output. The string-based contract tests are sufficient — they test what agents actually parse. Agents already work with the inline field listing format used by `issues` and `mrs`.
 - **`bridge_contract` meta-section in robot-docs output** — rejected because agents don't need a separate meta-contract section; they need correct field listings per command, which we already provide. Adding a cross-cutting contract section to robot-docs adds documentation surface area without improving the agent workflow.
 - **Performance regression benchmark test (ignored by default)** — rejected because timing-based assertions are inherently flaky across machines, CI environments, and load conditions. Performance is validated through query plan analysis (EXPLAIN) and manual profiling, not hard-coded elapsed-time thresholds.
+- **Make `--discussion-id` and `--gitlab-discussion-id` aliases for the same backing filter** — rejected because they filter on different identifiers: `--discussion-id` takes the internal integer ID (existing behavior), while `--gitlab-discussion-id` takes the external string ID. These serve fundamentally different use cases (internal cross-referencing vs. external API bridging) and cannot be collapsed without breaking existing consumers.
+- **`--with-write-hints` flag for inline glab endpoint templates** — rejected because this couples lore's read surface to glab's API surface, violating the read/write split principle. The Bridge Contract gives agents the raw identifiers; constructing glab commands is the agent's responsibility. Adding endpoint templates would require lore to track glab API changes, creating an unnecessary maintenance burden.
+- **Show-command note ordering change (`ORDER BY COALESCE(position, ...), created_at, id`)** — rejected because show-command note ordering within a discussion thread is out of scope for this plan. The existing ordering works correctly for present data; the defensive COALESCE pattern is applied to discussion-level ordering where it matters for agent workflows.