gitlore/docs/plan-expose-discussion-ids.feedback-2.md

Best non-rejected upgrades I’d make to this plan are below. They focus on reducing schema drift, making robot output safer to consume, and improving performance behavior at scale.

1. Add a shared contract model and field constants first (before workstreams 1-4)
Rationale: Right now each command has its own structs and ad-hoc mapping. That is exactly how drift happens. A single contract definition reused by `notes`, `show`, `discussions`, and robot-docs gives compile-time coupling between output payloads and docs. It also makes future fields cheaper and safer to add.

```diff
@@ Scope: Four workstreams, delivered in order:
-1. Add `gitlab_discussion_id` to notes output
-2. Add `gitlab_discussion_id` to show command discussion groups
-3. Add a standalone `discussions` list command
-4. Fix robot-docs to list actual field names instead of opaque type references
+0. Introduce shared Bridge Contract model/constants used by notes/show/discussions/robot-docs
+1. Add `gitlab_discussion_id` to notes output
+2. Add `gitlab_discussion_id` to show command discussion groups
+3. Add a standalone `discussions` list command
+4. Fix robot-docs to list actual field names instead of opaque type references

+## 0. Shared Contract Model (Cross-Cutting)
+Define canonical required-field constants and shared mapping helpers, then consume them in:
+- `src/cli/commands/list.rs`
+- `src/cli/commands/show.rs`
+- `src/cli/robot.rs`
+- `src/main.rs` robot-docs builder
+This removes duplicated field-name strings and prevents docs/output mismatch.
```

2. Make bridge fields “non-droppable” in robot mode
Rationale: The current plan adds fields, but `--fields` can still remove them. That breaks the core read/write bridge contract in exactly the workflows this change is trying to fix. In robot mode, contract fields should always be force-included.

```diff
@@ ## Bridge Contract (Cross-Cutting)
 Every read payload that surfaces notes or discussions **MUST** include:
 - `project_path`
 - `noteable_type`
 - `parent_iid`
 - `gitlab_discussion_id`
 - `gitlab_note_id` (when note-level data is returned — i.e., in notes list and show detail)

+### Field Filtering Guardrail
+In robot mode, `filter_fields` must force-include Bridge Contract fields even when users pass a narrower `--fields` list.
+Human/table mode keeps existing behavior.
```

3. Replace correlated subqueries in `discussions` rollup with a single-pass window/aggregate pattern
Rationale: Your CTE is better than naive fanout, but it still uses multiple correlated sub-selects per discussion for first author/body/path. At 200K+ discussions this can regress badly depending on cache/index state. A window-ranked `notes` CTE with grouped aggregates is usually faster and more predictable in SQLite.

```diff
@@ #### 3c. SQL Query
-Core query uses a CTE + rollup to avoid correlated subquery fanout on larger result sets:
+Core query uses a CTE + ranked-notes rollup (window function) to avoid per-row correlated subqueries:

-WITH filtered_discussions AS (...),
-note_rollup AS (
-    SELECT
-        n.discussion_id,
-        SUM(...) AS note_count,
-        (SELECT ... LIMIT 1) AS first_author,
-        (SELECT ... LIMIT 1) AS first_note_body,
-        (SELECT ... LIMIT 1) AS position_new_path,
-        (SELECT ... LIMIT 1) AS position_new_line
-    FROM notes n
-    ...
-)
+WITH filtered_discussions AS (...),
+ranked_notes AS (
+    SELECT
+        n.*,
+        ROW_NUMBER() OVER (PARTITION BY n.discussion_id ORDER BY n.position, n.id) AS rn
+    FROM notes n
+    WHERE n.discussion_id IN (SELECT id FROM filtered_discussions)
+),
+note_rollup AS (
+    SELECT
+        discussion_id,
+        SUM(CASE WHEN is_system = 0 THEN 1 ELSE 0 END) AS note_count,
+        MAX(CASE WHEN rn = 1 AND is_system = 0 THEN author_username END) AS first_author,
+        MAX(CASE WHEN rn = 1 AND is_system = 0 THEN body END) AS first_note_body,
+        MAX(CASE WHEN position_new_path IS NOT NULL THEN position_new_path END) AS position_new_path,
+        MAX(CASE WHEN position_new_line IS NOT NULL THEN position_new_line END) AS position_new_line
+    FROM ranked_notes
+    GROUP BY discussion_id
+)
```

4. Add direct GitLab ID filters for deterministic bridging
Rationale: Bridge workflows often start from one known ID. You already have `gitlab_note_id` in notes filters, but discussion filtering still looks internal-ID-centric. Add explicit GitLab-ID filters so agents do not need extra translation calls.

```diff
@@ #### 3a. CLI Args
 pub struct DiscussionsArgs {
+    /// Filter by GitLab discussion ID
+    #[arg(long, help_heading = "Filters")]
+    pub gitlab_discussion_id: Option<String>,
@@

@@ #### 3d. Filters struct
 pub struct DiscussionListFilters {
+    pub gitlab_discussion_id: Option<String>,
@@
 }
```

```diff
@@ ## 1. Add `gitlab_discussion_id` to Notes Output
+#### 1g. Add `--gitlab-discussion-id` filter to notes
+Allow filtering notes directly by GitLab thread ID (not only internal discussion ID).
+This enables one-hop note retrieval from external references.
```

5. Add optional note expansion to `discussions` for fewer round-trips
Rationale: Today the agent flow is often `discussions -> show`. Optional embedded notes (`--include-notes N`) gives a fast path for “list unresolved threads with latest context” without forcing full show payloads.

```diff
@@ ### Design
 lore -J discussions --for-mr 99 --resolution unresolved
+lore -J discussions --for-mr 99 --resolution unresolved --include-notes 2

@@ #### 3a. CLI Args
+    /// Include up to N latest notes per discussion (0 = none)
+    #[arg(long, default_value = "0", help_heading = "Output")]
+    pub include_notes: usize,
```

6. Upgrade robot-docs from string blobs to structured schema + explicit contract block
Rationale: `contains("gitlab_discussion_id")` tests on schema strings are brittle. A structured schema object gives machine-checked docs and reliable test assertions. Add a contract section for agent consumers.

```diff
@@ ## 4. Fix Robot-Docs Response Schemas
-#### 4a. Notes response_schema
-Replace stringly-typed schema snippets...
+#### 4a. Notes response_schema (structured)
+Represent response fields as JSON objects (field -> type/nullable), not freeform strings.

+#### 4g. Add `bridge_contract` section in robot-docs
+Publish canonical required fields per entity:
+- notes
+- discussions
+- show.discussions
+- show.notes
```

7. Strengthen validation: add CLI-level contract tests and perf guardrails
Rationale: Most current tests are unit-level struct/query checks. Add end-to-end JSON contract tests via command handlers, plus a benchmark-style regression test (ignored by default) so performance work stays intentional.

```diff
@@ ## Validation Criteria
 8. Bridge Contract fields (...) are present in every applicable read payload
+9. Contract fields remain present even with `--fields` in robot mode
+10. `discussions` query meets performance guardrail on representative fixture (documented threshold)

@@ ### Tests
+#### Test: robot-mode fields cannot drop bridge contract keys
+Run notes/discussions JSON output through `filter_fields` path and assert required keys remain.
+
+#### Test: CLI contract integration
+Invoke command handlers for `notes`, `discussions`, `mrs <iid>`, parse JSON, assert required keys and types.
+
+#### Test (ignored): large-fixture performance regression
+Generate representative fixture and assert `query_discussions` stays under target elapsed time.
```

If you want, I can now produce a full “v2 plan” document that applies these diffs end-to-end (including revised delivery order and complete updated sections).