Best non-rejected upgrades I’d make to this plan are below. They focus on reducing schema drift, making robot output safer to consume, and improving performance behavior at scale. 1. Add a shared contract model and field constants first (before workstreams 1-4) Rationale: Right now each command has its own structs and ad-hoc mapping. That is exactly how drift happens. A single contract definition reused by `notes`, `show`, `discussions`, and robot-docs gives compile-time coupling between output payloads and docs. It also makes future fields cheaper and safer to add. ```diff @@ Scope: Four workstreams, delivered in order: -1. Add `gitlab_discussion_id` to notes output -2. Add `gitlab_discussion_id` to show command discussion groups -3. Add a standalone `discussions` list command -4. Fix robot-docs to list actual field names instead of opaque type references +0. Introduce shared Bridge Contract model/constants used by notes/show/discussions/robot-docs +1. Add `gitlab_discussion_id` to notes output +2. Add `gitlab_discussion_id` to show command discussion groups +3. Add a standalone `discussions` list command +4. Fix robot-docs to list actual field names instead of opaque type references +## 0. Shared Contract Model (Cross-Cutting) +Define canonical required-field constants and shared mapping helpers, then consume them in: +- `src/cli/commands/list.rs` +- `src/cli/commands/show.rs` +- `src/cli/robot.rs` +- `src/main.rs` robot-docs builder +This removes duplicated field-name strings and prevents docs/output mismatch. ``` 2. Make bridge fields “non-droppable” in robot mode Rationale: The current plan adds fields, but `--fields` can still remove them. That breaks the core read/write bridge contract in exactly the workflows this change is trying to fix. In robot mode, contract fields should always be force-included. ```diff @@ ## Bridge Contract (Cross-Cutting) Every read payload that surfaces notes or discussions **MUST** include: - `project_path` - `noteable_type` - `parent_iid` - `gitlab_discussion_id` - `gitlab_note_id` (when note-level data is returned — i.e., in notes list and show detail) +### Field Filtering Guardrail +In robot mode, `filter_fields` must force-include Bridge Contract fields even when users pass a narrower `--fields` list. +Human/table mode keeps existing behavior. ``` 3. Replace correlated subqueries in `discussions` rollup with a single-pass window/aggregate pattern Rationale: Your CTE is better than naive fanout, but it still uses multiple correlated sub-selects per discussion for first author/body/path. At 200K+ discussions this can regress badly depending on cache/index state. A window-ranked `notes` CTE with grouped aggregates is usually faster and more predictable in SQLite. ```diff @@ #### 3c. SQL Query -Core query uses a CTE + rollup to avoid correlated subquery fanout on larger result sets: +Core query uses a CTE + ranked-notes rollup (window function) to avoid per-row correlated subqueries: -WITH filtered_discussions AS (...), -note_rollup AS ( - SELECT - n.discussion_id, - SUM(...) AS note_count, - (SELECT ... LIMIT 1) AS first_author, - (SELECT ... LIMIT 1) AS first_note_body, - (SELECT ... LIMIT 1) AS position_new_path, - (SELECT ... LIMIT 1) AS position_new_line - FROM notes n - ... -) +WITH filtered_discussions AS (...), +ranked_notes AS ( + SELECT + n.*, + ROW_NUMBER() OVER (PARTITION BY n.discussion_id ORDER BY n.position, n.id) AS rn + FROM notes n + WHERE n.discussion_id IN (SELECT id FROM filtered_discussions) +), +note_rollup AS ( + SELECT + discussion_id, + SUM(CASE WHEN is_system = 0 THEN 1 ELSE 0 END) AS note_count, + MAX(CASE WHEN rn = 1 AND is_system = 0 THEN author_username END) AS first_author, + MAX(CASE WHEN rn = 1 AND is_system = 0 THEN body END) AS first_note_body, + MAX(CASE WHEN position_new_path IS NOT NULL THEN position_new_path END) AS position_new_path, + MAX(CASE WHEN position_new_line IS NOT NULL THEN position_new_line END) AS position_new_line + FROM ranked_notes + GROUP BY discussion_id +) ``` 4. Add direct GitLab ID filters for deterministic bridging Rationale: Bridge workflows often start from one known ID. You already have `gitlab_note_id` in notes filters, but discussion filtering still looks internal-ID-centric. Add explicit GitLab-ID filters so agents do not need extra translation calls. ```diff @@ #### 3a. CLI Args pub struct DiscussionsArgs { + /// Filter by GitLab discussion ID + #[arg(long, help_heading = "Filters")] + pub gitlab_discussion_id: Option, @@ @@ #### 3d. Filters struct pub struct DiscussionListFilters { + pub gitlab_discussion_id: Option, @@ } ``` ```diff @@ ## 1. Add `gitlab_discussion_id` to Notes Output +#### 1g. Add `--gitlab-discussion-id` filter to notes +Allow filtering notes directly by GitLab thread ID (not only internal discussion ID). +This enables one-hop note retrieval from external references. ``` 5. Add optional note expansion to `discussions` for fewer round-trips Rationale: Today the agent flow is often `discussions -> show`. Optional embedded notes (`--include-notes N`) gives a fast path for “list unresolved threads with latest context” without forcing full show payloads. ```diff @@ ### Design lore -J discussions --for-mr 99 --resolution unresolved +lore -J discussions --for-mr 99 --resolution unresolved --include-notes 2 @@ #### 3a. CLI Args + /// Include up to N latest notes per discussion (0 = none) + #[arg(long, default_value = "0", help_heading = "Output")] + pub include_notes: usize, ``` 6. Upgrade robot-docs from string blobs to structured schema + explicit contract block Rationale: `contains("gitlab_discussion_id")` tests on schema strings are brittle. A structured schema object gives machine-checked docs and reliable test assertions. Add a contract section for agent consumers. ```diff @@ ## 4. Fix Robot-Docs Response Schemas -#### 4a. Notes response_schema -Replace stringly-typed schema snippets... +#### 4a. Notes response_schema (structured) +Represent response fields as JSON objects (field -> type/nullable), not freeform strings. +#### 4g. Add `bridge_contract` section in robot-docs +Publish canonical required fields per entity: +- notes +- discussions +- show.discussions +- show.notes ``` 7. Strengthen validation: add CLI-level contract tests and perf guardrails Rationale: Most current tests are unit-level struct/query checks. Add end-to-end JSON contract tests via command handlers, plus a benchmark-style regression test (ignored by default) so performance work stays intentional. ```diff @@ ## Validation Criteria 8. Bridge Contract fields (...) are present in every applicable read payload +9. Contract fields remain present even with `--fields` in robot mode +10. `discussions` query meets performance guardrail on representative fixture (documented threshold) @@ ### Tests +#### Test: robot-mode fields cannot drop bridge contract keys +Run notes/discussions JSON output through `filter_fields` path and assert required keys remain. + +#### Test: CLI contract integration +Invoke command handlers for `notes`, `discussions`, `mrs `, parse JSON, assert required keys and types. + +#### Test (ignored): large-fixture performance regression +Generate representative fixture and assert `query_discussions` stays under target elapsed time. ``` If you want, I can now produce a full “v2 plan” document that applies these diffs end-to-end (including revised delivery order and complete updated sections).