gitlore/docs/plan-expose-discussion-ids.feedback-2.md at 4d41d74ea7a3da12795bc3b5a7b2019106e82a0a

Files

teernisse a1bca10408 feat(cli): implement 'lore file-history' command (bd-z94)

Adds file-history command showing which MRs touched a file, with:
- Rename chain resolution via BFS (resolve_rename_chain from bd-1yx)
- DiffNote discussion snippets with --discussions flag
- --merged filter, --no-follow-renames, -n limit
- Human output with styled MR list and rename chain display
- Robot JSON output with {ok, data, meta} envelope
- Autocorrect registry and robot-docs manifest entry
- Fixes pre-existing --no-status missing from sync autocorrect registry

2026-02-17 12:57:56 -05:00

7.5 KiB

Raw Blame History

Best non-rejected upgrades I’d make to this plan are below. They focus on reducing schema drift, making robot output safer to consume, and improving performance behavior at scale.

Add a shared contract model and field constants first (before workstreams 1-4)
Rationale: Right now each command has its own structs and ad-hoc mapping. That is exactly how drift happens. A single contract definition reused by notes, show, discussions, and robot-docs gives compile-time coupling between output payloads and docs. It also makes future fields cheaper and safer to add.

@@ Scope: Four workstreams, delivered in order:
-1. Add `gitlab_discussion_id` to notes output
-2. Add `gitlab_discussion_id` to show command discussion groups
-3. Add a standalone `discussions` list command
-4. Fix robot-docs to list actual field names instead of opaque type references
+0. Introduce shared Bridge Contract model/constants used by notes/show/discussions/robot-docs
+1. Add `gitlab_discussion_id` to notes output
+2. Add `gitlab_discussion_id` to show command discussion groups
+3. Add a standalone `discussions` list command
+4. Fix robot-docs to list actual field names instead of opaque type references

+## 0. Shared Contract Model (Cross-Cutting)
+Define canonical required-field constants and shared mapping helpers, then consume them in:
+- `src/cli/commands/list.rs`
+- `src/cli/commands/show.rs`
+- `src/cli/robot.rs`
+- `src/main.rs` robot-docs builder
+This removes duplicated field-name strings and prevents docs/output mismatch.

Make bridge fields “non-droppable” in robot mode
Rationale: The current plan adds fields, but --fields can still remove them. That breaks the core read/write bridge contract in exactly the workflows this change is trying to fix. In robot mode, contract fields should always be force-included.

@@ ## Bridge Contract (Cross-Cutting)
 Every read payload that surfaces notes or discussions **MUST** include:
 - `project_path`
 - `noteable_type`
 - `parent_iid`
 - `gitlab_discussion_id`
 - `gitlab_note_id` (when note-level data is returned — i.e., in notes list and show detail)

+### Field Filtering Guardrail
+In robot mode, `filter_fields` must force-include Bridge Contract fields even when users pass a narrower `--fields` list.
+Human/table mode keeps existing behavior.

Replace correlated subqueries in discussions rollup with a single-pass window/aggregate pattern
Rationale: Your CTE is better than naive fanout, but it still uses multiple correlated sub-selects per discussion for first author/body/path. At 200K+ discussions this can regress badly depending on cache/index state. A window-ranked notes CTE with grouped aggregates is usually faster and more predictable in SQLite.

@@ #### 3c. SQL Query
-Core query uses a CTE + rollup to avoid correlated subquery fanout on larger result sets:
+Core query uses a CTE + ranked-notes rollup (window function) to avoid per-row correlated subqueries:

-WITH filtered_discussions AS (...),
-note_rollup AS (
-    SELECT
-        n.discussion_id,
-        SUM(...) AS note_count,
-        (SELECT ... LIMIT 1) AS first_author,
-        (SELECT ... LIMIT 1) AS first_note_body,
-        (SELECT ... LIMIT 1) AS position_new_path,
-        (SELECT ... LIMIT 1) AS position_new_line
-    FROM notes n
-    ...
-)
+WITH filtered_discussions AS (...),
+ranked_notes AS (
+    SELECT
+        n.*,
+        ROW_NUMBER() OVER (PARTITION BY n.discussion_id ORDER BY n.position, n.id) AS rn
+    FROM notes n
+    WHERE n.discussion_id IN (SELECT id FROM filtered_discussions)
+),
+note_rollup AS (
+    SELECT
+        discussion_id,
+        SUM(CASE WHEN is_system = 0 THEN 1 ELSE 0 END) AS note_count,
+        MAX(CASE WHEN rn = 1 AND is_system = 0 THEN author_username END) AS first_author,
+        MAX(CASE WHEN rn = 1 AND is_system = 0 THEN body END) AS first_note_body,
+        MAX(CASE WHEN position_new_path IS NOT NULL THEN position_new_path END) AS position_new_path,
+        MAX(CASE WHEN position_new_line IS NOT NULL THEN position_new_line END) AS position_new_line
+    FROM ranked_notes
+    GROUP BY discussion_id
+)

Add direct GitLab ID filters for deterministic bridging
Rationale: Bridge workflows often start from one known ID. You already have gitlab_note_id in notes filters, but discussion filtering still looks internal-ID-centric. Add explicit GitLab-ID filters so agents do not need extra translation calls.

@@ #### 3a. CLI Args
 pub struct DiscussionsArgs {
+    /// Filter by GitLab discussion ID
+    #[arg(long, help_heading = "Filters")]
+    pub gitlab_discussion_id: Option<String>,
@@

@@ #### 3d. Filters struct
 pub struct DiscussionListFilters {
+    pub gitlab_discussion_id: Option<String>,
@@
 }

@@ ## 1. Add `gitlab_discussion_id` to Notes Output
+#### 1g. Add `--gitlab-discussion-id` filter to notes
+Allow filtering notes directly by GitLab thread ID (not only internal discussion ID).
+This enables one-hop note retrieval from external references.

Add optional note expansion to discussions for fewer round-trips
Rationale: Today the agent flow is often discussions -> show. Optional embedded notes (--include-notes N) gives a fast path for “list unresolved threads with latest context” without forcing full show payloads.

@@ ### Design
 lore -J discussions --for-mr 99 --resolution unresolved
+lore -J discussions --for-mr 99 --resolution unresolved --include-notes 2

@@ #### 3a. CLI Args
+    /// Include up to N latest notes per discussion (0 = none)
+    #[arg(long, default_value = "0", help_heading = "Output")]
+    pub include_notes: usize,

Upgrade robot-docs from string blobs to structured schema + explicit contract block
Rationale: contains("gitlab_discussion_id") tests on schema strings are brittle. A structured schema object gives machine-checked docs and reliable test assertions. Add a contract section for agent consumers.

@@ ## 4. Fix Robot-Docs Response Schemas
-#### 4a. Notes response_schema
-Replace stringly-typed schema snippets...
+#### 4a. Notes response_schema (structured)
+Represent response fields as JSON objects (field -> type/nullable), not freeform strings.

+#### 4g. Add `bridge_contract` section in robot-docs
+Publish canonical required fields per entity:
+- notes
+- discussions
+- show.discussions
+- show.notes

Strengthen validation: add CLI-level contract tests and perf guardrails
Rationale: Most current tests are unit-level struct/query checks. Add end-to-end JSON contract tests via command handlers, plus a benchmark-style regression test (ignored by default) so performance work stays intentional.

@@ ## Validation Criteria
 8. Bridge Contract fields (...) are present in every applicable read payload
+9. Contract fields remain present even with `--fields` in robot mode
+10. `discussions` query meets performance guardrail on representative fixture (documented threshold)

@@ ### Tests
+#### Test: robot-mode fields cannot drop bridge contract keys
+Run notes/discussions JSON output through `filter_fields` path and assert required keys remain.
+
+#### Test: CLI contract integration
+Invoke command handlers for `notes`, `discussions`, `mrs <iid>`, parse JSON, assert required keys and types.
+
+#### Test (ignored): large-fixture performance regression
+Generate representative fixture and assert `query_discussions` stays under target elapsed time.

If you want, I can now produce a full “v2 plan” document that applies these diffs end-to-end (including revised delivery order and complete updated sections).

7.5 KiB Raw Blame History Unescape Escape

7.5 KiB

Raw Blame History