Files
gitlore/docs/plan-expose-discussion-ids.md
teernisse 171260a772 feat(cli): implement 'lore trace' command (bd-2n4, bd-9dd)
Gate 5 Code Trace - Tier 1 (API-only, no git blame).
Answers 'Why was this code introduced?' by building
file -> MR -> issue -> discussion chains.

New files:
- src/core/trace.rs: run_trace() query logic with rename-aware
  path resolution, entity_reference-based issue linking, and
  DiffNote discussion extraction
- src/core/trace_tests.rs: 7 unit tests for query logic
- src/cli/commands/trace.rs: CLI command with human output,
  robot JSON output, and :line suffix parsing (5 tests)

Human output shows full content (no truncation).
Robot JSON truncates discussion bodies to 500 chars for token efficiency.

Wiring:
- TraceArgs + Commands::Trace in cli/mod.rs
- handle_trace in main.rs
- VALID_COMMANDS + robot-docs manifest entry
- COMMAND_FLAGS autocorrect registry entry

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 14:57:21 -05:00

72 KiB

plan, title, status, iteration, target_iterations, beads_revision, related_plans, created, updated
plan title status iteration target_iterations beads_revision related_plans created updated
true iterating 4 8 0
2026-02-17 2026-02-17

Plan: Expose Discussion IDs Across the Read Surface

Problem: Agents can't bridge from lore's read output to glab's write API because gitlab_discussion_id — stored in the DB — is never exposed in any output. The read/write split contract requires lore to emit every identifier an agent needs to construct a glab write command.

Scope: Four workstreams, delivered in order:

  1. Add gitlab_discussion_id to notes output
  2. Add gitlab_discussion_id to show command discussion groups
  3. Add a standalone discussions list command
  4. Fix robot-docs to list actual field names instead of opaque type references

Bridge Contract (Cross-Cutting)

Every read payload that surfaces notes or discussions MUST include:

  • project_path
  • gitlab_project_id
  • noteable_type
  • parent_iid
  • gitlab_discussion_id
  • gitlab_note_id (when note-level data is returned — i.e., in notes list and show detail)

Rationale for gitlab_project_id: project_path is human-readable but mutable — project renames and group transfers change it. gitlab_project_id is immutable and is the identifier agents need for glab api /projects/:id/... write calls. Including both ensures agents can construct API calls without a separate project-ID lookup, even after path changes.

Back-compat rule: Note payloads in the notes list command continue exposing gitlab_id for existing consumers, but MUST also expose gitlab_note_id with the same value. This ensures agents can use a single field name (gitlab_note_id) across all commands — notes, show, and discussions --include-notes — without special-casing by command.

This contract exists so agents can deterministically construct glab api write calls without cross-referencing multiple commands. Each workstream below must satisfy these fields in its output.

Field Filtering Guardrail

In robot mode, filter_fields MUST force-include Bridge Contract fields even when the caller passes a narrower --fields list. This prevents agents from accidentally stripping the identifiers they need for write operations.

Implementation: Add a BRIDGE_FIELDS constant map per entity type. In filter_fields(), when operating in robot mode, union the caller's requested fields with the bridge set before filtering. Human/table mode keeps existing behavior (no forced fields).

// src/cli/robot.rs
const BRIDGE_FIELDS_NOTES: &[&str] = &[
    "project_path", "gitlab_project_id", "noteable_type", "parent_iid",
    "gitlab_discussion_id", "gitlab_note_id",
];
const BRIDGE_FIELDS_DISCUSSIONS: &[&str] = &[
    "project_path", "gitlab_project_id", "noteable_type", "parent_iid",
    "gitlab_discussion_id",
];

In filter_fields, when entity is "notes" or "discussions", merge the bridge set into the requested fields before filtering the JSON value. This is a ~5-line change to the existing function.

Ambiguity Guardrail

When filtering by gitlab_discussion_id (on either notes or discussions commands) without --project, if the query matches discussions in multiple projects:

  • Return an Ambiguous error (exit code 18, matching existing convention)
  • Include matching project paths in the error message
  • Suggest retry with --project <path>

Implementation: Run a preflight distinct-project check before the main list query executes its LIMIT. This is critical because a post-query check on the paginated result set can silently miss cross-project ambiguity when LIMIT truncates results to rows from a single project. The preflight query is cheap (hits the gitlab_discussion_id index, returns at most a few rows) and eliminates non-deterministic write-targeting risk.

-- Preflight ambiguity check (runs before main query)
SELECT DISTINCT p.path_with_namespace
FROM discussions d
JOIN projects p ON p.id = d.project_id
WHERE d.gitlab_discussion_id = ?
LIMIT 3

If more than one project is found, return LoreError::Ambiguous (exit code 18) with the distinct project paths and suggestion to retry with --project <path>.

// In query_notes / query_discussions, before executing the main query:
if let Some(ref disc_id) = filters.gitlab_discussion_id {
    if filters.project.is_none() {
        let distinct_projects: Vec<String> = conn
            .prepare(
                "SELECT DISTINCT p.path_with_namespace \
                 FROM discussions d \
                 JOIN projects p ON p.id = d.project_id \
                 WHERE d.gitlab_discussion_id = ? \
                 LIMIT 3"
            )?
            .query_map([disc_id], |row| row.get(0))?
            .collect::<std::result::Result<Vec<_>, _>>()?;

        if distinct_projects.len() > 1 {
            return Err(LoreError::Ambiguous {
                message: format!(
                    "Discussion ID matches {} projects: {}. Use --project to disambiguate.",
                    distinct_projects.len(),
                    distinct_projects.join(", ")
                ),
            });
        }
    }
}

1h. Wrap query_notes in a read transaction

Wrap the count query and page query in a deferred read transaction per the Snapshot Consistency cross-cutting requirement. See the Bridge Contract section for the pattern.

Tests

File: src/cli/commands/list_tests.rs

Test 1: gitlab_discussion_id present in NoteListRowJson

#[test]
fn note_list_row_json_includes_gitlab_discussion_id() {
    let row = NoteListRow {
        id: 1,
        gitlab_id: 100,
        author_username: "alice".to_string(),
        body: Some("test".to_string()),
        note_type: Some("DiscussionNote".to_string()),
        is_system: false,
        created_at: 1_700_000_000_000,
        updated_at: 1_700_000_000_000,
        position_new_path: None,
        position_new_line: None,
        position_old_path: None,
        position_old_line: None,
        resolvable: false,
        resolved: false,
        resolved_by: None,
        noteable_type: Some("MergeRequest".to_string()),
        parent_iid: Some(42),
        parent_title: Some("Fix bug".to_string()),
        project_path: "group/project".to_string(),
        gitlab_discussion_id: "6a9c1750b37d".to_string(),
        gitlab_project_id: 789,
    };

    let json_row = NoteListRowJson::from(&row);
    assert_eq!(json_row.gitlab_discussion_id, "6a9c1750b37d");
    assert_eq!(json_row.gitlab_note_id, 100);  // alias matches gitlab_id
    assert_eq!(json_row.gitlab_project_id, 789);

    let serialized = serde_json::to_value(&json_row).unwrap();
    assert!(serialized.get("gitlab_discussion_id").is_some());
    assert_eq!(
        serialized["gitlab_discussion_id"].as_str().unwrap(),
        "6a9c1750b37d"
    );
    // Both gitlab_id and gitlab_note_id present with same value
    assert_eq!(serialized["gitlab_id"], 100);
    assert_eq!(serialized["gitlab_note_id"], 100);
    assert_eq!(serialized["gitlab_project_id"], 789);
}

Test 2: query_notes returns gitlab_discussion_id from DB

#[test]
fn query_notes_returns_gitlab_discussion_id() {
    let conn = create_connection(Path::new(":memory:")).unwrap();
    run_migrations(&conn).unwrap();

    // Insert project, discussion, note
    conn.execute(
        "INSERT INTO projects (id, gitlab_project_id, path_with_namespace, web_url)
         VALUES (1, 1, 'group/project', 'https://gitlab.com/group/project')",
        [],
    ).unwrap();
    conn.execute(
        "INSERT INTO discussions (id, gitlab_discussion_id, project_id, issue_id, noteable_type, last_seen_at)
         VALUES (1, 'abc123def456', 1, NULL, 'MergeRequest', 1000)",
        [],
    ).unwrap();
    conn.execute(
        "INSERT INTO merge_requests (id, gitlab_id, project_id, iid, title, state, author_username, source_branch, target_branch, created_at, updated_at, last_seen_at)
         VALUES (1, 1, 1, 99, 'Test MR', 'opened', 'alice', 'feat', 'main', 1000, 1000, 1000)",
        [],
    ).unwrap();
    // Update discussion to reference MR
    conn.execute(
        "UPDATE discussions SET merge_request_id = 1 WHERE id = 1",
        [],
    ).unwrap();
    conn.execute(
        "INSERT INTO notes (id, gitlab_id, discussion_id, project_id, author_username, body, created_at, updated_at, last_seen_at, is_system, position)
         VALUES (1, 500, 1, 1, 'bob', 'test note', 1000, 1000, 1000, 0, 0)",
        [],
    ).unwrap();

    let config = Config::default();
    let filters = NoteListFilters {
        limit: 10,
        project: None,
        author: None,
        note_type: None,
        include_system: false,
        for_issue_iid: None,
        for_mr_iid: None,
        note_id: None,
        gitlab_note_id: None,
        discussion_id: None,
        since: None,
        until: None,
        path: None,
        contains: None,
        resolution: None,
        sort: "created".to_string(),
        order: "desc".to_string(),
    };

    let result = query_notes(&conn, &filters, &config).unwrap();
    assert_eq!(result.notes.len(), 1);
    assert_eq!(result.notes[0].gitlab_discussion_id, "abc123def456");
}

Test 3: --fields filtering includes gitlab_discussion_id

#[test]
fn fields_filter_retains_gitlab_discussion_id() {
    let mut value = serde_json::json!({
        "data": {
            "notes": [{
                "id": 1,
                "gitlab_id": 100,
                "author_username": "alice",
                "body": "test",
                "gitlab_discussion_id": "abc123"
            }]
        }
    });

    filter_fields(
        &mut value,
        "notes",
        &["id".to_string(), "gitlab_discussion_id".to_string()],
    );

    let note = &value["data"]["notes"][0];
    assert_eq!(note["id"], 1);
    assert_eq!(note["gitlab_discussion_id"], "abc123");
    assert!(note.get("body").is_none());
}

Test 4: Bridge fields survive aggressive --fields filtering in robot mode

#[test]
fn bridge_fields_forced_in_robot_mode() {
    // Agent requests only "body" — bridge fields must still appear
    let mut value = serde_json::json!({
        "data": {
            "notes": [{
                "id": 1,
                "body": "test",
                "project_path": "group/repo",
                "gitlab_project_id": 42,
                "noteable_type": "MergeRequest",
                "parent_iid": 42,
                "gitlab_discussion_id": "abc123",
                "gitlab_note_id": 500
            }]
        }
    });

    // In robot mode, filter_fields merges bridge set
    filter_fields_robot(
        &mut value,
        "notes",
        &["body".to_string()],
    );

    let note = &value["data"]["notes"][0];
    assert_eq!(note["body"], "test");
    // Bridge fields survive despite not being requested:
    assert!(note.get("project_path").is_some());
    assert!(note.get("gitlab_project_id").is_some());
    assert!(note.get("gitlab_discussion_id").is_some());
    assert!(note.get("parent_iid").is_some());
}

Test 5: --gitlab-discussion-id filter returns matching notes

#[test]
fn notes_filter_by_gitlab_discussion_id() {
    let conn = create_test_db();
    // Insert 2 discussions with different gitlab_discussion_ids, each with notes
    // Filter by one gitlab_discussion_id
    // Assert only notes from matching discussion are returned
}

Test 6: Ambiguity guardrail fires for cross-project discussion ID matches

#[test]
fn notes_ambiguous_gitlab_discussion_id_across_projects() {
    let conn = create_test_db();
    // Insert 2 projects, each with a discussion sharing the same gitlab_discussion_id
    // (this can happen since IDs are per-project)
    // Filter by gitlab_discussion_id without --project
    // Assert LoreError::Ambiguous is returned with both project paths
}

Test 7: Ambiguity preflight catches cross-project match even with LIMIT 1

#[test]
fn notes_ambiguity_preflight_not_defeated_by_limit() {
    let conn = create_test_db();
    // Insert 2 projects, each with a discussion sharing the same gitlab_discussion_id
    // Use --limit 1, which would hide the second project in a post-query check
    // Assert LoreError::Ambiguous is still returned (preflight runs before LIMIT)
}

2. Add gitlab_discussion_id to Show Command Discussion Groups

Why

lore -J issues 42 and lore -J mrs 99 return discussions grouped by thread, but neither DiscussionDetailJson nor MrDiscussionDetailJson includes the gitlab_discussion_id. An agent viewing MR details can see all discussion threads but can't identify which one to reply to.

Current Code

Issue discussions (src/cli/commands/show.rs:99-102):

pub struct DiscussionDetail {
    pub notes: Vec<NoteDetail>,
    pub individual_note: bool,
}

MR discussions (src/cli/commands/show.rs:37-40):

pub struct MrDiscussionDetail {
    pub notes: Vec<MrNoteDetail>,
    pub individual_note: bool,
}

JSON equivalents (show.rs:1001-1003 and show.rs:1100-1103):

pub struct DiscussionDetailJson {
    pub notes: Vec<NoteDetailJson>,
    pub individual_note: bool,
}
pub struct MrDiscussionDetailJson {
    pub notes: Vec<MrNoteDetailJson>,
    pub individual_note: bool,
}

Queries (show.rs:325-328 and show.rs:537-540):

SELECT id, individual_note FROM discussions WHERE issue_id = ? ORDER BY first_note_at
SELECT id, individual_note FROM discussions WHERE merge_request_id = ? ORDER BY first_note_at

Changes Required

2a. Add fields to domain structs

File: src/cli/commands/show.rs

pub struct DiscussionDetail {
    pub gitlab_discussion_id: String,  // ADD
    pub resolvable: bool,              // ADD — agents need thread state
    pub resolved: bool,                // ADD — agents need thread state
    pub last_note_at: i64,             // ADD — for recency sorting
    pub notes: Vec<NoteDetail>,
    pub individual_note: bool,
}

pub struct MrDiscussionDetail {
    pub gitlab_discussion_id: String,  // ADD
    pub resolvable: bool,              // ADD
    pub resolved: bool,                // ADD
    pub last_note_at: i64,             // ADD
    pub notes: Vec<MrNoteDetail>,
    pub individual_note: bool,
}

2b. Add fields to JSON structs

pub struct DiscussionDetailJson {
    pub gitlab_discussion_id: String,  // ADD
    pub resolvable: bool,              // ADD
    pub resolved: bool,                // ADD
    pub last_note_at_iso: String,      // ADD — ISO formatted
    pub notes: Vec<NoteDetailJson>,
    pub individual_note: bool,
}

pub struct MrDiscussionDetailJson {
    pub gitlab_discussion_id: String,  // ADD
    pub resolvable: bool,              // ADD
    pub resolved: bool,                // ADD
    pub last_note_at_iso: String,      // ADD — ISO formatted
    pub notes: Vec<MrNoteDetailJson>,
    pub individual_note: bool,
}

2c. Update queries to SELECT new fields

Issue discussions (show.rs:325):

SELECT id, gitlab_discussion_id, individual_note, resolvable, resolved,
       COALESCE(last_note_at, first_note_at, 0) AS last_note_at
FROM discussions
WHERE issue_id = ? ORDER BY COALESCE(first_note_at, last_note_at, 0), id

MR discussions (show.rs:537):

SELECT id, gitlab_discussion_id, individual_note, resolvable, resolved,
       COALESCE(last_note_at, first_note_at, 0) AS last_note_at
FROM discussions
WHERE merge_request_id = ? ORDER BY COALESCE(first_note_at, last_note_at, 0), id

Note on ordering: The COALESCE with tiebreaker id ensures deterministic ordering even when timestamps are NULL (possible during partial sync states). This prevents unstable output that could confuse automated workflows.

2d. Update query_map closures

The disc_rows tuple changes from (i64, bool) to a richer shape. Use named columns here too for clarity:

Issue path (show.rs:331-335):

let disc_rows: Vec<(i64, String, bool, bool, bool, i64)> = disc_stmt
    .query_map([issue_id], |row| {
        Ok((
            row.get("id")?,
            row.get("gitlab_discussion_id")?,
            row.get::<_, i64>("individual_note").map(|v| v == 1)?,
            row.get::<_, i64>("resolvable").map(|v| v == 1)?,
            row.get::<_, i64>("resolved").map(|v| v == 1)?,
            row.get("last_note_at")?,
        ))
    })?
    .collect::<std::result::Result<Vec<_>, _>>()?;

And where discussions are constructed (show.rs:361):

for (disc_id, gitlab_disc_id, individual_note, resolvable, resolved, last_note_at) in disc_rows {
    // ... existing note query ...
    discussions.push(DiscussionDetail {
        gitlab_discussion_id: gitlab_disc_id,
        resolvable,
        resolved,
        last_note_at,
        notes,
        individual_note,
    });
}

Same pattern for MR discussions (show.rs:543-560, show.rs:598).

2e. Update From impls

impl From<&DiscussionDetail> for DiscussionDetailJson {
    fn from(disc: &DiscussionDetail) -> Self {
        Self {
            gitlab_discussion_id: disc.gitlab_discussion_id.clone(),
            resolvable: disc.resolvable,
            resolved: disc.resolved,
            last_note_at_iso: format_iso_timestamp(disc.last_note_at),
            notes: disc.notes.iter().map(|n| n.into()).collect(),
            individual_note: disc.individual_note,
        }
    }
}

impl From<&MrDiscussionDetail> for MrDiscussionDetailJson {
    fn from(disc: &MrDiscussionDetail) -> Self {
        Self {
            gitlab_discussion_id: disc.gitlab_discussion_id.clone(),
            resolvable: disc.resolvable,
            resolved: disc.resolved,
            last_note_at_iso: format_iso_timestamp(disc.last_note_at),
            notes: disc.notes.iter().map(|n| n.into()).collect(),
            individual_note: disc.individual_note,
        }
    }
}

2f. Add gitlab_note_id to note detail structs in show

While we're here, add gitlab_id (as gitlab_note_id in JSON) to NoteDetail, MrNoteDetail, and their JSON counterparts. Currently show-command notes only have author_username, body, created_at, is_system — no note ID at all, making it impossible to reference a specific note. This satisfies the Bridge Contract requirement for gitlab_note_id on note-level data.

Domain structs — add gitlab_id: i64 field. JSON structs — add gitlab_note_id: i64 field. Queries — add n.gitlab_id to the note SELECT within show. From impls — map gitlab_idgitlab_note_id.

Tests

File: src/cli/commands/show_tests.rs (or within show.rs #[cfg(test)])

Test 1: Issue show includes gitlab_discussion_id

#[test]
fn show_issue_includes_gitlab_discussion_id() {
    // Setup: project, issue, discussion with known gitlab_discussion_id, note
    let conn = create_test_db();
    insert_project(&conn, 1);
    insert_issue(&conn, 1, 1, 42);
    conn.execute(
        "INSERT INTO discussions (id, gitlab_discussion_id, project_id, issue_id, noteable_type, last_seen_at)
         VALUES (1, 'abc123hex', 1, 1, 'Issue', 1000)",
        [],
    ).unwrap();
    insert_note(&conn, 1, 500, 1, 1, "alice", "hello", false);

    let detail = run_show_issue_with_conn(&conn, 42, None).unwrap();
    assert_eq!(detail.discussions.len(), 1);
    assert_eq!(detail.discussions[0].gitlab_discussion_id, "abc123hex");
}

Test 2: MR show includes gitlab_discussion_id

Same pattern for MR path.

Test 3: JSON serialization includes the field

#[test]
fn discussion_detail_json_has_gitlab_discussion_id() {
    let detail = DiscussionDetail {
        gitlab_discussion_id: "deadbeef".to_string(),
        resolvable: true,
        resolved: false,
        last_note_at: 1_700_000_000_000,
        notes: vec![],
        individual_note: false,
    };
    let json = DiscussionDetailJson::from(&detail);
    let value = serde_json::to_value(&json).unwrap();
    assert_eq!(value["gitlab_discussion_id"], "deadbeef");
    assert_eq!(value["resolvable"], true);
    assert_eq!(value["resolved"], false);
    assert!(value.get("last_note_at_iso").is_some());
}

Test 4: Show note includes gitlab_note_id

#[test]
fn show_note_detail_json_has_gitlab_note_id() {
    // Verify NoteDetailJson serialization includes gitlab_note_id
}

3. Add Standalone discussions List Command

Why

Discussions are a first-class entity in the DB (211K rows) but invisible to agents as a collection. An agent working on an MR needs to see all threads at a glance — which are unresolved, who started them, what file they're on — with the gitlab_discussion_id needed to reply. Currently the only way is lore notes --for-mr 99 which returns flat notes without discussion grouping.

Design

lore discussions [OPTIONS]

# List all discussions on MR 99 (most common agent use case)
lore -J discussions --for-mr 99

# List unresolved discussions on MR 99
lore -J discussions --for-mr 99 --resolution unresolved

# List discussions on issue 42
lore -J discussions --for-issue 42

# List discussions across a project
lore -J discussions -p group/repo --since 7d

# Look up a specific discussion by GitLab ID
lore -J discussions --gitlab-discussion-id 6a9c1750b37d

# List unresolved threads with latest 2 notes inline (fewer round-trips)
lore -J discussions --for-mr 99 --resolution unresolved --include-notes 2

Response Schema

{
  "ok": true,
  "data": {
    "discussions": [
      {
        "gitlab_discussion_id": "6a9c1750b37d513a...",
        "noteable_type": "MergeRequest",
        "parent_iid": 3929,
        "parent_title": "Resolve \"Switch Health Card\"",
        "project_path": "vs/typescript-code",
        "gitlab_project_id": 42,
        "individual_note": false,
        "note_count": 3,
        "first_author": "elovegrove",
        "first_note_body_snippet": "Ok @teernisse well I really do prefer...",
        "first_note_at_iso": "2026-02-16T14:31:34Z",
        "last_note_at_iso": "2026-02-16T15:02:11Z",
        "resolvable": true,
        "resolved": false,
        "position_new_path": "src/components/SwitchHealthCard.vue",
        "position_new_line": 42,
        "included_note_count": 0,
        "has_more_notes": false,
        "notes": []
      }
    ],
    "total_count": 15,
    "showing": 15
  },
  "meta": {
    "elapsed_ms": 12,
    "effective_limit": 50,
    "effective_include_notes": 0,
    "has_more": false
  }
}

The notes array is empty by default (zero overhead). When --include-notes N is provided, each discussion includes up to N of its most recent notes inline. This covers the common agent pattern of "show me unresolved threads with context" in a single round-trip.

The included_note_count and has_more_notes fields provide per-discussion truncation signals. included_note_count is the number of notes actually included in the notes array, and has_more_notes is true when note_count > included_note_count. This lets agents know whether a thread's notes were fully returned or truncated, enabling them to decide whether a follow-up lore notes --gitlab-discussion-id <id> call is needed for the complete thread.

The meta block includes effective_limit and effective_include_notes (the clamped values actually used) plus has_more (true when total_count > showing). This lets agents detect truncation and decide whether to paginate or narrow their query.

File Architecture

No new files. Follow the existing pattern:

  • Args struct: src/cli/mod.rs (alongside NotesArgs, IssuesArgs)
  • Query + print functions: src/cli/commands/list.rs (alongside query_notes, print_list_notes_json)
  • Handler: src/main.rs (alongside handle_notes)
  • Tests: src/cli/commands/list_tests.rs
  • Robot-docs: src/main.rs robot-docs JSON block

Implementation Details

3a. CLI Args

File: src/cli/mod.rs

Add variant to Commands enum (after Notes):

/// List discussions
#[command(visible_alias = "discussion")]
Discussions(DiscussionsArgs),

Args struct (with typed enums for filter/sort fields):

/// Resolution filter for discussion queries
#[derive(Clone, Debug, ValueEnum)]
pub enum ResolutionFilter {
    Unresolved,
    Resolved,
}

/// Noteable type filter
#[derive(Clone, Debug, ValueEnum)]
pub enum NoteableTypeFilter {
    Issue,
    MergeRequest,
}

/// Sort field for discussion queries
#[derive(Clone, Debug, ValueEnum)]
pub enum DiscussionSortField {
    FirstNote,
    LastNote,
}

/// Sort direction
#[derive(Clone, Debug, ValueEnum)]
pub enum SortDirection {
    Asc,
    Desc,
}

#[derive(Parser)]
pub struct DiscussionsArgs {
    /// Maximum results (clamped to 500)
    #[arg(short = 'n', long = "limit", default_value = "50", help_heading = "Output")]
    pub limit: usize,

    /// Select output fields (comma-separated, or 'minimal' preset)
    #[arg(long, help_heading = "Output", value_delimiter = ',')]
    pub fields: Option<Vec<String>>,

    /// Output format (table, json, jsonl, csv)
    #[arg(long, default_value = "table", value_parser = ["table", "json", "jsonl", "csv"], help_heading = "Output")]
    pub format: String,

    /// Filter to discussions on a specific issue IID
    #[arg(long, conflicts_with = "for_mr", help_heading = "Filters")]
    pub for_issue: Option<i64>,

    /// Filter to discussions on a specific MR IID
    #[arg(long, conflicts_with = "for_issue", help_heading = "Filters")]
    pub for_mr: Option<i64>,

    /// Filter by project path
    #[arg(short = 'p', long, help_heading = "Filters")]
    pub project: Option<String>,

    /// Filter by GitLab discussion ID
    #[arg(long, help_heading = "Filters")]
    pub gitlab_discussion_id: Option<String>,

    /// Filter by resolution status
    #[arg(long, value_enum, help_heading = "Filters")]
    pub resolution: Option<ResolutionFilter>,

    /// Filter by time (7d, 2w, 1m, or YYYY-MM-DD)
    #[arg(long, help_heading = "Filters")]
    pub since: Option<String>,

    /// Filter by file path (exact match or prefix with trailing /)
    #[arg(long, help_heading = "Filters")]
    pub path: Option<String>,

    /// Filter by noteable type
    #[arg(long, value_enum, help_heading = "Filters")]
    pub noteable_type: Option<NoteableTypeFilter>,

    /// Include up to N latest notes per discussion (0 = none, default; clamped to 20)
    #[arg(long, default_value = "0", help_heading = "Output")]
    pub include_notes: usize,

    /// Sort field
    #[arg(long, value_enum, default_value = "last-note", help_heading = "Sorting")]
    pub sort: DiscussionSortField,

    /// Sort direction (default: descending)
    #[arg(long, value_enum, default_value = "desc", help_heading = "Sorting")]
    pub order: SortDirection,
}

Typed enum rationale: Using ValueEnum for resolution, noteable_type, sort, and order makes invalid states unrepresentable at parse time. clap handles validation and help text automatically. In the query builder, match on the enum variant to produce the SQL fragment — no string interpolation, no risk of SQL injection drift.

// Example: enum -> SQL fragment mapping
impl DiscussionSortField {
    pub fn to_sql_column(&self) -> &'static str {
        match self {
            Self::FirstNote => "fd.first_note_at",
            Self::LastNote => "fd.last_note_at",
        }
    }
}

impl SortDirection {
    pub fn to_sql(&self) -> &'static str {
        match self {
            Self::Asc => "ASC",
            Self::Desc => "DESC",
        }
    }
}

Output guardrails: The handler clamps limit to min(limit, 500) and include_notes to min(include_notes, 20) before passing to the query layer. This prevents accidentally huge payloads in robot mode. The clamped values are reported in meta.effective_limit and meta.effective_include_notes.

3b. Domain Structs

File: src/cli/commands/list.rs

#[derive(Debug)]
pub struct DiscussionListRow {
    pub id: i64,
    pub gitlab_discussion_id: String,
    pub noteable_type: String,
    pub parent_iid: Option<i64>,
    pub parent_title: Option<String>,
    pub project_path: String,
    pub gitlab_project_id: i64,
    pub individual_note: bool,
    pub note_count: i64,
    pub first_author: Option<String>,
    pub first_note_body: Option<String>,
    pub first_note_at: i64,
    pub last_note_at: i64,
    pub resolvable: bool,
    pub resolved: bool,
    pub position_new_path: Option<String>,
    pub position_new_line: Option<i64>,
}

#[derive(Serialize)]
pub struct DiscussionListRowJson {
    pub gitlab_discussion_id: String,
    pub noteable_type: String,
    #[serde(skip_serializing_if = "Option::is_none")]
    pub parent_iid: Option<i64>,
    #[serde(skip_serializing_if = "Option::is_none")]
    pub parent_title: Option<String>,
    pub project_path: String,
    pub gitlab_project_id: i64,
    pub individual_note: bool,
    pub note_count: i64,
    #[serde(skip_serializing_if = "Option::is_none")]
    pub first_author: Option<String>,
    #[serde(skip_serializing_if = "Option::is_none")]
    pub first_note_body_snippet: Option<String>,
    pub first_note_at_iso: String,
    pub last_note_at_iso: String,
    pub resolvable: bool,
    pub resolved: bool,
    #[serde(skip_serializing_if = "Option::is_none")]
    pub position_new_path: Option<String>,
    #[serde(skip_serializing_if = "Option::is_none")]
    pub position_new_line: Option<i64>,
    pub included_note_count: usize,
    pub has_more_notes: bool,
    #[serde(skip_serializing_if = "Vec::is_empty")]
    pub notes: Vec<NoteListRowJson>,
}

pub struct DiscussionListResult {
    pub discussions: Vec<DiscussionListRow>,
    pub total_count: i64,
}

#[derive(Serialize)]
pub struct DiscussionListResultJson {
    pub discussions: Vec<DiscussionListRowJson>,
    pub total_count: i64,
    pub showing: usize,
}

The From impl truncates first_note_body to ~120 chars for the snippet.

The notes field on DiscussionListRowJson is populated only when --include-notes N > 0. It reuses the existing NoteListRowJson struct for consistency — agents get the same note shape whether they come from notes, show, or discussions --include-notes.

The included_note_count is set to notes.len() and has_more_notes is set to note_count > included_note_count during the JSON conversion, providing per-discussion truncation signals.

3c. SQL Query

File: src/cli/commands/list.rs

pub fn query_discussions(
    conn: &Connection,
    filters: &DiscussionListFilters,
    config: &Config,
) -> Result<DiscussionListResult> {
    // Wrap all queries in a deferred read transaction for snapshot consistency
    let tx = conn.transaction_with_behavior(rusqlite::TransactionBehavior::Deferred)?;

    // Preflight ambiguity check (if gitlab_discussion_id without project)
    // ... see Ambiguity Guardrail section ...

    // Main query + count query ...
    // ... note expansion query (if include_notes > 0) ...

    tx.commit()?;

Core query uses a CTE + ranked-notes rollup (window function) to avoid per-row correlated subqueries. The ROW_NUMBER() approach produces a single scan over the notes table, which is more predictable than repeated LIMIT 1 sub-selects at scale (200K+ discussions):

WITH filtered_discussions AS (
    SELECT
        d.id, d.gitlab_discussion_id, d.noteable_type, d.project_id,
        d.issue_id, d.merge_request_id, d.individual_note,
        d.first_note_at, d.last_note_at, d.resolvable, d.resolved
    FROM discussions d
    JOIN projects p ON d.project_id = p.id
    {where_sql}
),
ranked_notes AS (
    SELECT
        n.discussion_id,
        n.author_username,
        n.body,
        n.is_system,
        n.position_new_path,
        n.position_new_line,
        ROW_NUMBER() OVER (
            PARTITION BY n.discussion_id
            ORDER BY CASE WHEN n.is_system = 0 THEN 0 ELSE 1 END,
                     n.created_at, n.id
        ) AS rn_first_note,
        ROW_NUMBER() OVER (
            PARTITION BY n.discussion_id
            ORDER BY CASE WHEN n.position_new_path IS NULL THEN 1 ELSE 0 END,
                     n.created_at, n.id
        ) AS rn_first_position
    FROM notes n
    WHERE n.discussion_id IN (SELECT id FROM filtered_discussions)
),
note_rollup AS (
    SELECT
        discussion_id,
        SUM(CASE WHEN is_system = 0 THEN 1 ELSE 0 END) AS note_count,
        MAX(CASE WHEN rn_first_note = 1 AND is_system = 0 THEN author_username END) AS first_author,
        MAX(CASE WHEN rn_first_note = 1 AND is_system = 0 THEN body END) AS first_note_body,
        MAX(CASE WHEN rn_first_position = 1 THEN position_new_path END) AS position_new_path,
        MAX(CASE WHEN rn_first_position = 1 THEN position_new_line END) AS position_new_line
    FROM ranked_notes
    GROUP BY discussion_id
)
SELECT
    fd.id,
    fd.gitlab_discussion_id,
    fd.noteable_type,
    COALESCE(i.iid, m.iid) AS parent_iid,
    COALESCE(i.title, m.title) AS parent_title,
    p.path_with_namespace AS project_path,
    p.gitlab_project_id,
    fd.individual_note,
    COALESCE(nr.note_count, 0) AS note_count,
    nr.first_author,
    nr.first_note_body,
    COALESCE(fd.first_note_at, fd.last_note_at, 0) AS first_note_at,
    COALESCE(fd.last_note_at, fd.first_note_at, 0) AS last_note_at,
    fd.resolvable,
    fd.resolved,
    nr.position_new_path,
    nr.position_new_line
FROM filtered_discussions fd
JOIN projects p ON fd.project_id = p.id
LEFT JOIN issues i ON fd.issue_id = i.id
LEFT JOIN merge_requests m ON fd.merge_request_id = m.id
LEFT JOIN note_rollup nr ON nr.discussion_id = fd.id
ORDER BY COALESCE({sort_column}, 0) {order}, fd.id {order}
LIMIT ?

Dual window function rationale: The ranked_notes CTE uses two separate ROW_NUMBER() windows: rn_first_note ranks non-system notes first (so the first human comment is always rn=1), while rn_first_position ranks notes with file positions first (so the diff anchor is always rn=1). This prevents a leading system note (e.g., "assigned to @alice") from displacing the first human author/body, and prevents a non-positioned note from displacing the file location. The MAX(CASE WHEN rn_xxx = 1 ...) pattern extracts the correct value from each independently-ranked sequence.

Performance rationale: The CTE pre-filters discussions before joining notes. The ranked_notes CTE uses ROW_NUMBER() (a single pass over the notes index) instead of correlated (SELECT ... LIMIT 1) sub-selects per discussion. For MR-scoped queries (50-200 discussions) the performance is equivalent. For project-wide scans with thousands of discussions, the window function approach avoids repeated index probes and produces a more predictable query plan.

Note on ordering: The COALESCE({sort_column}, 0) with tiebreaker fd.id ensures deterministic ordering even when timestamps are NULL (partial sync states). The id tiebreaker is cheap (primary key) and prevents unstable sort output.

Note on SQLite FILTER syntax: SQLite does not support COUNT(*) FILTER (WHERE ...). Use SUM(CASE WHEN ... THEN 1 ELSE 0 END) instead (as shown above).

3c-ii. Note expansion query (--include-notes)

When include_notes > 0, after the main discussion query, run a single batched query using a window function to fetch the N most recent notes per discussion:

WITH ranked_expansion AS (
    SELECT
        n.id, n.gitlab_id, n.author_username, n.body, n.note_type,
        n.is_system, n.created_at, n.updated_at,
        n.position_new_path, n.position_new_line,
        n.position_old_path, n.position_old_line,
        n.resolvable, n.resolved, n.resolved_by,
        d.noteable_type,
        COALESCE(i.iid, m.iid) AS parent_iid,
        COALESCE(i.title, m.title) AS parent_title,
        p.path_with_namespace AS project_path,
        p.gitlab_project_id,
        d.gitlab_discussion_id,
        n.discussion_id,
        ROW_NUMBER() OVER (
            PARTITION BY n.discussion_id
            ORDER BY n.created_at DESC, n.id DESC
        ) AS rn
    FROM notes n
    JOIN discussions d ON n.discussion_id = d.id
    JOIN projects p ON n.project_id = p.id
    LEFT JOIN issues i ON d.issue_id = i.id
    LEFT JOIN merge_requests m ON d.merge_request_id = m.id
    WHERE n.discussion_id IN ({placeholders})
)
SELECT * FROM ranked_expansion WHERE rn <= ?
ORDER BY discussion_id, rn

Group by discussion_id in Rust and attach notes arrays to the corresponding DiscussionListRowJson. Set included_note_count = notes.len() and has_more_notes = note_count > included_note_count for each discussion. This avoids per-discussion round-trips entirely — one query regardless of how many discussions are in the result set.

The {placeholders} are the id values from the main discussion query result. Since the discussion count is already clamped by --limit (max 500), the IN clause size is bounded and safe.

The returned NoteListRow rows reuse the same struct and NoteListRowJson conversion from workstream 1, ensuring identical note shape across all commands.

3d. Filters struct

pub struct DiscussionListFilters {
    pub limit: usize,
    pub project: Option<String>,
    pub for_issue_iid: Option<i64>,
    pub for_mr_iid: Option<i64>,
    pub gitlab_discussion_id: Option<String>,
    pub resolution: Option<ResolutionFilter>,
    pub since: Option<String>,
    pub path: Option<String>,
    pub noteable_type: Option<NoteableTypeFilter>,
    pub sort: DiscussionSortField,
    pub order: SortDirection,
    pub include_notes: usize,
}

Where-clause construction uses match on typed enums — never raw string interpolation:

  • for_issue_iid → subquery to resolve issue ID from IID + project
  • for_mr_iid → subquery to resolve MR ID from IID + project
  • gitlab_discussion_idd.gitlab_discussion_id = ?
  • resolution → match: Unresolvedd.resolvable = 1 AND d.resolved = 0, Resolvedd.resolvable = 1 AND d.resolved = 1
  • sinced.first_note_at >= ? (using parse_since())
  • pathEXISTS (SELECT 1 FROM notes n WHERE n.discussion_id = d.id AND n.position_new_path LIKE ?)
  • noteable_type → match: Issued.noteable_type = 'Issue', MergeRequestd.noteable_type = 'MergeRequest'

3e. Handler wiring

File: src/main.rs

Add match arm:

Some(Commands::Discussions(args)) => handle_discussions(cli.config.as_deref(), args, robot_mode),

Handler function:

fn handle_discussions(
    config_override: Option<&str>,
    args: DiscussionsArgs,
    robot_mode: bool,
) -> Result<(), Box<dyn std::error::Error>> {
    let start = std::time::Instant::now();
    let config = Config::load(config_override)?;
    let db_path = get_db_path(config.storage.db_path.as_deref());
    let conn = create_connection(&db_path)?;

    let effective_limit = args.limit.min(500);
    let effective_include_notes = args.include_notes.min(20);
    let filters = DiscussionListFilters {
        limit: effective_limit,
        project: args.project,
        for_issue_iid: args.for_issue,
        for_mr_iid: args.for_mr,
        gitlab_discussion_id: args.gitlab_discussion_id,
        resolution: args.resolution,
        since: args.since,
        path: args.path,
        noteable_type: args.noteable_type,
        sort: args.sort,
        order: args.order,
        include_notes: effective_include_notes,
    };

    let result = query_discussions(&conn, &filters, &config)?;

    let format = if robot_mode && args.format == "table" {
        "json"
    } else {
        &args.format
    };

    match format {
        "json" => print_list_discussions_json(
            &result,
            start.elapsed().as_millis() as u64,
            args.fields.as_deref(),
            robot_mode,
            effective_limit,
            effective_include_notes,
        ),
        "jsonl" => print_list_discussions_jsonl(&result),
        "csv" => print_list_discussions_csv(&result),
        _ => print_list_discussions(&result),
    }

    Ok(())
}

3f. Print functions

File: src/cli/commands/list.rs

Follow same pattern as print_list_notes_json:

pub fn print_list_discussions_json(
    result: &DiscussionListResult,
    elapsed_ms: u64,
    fields: Option<&[String]>,
    robot_mode: bool,
    effective_limit: usize,
    effective_include_notes: usize,
) {
    let json_result = DiscussionListResultJson::from(result);
    let has_more = result.total_count as usize > json_result.showing;
    let meta = serde_json::json!({
        "elapsed_ms": elapsed_ms,
        "effective_limit": effective_limit,
        "effective_include_notes": effective_include_notes,
        "has_more": has_more,
    });
    let output = serde_json::json!({
        "ok": true,
        "data": json_result,
        "meta": meta,
    });
    let mut output = output;
    if let Some(f) = fields {
        let expanded = expand_fields_preset(f, "discussions");
        if robot_mode {
            filter_fields_robot(&mut output, "discussions", &expanded);
        } else {
            filter_fields(&mut output, "discussions", &expanded);
        }
    }
    match serde_json::to_string(&output) {
        Ok(json) => println!("{json}"),
        Err(e) => eprintln!("Error serializing to JSON: {e}"),
    }
}

Table view: compact format showing discussion_id (first 8 chars), first author, note count, resolved status, path, snippet.

CSV view: all fields, following same pattern as print_list_notes_csv.

3g. Fields preset

File: src/cli/robot.rs

"discussions" => [
    "gitlab_discussion_id", "parent_iid", "note_count",
    "resolvable", "resolved", "first_author", "gitlab_project_id"
]
    .iter()
    .map(|s| (*s).to_string())
    .collect(),

3h. Query-plan validation

Before merging the discussions command, capture EXPLAIN QUERY PLAN output for the three primary query patterns:

  • --for-mr <iid> --resolution unresolved
  • --project <path> --since 7d --sort last-note
  • --gitlab-discussion-id <id>

If plans show table scans on notes or discussions for these patterns, add targeted indexes to the MIGRATIONS array in src/core/db.rs:

Candidate indexes (add only if EXPLAIN QUERY PLAN shows they're needed):

  • discussions(project_id, gitlab_discussion_id) — for ambiguity preflight + direct ID lookup
  • discussions(merge_request_id, last_note_at, id) — for MR-scoped + sorted queries
  • notes(discussion_id, created_at DESC, id DESC) — for --include-notes expansion
  • notes(discussion_id, is_system, created_at, id) — for ranked_notes CTE ordering

This is a measured approach: profile first, add indexes only where the query plan demands them. No speculative index creation.

Tests

Test 1: Basic query returns discussions with gitlab_discussion_id

#[test]
fn query_discussions_basic() {
    let conn = create_test_db();
    insert_project(&conn, 1);
    insert_mr(&conn, 1, 1, 99, "Test MR");
    insert_discussion(&conn, 1, "hexhex123", 1, None, Some(1), "MergeRequest");
    insert_note_in_discussion(&conn, 1, 500, 1, 1, "alice", "first comment");
    insert_note_in_discussion(&conn, 2, 501, 1, 1, "bob", "reply");

    let filters = DiscussionListFilters::default_for_mr(99);
    let result = query_discussions(&conn, &filters, &Config::default()).unwrap();

    assert_eq!(result.discussions.len(), 1);
    assert_eq!(result.discussions[0].gitlab_discussion_id, "hexhex123");
    assert_eq!(result.discussions[0].note_count, 2);
    assert_eq!(result.discussions[0].first_author.as_deref(), Some("alice"));
}

Test 2: Resolution filter

#[test]
fn query_discussions_resolution_filter() {
    let conn = create_test_db();
    // Insert 2 discussions: one resolved, one unresolved
    // ...
    let filters = DiscussionListFilters {
        resolution: Some(ResolutionFilter::Unresolved),
        ..DiscussionListFilters::default_for_mr(99)
    };
    let result = query_discussions(&conn, &filters, &Config::default()).unwrap();
    assert_eq!(result.total_count, 1);
    assert!(!result.discussions[0].resolved);
}

Test 3: Path filter

#[test]
fn query_discussions_path_filter() {
    // Insert discussions: one with diff notes on src/auth.rs, one general
    // Filter by path "src/auth.rs"
    // Assert only the diff note discussion is returned
}

Test 4: JSON serialization round-trip

#[test]
fn discussion_list_json_serialization() {
    let row = DiscussionListRow {
        id: 1,
        gitlab_discussion_id: "abc123".to_string(),
        noteable_type: "MergeRequest".to_string(),
        parent_iid: Some(99),
        parent_title: Some("Fix auth".to_string()),
        project_path: "group/repo".to_string(),
        gitlab_project_id: 42,
        individual_note: false,
        note_count: 3,
        first_author: Some("alice".to_string()),
        first_note_body: Some("This is a very long comment that should be truncated...".to_string()),
        first_note_at: 1_700_000_000_000,
        last_note_at: 1_700_001_000_000,
        resolvable: true,
        resolved: false,
        position_new_path: Some("src/auth.rs".to_string()),
        position_new_line: Some(42),
    };

    let json_row = DiscussionListRowJson::from(&row);
    let value = serde_json::to_value(&json_row).unwrap();
    assert_eq!(value["gitlab_discussion_id"], "abc123");
    assert_eq!(value["gitlab_project_id"], 42);
    assert_eq!(value["note_count"], 3);
    assert!(value["first_note_body_snippet"].as_str().unwrap().len() <= 120);
    assert_eq!(value["included_note_count"], 0);
    assert_eq!(value["has_more_notes"], false);
}

Test 5: Fields filtering works for discussions

#[test]
fn discussions_fields_minimal_preset() {
    let expanded = expand_fields_preset(&["minimal".to_string()], "discussions");
    assert!(expanded.contains(&"gitlab_discussion_id".to_string()));
    assert!(expanded.contains(&"parent_iid".to_string()));
}

Test 6: CTE query handles empty note_rollup gracefully

#[test]
fn query_discussions_with_no_notes() {
    let conn = create_test_db();
    insert_project(&conn, 1);
    insert_mr(&conn, 1, 1, 99, "Test MR");
    // Insert discussion with no notes (edge case: possible after sync issues)
    insert_discussion(&conn, 1, "orphan123", 1, None, Some(1), "MergeRequest");

    let filters = DiscussionListFilters::default_for_mr(99);
    let result = query_discussions(&conn, &filters, &Config::default()).unwrap();

    assert_eq!(result.discussions.len(), 1);
    assert_eq!(result.discussions[0].note_count, 0);
    assert!(result.discussions[0].first_author.is_none());
}

Test 7: --gitlab-discussion-id filter returns exact match

#[test]
fn query_discussions_by_gitlab_id() {
    let conn = create_test_db();
    insert_project(&conn, 1);
    insert_mr(&conn, 1, 1, 99, "Test MR");
    insert_discussion(&conn, 1, "target123", 1, None, Some(1), "MergeRequest");
    insert_discussion(&conn, 2, "other456", 1, None, Some(1), "MergeRequest");

    let filters = DiscussionListFilters {
        gitlab_discussion_id: Some("target123".to_string()),
        ..DiscussionListFilters::default_for_mr(99)
    };
    let result = query_discussions(&conn, &filters, &Config::default()).unwrap();

    assert_eq!(result.discussions.len(), 1);
    assert_eq!(result.discussions[0].gitlab_discussion_id, "target123");
}

Test 8: --include-notes populates notes array via batched query

#[test]
fn query_discussions_with_included_notes() {
    let conn = create_test_db();
    insert_project(&conn, 1);
    insert_mr(&conn, 1, 1, 99, "Test MR");
    insert_discussion(&conn, 1, "disc123", 1, None, Some(1), "MergeRequest");
    insert_note_in_discussion(&conn, 1, 500, 1, 1, "alice", "first");
    insert_note_in_discussion(&conn, 2, 501, 1, 1, "bob", "second");
    insert_note_in_discussion(&conn, 3, 502, 1, 1, "carol", "third");

    let filters = DiscussionListFilters {
        include_notes: 2,
        ..DiscussionListFilters::default_for_mr(99)
    };
    let result = query_discussions(&conn, &filters, &Config::default()).unwrap();

    assert_eq!(result.discussions.len(), 1);
    // Note: notes populated during JSON conversion, not in raw result
    // Test at handler/print level for full integration
}

Test 9: Bridge fields survive --fields filtering in robot mode

#[test]
fn discussions_bridge_fields_forced_in_robot_mode() {
    // Request only "note_count" — bridge fields must still appear
    let mut value = serde_json::json!({
        "data": {
            "discussions": [{
                "gitlab_discussion_id": "abc",
                "noteable_type": "MergeRequest",
                "parent_iid": 99,
                "project_path": "group/repo",
                "gitlab_project_id": 42,
                "note_count": 3
            }]
        }
    });

    filter_fields_robot(
        &mut value,
        "discussions",
        &["note_count".to_string()],
    );

    let disc = &value["data"]["discussions"][0];
    assert_eq!(disc["note_count"], 3);
    assert!(disc.get("gitlab_discussion_id").is_some());
    assert!(disc.get("project_path").is_some());
    assert!(disc.get("gitlab_project_id").is_some());
}

Test 10: Output guardrails clamp limit and include_notes

#[test]
fn discussions_output_guardrails() {
    // Verify that limit > 500 is clamped to 500
    // Verify that include_notes > 20 is clamped to 20
    // These are handler-level tests (not query-level)
    assert_eq!(1000_usize.min(500), 500);
    assert_eq!(50_usize.min(20), 20);
    assert_eq!(5_usize.min(20), 5);  // below cap stays unchanged
}

Test 11: Ambiguity guardrail fires for cross-project discussion ID

#[test]
fn discussions_ambiguous_gitlab_discussion_id_across_projects() {
    let conn = create_test_db();
    insert_project(&conn, 1);  // "group/repo-a"
    insert_project(&conn, 2);  // "group/repo-b"
    // Insert discussions with same gitlab_discussion_id in different projects
    insert_discussion(&conn, 1, "shared-id", 1, None, None, "Issue");
    insert_discussion(&conn, 2, "shared-id", 2, None, None, "Issue");

    let filters = DiscussionListFilters {
        gitlab_discussion_id: Some("shared-id".to_string()),
        project: None,  // no project specified
        ..DiscussionListFilters::default()
    };
    let result = query_discussions(&conn, &filters, &Config::default());
    assert!(result.is_err());
    // Error should be Ambiguous with both project paths
}

Test 12: has_more metadata is accurate

#[test]
fn discussions_has_more_metadata() {
    let conn = create_test_db();
    insert_project(&conn, 1);
    insert_mr(&conn, 1, 1, 99, "Test MR");
    // Insert 5 discussions
    for i in 1..=5 {
        insert_discussion(&conn, i, &format!("disc-{i}"), 1, None, Some(1), "MergeRequest");
        insert_note_in_discussion(&conn, i, 500 + i, i, 1, "alice", "note");
    }

    // Limit to 3 — should show has_more = true
    let filters = DiscussionListFilters {
        limit: 3,
        ..DiscussionListFilters::default_for_mr(99)
    };
    let result = query_discussions(&conn, &filters, &Config::default()).unwrap();
    assert_eq!(result.discussions.len(), 3);
    assert_eq!(result.total_count, 5);
    // has_more = total_count > showing = 5 > 3 = true
}

Test 13: Per-discussion truncation signals are accurate

#[test]
fn discussions_per_discussion_truncation_signals() {
    let conn = create_test_db();
    insert_project(&conn, 1);
    insert_mr(&conn, 1, 1, 99, "Test MR");
    insert_discussion(&conn, 1, "disc123", 1, None, Some(1), "MergeRequest");
    // Insert 5 notes
    for i in 1..=5 {
        insert_note_in_discussion(&conn, i, 500 + i, 1, 1, "alice", &format!("note {i}"));
    }

    // Request 2 notes — should show has_more_notes = true
    let filters = DiscussionListFilters {
        include_notes: 2,
        ..DiscussionListFilters::default_for_mr(99)
    };
    let result = query_discussions(&conn, &filters, &Config::default()).unwrap();

    // Convert to JSON and verify truncation signals
    let json_result = DiscussionListResultJson::from_with_notes(&result, 2);
    assert_eq!(json_result.discussions[0].included_note_count, 2);
    assert!(json_result.discussions[0].has_more_notes);  // 5 notes > 2 included
}

Test 14: First-note rollup handles leading system notes correctly

#[test]
fn discussions_first_note_rollup_skips_system_notes() {
    let conn = create_test_db();
    insert_project(&conn, 1);
    insert_mr(&conn, 1, 1, 99, "Test MR");
    insert_discussion(&conn, 1, "disc123", 1, None, Some(1), "MergeRequest");
    // First note is system, second is human
    insert_note_in_discussion_system(&conn, 1, 500, 1, 1, "system", "assigned to @alice", true);
    insert_note_in_discussion(&conn, 2, 501, 1, 1, "bob", "actual first comment");

    let filters = DiscussionListFilters::default_for_mr(99);
    let result = query_discussions(&conn, &filters, &Config::default()).unwrap();

    assert_eq!(result.discussions[0].first_author.as_deref(), Some("bob"));
    assert!(result.discussions[0].first_note_body.as_ref().unwrap().contains("actual first comment"));
}

4. Fix Robot-Docs Response Schemas

Why

The notes command robot-docs says:

"data": {"notes": "[NoteListRowJson]", "total_count": "int", "showing": "int"}

An agent sees [NoteListRowJson] — a Rust type name — and has no idea what fields are available. Compare with the issues command which inline-lists every field. This forces agents into trial-and-error field discovery.

Changes Required

File: src/main.rs, robot-docs JSON block

4a. Notes response_schema

Replace:

"data": {"notes": "[NoteListRowJson]", "total_count": "int", "showing": "int"}

With:

"data": {
    "notes": "[{id:int, gitlab_id:int, gitlab_note_id:int, author_username:string, body:string?, note_type:string?, is_system:bool, created_at_iso:string, updated_at_iso:string, position_new_path:string?, position_new_line:int?, position_old_path:string?, position_old_line:int?, resolvable:bool, resolved:bool, resolved_by:string?, noteable_type:string?, parent_iid:int?, parent_title:string?, project_path:string, gitlab_project_id:int, gitlab_discussion_id:string}]",
    "total_count": "int",
    "showing": "int"
}

4b. Add discussions response_schema

"discussions": {
    "description": "List discussions with thread-level metadata",
    "flags": [
        "--limit/-n <N>",
        "--for-issue <iid>",
        "--for-mr <iid>",
        "-p/--project <path>",
        "--gitlab-discussion-id <id>",
        "--resolution <unresolved|resolved>",
        "--since <period>",
        "--path <filepath>",
        "--noteable-type <Issue|MergeRequest>",
        "--include-notes <N>",
        "--sort <first-note|last-note>",
        "--order <asc|desc>",
        "--fields <list|minimal>",
        "--format <table|json|jsonl|csv>"
    ],
    "robot_flags": ["--format json", "--fields minimal"],
    "example": "lore --robot discussions --for-mr 99 --resolution unresolved",
    "response_schema": {
        "ok": "bool",
        "data": {
            "discussions": "[{gitlab_discussion_id:string, noteable_type:string, parent_iid:int?, parent_title:string?, project_path:string, gitlab_project_id:int, individual_note:bool, note_count:int, first_author:string?, first_note_body_snippet:string?, first_note_at_iso:string, last_note_at_iso:string, resolvable:bool, resolved:bool, position_new_path:string?, position_new_line:int?, included_note_count:int, has_more_notes:bool, notes:[{...NoteListRowJson fields...}]?}]",
            "total_count": "int",
            "showing": "int"
        },
        "meta": {"elapsed_ms": "int", "effective_limit": "int", "effective_include_notes": "int", "has_more": "bool"}
    }
}

4c. Add to glab_equivalents

{
    "glab": "glab api /projects/:id/merge_requests/:iid/discussions",
    "lore": "lore -J discussions --for-mr <iid>",
    "note": "Includes note counts, first author, resolution status, file positions"
}

4d. Update show response_schema

Update the issues and mrs show schemas to reflect that discussions now include gitlab_discussion_id, resolvable, resolved, and last_note_at_iso. Also reflect that notes within show discussions now include gitlab_note_id.

4e. Add to lore_exclusive list

"discussions: Thread-level discussion listing with gitlab_discussion_id for API integration"

4f. Add robot-docs contract tests (field-set parity)

File: src/main.rs (within #[cfg(test)] module)

Add tests that parse the robot-docs JSON output and compare declared fields against actual serialized struct fields. This is stronger than string-contains checks — it catches schema drift in both directions (field added to struct but not docs, or field listed in docs but removed from struct).

/// Parse compact schema string "field1:type, field2:type?" into a set of field names
fn parse_schema_fields(schema: &str) -> HashSet<String> {
    // Strip leading "[{" and trailing "}]", split on ", ", extract field name before ":"
    schema.trim_start_matches("[{").trim_end_matches("}]")
        .split(", ")
        .filter_map(|f| f.split(':').next())
        .map(|f| f.to_string())
        .collect()
}

/// Get the actual serialized field names from a sample JSON struct
fn sample_note_json_keys() -> HashSet<String> {
    let row = NoteListRow { /* ... test defaults ... */ };
    let json = NoteListRowJson::from(&row);
    let value = serde_json::to_value(&json).unwrap();
    value.as_object().unwrap().keys().cloned().collect()
}

#[test]
fn robot_docs_notes_schema_matches_actual_fields() {
    let docs = get_robot_docs_json();
    let notes_schema = docs["commands"]["notes"]["response_schema"]["data"]["notes"]
        .as_str().unwrap();
    let declared = parse_schema_fields(notes_schema);
    let actual = sample_note_json_keys();

    // All bridge fields must be in both declared and actual
    for bridge in &["gitlab_discussion_id", "gitlab_project_id", "project_path", "parent_iid", "gitlab_note_id"] {
        assert!(declared.contains(*bridge), "robot-docs missing bridge field: {bridge}");
        assert!(actual.contains(*bridge), "NoteListRowJson missing bridge field: {bridge}");
    }

    // Every declared field should exist in the actual struct (no phantom docs)
    for field in &declared {
        assert!(actual.contains(field),
            "robot-docs declares '{field}' but NoteListRowJson doesn't serialize it");
    }

    // Every actual field should be declared in docs (no undocumented fields)
    for field in &actual {
        assert!(declared.contains(field),
            "NoteListRowJson serializes '{field}' but robot-docs doesn't declare it");
    }
}

#[test]
fn robot_docs_discussions_schema_matches_actual_fields() {
    let docs = get_robot_docs_json();
    let disc_schema = docs["commands"]["discussions"]["response_schema"]["data"]["discussions"]
        .as_str().unwrap();
    let declared = parse_schema_fields(disc_schema);
    let actual = sample_discussion_json_keys();

    for bridge in &["gitlab_discussion_id", "gitlab_project_id", "project_path", "parent_iid"] {
        assert!(declared.contains(*bridge), "robot-docs missing bridge field: {bridge}");
        assert!(actual.contains(*bridge), "DiscussionListRowJson missing bridge field: {bridge}");
    }

    for field in &declared {
        assert!(actual.contains(field),
            "robot-docs declares '{field}' but DiscussionListRowJson doesn't serialize it");
    }

    for field in &actual {
        assert!(declared.contains(field),
            "DiscussionListRowJson serializes '{field}' but robot-docs doesn't declare it");
    }
}

#[test]
fn robot_docs_show_schema_includes_discussion_id() {
    let docs = get_robot_docs_json();
    // Verify issues and mrs show schemas reference gitlab_discussion_id
    // in their discussion sub-schemas
}

4g. Add CLI-level contract integration tests

File: src/cli/commands/list_tests.rs or src/main.rs #[cfg(test)]

Add handler-level tests that invoke the command handlers with an in-memory DB and parse the JSON output, asserting Bridge Contract fields are present. These are stronger than unit tests on structs because they exercise the full path from query through serialization.

#[test]
fn notes_handler_json_includes_bridge_fields() {
    // Setup in-memory DB with project, discussion, note
    // Capture stdout from handle_notes (or call query_notes + print_list_notes_json)
    // Parse JSON, assert bridge fields present on every note
    let conn = create_test_db();
    insert_project(&conn, 1);
    insert_mr(&conn, 1, 1, 99, "Test MR");
    insert_discussion(&conn, 1, "abc123", 1, None, Some(1), "MergeRequest");
    insert_note_in_discussion(&conn, 1, 500, 1, 1, "alice", "hello");

    let result = query_notes(&conn, &NoteListFilters::default_for_mr(99), &Config::default()).unwrap();
    let json_result = NoteListResultJson::from(&result);
    let value = serde_json::to_value(&json_result).unwrap();

    for note in value["notes"].as_array().unwrap() {
        assert!(note.get("gitlab_discussion_id").is_some(), "missing gitlab_discussion_id");
        assert!(note.get("gitlab_note_id").is_some(), "missing gitlab_note_id");
        assert!(note.get("gitlab_project_id").is_some(), "missing gitlab_project_id");
        assert!(note.get("project_path").is_some(), "missing project_path");
        assert!(note.get("parent_iid").is_some(), "missing parent_iid");
    }
}

#[test]
fn discussions_handler_json_includes_bridge_fields() {
    let conn = create_test_db();
    insert_project(&conn, 1);
    insert_mr(&conn, 1, 1, 99, "Test MR");
    insert_discussion(&conn, 1, "abc123", 1, None, Some(1), "MergeRequest");
    insert_note_in_discussion(&conn, 1, 500, 1, 1, "alice", "hello");

    let result = query_discussions(&conn, &DiscussionListFilters::default_for_mr(99), &Config::default()).unwrap();
    let json_result = DiscussionListResultJson::from(&result);
    let value = serde_json::to_value(&json_result).unwrap();

    for disc in value["discussions"].as_array().unwrap() {
        assert!(disc.get("gitlab_discussion_id").is_some(), "missing gitlab_discussion_id");
        assert!(disc.get("gitlab_project_id").is_some(), "missing gitlab_project_id");
        assert!(disc.get("project_path").is_some(), "missing project_path");
        assert!(disc.get("parent_iid").is_some(), "missing parent_iid");
    }
}

Tests

Beyond the contract tests above, robot-docs changes are verified by running lore robot-docs and inspecting output.


Delivery Order

  1. Change 1 (notes output) — standalone, no dependencies. Can be released immediately.
  2. Change 2 (show output) — standalone, no dependencies. Can be released alongside 1.
  3. Change 3 (discussions command) — largest change, benefits from 1+2 being reviewed first to lock down field naming and serialization patterns.
  4. Change 4 (robot-docs + contract tests) — last, after all payloads are finalized.

Changes 1 and 2 can be done in parallel. Change 4 must come last since it documents the final schema of all preceding changes.

Cross-cutting: The Bridge Contract field guardrail (force-including bridge fields in robot mode) should be implemented as part of Change 1, since it modifies filter_fields in robot.rs which all subsequent changes depend on. The BRIDGE_FIELDS_* constants are defined once and reused by Changes 3 and 4.

Cross-cutting: The snapshot consistency pattern (deferred read transaction) should be implemented in Change 1 for query_notes and carried forward to Change 3 for query_discussions. This is a one-line wrapper that provides correctness guarantees with zero performance cost.


Validation Criteria

After all changes:

  1. An agent can run lore -J notes --for-mr 3929 --contains "really do prefer" and get gitlab_discussion_id, gitlab_note_id, and gitlab_project_id in the response
  2. An agent can run lore -J discussions --for-mr 3929 --resolution unresolved to see all open threads with their IDs
  3. An agent can run lore -J mrs 3929 and see gitlab_discussion_id, resolvable, resolved, and last_note_at_iso on each discussion group, plus gitlab_note_id on each note within
  4. lore robot-docs lists actual field names for all commands
  5. All existing tests still pass
  6. No clippy warnings (pedantic + nursery)
  7. Robot-docs contract tests pass with field-set parity (not just string-contains), preventing future schema drift in both directions
  8. Bridge Contract fields (project_path, gitlab_project_id, noteable_type, parent_iid, gitlab_discussion_id, gitlab_note_id) are present in every applicable read payload
  9. Bridge Contract fields survive --fields filtering in robot mode (guardrail enforced)
  10. --gitlab-discussion-id filter works on both notes and discussions commands
  11. --include-notes N populates inline notes on discussions output via single batched query
  12. CLI-level contract integration tests verify bridge fields through the full handler path
  13. gitlab_note_id is available in notes list output (alongside gitlab_id for back-compat) and in show detail notes, providing a uniform field name across all commands
  14. Ambiguity guardrail fires when --gitlab-discussion-id matches multiple projects without --project specified — including when LIMIT would have hidden the ambiguity (preflight query runs before LIMIT)
  15. Output guardrails clamp --limit to 500 and --include-notes to 20; meta reports effective values and has_more truncation flag
  16. Discussion and show queries use deterministic ordering (COALESCE + id tiebreaker) to prevent unstable output during partial sync states
  17. Per-discussion truncation signals (included_note_count, has_more_notes) are accurate for --include-notes output
  18. Multi-query commands (query_notes, query_discussions) use deferred read transactions for snapshot consistency during concurrent ingest
  19. Discussion filters (resolution, noteable_type, sort, order) use typed enums with match-to-SQL mapping — no raw string interpolation in query construction
  20. First-note rollup correctly handles discussions with leading system notes — first_author and first_note_body_snippet always reflect the first non-system note
  21. Query plans for primary discussion query patterns (--for-mr, --project --since, --gitlab-discussion-id) have been validated via EXPLAIN QUERY PLAN; targeted indexes added only where scans were observed

Rejected Recommendations

  • Rename idnote_id and gitlab_idgitlab_note_id in notes list output — rejected because every existing consumer (agents, scripts, field presets) uses id and gitlab_id. The fields are unambiguous within the notes context. The show-command note structs are a different story (they have no IDs at all), so we add gitlab_note_id there where it's genuinely missing. Renaming established fields is churn without proportional benefit. (Updated: we now ADD gitlab_note_id as an alias alongside gitlab_id per iteration 3 feedback.)
  • Keyset cursor-based pagination (--cursor flag) — rejected because no existing lore command has pagination, agents use --limit effectively, and adding a cursor mechanism is significant scope creep. Tracked as potential future work if agents hit real pagination needs.
  • Split note_count into user_note_count/total_note_count and rename first_author to first_user_author — rejected because note_count already excludes system notes by query design (the WHERE is_system = 0 / CASE WHEN filter), and first_author already targets the first non-system note. The current naming is clear and consistent with how notes --include-system works elsewhere.
  • Match path filter on both position_new_path and position_old_path — rejected because agents care about where code is now (new path), not where it was before a rename. Matching old paths adds complexity and returns confusing results for moved files.
  • Separate migration file for discussion-list indexes — rejected because this project uses a MIGRATIONS array in src/core/db.rs, not separate migration files. If profiling shows the new query needs indexes, they'll be added to the migration array in the standard way. Premature index creation without measurement is against project practice.
  • Shared contract model / workstream 0 (shared constants module) — rejected because 4 structs sharing field names in a codebase this size isn't drift-prone. We have compile-time contract tests (robot-docs assertions + handler-level JSON tests) that catch drift. A constants module for field name strings adds indirection without proportional gain. The Bridge Contract field guardrail (BRIDGE_FIELDS_* arrays in robot.rs) provides the centralized definition where it matters — at the filtering enforcement point.
  • Structured robot-docs schema (JSON objects instead of string blobs) — rejected because the current compact string format is intentionally token-efficient for agent consumption. Switching to nested JSON objects per field would significantly bloat robot-docs output. The string-based contract tests are sufficient — they test what agents actually parse. Agents already work with the inline field listing format used by issues and mrs.
  • bridge_contract meta-section in robot-docs output — rejected because agents don't need a separate meta-contract section; they need correct field listings per command, which we already provide. Adding a cross-cutting contract section to robot-docs adds documentation surface area without improving the agent workflow.
  • Performance regression benchmark test (ignored by default) — rejected because timing-based assertions are inherently flaky across machines, CI environments, and load conditions. Performance is validated through query plan analysis (EXPLAIN) and manual profiling, not hard-coded elapsed-time thresholds.
  • Make --discussion-id and --gitlab-discussion-id aliases for the same backing filter — rejected because they filter on different identifiers: --discussion-id takes the internal integer ID (existing behavior), while --gitlab-discussion-id takes the external string ID. These serve fundamentally different use cases (internal cross-referencing vs. external API bridging) and cannot be collapsed without breaking existing consumers.
  • --with-write-hints flag for inline glab endpoint templates — rejected because this couples lore's read surface to glab's API surface, violating the read/write split principle. The Bridge Contract gives agents the raw identifiers; constructing glab commands is the agent's responsibility. Adding endpoint templates would require lore to track glab API changes, creating an unnecessary maintenance burden.
  • Show-command note ordering change (ORDER BY COALESCE(position, ...), created_at, id) — rejected because show-command note ordering within a discussion thread is out of scope for this plan. The existing ordering works correctly for present data; the defensive COALESCE pattern is applied to discussion-level ordering where it matters for agent workflows.
  • Query-plan validation as a separate numbered workstream — rejected because it adds delivery overhead without proportional benefit. Query-plan validation is integrated into workstream 3 as a pre-merge validation step (section 3h), with candidate indexes listed but only added when EXPLAIN QUERY PLAN shows they're needed. This keeps the measured approach without inflating the workstream count.
  • Add gitlab_note_id to show-command note detail structs — rejected because show-command note detail structs already have gitlab_id (same value as id). The field is unambiguous and consistent with the Bridge Contract. Adding gitlab_note_id would create a duplicate and increase payload size without benefit.
  • Add gitlab_discussion_id to show-command discussion detail structs — rejected because show-command discussion detail structs already have gitlab_discussion_id. The field is unambiguous and consistent with the Bridge Contract. Adding gitlab_discussion_id would create a duplicate and increase payload size without benefit.
  • Add gitlab_project_id to show-command discussion detail structs — rejected because show-command discussion detail structs already have gitlab_project_id. The field is unambiguous and consistent with the Bridge Contract. Adding gitlab_project_id would create a duplicate and increase payload size without benefit.