Files
gitlore/docs/plan-expose-discussion-ids.md
teernisse 171260a772 feat(cli): implement 'lore trace' command (bd-2n4, bd-9dd)
Gate 5 Code Trace - Tier 1 (API-only, no git blame).
Answers 'Why was this code introduced?' by building
file -> MR -> issue -> discussion chains.

New files:
- src/core/trace.rs: run_trace() query logic with rename-aware
  path resolution, entity_reference-based issue linking, and
  DiffNote discussion extraction
- src/core/trace_tests.rs: 7 unit tests for query logic
- src/cli/commands/trace.rs: CLI command with human output,
  robot JSON output, and :line suffix parsing (5 tests)

Human output shows full content (no truncation).
Robot JSON truncates discussion bodies to 500 chars for token efficiency.

Wiring:
- TraceArgs + Commands::Trace in cli/mod.rs
- handle_trace in main.rs
- VALID_COMMANDS + robot-docs manifest entry
- COMMAND_FLAGS autocorrect registry entry

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 14:57:21 -05:00

1908 lines
72 KiB
Markdown

---
plan: true
title: ""
status: iterating
iteration: 4
target_iterations: 8
beads_revision: 0
related_plans: []
created: 2026-02-17
updated: 2026-02-17
---
# Plan: Expose Discussion IDs Across the Read Surface
**Problem**: Agents can't bridge from lore's read output to glab's write API because
`gitlab_discussion_id` — stored in the DB — is never exposed in any output. The read/write
split contract requires lore to emit every identifier an agent needs to construct a glab write
command.
**Scope**: Four workstreams, delivered in order:
1. Add `gitlab_discussion_id` to notes output
2. Add `gitlab_discussion_id` to show command discussion groups
3. Add a standalone `discussions` list command
4. Fix robot-docs to list actual field names instead of opaque type references
---
## Bridge Contract (Cross-Cutting)
Every read payload that surfaces notes or discussions **MUST** include:
- `project_path`
- `gitlab_project_id`
- `noteable_type`
- `parent_iid`
- `gitlab_discussion_id`
- `gitlab_note_id` (when note-level data is returned — i.e., in notes list and show detail)
**Rationale for `gitlab_project_id`**: `project_path` is human-readable but mutable — project
renames and group transfers change it. `gitlab_project_id` is immutable and is the identifier
agents need for `glab api /projects/:id/...` write calls. Including both ensures agents can
construct API calls without a separate project-ID lookup, even after path changes.
**Back-compat rule**: Note payloads in the `notes` list command continue exposing `gitlab_id`
for existing consumers, but **MUST also** expose `gitlab_note_id` with the same value. This
ensures agents can use a single field name (`gitlab_note_id`) across all commands — `notes`,
`show`, and `discussions --include-notes` — without special-casing by command.
This contract exists so agents can deterministically construct `glab api` write calls without
cross-referencing multiple commands. Each workstream below must satisfy these fields in its
output.
### Field Filtering Guardrail
In robot mode, `filter_fields` **MUST** force-include Bridge Contract fields even when the
caller passes a narrower `--fields` list. This prevents agents from accidentally stripping
the identifiers they need for write operations.
**Implementation**: Add a `BRIDGE_FIELDS` constant map per entity type. In `filter_fields()`,
when operating in robot mode, union the caller's requested fields with the bridge set before
filtering. Human/table mode keeps existing behavior (no forced fields).
```rust
// src/cli/robot.rs
const BRIDGE_FIELDS_NOTES: &[&str] = &[
"project_path", "gitlab_project_id", "noteable_type", "parent_iid",
"gitlab_discussion_id", "gitlab_note_id",
];
const BRIDGE_FIELDS_DISCUSSIONS: &[&str] = &[
"project_path", "gitlab_project_id", "noteable_type", "parent_iid",
"gitlab_discussion_id",
];
```
In `filter_fields`, when entity is `"notes"` or `"discussions"`, merge the bridge set into the
requested fields before filtering the JSON value. This is a ~5-line change to the existing
function.
### Ambiguity Guardrail
When filtering by `gitlab_discussion_id` (on either `notes` or `discussions` commands) without
`--project`, if the query matches discussions in multiple projects:
- Return an `Ambiguous` error (exit code 18, matching existing convention)
- Include matching project paths in the error message
- Suggest retry with `--project <path>`
**Implementation**: Run a **preflight distinct-project check** before the main list query
executes its `LIMIT`. This is critical because a post-query check on the paginated result set
can silently miss cross-project ambiguity when `LIMIT` truncates results to rows from a single
project. The preflight query is cheap (hits the `gitlab_discussion_id` index, returns at most
a few rows) and eliminates non-deterministic write-targeting risk.
```sql
-- Preflight ambiguity check (runs before main query)
SELECT DISTINCT p.path_with_namespace
FROM discussions d
JOIN projects p ON p.id = d.project_id
WHERE d.gitlab_discussion_id = ?
LIMIT 3
```
If more than one project is found, return `LoreError::Ambiguous` (exit code 18) with the
distinct project paths and suggestion to retry with `--project <path>`.
```rust
// In query_notes / query_discussions, before executing the main query:
if let Some(ref disc_id) = filters.gitlab_discussion_id {
if filters.project.is_none() {
let distinct_projects: Vec<String> = conn
.prepare(
"SELECT DISTINCT p.path_with_namespace \
FROM discussions d \
JOIN projects p ON p.id = d.project_id \
WHERE d.gitlab_discussion_id = ? \
LIMIT 3"
)?
.query_map([disc_id], |row| row.get(0))?
.collect::<std::result::Result<Vec<_>, _>>()?;
if distinct_projects.len() > 1 {
return Err(LoreError::Ambiguous {
message: format!(
"Discussion ID matches {} projects: {}. Use --project to disambiguate.",
distinct_projects.len(),
distinct_projects.join(", ")
),
});
}
}
}
```
#### 1h. Wrap `query_notes` in a read transaction
Wrap the count query and page query in a deferred read transaction per the Snapshot Consistency
cross-cutting requirement. See the Bridge Contract section for the pattern.
### Tests
**File**: `src/cli/commands/list_tests.rs`
#### Test 1: `gitlab_discussion_id` present in NoteListRowJson
```rust
#[test]
fn note_list_row_json_includes_gitlab_discussion_id() {
let row = NoteListRow {
id: 1,
gitlab_id: 100,
author_username: "alice".to_string(),
body: Some("test".to_string()),
note_type: Some("DiscussionNote".to_string()),
is_system: false,
created_at: 1_700_000_000_000,
updated_at: 1_700_000_000_000,
position_new_path: None,
position_new_line: None,
position_old_path: None,
position_old_line: None,
resolvable: false,
resolved: false,
resolved_by: None,
noteable_type: Some("MergeRequest".to_string()),
parent_iid: Some(42),
parent_title: Some("Fix bug".to_string()),
project_path: "group/project".to_string(),
gitlab_discussion_id: "6a9c1750b37d".to_string(),
gitlab_project_id: 789,
};
let json_row = NoteListRowJson::from(&row);
assert_eq!(json_row.gitlab_discussion_id, "6a9c1750b37d");
assert_eq!(json_row.gitlab_note_id, 100); // alias matches gitlab_id
assert_eq!(json_row.gitlab_project_id, 789);
let serialized = serde_json::to_value(&json_row).unwrap();
assert!(serialized.get("gitlab_discussion_id").is_some());
assert_eq!(
serialized["gitlab_discussion_id"].as_str().unwrap(),
"6a9c1750b37d"
);
// Both gitlab_id and gitlab_note_id present with same value
assert_eq!(serialized["gitlab_id"], 100);
assert_eq!(serialized["gitlab_note_id"], 100);
assert_eq!(serialized["gitlab_project_id"], 789);
}
```
#### Test 2: query_notes returns gitlab_discussion_id from DB
```rust
#[test]
fn query_notes_returns_gitlab_discussion_id() {
let conn = create_connection(Path::new(":memory:")).unwrap();
run_migrations(&conn).unwrap();
// Insert project, discussion, note
conn.execute(
"INSERT INTO projects (id, gitlab_project_id, path_with_namespace, web_url)
VALUES (1, 1, 'group/project', 'https://gitlab.com/group/project')",
[],
).unwrap();
conn.execute(
"INSERT INTO discussions (id, gitlab_discussion_id, project_id, issue_id, noteable_type, last_seen_at)
VALUES (1, 'abc123def456', 1, NULL, 'MergeRequest', 1000)",
[],
).unwrap();
conn.execute(
"INSERT INTO merge_requests (id, gitlab_id, project_id, iid, title, state, author_username, source_branch, target_branch, created_at, updated_at, last_seen_at)
VALUES (1, 1, 1, 99, 'Test MR', 'opened', 'alice', 'feat', 'main', 1000, 1000, 1000)",
[],
).unwrap();
// Update discussion to reference MR
conn.execute(
"UPDATE discussions SET merge_request_id = 1 WHERE id = 1",
[],
).unwrap();
conn.execute(
"INSERT INTO notes (id, gitlab_id, discussion_id, project_id, author_username, body, created_at, updated_at, last_seen_at, is_system, position)
VALUES (1, 500, 1, 1, 'bob', 'test note', 1000, 1000, 1000, 0, 0)",
[],
).unwrap();
let config = Config::default();
let filters = NoteListFilters {
limit: 10,
project: None,
author: None,
note_type: None,
include_system: false,
for_issue_iid: None,
for_mr_iid: None,
note_id: None,
gitlab_note_id: None,
discussion_id: None,
since: None,
until: None,
path: None,
contains: None,
resolution: None,
sort: "created".to_string(),
order: "desc".to_string(),
};
let result = query_notes(&conn, &filters, &config).unwrap();
assert_eq!(result.notes.len(), 1);
assert_eq!(result.notes[0].gitlab_discussion_id, "abc123def456");
}
```
#### Test 3: --fields filtering includes gitlab_discussion_id
```rust
#[test]
fn fields_filter_retains_gitlab_discussion_id() {
let mut value = serde_json::json!({
"data": {
"notes": [{
"id": 1,
"gitlab_id": 100,
"author_username": "alice",
"body": "test",
"gitlab_discussion_id": "abc123"
}]
}
});
filter_fields(
&mut value,
"notes",
&["id".to_string(), "gitlab_discussion_id".to_string()],
);
let note = &value["data"]["notes"][0];
assert_eq!(note["id"], 1);
assert_eq!(note["gitlab_discussion_id"], "abc123");
assert!(note.get("body").is_none());
}
```
#### Test 4: Bridge fields survive aggressive --fields filtering in robot mode
```rust
#[test]
fn bridge_fields_forced_in_robot_mode() {
// Agent requests only "body" — bridge fields must still appear
let mut value = serde_json::json!({
"data": {
"notes": [{
"id": 1,
"body": "test",
"project_path": "group/repo",
"gitlab_project_id": 42,
"noteable_type": "MergeRequest",
"parent_iid": 42,
"gitlab_discussion_id": "abc123",
"gitlab_note_id": 500
}]
}
});
// In robot mode, filter_fields merges bridge set
filter_fields_robot(
&mut value,
"notes",
&["body".to_string()],
);
let note = &value["data"]["notes"][0];
assert_eq!(note["body"], "test");
// Bridge fields survive despite not being requested:
assert!(note.get("project_path").is_some());
assert!(note.get("gitlab_project_id").is_some());
assert!(note.get("gitlab_discussion_id").is_some());
assert!(note.get("parent_iid").is_some());
}
```
#### Test 5: --gitlab-discussion-id filter returns matching notes
```rust
#[test]
fn notes_filter_by_gitlab_discussion_id() {
let conn = create_test_db();
// Insert 2 discussions with different gitlab_discussion_ids, each with notes
// Filter by one gitlab_discussion_id
// Assert only notes from matching discussion are returned
}
```
#### Test 6: Ambiguity guardrail fires for cross-project discussion ID matches
```rust
#[test]
fn notes_ambiguous_gitlab_discussion_id_across_projects() {
let conn = create_test_db();
// Insert 2 projects, each with a discussion sharing the same gitlab_discussion_id
// (this can happen since IDs are per-project)
// Filter by gitlab_discussion_id without --project
// Assert LoreError::Ambiguous is returned with both project paths
}
```
#### Test 7: Ambiguity preflight catches cross-project match even with LIMIT 1
```rust
#[test]
fn notes_ambiguity_preflight_not_defeated_by_limit() {
let conn = create_test_db();
// Insert 2 projects, each with a discussion sharing the same gitlab_discussion_id
// Use --limit 1, which would hide the second project in a post-query check
// Assert LoreError::Ambiguous is still returned (preflight runs before LIMIT)
}
```
---
## 2. Add `gitlab_discussion_id` to Show Command Discussion Groups
### Why
`lore -J issues 42` and `lore -J mrs 99` return discussions grouped by thread, but neither
`DiscussionDetailJson` nor `MrDiscussionDetailJson` includes the `gitlab_discussion_id`. An
agent viewing MR details can see all discussion threads but can't identify which one to reply to.
### Current Code
**Issue discussions** (`src/cli/commands/show.rs:99-102`):
```rust
pub struct DiscussionDetail {
pub notes: Vec<NoteDetail>,
pub individual_note: bool,
}
```
**MR discussions** (`src/cli/commands/show.rs:37-40`):
```rust
pub struct MrDiscussionDetail {
pub notes: Vec<MrNoteDetail>,
pub individual_note: bool,
}
```
**JSON equivalents** (`show.rs:1001-1003` and `show.rs:1100-1103`):
```rust
pub struct DiscussionDetailJson {
pub notes: Vec<NoteDetailJson>,
pub individual_note: bool,
}
pub struct MrDiscussionDetailJson {
pub notes: Vec<MrNoteDetailJson>,
pub individual_note: bool,
}
```
**Queries** (`show.rs:325-328` and `show.rs:537-540`):
```sql
SELECT id, individual_note FROM discussions WHERE issue_id = ? ORDER BY first_note_at
SELECT id, individual_note FROM discussions WHERE merge_request_id = ? ORDER BY first_note_at
```
### Changes Required
#### 2a. Add fields to domain structs
**File**: `src/cli/commands/show.rs`
```rust
pub struct DiscussionDetail {
pub gitlab_discussion_id: String, // ADD
pub resolvable: bool, // ADD — agents need thread state
pub resolved: bool, // ADD — agents need thread state
pub last_note_at: i64, // ADD — for recency sorting
pub notes: Vec<NoteDetail>,
pub individual_note: bool,
}
pub struct MrDiscussionDetail {
pub gitlab_discussion_id: String, // ADD
pub resolvable: bool, // ADD
pub resolved: bool, // ADD
pub last_note_at: i64, // ADD
pub notes: Vec<MrNoteDetail>,
pub individual_note: bool,
}
```
#### 2b. Add fields to JSON structs
```rust
pub struct DiscussionDetailJson {
pub gitlab_discussion_id: String, // ADD
pub resolvable: bool, // ADD
pub resolved: bool, // ADD
pub last_note_at_iso: String, // ADD — ISO formatted
pub notes: Vec<NoteDetailJson>,
pub individual_note: bool,
}
pub struct MrDiscussionDetailJson {
pub gitlab_discussion_id: String, // ADD
pub resolvable: bool, // ADD
pub resolved: bool, // ADD
pub last_note_at_iso: String, // ADD — ISO formatted
pub notes: Vec<MrNoteDetailJson>,
pub individual_note: bool,
}
```
#### 2c. Update queries to SELECT new fields
**Issue discussions** (`show.rs:325`):
```sql
SELECT id, gitlab_discussion_id, individual_note, resolvable, resolved,
COALESCE(last_note_at, first_note_at, 0) AS last_note_at
FROM discussions
WHERE issue_id = ? ORDER BY COALESCE(first_note_at, last_note_at, 0), id
```
**MR discussions** (`show.rs:537`):
```sql
SELECT id, gitlab_discussion_id, individual_note, resolvable, resolved,
COALESCE(last_note_at, first_note_at, 0) AS last_note_at
FROM discussions
WHERE merge_request_id = ? ORDER BY COALESCE(first_note_at, last_note_at, 0), id
```
**Note on ordering**: The `COALESCE` with tiebreaker `id` ensures deterministic ordering even
when timestamps are NULL (possible during partial sync states). This prevents unstable output
that could confuse automated workflows.
#### 2d. Update query_map closures
The `disc_rows` tuple changes from `(i64, bool)` to a richer shape. Use named columns here
too for clarity:
Issue path (`show.rs:331-335`):
```rust
let disc_rows: Vec<(i64, String, bool, bool, bool, i64)> = disc_stmt
.query_map([issue_id], |row| {
Ok((
row.get("id")?,
row.get("gitlab_discussion_id")?,
row.get::<_, i64>("individual_note").map(|v| v == 1)?,
row.get::<_, i64>("resolvable").map(|v| v == 1)?,
row.get::<_, i64>("resolved").map(|v| v == 1)?,
row.get("last_note_at")?,
))
})?
.collect::<std::result::Result<Vec<_>, _>>()?;
```
And where discussions are constructed (`show.rs:361`):
```rust
for (disc_id, gitlab_disc_id, individual_note, resolvable, resolved, last_note_at) in disc_rows {
// ... existing note query ...
discussions.push(DiscussionDetail {
gitlab_discussion_id: gitlab_disc_id,
resolvable,
resolved,
last_note_at,
notes,
individual_note,
});
}
```
Same pattern for MR discussions (`show.rs:543-560`, `show.rs:598`).
#### 2e. Update From impls
```rust
impl From<&DiscussionDetail> for DiscussionDetailJson {
fn from(disc: &DiscussionDetail) -> Self {
Self {
gitlab_discussion_id: disc.gitlab_discussion_id.clone(),
resolvable: disc.resolvable,
resolved: disc.resolved,
last_note_at_iso: format_iso_timestamp(disc.last_note_at),
notes: disc.notes.iter().map(|n| n.into()).collect(),
individual_note: disc.individual_note,
}
}
}
impl From<&MrDiscussionDetail> for MrDiscussionDetailJson {
fn from(disc: &MrDiscussionDetail) -> Self {
Self {
gitlab_discussion_id: disc.gitlab_discussion_id.clone(),
resolvable: disc.resolvable,
resolved: disc.resolved,
last_note_at_iso: format_iso_timestamp(disc.last_note_at),
notes: disc.notes.iter().map(|n| n.into()).collect(),
individual_note: disc.individual_note,
}
}
}
```
#### 2f. Add `gitlab_note_id` to note detail structs in show
While we're here, add `gitlab_id` (as `gitlab_note_id` in JSON) to `NoteDetail`,
`MrNoteDetail`, and their JSON counterparts. Currently show-command notes only have
`author_username`, `body`, `created_at`, `is_system` — no note ID at all, making it impossible
to reference a specific note. This satisfies the Bridge Contract requirement for `gitlab_note_id`
on note-level data.
**Domain structs** — add `gitlab_id: i64` field.
**JSON structs** — add `gitlab_note_id: i64` field.
**Queries** — add `n.gitlab_id` to the note SELECT within show.
**From impls** — map `gitlab_id``gitlab_note_id`.
### Tests
**File**: `src/cli/commands/show_tests.rs` (or within show.rs `#[cfg(test)]`)
#### Test 1: Issue show includes gitlab_discussion_id
```rust
#[test]
fn show_issue_includes_gitlab_discussion_id() {
// Setup: project, issue, discussion with known gitlab_discussion_id, note
let conn = create_test_db();
insert_project(&conn, 1);
insert_issue(&conn, 1, 1, 42);
conn.execute(
"INSERT INTO discussions (id, gitlab_discussion_id, project_id, issue_id, noteable_type, last_seen_at)
VALUES (1, 'abc123hex', 1, 1, 'Issue', 1000)",
[],
).unwrap();
insert_note(&conn, 1, 500, 1, 1, "alice", "hello", false);
let detail = run_show_issue_with_conn(&conn, 42, None).unwrap();
assert_eq!(detail.discussions.len(), 1);
assert_eq!(detail.discussions[0].gitlab_discussion_id, "abc123hex");
}
```
#### Test 2: MR show includes gitlab_discussion_id
Same pattern for MR path.
#### Test 3: JSON serialization includes the field
```rust
#[test]
fn discussion_detail_json_has_gitlab_discussion_id() {
let detail = DiscussionDetail {
gitlab_discussion_id: "deadbeef".to_string(),
resolvable: true,
resolved: false,
last_note_at: 1_700_000_000_000,
notes: vec![],
individual_note: false,
};
let json = DiscussionDetailJson::from(&detail);
let value = serde_json::to_value(&json).unwrap();
assert_eq!(value["gitlab_discussion_id"], "deadbeef");
assert_eq!(value["resolvable"], true);
assert_eq!(value["resolved"], false);
assert!(value.get("last_note_at_iso").is_some());
}
```
#### Test 4: Show note includes gitlab_note_id
```rust
#[test]
fn show_note_detail_json_has_gitlab_note_id() {
// Verify NoteDetailJson serialization includes gitlab_note_id
}
```
---
## 3. Add Standalone `discussions` List Command
### Why
Discussions are a first-class entity in the DB (211K rows) but invisible to agents as a
collection. An agent working on an MR needs to see all threads at a glance — which are
unresolved, who started them, what file they're on — with the `gitlab_discussion_id` needed
to reply. Currently the only way is `lore notes --for-mr 99` which returns flat notes without
discussion grouping.
### Design
```
lore discussions [OPTIONS]
# List all discussions on MR 99 (most common agent use case)
lore -J discussions --for-mr 99
# List unresolved discussions on MR 99
lore -J discussions --for-mr 99 --resolution unresolved
# List discussions on issue 42
lore -J discussions --for-issue 42
# List discussions across a project
lore -J discussions -p group/repo --since 7d
# Look up a specific discussion by GitLab ID
lore -J discussions --gitlab-discussion-id 6a9c1750b37d
# List unresolved threads with latest 2 notes inline (fewer round-trips)
lore -J discussions --for-mr 99 --resolution unresolved --include-notes 2
```
### Response Schema
```json
{
"ok": true,
"data": {
"discussions": [
{
"gitlab_discussion_id": "6a9c1750b37d513a...",
"noteable_type": "MergeRequest",
"parent_iid": 3929,
"parent_title": "Resolve \"Switch Health Card\"",
"project_path": "vs/typescript-code",
"gitlab_project_id": 42,
"individual_note": false,
"note_count": 3,
"first_author": "elovegrove",
"first_note_body_snippet": "Ok @teernisse well I really do prefer...",
"first_note_at_iso": "2026-02-16T14:31:34Z",
"last_note_at_iso": "2026-02-16T15:02:11Z",
"resolvable": true,
"resolved": false,
"position_new_path": "src/components/SwitchHealthCard.vue",
"position_new_line": 42,
"included_note_count": 0,
"has_more_notes": false,
"notes": []
}
],
"total_count": 15,
"showing": 15
},
"meta": {
"elapsed_ms": 12,
"effective_limit": 50,
"effective_include_notes": 0,
"has_more": false
}
}
```
The `notes` array is empty by default (zero overhead). When `--include-notes N` is provided,
each discussion includes up to N of its most recent notes inline. This covers the common
agent pattern of "show me unresolved threads with context" in a single round-trip.
The `included_note_count` and `has_more_notes` fields provide per-discussion truncation
signals. `included_note_count` is the number of notes actually included in the `notes` array,
and `has_more_notes` is true when `note_count > included_note_count`. This lets agents know
whether a thread's notes were fully returned or truncated, enabling them to decide whether a
follow-up `lore notes --gitlab-discussion-id <id>` call is needed for the complete thread.
The `meta` block includes `effective_limit` and `effective_include_notes` (the clamped values
actually used) plus `has_more` (true when total_count > showing). This lets agents detect
truncation and decide whether to paginate or narrow their query.
### File Architecture
**No new files.** Follow the existing pattern:
- Args struct: `src/cli/mod.rs` (alongside `NotesArgs`, `IssuesArgs`)
- Query + print functions: `src/cli/commands/list.rs` (alongside `query_notes`, `print_list_notes_json`)
- Handler: `src/main.rs` (alongside `handle_notes`)
- Tests: `src/cli/commands/list_tests.rs`
- Robot-docs: `src/main.rs` robot-docs JSON block
### Implementation Details
#### 3a. CLI Args
**File**: `src/cli/mod.rs`
Add variant to `Commands` enum (after `Notes`):
```rust
/// List discussions
#[command(visible_alias = "discussion")]
Discussions(DiscussionsArgs),
```
Args struct (with typed enums for filter/sort fields):
```rust
/// Resolution filter for discussion queries
#[derive(Clone, Debug, ValueEnum)]
pub enum ResolutionFilter {
Unresolved,
Resolved,
}
/// Noteable type filter
#[derive(Clone, Debug, ValueEnum)]
pub enum NoteableTypeFilter {
Issue,
MergeRequest,
}
/// Sort field for discussion queries
#[derive(Clone, Debug, ValueEnum)]
pub enum DiscussionSortField {
FirstNote,
LastNote,
}
/// Sort direction
#[derive(Clone, Debug, ValueEnum)]
pub enum SortDirection {
Asc,
Desc,
}
#[derive(Parser)]
pub struct DiscussionsArgs {
/// Maximum results (clamped to 500)
#[arg(short = 'n', long = "limit", default_value = "50", help_heading = "Output")]
pub limit: usize,
/// Select output fields (comma-separated, or 'minimal' preset)
#[arg(long, help_heading = "Output", value_delimiter = ',')]
pub fields: Option<Vec<String>>,
/// Output format (table, json, jsonl, csv)
#[arg(long, default_value = "table", value_parser = ["table", "json", "jsonl", "csv"], help_heading = "Output")]
pub format: String,
/// Filter to discussions on a specific issue IID
#[arg(long, conflicts_with = "for_mr", help_heading = "Filters")]
pub for_issue: Option<i64>,
/// Filter to discussions on a specific MR IID
#[arg(long, conflicts_with = "for_issue", help_heading = "Filters")]
pub for_mr: Option<i64>,
/// Filter by project path
#[arg(short = 'p', long, help_heading = "Filters")]
pub project: Option<String>,
/// Filter by GitLab discussion ID
#[arg(long, help_heading = "Filters")]
pub gitlab_discussion_id: Option<String>,
/// Filter by resolution status
#[arg(long, value_enum, help_heading = "Filters")]
pub resolution: Option<ResolutionFilter>,
/// Filter by time (7d, 2w, 1m, or YYYY-MM-DD)
#[arg(long, help_heading = "Filters")]
pub since: Option<String>,
/// Filter by file path (exact match or prefix with trailing /)
#[arg(long, help_heading = "Filters")]
pub path: Option<String>,
/// Filter by noteable type
#[arg(long, value_enum, help_heading = "Filters")]
pub noteable_type: Option<NoteableTypeFilter>,
/// Include up to N latest notes per discussion (0 = none, default; clamped to 20)
#[arg(long, default_value = "0", help_heading = "Output")]
pub include_notes: usize,
/// Sort field
#[arg(long, value_enum, default_value = "last-note", help_heading = "Sorting")]
pub sort: DiscussionSortField,
/// Sort direction (default: descending)
#[arg(long, value_enum, default_value = "desc", help_heading = "Sorting")]
pub order: SortDirection,
}
```
**Typed enum rationale**: Using `ValueEnum` for `resolution`, `noteable_type`, `sort`, and
`order` makes invalid states unrepresentable at parse time. clap handles validation and help
text automatically. In the query builder, `match` on the enum variant to produce the SQL
fragment — no string interpolation, no risk of SQL injection drift.
```rust
// Example: enum -> SQL fragment mapping
impl DiscussionSortField {
pub fn to_sql_column(&self) -> &'static str {
match self {
Self::FirstNote => "fd.first_note_at",
Self::LastNote => "fd.last_note_at",
}
}
}
impl SortDirection {
pub fn to_sql(&self) -> &'static str {
match self {
Self::Asc => "ASC",
Self::Desc => "DESC",
}
}
}
```
**Output guardrails**: The handler clamps `limit` to `min(limit, 500)` and `include_notes`
to `min(include_notes, 20)` before passing to the query layer. This prevents accidentally
huge payloads in robot mode. The clamped values are reported in `meta.effective_limit` and
`meta.effective_include_notes`.
#### 3b. Domain Structs
**File**: `src/cli/commands/list.rs`
```rust
#[derive(Debug)]
pub struct DiscussionListRow {
pub id: i64,
pub gitlab_discussion_id: String,
pub noteable_type: String,
pub parent_iid: Option<i64>,
pub parent_title: Option<String>,
pub project_path: String,
pub gitlab_project_id: i64,
pub individual_note: bool,
pub note_count: i64,
pub first_author: Option<String>,
pub first_note_body: Option<String>,
pub first_note_at: i64,
pub last_note_at: i64,
pub resolvable: bool,
pub resolved: bool,
pub position_new_path: Option<String>,
pub position_new_line: Option<i64>,
}
#[derive(Serialize)]
pub struct DiscussionListRowJson {
pub gitlab_discussion_id: String,
pub noteable_type: String,
#[serde(skip_serializing_if = "Option::is_none")]
pub parent_iid: Option<i64>,
#[serde(skip_serializing_if = "Option::is_none")]
pub parent_title: Option<String>,
pub project_path: String,
pub gitlab_project_id: i64,
pub individual_note: bool,
pub note_count: i64,
#[serde(skip_serializing_if = "Option::is_none")]
pub first_author: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub first_note_body_snippet: Option<String>,
pub first_note_at_iso: String,
pub last_note_at_iso: String,
pub resolvable: bool,
pub resolved: bool,
#[serde(skip_serializing_if = "Option::is_none")]
pub position_new_path: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub position_new_line: Option<i64>,
pub included_note_count: usize,
pub has_more_notes: bool,
#[serde(skip_serializing_if = "Vec::is_empty")]
pub notes: Vec<NoteListRowJson>,
}
pub struct DiscussionListResult {
pub discussions: Vec<DiscussionListRow>,
pub total_count: i64,
}
#[derive(Serialize)]
pub struct DiscussionListResultJson {
pub discussions: Vec<DiscussionListRowJson>,
pub total_count: i64,
pub showing: usize,
}
```
The `From` impl truncates `first_note_body` to ~120 chars for the snippet.
The `notes` field on `DiscussionListRowJson` is populated only when `--include-notes N > 0`.
It reuses the existing `NoteListRowJson` struct for consistency — agents get the same note
shape whether they come from `notes`, `show`, or `discussions --include-notes`.
The `included_note_count` is set to `notes.len()` and `has_more_notes` is set to
`note_count > included_note_count` during the JSON conversion, providing per-discussion
truncation signals.
#### 3c. SQL Query
**File**: `src/cli/commands/list.rs`
```rust
pub fn query_discussions(
conn: &Connection,
filters: &DiscussionListFilters,
config: &Config,
) -> Result<DiscussionListResult> {
// Wrap all queries in a deferred read transaction for snapshot consistency
let tx = conn.transaction_with_behavior(rusqlite::TransactionBehavior::Deferred)?;
// Preflight ambiguity check (if gitlab_discussion_id without project)
// ... see Ambiguity Guardrail section ...
// Main query + count query ...
// ... note expansion query (if include_notes > 0) ...
tx.commit()?;
```
Core query uses a CTE + ranked-notes rollup (window function) to avoid per-row correlated
subqueries. The `ROW_NUMBER()` approach produces a single scan over the notes table, which
is more predictable than repeated LIMIT 1 sub-selects at scale (200K+ discussions):
```sql
WITH filtered_discussions AS (
SELECT
d.id, d.gitlab_discussion_id, d.noteable_type, d.project_id,
d.issue_id, d.merge_request_id, d.individual_note,
d.first_note_at, d.last_note_at, d.resolvable, d.resolved
FROM discussions d
JOIN projects p ON d.project_id = p.id
{where_sql}
),
ranked_notes AS (
SELECT
n.discussion_id,
n.author_username,
n.body,
n.is_system,
n.position_new_path,
n.position_new_line,
ROW_NUMBER() OVER (
PARTITION BY n.discussion_id
ORDER BY CASE WHEN n.is_system = 0 THEN 0 ELSE 1 END,
n.created_at, n.id
) AS rn_first_note,
ROW_NUMBER() OVER (
PARTITION BY n.discussion_id
ORDER BY CASE WHEN n.position_new_path IS NULL THEN 1 ELSE 0 END,
n.created_at, n.id
) AS rn_first_position
FROM notes n
WHERE n.discussion_id IN (SELECT id FROM filtered_discussions)
),
note_rollup AS (
SELECT
discussion_id,
SUM(CASE WHEN is_system = 0 THEN 1 ELSE 0 END) AS note_count,
MAX(CASE WHEN rn_first_note = 1 AND is_system = 0 THEN author_username END) AS first_author,
MAX(CASE WHEN rn_first_note = 1 AND is_system = 0 THEN body END) AS first_note_body,
MAX(CASE WHEN rn_first_position = 1 THEN position_new_path END) AS position_new_path,
MAX(CASE WHEN rn_first_position = 1 THEN position_new_line END) AS position_new_line
FROM ranked_notes
GROUP BY discussion_id
)
SELECT
fd.id,
fd.gitlab_discussion_id,
fd.noteable_type,
COALESCE(i.iid, m.iid) AS parent_iid,
COALESCE(i.title, m.title) AS parent_title,
p.path_with_namespace AS project_path,
p.gitlab_project_id,
fd.individual_note,
COALESCE(nr.note_count, 0) AS note_count,
nr.first_author,
nr.first_note_body,
COALESCE(fd.first_note_at, fd.last_note_at, 0) AS first_note_at,
COALESCE(fd.last_note_at, fd.first_note_at, 0) AS last_note_at,
fd.resolvable,
fd.resolved,
nr.position_new_path,
nr.position_new_line
FROM filtered_discussions fd
JOIN projects p ON fd.project_id = p.id
LEFT JOIN issues i ON fd.issue_id = i.id
LEFT JOIN merge_requests m ON fd.merge_request_id = m.id
LEFT JOIN note_rollup nr ON nr.discussion_id = fd.id
ORDER BY COALESCE({sort_column}, 0) {order}, fd.id {order}
LIMIT ?
```
**Dual window function rationale**: The `ranked_notes` CTE uses two separate `ROW_NUMBER()`
windows: `rn_first_note` ranks non-system notes first (so the first human comment is always
rn=1), while `rn_first_position` ranks notes with file positions first (so the diff anchor
is always rn=1). This prevents a leading system note (e.g., "assigned to @alice") from
displacing the first human author/body, and prevents a non-positioned note from displacing
the file location. The `MAX(CASE WHEN rn_xxx = 1 ...)` pattern extracts the correct value
from each independently-ranked sequence.
**Performance rationale**: The CTE pre-filters discussions before joining notes. The
`ranked_notes` CTE uses `ROW_NUMBER()` (a single pass over the notes index) instead of
correlated `(SELECT ... LIMIT 1)` sub-selects per discussion. For MR-scoped queries
(50-200 discussions) the performance is equivalent. For project-wide scans with thousands
of discussions, the window function approach avoids repeated index probes and produces a
more predictable query plan.
**Note on ordering**: The `COALESCE({sort_column}, 0)` with tiebreaker `fd.id` ensures
deterministic ordering even when timestamps are NULL (partial sync states). The `id`
tiebreaker is cheap (primary key) and prevents unstable sort output.
**Note on SQLite FILTER syntax**: SQLite does not support `COUNT(*) FILTER (WHERE ...)`.
Use `SUM(CASE WHEN ... THEN 1 ELSE 0 END)` instead (as shown above).
#### 3c-ii. Note expansion query (--include-notes)
When `include_notes > 0`, after the main discussion query, run a **single batched query**
using a window function to fetch the N most recent notes per discussion:
```sql
WITH ranked_expansion AS (
SELECT
n.id, n.gitlab_id, n.author_username, n.body, n.note_type,
n.is_system, n.created_at, n.updated_at,
n.position_new_path, n.position_new_line,
n.position_old_path, n.position_old_line,
n.resolvable, n.resolved, n.resolved_by,
d.noteable_type,
COALESCE(i.iid, m.iid) AS parent_iid,
COALESCE(i.title, m.title) AS parent_title,
p.path_with_namespace AS project_path,
p.gitlab_project_id,
d.gitlab_discussion_id,
n.discussion_id,
ROW_NUMBER() OVER (
PARTITION BY n.discussion_id
ORDER BY n.created_at DESC, n.id DESC
) AS rn
FROM notes n
JOIN discussions d ON n.discussion_id = d.id
JOIN projects p ON n.project_id = p.id
LEFT JOIN issues i ON d.issue_id = i.id
LEFT JOIN merge_requests m ON d.merge_request_id = m.id
WHERE n.discussion_id IN ({placeholders})
)
SELECT * FROM ranked_expansion WHERE rn <= ?
ORDER BY discussion_id, rn
```
Group by `discussion_id` in Rust and attach notes arrays to the corresponding
`DiscussionListRowJson`. Set `included_note_count = notes.len()` and
`has_more_notes = note_count > included_note_count` for each discussion. This avoids
per-discussion round-trips entirely — one query regardless of how many discussions are in the
result set.
The `{placeholders}` are the `id` values from the main discussion query result. Since
the discussion count is already clamped by `--limit` (max 500), the IN clause size is
bounded and safe.
The returned `NoteListRow` rows reuse the same struct and `NoteListRowJson` conversion from
workstream 1, ensuring identical note shape across all commands.
#### 3d. Filters struct
```rust
pub struct DiscussionListFilters {
pub limit: usize,
pub project: Option<String>,
pub for_issue_iid: Option<i64>,
pub for_mr_iid: Option<i64>,
pub gitlab_discussion_id: Option<String>,
pub resolution: Option<ResolutionFilter>,
pub since: Option<String>,
pub path: Option<String>,
pub noteable_type: Option<NoteableTypeFilter>,
pub sort: DiscussionSortField,
pub order: SortDirection,
pub include_notes: usize,
}
```
Where-clause construction uses `match` on typed enums — never raw string interpolation:
- `for_issue_iid` → subquery to resolve issue ID from IID + project
- `for_mr_iid` → subquery to resolve MR ID from IID + project
- `gitlab_discussion_id``d.gitlab_discussion_id = ?`
- `resolution` → match: `Unresolved``d.resolvable = 1 AND d.resolved = 0`, `Resolved``d.resolvable = 1 AND d.resolved = 1`
- `since``d.first_note_at >= ?` (using `parse_since()`)
- `path``EXISTS (SELECT 1 FROM notes n WHERE n.discussion_id = d.id AND n.position_new_path LIKE ?)`
- `noteable_type` → match: `Issue``d.noteable_type = 'Issue'`, `MergeRequest``d.noteable_type = 'MergeRequest'`
#### 3e. Handler wiring
**File**: `src/main.rs`
Add match arm:
```rust
Some(Commands::Discussions(args)) => handle_discussions(cli.config.as_deref(), args, robot_mode),
```
Handler function:
```rust
fn handle_discussions(
config_override: Option<&str>,
args: DiscussionsArgs,
robot_mode: bool,
) -> Result<(), Box<dyn std::error::Error>> {
let start = std::time::Instant::now();
let config = Config::load(config_override)?;
let db_path = get_db_path(config.storage.db_path.as_deref());
let conn = create_connection(&db_path)?;
let effective_limit = args.limit.min(500);
let effective_include_notes = args.include_notes.min(20);
let filters = DiscussionListFilters {
limit: effective_limit,
project: args.project,
for_issue_iid: args.for_issue,
for_mr_iid: args.for_mr,
gitlab_discussion_id: args.gitlab_discussion_id,
resolution: args.resolution,
since: args.since,
path: args.path,
noteable_type: args.noteable_type,
sort: args.sort,
order: args.order,
include_notes: effective_include_notes,
};
let result = query_discussions(&conn, &filters, &config)?;
let format = if robot_mode && args.format == "table" {
"json"
} else {
&args.format
};
match format {
"json" => print_list_discussions_json(
&result,
start.elapsed().as_millis() as u64,
args.fields.as_deref(),
robot_mode,
effective_limit,
effective_include_notes,
),
"jsonl" => print_list_discussions_jsonl(&result),
"csv" => print_list_discussions_csv(&result),
_ => print_list_discussions(&result),
}
Ok(())
}
```
#### 3f. Print functions
**File**: `src/cli/commands/list.rs`
Follow same pattern as `print_list_notes_json`:
```rust
pub fn print_list_discussions_json(
result: &DiscussionListResult,
elapsed_ms: u64,
fields: Option<&[String]>,
robot_mode: bool,
effective_limit: usize,
effective_include_notes: usize,
) {
let json_result = DiscussionListResultJson::from(result);
let has_more = result.total_count as usize > json_result.showing;
let meta = serde_json::json!({
"elapsed_ms": elapsed_ms,
"effective_limit": effective_limit,
"effective_include_notes": effective_include_notes,
"has_more": has_more,
});
let output = serde_json::json!({
"ok": true,
"data": json_result,
"meta": meta,
});
let mut output = output;
if let Some(f) = fields {
let expanded = expand_fields_preset(f, "discussions");
if robot_mode {
filter_fields_robot(&mut output, "discussions", &expanded);
} else {
filter_fields(&mut output, "discussions", &expanded);
}
}
match serde_json::to_string(&output) {
Ok(json) => println!("{json}"),
Err(e) => eprintln!("Error serializing to JSON: {e}"),
}
}
```
Table view: compact format showing discussion_id (first 8 chars), first author, note count,
resolved status, path, snippet.
CSV view: all fields, following same pattern as `print_list_notes_csv`.
#### 3g. Fields preset
**File**: `src/cli/robot.rs`
```rust
"discussions" => [
"gitlab_discussion_id", "parent_iid", "note_count",
"resolvable", "resolved", "first_author", "gitlab_project_id"
]
.iter()
.map(|s| (*s).to_string())
.collect(),
```
#### 3h. Query-plan validation
Before merging the discussions command, capture `EXPLAIN QUERY PLAN` output for the three
primary query patterns:
- `--for-mr <iid> --resolution unresolved`
- `--project <path> --since 7d --sort last-note`
- `--gitlab-discussion-id <id>`
If plans show table scans on `notes` or `discussions` for these patterns, add targeted indexes
to the `MIGRATIONS` array in `src/core/db.rs`:
**Candidate indexes** (add only if EXPLAIN QUERY PLAN shows they're needed):
- `discussions(project_id, gitlab_discussion_id)` — for ambiguity preflight + direct ID lookup
- `discussions(merge_request_id, last_note_at, id)` — for MR-scoped + sorted queries
- `notes(discussion_id, created_at DESC, id DESC)` — for `--include-notes` expansion
- `notes(discussion_id, is_system, created_at, id)` — for ranked_notes CTE ordering
This is a measured approach: profile first, add indexes only where the query plan demands them.
No speculative index creation.
### Tests
#### Test 1: Basic query returns discussions with gitlab_discussion_id
```rust
#[test]
fn query_discussions_basic() {
let conn = create_test_db();
insert_project(&conn, 1);
insert_mr(&conn, 1, 1, 99, "Test MR");
insert_discussion(&conn, 1, "hexhex123", 1, None, Some(1), "MergeRequest");
insert_note_in_discussion(&conn, 1, 500, 1, 1, "alice", "first comment");
insert_note_in_discussion(&conn, 2, 501, 1, 1, "bob", "reply");
let filters = DiscussionListFilters::default_for_mr(99);
let result = query_discussions(&conn, &filters, &Config::default()).unwrap();
assert_eq!(result.discussions.len(), 1);
assert_eq!(result.discussions[0].gitlab_discussion_id, "hexhex123");
assert_eq!(result.discussions[0].note_count, 2);
assert_eq!(result.discussions[0].first_author.as_deref(), Some("alice"));
}
```
#### Test 2: Resolution filter
```rust
#[test]
fn query_discussions_resolution_filter() {
let conn = create_test_db();
// Insert 2 discussions: one resolved, one unresolved
// ...
let filters = DiscussionListFilters {
resolution: Some(ResolutionFilter::Unresolved),
..DiscussionListFilters::default_for_mr(99)
};
let result = query_discussions(&conn, &filters, &Config::default()).unwrap();
assert_eq!(result.total_count, 1);
assert!(!result.discussions[0].resolved);
}
```
#### Test 3: Path filter
```rust
#[test]
fn query_discussions_path_filter() {
// Insert discussions: one with diff notes on src/auth.rs, one general
// Filter by path "src/auth.rs"
// Assert only the diff note discussion is returned
}
```
#### Test 4: JSON serialization round-trip
```rust
#[test]
fn discussion_list_json_serialization() {
let row = DiscussionListRow {
id: 1,
gitlab_discussion_id: "abc123".to_string(),
noteable_type: "MergeRequest".to_string(),
parent_iid: Some(99),
parent_title: Some("Fix auth".to_string()),
project_path: "group/repo".to_string(),
gitlab_project_id: 42,
individual_note: false,
note_count: 3,
first_author: Some("alice".to_string()),
first_note_body: Some("This is a very long comment that should be truncated...".to_string()),
first_note_at: 1_700_000_000_000,
last_note_at: 1_700_001_000_000,
resolvable: true,
resolved: false,
position_new_path: Some("src/auth.rs".to_string()),
position_new_line: Some(42),
};
let json_row = DiscussionListRowJson::from(&row);
let value = serde_json::to_value(&json_row).unwrap();
assert_eq!(value["gitlab_discussion_id"], "abc123");
assert_eq!(value["gitlab_project_id"], 42);
assert_eq!(value["note_count"], 3);
assert!(value["first_note_body_snippet"].as_str().unwrap().len() <= 120);
assert_eq!(value["included_note_count"], 0);
assert_eq!(value["has_more_notes"], false);
}
```
#### Test 5: Fields filtering works for discussions
```rust
#[test]
fn discussions_fields_minimal_preset() {
let expanded = expand_fields_preset(&["minimal".to_string()], "discussions");
assert!(expanded.contains(&"gitlab_discussion_id".to_string()));
assert!(expanded.contains(&"parent_iid".to_string()));
}
```
#### Test 6: CTE query handles empty note_rollup gracefully
```rust
#[test]
fn query_discussions_with_no_notes() {
let conn = create_test_db();
insert_project(&conn, 1);
insert_mr(&conn, 1, 1, 99, "Test MR");
// Insert discussion with no notes (edge case: possible after sync issues)
insert_discussion(&conn, 1, "orphan123", 1, None, Some(1), "MergeRequest");
let filters = DiscussionListFilters::default_for_mr(99);
let result = query_discussions(&conn, &filters, &Config::default()).unwrap();
assert_eq!(result.discussions.len(), 1);
assert_eq!(result.discussions[0].note_count, 0);
assert!(result.discussions[0].first_author.is_none());
}
```
#### Test 7: --gitlab-discussion-id filter returns exact match
```rust
#[test]
fn query_discussions_by_gitlab_id() {
let conn = create_test_db();
insert_project(&conn, 1);
insert_mr(&conn, 1, 1, 99, "Test MR");
insert_discussion(&conn, 1, "target123", 1, None, Some(1), "MergeRequest");
insert_discussion(&conn, 2, "other456", 1, None, Some(1), "MergeRequest");
let filters = DiscussionListFilters {
gitlab_discussion_id: Some("target123".to_string()),
..DiscussionListFilters::default_for_mr(99)
};
let result = query_discussions(&conn, &filters, &Config::default()).unwrap();
assert_eq!(result.discussions.len(), 1);
assert_eq!(result.discussions[0].gitlab_discussion_id, "target123");
}
```
#### Test 8: --include-notes populates notes array via batched query
```rust
#[test]
fn query_discussions_with_included_notes() {
let conn = create_test_db();
insert_project(&conn, 1);
insert_mr(&conn, 1, 1, 99, "Test MR");
insert_discussion(&conn, 1, "disc123", 1, None, Some(1), "MergeRequest");
insert_note_in_discussion(&conn, 1, 500, 1, 1, "alice", "first");
insert_note_in_discussion(&conn, 2, 501, 1, 1, "bob", "second");
insert_note_in_discussion(&conn, 3, 502, 1, 1, "carol", "third");
let filters = DiscussionListFilters {
include_notes: 2,
..DiscussionListFilters::default_for_mr(99)
};
let result = query_discussions(&conn, &filters, &Config::default()).unwrap();
assert_eq!(result.discussions.len(), 1);
// Note: notes populated during JSON conversion, not in raw result
// Test at handler/print level for full integration
}
```
#### Test 9: Bridge fields survive --fields filtering in robot mode
```rust
#[test]
fn discussions_bridge_fields_forced_in_robot_mode() {
// Request only "note_count" — bridge fields must still appear
let mut value = serde_json::json!({
"data": {
"discussions": [{
"gitlab_discussion_id": "abc",
"noteable_type": "MergeRequest",
"parent_iid": 99,
"project_path": "group/repo",
"gitlab_project_id": 42,
"note_count": 3
}]
}
});
filter_fields_robot(
&mut value,
"discussions",
&["note_count".to_string()],
);
let disc = &value["data"]["discussions"][0];
assert_eq!(disc["note_count"], 3);
assert!(disc.get("gitlab_discussion_id").is_some());
assert!(disc.get("project_path").is_some());
assert!(disc.get("gitlab_project_id").is_some());
}
```
#### Test 10: Output guardrails clamp limit and include_notes
```rust
#[test]
fn discussions_output_guardrails() {
// Verify that limit > 500 is clamped to 500
// Verify that include_notes > 20 is clamped to 20
// These are handler-level tests (not query-level)
assert_eq!(1000_usize.min(500), 500);
assert_eq!(50_usize.min(20), 20);
assert_eq!(5_usize.min(20), 5); // below cap stays unchanged
}
```
#### Test 11: Ambiguity guardrail fires for cross-project discussion ID
```rust
#[test]
fn discussions_ambiguous_gitlab_discussion_id_across_projects() {
let conn = create_test_db();
insert_project(&conn, 1); // "group/repo-a"
insert_project(&conn, 2); // "group/repo-b"
// Insert discussions with same gitlab_discussion_id in different projects
insert_discussion(&conn, 1, "shared-id", 1, None, None, "Issue");
insert_discussion(&conn, 2, "shared-id", 2, None, None, "Issue");
let filters = DiscussionListFilters {
gitlab_discussion_id: Some("shared-id".to_string()),
project: None, // no project specified
..DiscussionListFilters::default()
};
let result = query_discussions(&conn, &filters, &Config::default());
assert!(result.is_err());
// Error should be Ambiguous with both project paths
}
```
#### Test 12: has_more metadata is accurate
```rust
#[test]
fn discussions_has_more_metadata() {
let conn = create_test_db();
insert_project(&conn, 1);
insert_mr(&conn, 1, 1, 99, "Test MR");
// Insert 5 discussions
for i in 1..=5 {
insert_discussion(&conn, i, &format!("disc-{i}"), 1, None, Some(1), "MergeRequest");
insert_note_in_discussion(&conn, i, 500 + i, i, 1, "alice", "note");
}
// Limit to 3 — should show has_more = true
let filters = DiscussionListFilters {
limit: 3,
..DiscussionListFilters::default_for_mr(99)
};
let result = query_discussions(&conn, &filters, &Config::default()).unwrap();
assert_eq!(result.discussions.len(), 3);
assert_eq!(result.total_count, 5);
// has_more = total_count > showing = 5 > 3 = true
}
```
#### Test 13: Per-discussion truncation signals are accurate
```rust
#[test]
fn discussions_per_discussion_truncation_signals() {
let conn = create_test_db();
insert_project(&conn, 1);
insert_mr(&conn, 1, 1, 99, "Test MR");
insert_discussion(&conn, 1, "disc123", 1, None, Some(1), "MergeRequest");
// Insert 5 notes
for i in 1..=5 {
insert_note_in_discussion(&conn, i, 500 + i, 1, 1, "alice", &format!("note {i}"));
}
// Request 2 notes — should show has_more_notes = true
let filters = DiscussionListFilters {
include_notes: 2,
..DiscussionListFilters::default_for_mr(99)
};
let result = query_discussions(&conn, &filters, &Config::default()).unwrap();
// Convert to JSON and verify truncation signals
let json_result = DiscussionListResultJson::from_with_notes(&result, 2);
assert_eq!(json_result.discussions[0].included_note_count, 2);
assert!(json_result.discussions[0].has_more_notes); // 5 notes > 2 included
}
```
#### Test 14: First-note rollup handles leading system notes correctly
```rust
#[test]
fn discussions_first_note_rollup_skips_system_notes() {
let conn = create_test_db();
insert_project(&conn, 1);
insert_mr(&conn, 1, 1, 99, "Test MR");
insert_discussion(&conn, 1, "disc123", 1, None, Some(1), "MergeRequest");
// First note is system, second is human
insert_note_in_discussion_system(&conn, 1, 500, 1, 1, "system", "assigned to @alice", true);
insert_note_in_discussion(&conn, 2, 501, 1, 1, "bob", "actual first comment");
let filters = DiscussionListFilters::default_for_mr(99);
let result = query_discussions(&conn, &filters, &Config::default()).unwrap();
assert_eq!(result.discussions[0].first_author.as_deref(), Some("bob"));
assert!(result.discussions[0].first_note_body.as_ref().unwrap().contains("actual first comment"));
}
```
---
## 4. Fix Robot-Docs Response Schemas
### Why
The notes command robot-docs says:
```json
"data": {"notes": "[NoteListRowJson]", "total_count": "int", "showing": "int"}
```
An agent sees `[NoteListRowJson]` — a Rust type name — and has no idea what fields are
available. Compare with the `issues` command which inline-lists every field. This forces
agents into trial-and-error field discovery.
### Changes Required
**File**: `src/main.rs`, robot-docs JSON block
#### 4a. Notes response_schema
Replace:
```json
"data": {"notes": "[NoteListRowJson]", "total_count": "int", "showing": "int"}
```
With:
```json
"data": {
"notes": "[{id:int, gitlab_id:int, gitlab_note_id:int, author_username:string, body:string?, note_type:string?, is_system:bool, created_at_iso:string, updated_at_iso:string, position_new_path:string?, position_new_line:int?, position_old_path:string?, position_old_line:int?, resolvable:bool, resolved:bool, resolved_by:string?, noteable_type:string?, parent_iid:int?, parent_title:string?, project_path:string, gitlab_project_id:int, gitlab_discussion_id:string}]",
"total_count": "int",
"showing": "int"
}
```
#### 4b. Add discussions response_schema
```json
"discussions": {
"description": "List discussions with thread-level metadata",
"flags": [
"--limit/-n <N>",
"--for-issue <iid>",
"--for-mr <iid>",
"-p/--project <path>",
"--gitlab-discussion-id <id>",
"--resolution <unresolved|resolved>",
"--since <period>",
"--path <filepath>",
"--noteable-type <Issue|MergeRequest>",
"--include-notes <N>",
"--sort <first-note|last-note>",
"--order <asc|desc>",
"--fields <list|minimal>",
"--format <table|json|jsonl|csv>"
],
"robot_flags": ["--format json", "--fields minimal"],
"example": "lore --robot discussions --for-mr 99 --resolution unresolved",
"response_schema": {
"ok": "bool",
"data": {
"discussions": "[{gitlab_discussion_id:string, noteable_type:string, parent_iid:int?, parent_title:string?, project_path:string, gitlab_project_id:int, individual_note:bool, note_count:int, first_author:string?, first_note_body_snippet:string?, first_note_at_iso:string, last_note_at_iso:string, resolvable:bool, resolved:bool, position_new_path:string?, position_new_line:int?, included_note_count:int, has_more_notes:bool, notes:[{...NoteListRowJson fields...}]?}]",
"total_count": "int",
"showing": "int"
},
"meta": {"elapsed_ms": "int", "effective_limit": "int", "effective_include_notes": "int", "has_more": "bool"}
}
}
```
#### 4c. Add to glab_equivalents
```json
{
"glab": "glab api /projects/:id/merge_requests/:iid/discussions",
"lore": "lore -J discussions --for-mr <iid>",
"note": "Includes note counts, first author, resolution status, file positions"
}
```
#### 4d. Update show response_schema
Update the `issues` and `mrs` show schemas to reflect that `discussions` now include
`gitlab_discussion_id`, `resolvable`, `resolved`, and `last_note_at_iso`. Also reflect that
notes within show discussions now include `gitlab_note_id`.
#### 4e. Add to lore_exclusive list
```json
"discussions: Thread-level discussion listing with gitlab_discussion_id for API integration"
```
#### 4f. Add robot-docs contract tests (field-set parity)
**File**: `src/main.rs` (within `#[cfg(test)]` module)
Add tests that parse the robot-docs JSON output and compare declared fields against actual
serialized struct fields. This is stronger than string-contains checks — it catches schema
drift in both directions (field added to struct but not docs, or field listed in docs but
removed from struct).
```rust
/// Parse compact schema string "field1:type, field2:type?" into a set of field names
fn parse_schema_fields(schema: &str) -> HashSet<String> {
// Strip leading "[{" and trailing "}]", split on ", ", extract field name before ":"
schema.trim_start_matches("[{").trim_end_matches("}]")
.split(", ")
.filter_map(|f| f.split(':').next())
.map(|f| f.to_string())
.collect()
}
/// Get the actual serialized field names from a sample JSON struct
fn sample_note_json_keys() -> HashSet<String> {
let row = NoteListRow { /* ... test defaults ... */ };
let json = NoteListRowJson::from(&row);
let value = serde_json::to_value(&json).unwrap();
value.as_object().unwrap().keys().cloned().collect()
}
#[test]
fn robot_docs_notes_schema_matches_actual_fields() {
let docs = get_robot_docs_json();
let notes_schema = docs["commands"]["notes"]["response_schema"]["data"]["notes"]
.as_str().unwrap();
let declared = parse_schema_fields(notes_schema);
let actual = sample_note_json_keys();
// All bridge fields must be in both declared and actual
for bridge in &["gitlab_discussion_id", "gitlab_project_id", "project_path", "parent_iid", "gitlab_note_id"] {
assert!(declared.contains(*bridge), "robot-docs missing bridge field: {bridge}");
assert!(actual.contains(*bridge), "NoteListRowJson missing bridge field: {bridge}");
}
// Every declared field should exist in the actual struct (no phantom docs)
for field in &declared {
assert!(actual.contains(field),
"robot-docs declares '{field}' but NoteListRowJson doesn't serialize it");
}
// Every actual field should be declared in docs (no undocumented fields)
for field in &actual {
assert!(declared.contains(field),
"NoteListRowJson serializes '{field}' but robot-docs doesn't declare it");
}
}
#[test]
fn robot_docs_discussions_schema_matches_actual_fields() {
let docs = get_robot_docs_json();
let disc_schema = docs["commands"]["discussions"]["response_schema"]["data"]["discussions"]
.as_str().unwrap();
let declared = parse_schema_fields(disc_schema);
let actual = sample_discussion_json_keys();
for bridge in &["gitlab_discussion_id", "gitlab_project_id", "project_path", "parent_iid"] {
assert!(declared.contains(*bridge), "robot-docs missing bridge field: {bridge}");
assert!(actual.contains(*bridge), "DiscussionListRowJson missing bridge field: {bridge}");
}
for field in &declared {
assert!(actual.contains(field),
"robot-docs declares '{field}' but DiscussionListRowJson doesn't serialize it");
}
for field in &actual {
assert!(declared.contains(field),
"DiscussionListRowJson serializes '{field}' but robot-docs doesn't declare it");
}
}
#[test]
fn robot_docs_show_schema_includes_discussion_id() {
let docs = get_robot_docs_json();
// Verify issues and mrs show schemas reference gitlab_discussion_id
// in their discussion sub-schemas
}
```
#### 4g. Add CLI-level contract integration tests
**File**: `src/cli/commands/list_tests.rs` or `src/main.rs` `#[cfg(test)]`
Add handler-level tests that invoke the command handlers with an in-memory DB and parse the
JSON output, asserting Bridge Contract fields are present. These are stronger than unit tests
on structs because they exercise the full path from query through serialization.
```rust
#[test]
fn notes_handler_json_includes_bridge_fields() {
// Setup in-memory DB with project, discussion, note
// Capture stdout from handle_notes (or call query_notes + print_list_notes_json)
// Parse JSON, assert bridge fields present on every note
let conn = create_test_db();
insert_project(&conn, 1);
insert_mr(&conn, 1, 1, 99, "Test MR");
insert_discussion(&conn, 1, "abc123", 1, None, Some(1), "MergeRequest");
insert_note_in_discussion(&conn, 1, 500, 1, 1, "alice", "hello");
let result = query_notes(&conn, &NoteListFilters::default_for_mr(99), &Config::default()).unwrap();
let json_result = NoteListResultJson::from(&result);
let value = serde_json::to_value(&json_result).unwrap();
for note in value["notes"].as_array().unwrap() {
assert!(note.get("gitlab_discussion_id").is_some(), "missing gitlab_discussion_id");
assert!(note.get("gitlab_note_id").is_some(), "missing gitlab_note_id");
assert!(note.get("gitlab_project_id").is_some(), "missing gitlab_project_id");
assert!(note.get("project_path").is_some(), "missing project_path");
assert!(note.get("parent_iid").is_some(), "missing parent_iid");
}
}
#[test]
fn discussions_handler_json_includes_bridge_fields() {
let conn = create_test_db();
insert_project(&conn, 1);
insert_mr(&conn, 1, 1, 99, "Test MR");
insert_discussion(&conn, 1, "abc123", 1, None, Some(1), "MergeRequest");
insert_note_in_discussion(&conn, 1, 500, 1, 1, "alice", "hello");
let result = query_discussions(&conn, &DiscussionListFilters::default_for_mr(99), &Config::default()).unwrap();
let json_result = DiscussionListResultJson::from(&result);
let value = serde_json::to_value(&json_result).unwrap();
for disc in value["discussions"].as_array().unwrap() {
assert!(disc.get("gitlab_discussion_id").is_some(), "missing gitlab_discussion_id");
assert!(disc.get("gitlab_project_id").is_some(), "missing gitlab_project_id");
assert!(disc.get("project_path").is_some(), "missing project_path");
assert!(disc.get("parent_iid").is_some(), "missing parent_iid");
}
}
```
### Tests
Beyond the contract tests above, robot-docs changes are verified by running
`lore robot-docs` and inspecting output.
---
## Delivery Order
1. **Change 1** (notes output) — standalone, no dependencies. Can be released immediately.
2. **Change 2** (show output) — standalone, no dependencies. Can be released alongside 1.
3. **Change 3** (discussions command) — largest change, benefits from 1+2 being reviewed first
to lock down field naming and serialization patterns.
4. **Change 4** (robot-docs + contract tests) — last, after all payloads are finalized.
Changes 1 and 2 can be done in parallel. Change 4 must come last since it documents the
final schema of all preceding changes.
**Cross-cutting**: The Bridge Contract field guardrail (force-including bridge fields in robot
mode) should be implemented as part of Change 1, since it modifies `filter_fields` in
`robot.rs` which all subsequent changes depend on. The `BRIDGE_FIELDS_*` constants are defined
once and reused by Changes 3 and 4.
**Cross-cutting**: The snapshot consistency pattern (deferred read transaction) should be
implemented in Change 1 for `query_notes` and carried forward to Change 3 for
`query_discussions`. This is a one-line wrapper that provides correctness guarantees with
zero performance cost.
---
## Validation Criteria
After all changes:
1. An agent can run `lore -J notes --for-mr 3929 --contains "really do prefer"` and get
`gitlab_discussion_id`, `gitlab_note_id`, and `gitlab_project_id` in the response
2. An agent can run `lore -J discussions --for-mr 3929 --resolution unresolved` to see all
open threads with their IDs
3. An agent can run `lore -J mrs 3929` and see `gitlab_discussion_id`, `resolvable`,
`resolved`, and `last_note_at_iso` on each discussion group, plus `gitlab_note_id` on
each note within
4. `lore robot-docs` lists actual field names for all commands
5. All existing tests still pass
6. No clippy warnings (pedantic + nursery)
7. Robot-docs contract tests pass with field-set parity (not just string-contains), preventing
future schema drift in both directions
8. Bridge Contract fields (`project_path`, `gitlab_project_id`, `noteable_type`, `parent_iid`,
`gitlab_discussion_id`, `gitlab_note_id`) are present in every applicable read payload
9. Bridge Contract fields survive `--fields` filtering in robot mode (guardrail enforced)
10. `--gitlab-discussion-id` filter works on both `notes` and `discussions` commands
11. `--include-notes N` populates inline notes on `discussions` output via single batched query
12. CLI-level contract integration tests verify bridge fields through the full handler path
13. `gitlab_note_id` is available in notes list output (alongside `gitlab_id` for back-compat)
and in show detail notes, providing a uniform field name across all commands
14. Ambiguity guardrail fires when `--gitlab-discussion-id` matches multiple projects without
`--project` specified — **including when LIMIT would have hidden the ambiguity** (preflight
query runs before LIMIT)
15. Output guardrails clamp `--limit` to 500 and `--include-notes` to 20; `meta` reports
effective values and `has_more` truncation flag
16. Discussion and show queries use deterministic ordering (COALESCE + id tiebreaker) to
prevent unstable output during partial sync states
17. Per-discussion truncation signals (`included_note_count`, `has_more_notes`) are accurate
for `--include-notes` output
18. Multi-query commands (`query_notes`, `query_discussions`) use deferred read transactions
for snapshot consistency during concurrent ingest
19. Discussion filters (`resolution`, `noteable_type`, `sort`, `order`) use typed enums
with match-to-SQL mapping — no raw string interpolation in query construction
20. First-note rollup correctly handles discussions with leading system notes — `first_author`
and `first_note_body_snippet` always reflect the first non-system note
21. Query plans for primary discussion query patterns (`--for-mr`, `--project --since`,
`--gitlab-discussion-id`) have been validated via EXPLAIN QUERY PLAN; targeted indexes
added only where scans were observed
---
## Rejected Recommendations
- **Rename `id``note_id` and `gitlab_id``gitlab_note_id` in notes list output** — rejected because every existing consumer (agents, scripts, field presets) uses `id` and `gitlab_id`. The fields are unambiguous within the `notes` context. The show-command note structs are a different story (they have no IDs at all), so we add `gitlab_note_id` there where it's genuinely missing. Renaming established fields is churn without proportional benefit. (Updated: we now ADD `gitlab_note_id` as an alias alongside `gitlab_id` per iteration 3 feedback.)
- **Keyset cursor-based pagination (`--cursor` flag)** — rejected because no existing lore command has pagination, agents use `--limit` effectively, and adding a cursor mechanism is significant scope creep. Tracked as potential future work if agents hit real pagination needs.
- **Split `note_count` into `user_note_count`/`total_note_count` and rename `first_author` to `first_user_author`** — rejected because `note_count` already excludes system notes by query design (the `WHERE is_system = 0` / `CASE WHEN` filter), and `first_author` already targets the first non-system note. The current naming is clear and consistent with how `notes --include-system` works elsewhere.
- **Match path filter on both `position_new_path` and `position_old_path`** — rejected because agents care about where code is *now* (new path), not where it was before a rename. Matching old paths adds complexity and returns confusing results for moved files.
- **Separate migration file for discussion-list indexes** — rejected because this project uses a `MIGRATIONS` array in `src/core/db.rs`, not separate migration files. If profiling shows the new query needs indexes, they'll be added to the migration array in the standard way. Premature index creation without measurement is against project practice.
- **Shared contract model / workstream 0 (shared constants module)** — rejected because 4 structs sharing field names in a codebase this size isn't drift-prone. We have compile-time contract tests (robot-docs assertions + handler-level JSON tests) that catch drift. A constants module for field name strings adds indirection without proportional gain. The Bridge Contract field guardrail (`BRIDGE_FIELDS_*` arrays in robot.rs) provides the centralized definition where it matters — at the filtering enforcement point.
- **Structured robot-docs schema (JSON objects instead of string blobs)** — rejected because the current compact string format is intentionally token-efficient for agent consumption. Switching to nested JSON objects per field would significantly bloat robot-docs output. The string-based contract tests are sufficient — they test what agents actually parse. Agents already work with the inline field listing format used by `issues` and `mrs`.
- **`bridge_contract` meta-section in robot-docs output** — rejected because agents don't need a separate meta-contract section; they need correct field listings per command, which we already provide. Adding a cross-cutting contract section to robot-docs adds documentation surface area without improving the agent workflow.
- **Performance regression benchmark test (ignored by default)** — rejected because timing-based assertions are inherently flaky across machines, CI environments, and load conditions. Performance is validated through query plan analysis (EXPLAIN) and manual profiling, not hard-coded elapsed-time thresholds.
- **Make `--discussion-id` and `--gitlab-discussion-id` aliases for the same backing filter** — rejected because they filter on different identifiers: `--discussion-id` takes the internal integer ID (existing behavior), while `--gitlab-discussion-id` takes the external string ID. These serve fundamentally different use cases (internal cross-referencing vs. external API bridging) and cannot be collapsed without breaking existing consumers.
- **`--with-write-hints` flag for inline glab endpoint templates** — rejected because this couples lore's read surface to glab's API surface, violating the read/write split principle. The Bridge Contract gives agents the raw identifiers; constructing glab commands is the agent's responsibility. Adding endpoint templates would require lore to track glab API changes, creating an unnecessary maintenance burden.
- **Show-command note ordering change (`ORDER BY COALESCE(position, ...), created_at, id`)** — rejected because show-command note ordering within a discussion thread is out of scope for this plan. The existing ordering works correctly for present data; the defensive COALESCE pattern is applied to discussion-level ordering where it matters for agent workflows.
- **Query-plan validation as a separate numbered workstream** — rejected because it adds delivery overhead without proportional benefit. Query-plan validation is integrated into workstream 3 as a pre-merge validation step (section 3h), with candidate indexes listed but only added when EXPLAIN QUERY PLAN shows they're needed. This keeps the measured approach without inflating the workstream count.
- **Add `gitlab_note_id` to show-command note detail structs** — rejected because show-command note detail structs already have `gitlab_id` (same value as `id`). The field is unambiguous and consistent with the Bridge Contract. Adding `gitlab_note_id` would create a duplicate and increase payload size without benefit.
- **Add `gitlab_discussion_id` to show-command discussion detail structs** — rejected because show-command discussion detail structs already have `gitlab_discussion_id`. The field is unambiguous and consistent with the Bridge Contract. Adding `gitlab_discussion_id` would create a duplicate and increase payload size without benefit.
- **Add `gitlab_project_id` to show-command discussion detail structs** — rejected because show-command discussion detail structs already have `gitlab_project_id`. The field is unambiguous and consistent with the Bridge Contract. Adding `gitlab_project_id` would create a duplicate and increase payload size without benefit.