1 Commits

Author SHA1 Message Date
teernisse
171260a772 feat(cli): implement 'lore trace' command (bd-2n4, bd-9dd)
Gate 5 Code Trace - Tier 1 (API-only, no git blame).
Answers 'Why was this code introduced?' by building
file -> MR -> issue -> discussion chains.

New files:
- src/core/trace.rs: run_trace() query logic with rename-aware
  path resolution, entity_reference-based issue linking, and
  DiffNote discussion extraction
- src/core/trace_tests.rs: 7 unit tests for query logic
- src/cli/commands/trace.rs: CLI command with human output,
  robot JSON output, and :line suffix parsing (5 tests)

Human output shows full content (no truncation).
Robot JSON truncates discussion bodies to 500 chars for token efficiency.

Wiring:
- TraceArgs + Commands::Trace in cli/mod.rs
- handle_trace in main.rs
- VALID_COMMANDS + robot-docs manifest entry
- COMMAND_FLAGS autocorrect registry entry

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 14:57:21 -05:00
13 changed files with 1807 additions and 283 deletions

File diff suppressed because one or more lines are too long

View File

@@ -1 +1 @@
bd-z94
bd-1sc6

View File

@@ -0,0 +1,147 @@
1. **Make `gitlab_note_id` explicit in all note-level payloads without breaking existing consumers**
Rationale: Your Bridge Contract already requires `gitlab_note_id`, but current plan keeps `gitlab_id` only in `notes` list while adding `gitlab_note_id` only in `show`. That forces agents to special-case commands. Add `gitlab_note_id` as an alias field everywhere note-level data appears, while keeping `gitlab_id` for compatibility.
```diff
@@ Bridge Contract (Cross-Cutting)
-Every read payload that surfaces notes or discussions MUST include:
+Every read payload that surfaces notes or discussions MUST include:
- project_path
- noteable_type
- parent_iid
- gitlab_discussion_id
- gitlab_note_id (when note-level data is returned — i.e., in notes list and show detail)
+ - Back-compat rule: note payloads may continue exposing `gitlab_id`, but MUST also expose `gitlab_note_id` with the same value.
@@ 1. Add `gitlab_discussion_id` to Notes Output
-#### 1c. Add field to `NoteListRowJson`
+#### 1c. Add fields to `NoteListRowJson`
+Add `gitlab_note_id` alias in addition to existing `gitlab_id` (no rename, no breakage).
@@ 1f. Update `--fields minimal` preset
-"notes" => ["id", "author_username", "body", "created_at_iso", "gitlab_discussion_id"]
+"notes" => ["id", "gitlab_note_id", "author_username", "body", "created_at_iso", "gitlab_discussion_id"]
```
2. **Avoid duplicate flag semantics for discussion filtering**
Rationale: `notes` already has `--discussion-id` and it already maps to `d.gitlab_discussion_id`. Adding a second independent flag/field (`--gitlab-discussion-id`) increases complexity and precedence bugs. Keep one backing filter field and make the new flag an alias.
```diff
@@ 1g. Add `--gitlab-discussion-id` filter to notes
-Allow filtering notes directly by GitLab discussion thread ID...
+Normalize discussion ID flags:
+- Keep one backing filter field (`discussion_id`)
+- Support both `--discussion-id` (existing) and `--gitlab-discussion-id` (alias)
+- If both are provided, clap should reject as duplicate/alias conflict
```
3. **Add ambiguity guardrails for cross-project discussion IDs**
Rationale: `gitlab_discussion_id` is unique per project, not globally. Filtering by discussion ID without project can return multiple rows across repos, which breaks deterministic write bridging. Fail fast with an `Ambiguous` error and actionable fix (`--project`).
```diff
@@ Bridge Contract (Cross-Cutting)
+### Ambiguity Guardrail
+When filtering by `gitlab_discussion_id` without `--project`, if multiple projects match:
+- return `Ambiguous` error
+- include matching project paths in message
+- suggest retry with `--project <path>`
```
4. **Replace `--include-notes` N+1 retrieval with one batched top-N query**
Rationale: The current plans per-discussion follow-up query scales poorly and creates latency spikes. Use a single window-function query over selected discussion IDs and group rows in Rust. This is both faster and more predictable.
```diff
@@ 3c-ii. Note expansion query (--include-notes)
-When `include_notes > 0`, after the main discussion query, run a follow-up query per discussion...
+When `include_notes > 0`, run one batched query:
+WITH ranked_notes AS (
+ SELECT
+ n.*,
+ d.gitlab_discussion_id,
+ ROW_NUMBER() OVER (
+ PARTITION BY n.discussion_id
+ ORDER BY n.created_at DESC, n.id DESC
+ ) AS rn
+ FROM notes n
+ JOIN discussions d ON d.id = n.discussion_id
+ WHERE n.discussion_id IN ( ...selected discussion ids... )
+)
+SELECT ... FROM ranked_notes WHERE rn <= ?
+ORDER BY discussion_id, rn;
+
+Group by `discussion_id` in Rust and attach notes arrays without per-thread round-trips.
```
5. **Add hard output guardrails and explicit truncation metadata**
Rationale: `--limit` and `--include-notes` are unbounded today. For robot workflows this can accidentally generate huge payloads. Cap values and surface effective limits plus truncation state in `meta`.
```diff
@@ 3a. CLI Args
- pub limit: usize,
+ pub limit: usize, // clamp to max (e.g., 500)
- pub include_notes: usize,
+ pub include_notes: usize, // clamp to max (e.g., 20)
@@ Response Schema
- "meta": { "elapsed_ms": 12 }
+ "meta": {
+ "elapsed_ms": 12,
+ "effective_limit": 50,
+ "effective_include_notes": 2,
+ "has_more": true
+ }
```
6. **Strengthen deterministic ordering and null handling**
Rationale: `first_note_at`, `last_note_at`, and note `position` can be null/incomplete during partial sync states. Add null-safe ordering to avoid unstable output and flaky automation.
```diff
@@ 2c. Update queries to SELECT new fields
-... ORDER BY first_note_at
+... ORDER BY COALESCE(first_note_at, last_note_at, 0), id
@@ show note query
-ORDER BY position
+ORDER BY COALESCE(position, 9223372036854775807), created_at, id
@@ 3c. SQL Query
-ORDER BY {sort_column} {order}
+ORDER BY COALESCE({sort_column}, 0) {order}, fd.id {order}
```
7. **Make write-bridging more useful with optional command hints**
Rationale: Exposing IDs is necessary but not sufficient; agents still need to assemble endpoints repeatedly. Add optional `--with-write-hints` that injects compact endpoint templates (`reply`, `resolve`) derived from row context. This improves usability without bloating default output.
```diff
@@ 3a. CLI Args
+ /// Include machine-actionable glab write hints per row
+ #[arg(long, help_heading = "Output")]
+ pub with_write_hints: bool,
@@ Response Schema (notes/discussions/show)
+ "write_hints?": {
+ "reply_endpoint": "string",
+ "resolve_endpoint?": "string"
+ }
```
8. **Upgrade robot-docs/contract validation from string-contains to parity checks**
Rationale: `contains("gitlab_discussion_id")` catches very little and allows schema drift. Build field-set parity tests that compare actual serialized JSON keys to robot-docs declared fields for `notes`, `discussions`, and `show` discussion nodes.
```diff
@@ 4f. Add robot-docs contract tests
-assert!(notes_schema.contains("gitlab_discussion_id"));
+let declared = parse_schema_field_list(notes_schema);
+let sample = sample_notes_row_json_keys();
+assert_required_subset(&declared, &["project_path","noteable_type","parent_iid","gitlab_discussion_id","gitlab_note_id"]);
+assert_schema_matches_payload(&declared, &sample);
@@ 4g. Add CLI-level contract integration tests
+Add parity tests for:
+- notes list JSON
+- discussions list JSON
+- issues show discussions[*]
+- mrs show discussions[*]
```
If you want, I can produce a full revised v3 plan text with these edits merged end-to-end so its ready to execute directly.

View File

@@ -0,0 +1,207 @@
Below are the highest-impact revisions Id make to this plan. I excluded everything listed in your `## Rejected Recommendations` section.
**1. Fix a correctness bug in the ambiguity guardrail (must run before `LIMIT`)**
The current post-query ambiguity check can silently fail when `--limit` truncates results to one project even though multiple projects match the same `gitlab_discussion_id`. That creates non-deterministic write targeting risk.
```diff
@@ ## Ambiguity Guardrail
-**Implementation**: After the main query, if `gitlab_discussion_id` is set and no `--project`
-was provided, check if the result set spans multiple `project_path` values.
+**Implementation**: Run a preflight distinct-project check when `gitlab_discussion_id` is set
+and `--project` was not provided, before the main list query applies `LIMIT`.
+Use:
+```sql
+SELECT DISTINCT p.path_with_namespace
+FROM discussions d
+JOIN projects p ON p.id = d.project_id
+WHERE d.gitlab_discussion_id = ?
+LIMIT 3
+```
+If more than one project is found, return `LoreError::Ambiguous` (exit code 18) with project
+paths and suggestion to retry with `--project <path>`.
```
---
**2. Add `gitlab_project_id` to the Bridge Contract**
`project_path` is human-friendly but mutable (renames/transfers). `gitlab_project_id` gives a stable write target and avoids path re-resolution failures.
```diff
@@ ## Bridge Contract (Cross-Cutting)
Every read payload that surfaces notes or discussions **MUST** include:
- `project_path`
+- `gitlab_project_id`
- `noteable_type`
- `parent_iid`
- `gitlab_discussion_id`
- `gitlab_note_id`
@@
const BRIDGE_FIELDS_NOTES: &[&str] = &[
- "project_path", "noteable_type", "parent_iid",
+ "project_path", "gitlab_project_id", "noteable_type", "parent_iid",
"gitlab_discussion_id", "gitlab_note_id",
];
const BRIDGE_FIELDS_DISCUSSIONS: &[&str] = &[
- "project_path", "noteable_type", "parent_iid",
+ "project_path", "gitlab_project_id", "noteable_type", "parent_iid",
"gitlab_discussion_id",
];
```
---
**3. Replace stringly-typed filter/sort fields with enums end-to-end**
Right now `sort`, `order`, `resolution`, `noteable_type` are mostly `String`. This is fragile and risks unsafe SQL interpolation drift over time. Typed enums make invalid states unrepresentable.
```diff
@@ ## 3a. CLI Args
- pub resolution: Option<String>,
+ pub resolution: Option<ResolutionFilter>,
@@
- pub noteable_type: Option<String>,
+ pub noteable_type: Option<NoteableTypeFilter>,
@@
- pub sort: String,
+ pub sort: DiscussionSortField,
@@
- pub asc: bool,
+ pub order: SortDirection,
@@ ## 3d. Filters struct
- pub resolution: Option<String>,
- pub noteable_type: Option<String>,
- pub sort: String,
- pub order: String,
+ pub resolution: Option<ResolutionFilter>,
+ pub noteable_type: Option<NoteableTypeFilter>,
+ pub sort: DiscussionSortField,
+ pub order: SortDirection,
@@
+Map enum -> SQL fragment via `match` in query builder; never interpolate raw strings.
```
---
**4. Enforce snapshot consistency for multi-query commands**
`discussions` with `--include-notes` does multiple reads. Without a single read transaction, concurrent ingest can produce mismatched `total_count`, row set, and expanded notes.
```diff
@@ ## 3c. SQL Query
-pub fn query_discussions(...)
+pub fn query_discussions(...)
{
+ // Run count query + page query + note expansion under one deferred read transaction
+ // so output is a single consistent snapshot.
+ let tx = conn.transaction_with_behavior(rusqlite::TransactionBehavior::Deferred)?;
...
+ tx.commit()?;
}
@@ ## 1. Add `gitlab_discussion_id` to Notes Output
+Apply the same snapshot rule to `query_notes` when returning `total_count` + paged rows.
```
---
**5. Correct first-note rollup semantics (current CTE can return null/incorrect `first_author`)**
In the proposed SQL, `rn=1` is computed over all notes but then filtered with `is_system=0`, so threads with a leading system note may incorrectly lose `first_author`/snippet. Also path rollup uses non-deterministic `MAX(...)`.
```diff
@@ ## 3c. SQL Query
-ranked_notes AS (
+ranked_notes AS (
SELECT
n.discussion_id,
n.author_username,
n.body,
n.is_system,
n.position_new_path,
n.position_new_line,
- ROW_NUMBER() OVER (
- PARTITION BY n.discussion_id
- ORDER BY n.position, n.id
- ) AS rn
+ ROW_NUMBER() OVER (
+ PARTITION BY n.discussion_id
+ ORDER BY CASE WHEN n.is_system = 0 THEN 0 ELSE 1 END, n.created_at, n.id
+ ) AS rn_first_note,
+ ROW_NUMBER() OVER (
+ PARTITION BY n.discussion_id
+ ORDER BY CASE WHEN n.position_new_path IS NULL THEN 1 ELSE 0 END, n.created_at, n.id
+ ) AS rn_first_position
@@
- MAX(CASE WHEN rn = 1 AND is_system = 0 THEN author_username END) AS first_author,
- MAX(CASE WHEN rn = 1 AND is_system = 0 THEN body END) AS first_note_body,
- MAX(CASE WHEN position_new_path IS NOT NULL THEN position_new_path END) AS position_new_path,
- MAX(CASE WHEN position_new_line IS NOT NULL THEN position_new_line END) AS position_new_line
+ MAX(CASE WHEN rn_first_note = 1 AND is_system = 0 THEN author_username END) AS first_author,
+ MAX(CASE WHEN rn_first_note = 1 AND is_system = 0 THEN body END) AS first_note_body,
+ MAX(CASE WHEN rn_first_position = 1 THEN position_new_path END) AS position_new_path,
+ MAX(CASE WHEN rn_first_position = 1 THEN position_new_line END) AS position_new_line
```
---
**6. Add per-discussion truncation signals for `--include-notes`**
Top-level `has_more` is useful, but agents also need to know if an individual threads notes were truncated. Otherwise they cant tell if a thread is complete.
```diff
@@ ## Response Schema
{
"gitlab_discussion_id": "...",
...
- "notes": []
+ "included_note_count": 0,
+ "has_more_notes": false,
+ "notes": []
}
@@ ## 3b. Domain Structs
pub struct DiscussionListRowJson {
@@
+ pub included_note_count: usize,
+ pub has_more_notes: bool,
#[serde(skip_serializing_if = "Vec::is_empty")]
pub notes: Vec<NoteListRowJson>,
}
@@ ## 3c-ii. Note expansion query (--include-notes)
-Group by `discussion_id` in Rust and attach notes arrays...
+Group by `discussion_id` in Rust, attach notes arrays, and set:
+`included_note_count = notes.len()`,
+`has_more_notes = note_count > included_note_count`.
```
---
**7. Add explicit query-plan gate and targeted index workstream (measured, not speculative)**
This plan introduces heavy discussion-centric reads. You should bake in deterministic performance validation with `EXPLAIN QUERY PLAN` and only then add indexes if missing.
```diff
@@ ## Scope: Four workstreams, delivered in order:
-4. Fix robot-docs to list actual field names instead of opaque type references
+4. Add query-plan validation + targeted index updates for new discussion queries
+5. Fix robot-docs to list actual field names instead of opaque type references
@@
+## 4. Query-Plan Validation and Targeted Indexes
+
+Before and after implementing `query_discussions`, capture `EXPLAIN QUERY PLAN` for:
+- `--for-mr <iid> --resolution unresolved`
+- `--project <path> --since 7d --sort last_note`
+- `--gitlab-discussion-id <id>`
+
+If plans show table scans on `notes`/`discussions`, add indexes in `MIGRATIONS` array:
+- `discussions(project_id, gitlab_discussion_id)`
+- `discussions(merge_request_id, last_note_at, id)`
+- `notes(discussion_id, created_at DESC, id DESC)`
+- `notes(discussion_id, position, id)`
+
+Tests: assert the new query paths return expected rows under indexed schema and no regressions.
```
---
If you want, I can produce a single consolidated “iteration 4” version of the plan text with all seven revisions merged in place.

File diff suppressed because it is too large Load Diff

View File

@@ -243,6 +243,15 @@ const COMMAND_FLAGS: &[(&str, &[&str])] = &[
"--limit",
],
),
(
"trace",
&[
"--project",
"--discussions",
"--no-follow-renames",
"--limit",
],
),
("generate-docs", &["--full", "--project"]),
("completions", &[]),
("robot-docs", &["--brief"]),

View File

@@ -14,6 +14,7 @@ pub mod stats;
pub mod sync;
pub mod sync_status;
pub mod timeline;
pub mod trace;
pub mod who;
pub use auth_test::run_auth_test;
@@ -48,4 +49,5 @@ pub use stats::{print_stats, print_stats_json, run_stats};
pub use sync::{SyncOptions, SyncResult, print_sync, print_sync_json, run_sync};
pub use sync_status::{print_sync_status, print_sync_status_json, run_sync_status};
pub use timeline::{TimelineParams, print_timeline, print_timeline_json_with_meta, run_timeline};
pub use trace::{parse_trace_path, print_trace, print_trace_json};
pub use who::{WhoRun, print_who_human, print_who_json, run_who};

242
src/cli/commands/trace.rs Normal file
View File

@@ -0,0 +1,242 @@
use crate::cli::render::{Icons, Theme};
use crate::core::trace::{TraceChain, TraceResult};
/// Parse a path with optional `:line` suffix.
///
/// Handles Windows drive letters (e.g. `C:/foo.rs`) by checking that the
/// prefix before the colon is not a single ASCII letter.
pub fn parse_trace_path(input: &str) -> (String, Option<u32>) {
if let Some((path, suffix)) = input.rsplit_once(':')
&& !path.is_empty()
&& let Ok(line) = suffix.parse::<u32>()
// Reject Windows drive letters: single ASCII letter before colon
&& (path.len() > 1 || !path.chars().next().unwrap_or(' ').is_ascii_alphabetic())
{
return (path.to_string(), Some(line));
}
(input.to_string(), None)
}
// ── Human output ────────────────────────────────────────────────────────────
pub fn print_trace(result: &TraceResult) {
let chain_info = if result.total_chains == 1 {
"1 chain".to_string()
} else {
format!("{} chains", result.total_chains)
};
let paths_info = if result.resolved_paths.len() > 1 {
format!(", {} paths", result.resolved_paths.len())
} else {
String::new()
};
println!();
println!(
"{}",
Theme::bold().render(&format!(
"Trace: {} ({}{})",
result.path, chain_info, paths_info
))
);
// Rename chain
if result.renames_followed && result.resolved_paths.len() > 1 {
let chain_str: Vec<&str> = result.resolved_paths.iter().map(String::as_str).collect();
println!(
" Rename chain: {}",
Theme::dim().render(&chain_str.join(" -> "))
);
}
if result.trace_chains.is_empty() {
println!(
"\n {} {}",
Icons::info(),
Theme::dim().render("No trace chains found for this file.")
);
println!(
" {}",
Theme::dim()
.render("Hint: Run 'lore sync' to fetch MR file changes and cross-references.")
);
println!();
return;
}
println!();
for chain in &result.trace_chains {
print_chain(chain);
}
println!();
}
fn print_chain(chain: &TraceChain) {
let (icon, state_style) = match chain.mr_state.as_str() {
"merged" => (Icons::mr_merged(), Theme::accent()),
"opened" => (Icons::mr_opened(), Theme::success()),
"closed" => (Icons::mr_closed(), Theme::warning()),
_ => (Icons::mr_opened(), Theme::dim()),
};
let date = chain
.merged_at_iso
.as_deref()
.or(Some(chain.updated_at_iso.as_str()))
.unwrap_or("")
.split('T')
.next()
.unwrap_or("");
println!(
" {} {} {} {} @{} {} {}",
icon,
Theme::accent().render(&format!("!{}", chain.mr_iid)),
chain.mr_title,
state_style.render(&chain.mr_state),
chain.mr_author,
date,
Theme::dim().render(&chain.change_type),
);
// Linked issues
for issue in &chain.issues {
let ref_icon = match issue.reference_type.as_str() {
"closes" => Icons::issue_closed(),
_ => Icons::issue_opened(),
};
println!(
" {} #{} {} {} [{}]",
ref_icon,
issue.iid,
issue.title,
Theme::dim().render(&issue.state),
Theme::dim().render(&issue.reference_type),
);
}
// Discussions
for disc in &chain.discussions {
let date = disc.created_at_iso.split('T').next().unwrap_or("");
println!(
" {} @{} ({}) [{}]: {}",
Icons::note(),
disc.author_username,
date,
Theme::dim().render(&disc.path),
disc.body
);
}
}
// ── Robot (JSON) output ─────────────────────────────────────────────────────
/// Maximum body length in robot JSON output (token efficiency).
const ROBOT_BODY_SNIPPET_LEN: usize = 500;
fn truncate_body(body: &str, max: usize) -> String {
if body.len() <= max {
return body.to_string();
}
let boundary = body.floor_char_boundary(max);
format!("{}...", &body[..boundary])
}
pub fn print_trace_json(result: &TraceResult, elapsed_ms: u64, line_requested: Option<u32>) {
// Truncate discussion bodies for token efficiency in robot mode
let chains: Vec<serde_json::Value> = result
.trace_chains
.iter()
.map(|chain| {
let discussions: Vec<serde_json::Value> = chain
.discussions
.iter()
.map(|d| {
serde_json::json!({
"discussion_id": d.discussion_id,
"mr_iid": d.mr_iid,
"author_username": d.author_username,
"body_snippet": truncate_body(&d.body, ROBOT_BODY_SNIPPET_LEN),
"path": d.path,
"created_at_iso": d.created_at_iso,
})
})
.collect();
serde_json::json!({
"mr_iid": chain.mr_iid,
"mr_title": chain.mr_title,
"mr_state": chain.mr_state,
"mr_author": chain.mr_author,
"change_type": chain.change_type,
"merged_at_iso": chain.merged_at_iso,
"updated_at_iso": chain.updated_at_iso,
"web_url": chain.web_url,
"issues": chain.issues,
"discussions": discussions,
})
})
.collect();
let output = serde_json::json!({
"ok": true,
"data": {
"path": result.path,
"resolved_paths": result.resolved_paths,
"trace_chains": chains,
},
"meta": {
"tier": "api_only",
"line_requested": line_requested,
"elapsed_ms": elapsed_ms,
"total_chains": result.total_chains,
"renames_followed": result.renames_followed,
}
});
println!("{}", serde_json::to_string(&output).unwrap_or_default());
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_parse_trace_path_simple() {
let (path, line) = parse_trace_path("src/foo.rs");
assert_eq!(path, "src/foo.rs");
assert_eq!(line, None);
}
#[test]
fn test_parse_trace_path_with_line() {
let (path, line) = parse_trace_path("src/foo.rs:42");
assert_eq!(path, "src/foo.rs");
assert_eq!(line, Some(42));
}
#[test]
fn test_parse_trace_path_windows() {
let (path, line) = parse_trace_path("C:/foo.rs");
assert_eq!(path, "C:/foo.rs");
assert_eq!(line, None);
}
#[test]
fn test_parse_trace_path_directory() {
let (path, line) = parse_trace_path("src/auth/");
assert_eq!(path, "src/auth/");
assert_eq!(line, None);
}
#[test]
fn test_parse_trace_path_with_line_zero() {
let (path, line) = parse_trace_path("file.rs:0");
assert_eq!(path, "file.rs");
assert_eq!(line, Some(0));
}
}

View File

@@ -238,6 +238,9 @@ pub enum Commands {
#[command(name = "file-history")]
FileHistory(FileHistoryArgs),
/// Trace why code was introduced: file -> MR -> issue -> discussion
Trace(TraceArgs),
/// Detect discussion divergence from original intent
Drift {
/// Entity type (currently only "issues" supported)
@@ -1006,6 +1009,38 @@ pub struct FileHistoryArgs {
pub limit: usize,
}
#[derive(Parser)]
#[command(after_help = "\x1b[1mExamples:\x1b[0m
lore trace src/main.rs # Why was this file changed?
lore trace src/auth/ -p group/repo # Scoped to project
lore trace src/foo.rs --discussions # Include DiffNote context
lore trace src/bar.rs:42 # Line hint (Tier 2 warning)")]
pub struct TraceArgs {
/// File path to trace (supports :line suffix for future Tier 2)
pub path: String,
/// Scope to a specific project (fuzzy match)
#[arg(short = 'p', long, help_heading = "Filters")]
pub project: Option<String>,
/// Include DiffNote discussion snippets
#[arg(long, help_heading = "Output")]
pub discussions: bool,
/// Disable rename chain resolution
#[arg(long = "no-follow-renames", help_heading = "Filters")]
pub no_follow_renames: bool,
/// Maximum trace chains to display
#[arg(
short = 'n',
long = "limit",
default_value = "20",
help_heading = "Output"
)]
pub limit: usize,
}
#[derive(Parser)]
pub struct CountArgs {
/// Entity type to count (issues, mrs, discussions, notes, events)

View File

@@ -21,6 +21,7 @@ pub mod timeline;
pub mod timeline_collect;
pub mod timeline_expand;
pub mod timeline_seed;
pub mod trace;
pub use config::Config;
pub use error::{LoreError, Result};

262
src/core/trace.rs Normal file
View File

@@ -0,0 +1,262 @@
use serde::Serialize;
use super::error::Result;
use super::file_history::resolve_rename_chain;
use super::time::ms_to_iso;
/// Maximum rename chain BFS depth.
const MAX_RENAME_HOPS: usize = 10;
/// A linked issue found via entity_references on the MR.
#[derive(Debug, Serialize)]
pub struct TraceIssue {
pub iid: i64,
pub title: String,
pub state: String,
pub reference_type: String,
pub web_url: Option<String>,
}
/// A DiffNote discussion relevant to the traced file.
#[derive(Debug, Serialize)]
pub struct TraceDiscussion {
pub discussion_id: String,
pub mr_iid: i64,
pub author_username: String,
pub body: String,
pub path: String,
pub created_at_iso: String,
}
/// A single trace chain: an MR that touched the file, plus linked issues and discussions.
#[derive(Debug, Serialize)]
pub struct TraceChain {
pub mr_iid: i64,
pub mr_title: String,
pub mr_state: String,
pub mr_author: String,
pub change_type: String,
pub merged_at_iso: Option<String>,
pub updated_at_iso: String,
pub web_url: Option<String>,
pub issues: Vec<TraceIssue>,
pub discussions: Vec<TraceDiscussion>,
}
/// Result of a trace query.
#[derive(Debug, Serialize)]
pub struct TraceResult {
pub path: String,
pub resolved_paths: Vec<String>,
pub renames_followed: bool,
pub trace_chains: Vec<TraceChain>,
pub total_chains: usize,
}
/// Run the trace query: file -> MR -> issue chain.
pub fn run_trace(
conn: &rusqlite::Connection,
project_id: Option<i64>,
path: &str,
follow_renames: bool,
include_discussions: bool,
limit: usize,
) -> Result<TraceResult> {
// Resolve rename chain
let (all_paths, renames_followed) = if follow_renames {
if let Some(pid) = project_id {
let chain = resolve_rename_chain(conn, pid, path, MAX_RENAME_HOPS)?;
let followed = chain.len() > 1;
(chain, followed)
} else {
(vec![path.to_string()], false)
}
} else {
(vec![path.to_string()], false)
};
// Build placeholders for IN clause
let placeholders: Vec<String> = (0..all_paths.len())
.map(|i| format!("?{}", i + 2))
.collect();
let in_clause = placeholders.join(", ");
let project_filter = if project_id.is_some() {
"AND mfc.project_id = ?1"
} else {
""
};
// Step 1: Find MRs that touched the file
let mr_sql = format!(
"SELECT DISTINCT \
mr.id, mr.iid, mr.title, mr.state, mr.author_username, \
mfc.change_type, mr.merged_at, mr.updated_at, mr.web_url \
FROM mr_file_changes mfc \
JOIN merge_requests mr ON mr.id = mfc.merge_request_id \
WHERE mfc.new_path IN ({in_clause}) {project_filter} \
ORDER BY COALESCE(mr.merged_at, mr.updated_at) DESC \
LIMIT ?{}",
all_paths.len() + 2
);
let mut stmt = conn.prepare(&mr_sql)?;
let mut params: Vec<Box<dyn rusqlite::types::ToSql>> = Vec::new();
params.push(Box::new(project_id.unwrap_or(0)));
for p in &all_paths {
params.push(Box::new(p.clone()));
}
params.push(Box::new(limit as i64));
let param_refs: Vec<&dyn rusqlite::types::ToSql> = params.iter().map(|p| p.as_ref()).collect();
struct MrRow {
id: i64,
iid: i64,
title: String,
state: String,
author: String,
change_type: String,
merged_at: Option<i64>,
updated_at: i64,
web_url: Option<String>,
}
let mr_rows: Vec<MrRow> = stmt
.query_map(param_refs.as_slice(), |row| {
Ok(MrRow {
id: row.get(0)?,
iid: row.get(1)?,
title: row.get(2)?,
state: row.get(3)?,
author: row.get(4)?,
change_type: row.get(5)?,
merged_at: row.get(6)?,
updated_at: row.get(7)?,
web_url: row.get(8)?,
})
})?
.filter_map(std::result::Result::ok)
.collect();
// Step 2: For each MR, find linked issues + optional discussions
let mut trace_chains = Vec::with_capacity(mr_rows.len());
for mr in &mr_rows {
let issues = fetch_linked_issues(conn, mr.id)?;
let discussions = if include_discussions {
fetch_trace_discussions(conn, mr.id, mr.iid, &all_paths)?
} else {
Vec::new()
};
trace_chains.push(TraceChain {
mr_iid: mr.iid,
mr_title: mr.title.clone(),
mr_state: mr.state.clone(),
mr_author: mr.author.clone(),
change_type: mr.change_type.clone(),
merged_at_iso: mr.merged_at.map(ms_to_iso),
updated_at_iso: ms_to_iso(mr.updated_at),
web_url: mr.web_url.clone(),
issues,
discussions,
});
}
let total_chains = trace_chains.len();
Ok(TraceResult {
path: path.to_string(),
resolved_paths: all_paths,
renames_followed,
trace_chains,
total_chains,
})
}
/// Fetch issues linked to an MR via entity_references.
/// source = merge_request -> target = issue (closes/mentioned/related)
fn fetch_linked_issues(conn: &rusqlite::Connection, mr_id: i64) -> Result<Vec<TraceIssue>> {
let sql = "SELECT DISTINCT i.iid, i.title, i.state, er.reference_type, i.web_url \
FROM entity_references er \
JOIN issues i ON i.id = er.target_entity_id \
WHERE er.source_entity_type = 'merge_request' \
AND er.source_entity_id = ?1 \
AND er.target_entity_type = 'issue' \
AND er.target_entity_id IS NOT NULL \
ORDER BY \
CASE er.reference_type WHEN 'closes' THEN 0 WHEN 'related' THEN 1 ELSE 2 END, \
i.iid";
let mut stmt = conn.prepare(sql)?;
let issues: Vec<TraceIssue> = stmt
.query_map(rusqlite::params![mr_id], |row| {
Ok(TraceIssue {
iid: row.get(0)?,
title: row.get(1)?,
state: row.get(2)?,
reference_type: row.get(3)?,
web_url: row.get(4)?,
})
})?
.filter_map(std::result::Result::ok)
.collect();
Ok(issues)
}
/// Fetch DiffNote discussions on a specific MR that reference the traced paths.
fn fetch_trace_discussions(
conn: &rusqlite::Connection,
mr_id: i64,
mr_iid: i64,
paths: &[String],
) -> Result<Vec<TraceDiscussion>> {
let placeholders: Vec<String> = (0..paths.len()).map(|i| format!("?{}", i + 2)).collect();
let in_clause = placeholders.join(", ");
let sql = format!(
"SELECT d.gitlab_discussion_id, n.author_username, n.body, n.position_new_path, n.created_at \
FROM notes n \
JOIN discussions d ON d.id = n.discussion_id \
WHERE d.merge_request_id = ?1 \
AND n.position_new_path IN ({in_clause}) \
AND n.is_system = 0 \
ORDER BY n.created_at DESC \
LIMIT 20"
);
let mut stmt = conn.prepare(&sql)?;
let mut params: Vec<Box<dyn rusqlite::types::ToSql>> = Vec::new();
params.push(Box::new(mr_id));
for p in paths {
params.push(Box::new(p.clone()));
}
let param_refs: Vec<&dyn rusqlite::types::ToSql> = params.iter().map(|p| p.as_ref()).collect();
let discussions: Vec<TraceDiscussion> = stmt
.query_map(param_refs.as_slice(), |row| {
let created_at: i64 = row.get(4)?;
Ok(TraceDiscussion {
discussion_id: row.get(0)?,
mr_iid,
author_username: row.get(1)?,
body: row.get(2)?,
path: row.get(3)?,
created_at_iso: ms_to_iso(created_at),
})
})?
.filter_map(std::result::Result::ok)
.collect();
Ok(discussions)
}
#[cfg(test)]
#[path = "trace_tests.rs"]
mod tests;

260
src/core/trace_tests.rs Normal file
View File

@@ -0,0 +1,260 @@
use super::*;
use crate::core::db::{create_connection, run_migrations};
use std::path::Path;
fn setup_test_db() -> rusqlite::Connection {
let conn = create_connection(Path::new(":memory:")).unwrap();
run_migrations(&conn).unwrap();
conn
}
fn seed_project(conn: &rusqlite::Connection) -> i64 {
conn.execute(
"INSERT INTO projects (id, gitlab_project_id, path_with_namespace, web_url, created_at, updated_at)
VALUES (1, 100, 'group/repo', 'https://gitlab.example.com/group/repo', 1000, 2000)",
[],
)
.unwrap();
1
}
fn insert_mr(
conn: &rusqlite::Connection,
id: i64,
iid: i64,
title: &str,
state: &str,
merged_at: Option<i64>,
) {
conn.execute(
"INSERT INTO merge_requests (id, gitlab_id, iid, project_id, title, state, author_username, \
created_at, updated_at, last_seen_at, source_branch, target_branch, merged_at, web_url)
VALUES (?1, ?2, ?3, 1, ?4, ?5, 'dev', 1000, 2000, 2000, 'feature', 'main', ?6, \
'https://gitlab.example.com/group/repo/-/merge_requests/' || ?3)",
rusqlite::params![id, 300 + id, iid, title, state, merged_at],
)
.unwrap();
}
fn insert_file_change(
conn: &rusqlite::Connection,
mr_id: i64,
old_path: Option<&str>,
new_path: &str,
change_type: &str,
) {
conn.execute(
"INSERT INTO mr_file_changes (merge_request_id, project_id, old_path, new_path, change_type)
VALUES (?1, 1, ?2, ?3, ?4)",
rusqlite::params![mr_id, old_path, new_path, change_type],
)
.unwrap();
}
fn insert_entity_ref(
conn: &rusqlite::Connection,
source_type: &str,
source_id: i64,
target_type: &str,
target_id: i64,
ref_type: &str,
) {
conn.execute(
"INSERT INTO entity_references (project_id, source_entity_type, source_entity_id, \
target_entity_type, target_entity_id, reference_type, source_method, created_at)
VALUES (1, ?1, ?2, ?3, ?4, ?5, 'api', 1000)",
rusqlite::params![source_type, source_id, target_type, target_id, ref_type],
)
.unwrap();
}
fn insert_issue(conn: &rusqlite::Connection, id: i64, iid: i64, title: &str, state: &str) {
conn.execute(
"INSERT INTO issues (id, gitlab_id, project_id, iid, title, state, created_at, updated_at, \
last_seen_at, web_url)
VALUES (?1, ?2, 1, ?3, ?4, ?5, 1000, 2000, 2000, \
'https://gitlab.example.com/group/repo/-/issues/' || ?3)",
rusqlite::params![id, 400 + id, iid, title, state],
)
.unwrap();
}
fn insert_discussion_and_note(
conn: &rusqlite::Connection,
discussion_id: i64,
mr_id: i64,
note_id: i64,
author: &str,
body: &str,
position_new_path: Option<&str>,
) {
conn.execute(
"INSERT INTO discussions (id, gitlab_discussion_id, project_id, merge_request_id, \
noteable_type, last_seen_at)
VALUES (?1, 'disc-' || ?1, 1, ?2, 'MergeRequest', 2000)",
rusqlite::params![discussion_id, mr_id],
)
.unwrap();
conn.execute(
"INSERT INTO notes (id, gitlab_id, discussion_id, project_id, author_username, body, \
is_system, created_at, updated_at, last_seen_at, position_new_path)
VALUES (?1, ?2, ?3, 1, ?4, ?5, 0, 1500, 1500, 2000, ?6)",
rusqlite::params![
note_id,
500 + note_id,
discussion_id,
author,
body,
position_new_path
],
)
.unwrap();
}
#[test]
fn test_trace_empty_file() {
let conn = setup_test_db();
seed_project(&conn);
let result = run_trace(&conn, Some(1), "src/nonexistent.rs", false, false, 10).unwrap();
assert!(result.trace_chains.is_empty());
assert_eq!(result.resolved_paths, ["src/nonexistent.rs"]);
}
#[test]
fn test_trace_finds_mr() {
let conn = setup_test_db();
seed_project(&conn);
insert_mr(&conn, 1, 10, "Add auth module", "merged", Some(3000));
insert_file_change(&conn, 1, None, "src/auth.rs", "added");
let result = run_trace(&conn, Some(1), "src/auth.rs", false, false, 10).unwrap();
assert_eq!(result.trace_chains.len(), 1);
let chain = &result.trace_chains[0];
assert_eq!(chain.mr_iid, 10);
assert_eq!(chain.mr_title, "Add auth module");
assert_eq!(chain.mr_state, "merged");
assert_eq!(chain.change_type, "added");
assert!(chain.merged_at_iso.is_some());
}
#[test]
fn test_trace_follows_renames() {
let conn = setup_test_db();
seed_project(&conn);
// MR 1: added old_auth.rs
insert_mr(&conn, 1, 10, "Add old auth", "merged", Some(1000));
insert_file_change(&conn, 1, None, "src/old_auth.rs", "added");
// MR 2: renamed old_auth.rs -> auth.rs
insert_mr(&conn, 2, 11, "Rename auth", "merged", Some(2000));
insert_file_change(&conn, 2, Some("src/old_auth.rs"), "src/auth.rs", "renamed");
// Query auth.rs with follow_renames -- should find both MRs
let result = run_trace(&conn, Some(1), "src/auth.rs", true, false, 10).unwrap();
assert!(result.renames_followed);
assert!(
result
.resolved_paths
.contains(&"src/old_auth.rs".to_string())
);
assert!(result.resolved_paths.contains(&"src/auth.rs".to_string()));
// MR 2 touches auth.rs (new_path), MR 1 touches old_auth.rs (new_path in its row)
assert_eq!(result.trace_chains.len(), 2);
}
#[test]
fn test_trace_links_issues() {
let conn = setup_test_db();
seed_project(&conn);
insert_mr(&conn, 1, 10, "Fix login bug", "merged", Some(3000));
insert_file_change(&conn, 1, None, "src/login.rs", "modified");
insert_issue(&conn, 1, 42, "Login broken on mobile", "closed");
insert_entity_ref(&conn, "merge_request", 1, "issue", 1, "closes");
let result = run_trace(&conn, Some(1), "src/login.rs", false, false, 10).unwrap();
assert_eq!(result.trace_chains.len(), 1);
assert_eq!(result.trace_chains[0].issues.len(), 1);
let issue = &result.trace_chains[0].issues[0];
assert_eq!(issue.iid, 42);
assert_eq!(issue.title, "Login broken on mobile");
assert_eq!(issue.reference_type, "closes");
}
#[test]
fn test_trace_limits_chains() {
let conn = setup_test_db();
seed_project(&conn);
for i in 1..=3 {
insert_mr(
&conn,
i,
10 + i,
&format!("MR {i}"),
"merged",
Some(1000 * i),
);
insert_file_change(&conn, i, None, "src/shared.rs", "modified");
}
let result = run_trace(&conn, Some(1), "src/shared.rs", false, false, 1).unwrap();
assert_eq!(result.trace_chains.len(), 1);
}
#[test]
fn test_trace_no_follow_renames() {
let conn = setup_test_db();
seed_project(&conn);
// MR 1: added old_name.rs
insert_mr(&conn, 1, 10, "Add old file", "merged", Some(1000));
insert_file_change(&conn, 1, None, "src/old_name.rs", "added");
// MR 2: renamed old_name.rs -> new_name.rs
insert_mr(&conn, 2, 11, "Rename file", "merged", Some(2000));
insert_file_change(
&conn,
2,
Some("src/old_name.rs"),
"src/new_name.rs",
"renamed",
);
// Without follow_renames -- should only find MR 2 (new_path = new_name.rs)
let result = run_trace(&conn, Some(1), "src/new_name.rs", false, false, 10).unwrap();
assert_eq!(result.resolved_paths, ["src/new_name.rs"]);
assert!(!result.renames_followed);
assert_eq!(result.trace_chains.len(), 1);
assert_eq!(result.trace_chains[0].mr_iid, 11);
}
#[test]
fn test_trace_includes_discussions() {
let conn = setup_test_db();
seed_project(&conn);
insert_mr(&conn, 1, 10, "Refactor auth", "merged", Some(3000));
insert_file_change(&conn, 1, None, "src/auth.rs", "modified");
insert_discussion_and_note(
&conn,
1,
1,
1,
"reviewer",
"This function should handle the error case.",
Some("src/auth.rs"),
);
let result = run_trace(&conn, Some(1), "src/auth.rs", false, true, 10).unwrap();
assert_eq!(result.trace_chains.len(), 1);
assert_eq!(result.trace_chains[0].discussions.len(), 1);
let disc = &result.trace_chains[0].discussions[0];
assert_eq!(disc.author_username, "reviewer");
assert!(disc.body.contains("error case"));
assert_eq!(disc.mr_iid, 10);
}

View File

@@ -11,26 +11,26 @@ use lore::cli::autocorrect::{self, CorrectionResult};
use lore::cli::commands::{
IngestDisplay, InitInputs, InitOptions, InitResult, ListFilters, MrListFilters,
NoteListFilters, SearchCliFilters, SyncOptions, TimelineParams, open_issue_in_browser,
open_mr_in_browser, print_count, print_count_json, print_doctor_results, print_drift_human,
print_drift_json, print_dry_run_preview, print_dry_run_preview_json, print_embed,
print_embed_json, print_event_count, print_event_count_json, print_file_history,
open_mr_in_browser, parse_trace_path, print_count, print_count_json, print_doctor_results,
print_drift_human, print_drift_json, print_dry_run_preview, print_dry_run_preview_json,
print_embed, print_embed_json, print_event_count, print_event_count_json, print_file_history,
print_file_history_json, print_generate_docs, print_generate_docs_json, print_ingest_summary,
print_ingest_summary_json, print_list_issues, print_list_issues_json, print_list_mrs,
print_list_mrs_json, print_list_notes, print_list_notes_csv, print_list_notes_json,
print_list_notes_jsonl, print_search_results, print_search_results_json, print_show_issue,
print_show_issue_json, print_show_mr, print_show_mr_json, print_stats, print_stats_json,
print_sync, print_sync_json, print_sync_status, print_sync_status_json, print_timeline,
print_timeline_json_with_meta, print_who_human, print_who_json, query_notes, run_auth_test,
run_count, run_count_events, run_doctor, run_drift, run_embed, run_file_history,
run_generate_docs, run_ingest, run_ingest_dry_run, run_init, run_list_issues, run_list_mrs,
run_search, run_show_issue, run_show_mr, run_stats, run_sync, run_sync_status, run_timeline,
run_who,
print_timeline_json_with_meta, print_trace, print_trace_json, print_who_human, print_who_json,
query_notes, run_auth_test, run_count, run_count_events, run_doctor, run_drift, run_embed,
run_file_history, run_generate_docs, run_ingest, run_ingest_dry_run, run_init, run_list_issues,
run_list_mrs, run_search, run_show_issue, run_show_mr, run_stats, run_sync, run_sync_status,
run_timeline, run_who,
};
use lore::cli::render::{ColorMode, GlyphMode, Icons, LoreRenderer, Theme};
use lore::cli::robot::{RobotMeta, strip_schemas};
use lore::cli::{
Cli, Commands, CountArgs, EmbedArgs, FileHistoryArgs, GenerateDocsArgs, IngestArgs, IssuesArgs,
MrsArgs, NotesArgs, SearchArgs, StatsArgs, SyncArgs, TimelineArgs, WhoArgs,
MrsArgs, NotesArgs, SearchArgs, StatsArgs, SyncArgs, TimelineArgs, TraceArgs, WhoArgs,
};
use lore::core::db::{
LATEST_SCHEMA_VERSION, create_connection, get_schema_version, run_migrations,
@@ -40,8 +40,10 @@ use lore::core::error::{LoreError, RobotErrorOutput};
use lore::core::logging;
use lore::core::metrics::MetricsLayer;
use lore::core::paths::{get_config_path, get_db_path, get_log_dir};
use lore::core::project::resolve_project;
use lore::core::shutdown::ShutdownSignal;
use lore::core::sync_run::SyncRunRecorder;
use lore::core::trace::run_trace;
#[tokio::main]
async fn main() {
@@ -199,6 +201,7 @@ async fn main() {
Some(Commands::FileHistory(args)) => {
handle_file_history(cli.config.as_deref(), args, robot_mode)
}
Some(Commands::Trace(args)) => handle_trace(cli.config.as_deref(), args, robot_mode),
Some(Commands::Drift {
entity_type,
iid,
@@ -725,6 +728,7 @@ fn suggest_similar_command(invalid: &str) -> String {
("note", "notes"),
("drift", "drift"),
("file-history", "file-history"),
("trace", "trace"),
];
let invalid_lower = invalid.to_lowercase();
@@ -766,6 +770,7 @@ fn command_example(cmd: &str) -> &'static str {
"generate-docs" => "lore --robot generate-docs",
"embed" => "lore --robot embed",
"robot-docs" => "lore robot-docs",
"trace" => "lore --robot trace src/main.rs",
"init" => "lore init",
_ => "lore --robot <command>",
}
@@ -1888,6 +1893,51 @@ fn handle_file_history(
Ok(())
}
fn handle_trace(
config_override: Option<&str>,
args: TraceArgs,
robot_mode: bool,
) -> Result<(), Box<dyn std::error::Error>> {
let start = std::time::Instant::now();
let config = Config::load(config_override)?;
let (path, line_requested) = parse_trace_path(&args.path);
if line_requested.is_some() && !robot_mode {
eprintln!(
"Note: Line-level tracing requires Tier 2 (git blame). Showing file-level results."
);
}
let project = config
.effective_project(args.project.as_deref())
.map(String::from);
let db_path = get_db_path(config.storage.db_path.as_deref());
let conn = create_connection(&db_path)?;
let project_id = project
.as_deref()
.map(|p| resolve_project(&conn, p))
.transpose()?;
let result = run_trace(
&conn,
project_id,
&path,
!args.no_follow_renames,
args.discussions,
args.limit,
)?;
if robot_mode {
let elapsed_ms = start.elapsed().as_millis() as u64;
print_trace_json(&result, elapsed_ms, line_requested);
} else {
print_trace(&result);
}
Ok(())
}
async fn handle_timeline(
config_override: Option<&str>,
args: TimelineArgs,
@@ -2556,6 +2606,16 @@ fn handle_robot_docs(robot_mode: bool, brief: bool) -> Result<(), Box<dyn std::e
"active_minimal": ["entity_type", "iid", "title", "participants"]
}
},
"trace": {
"description": "Trace why code was introduced: file -> MR -> issue -> discussion. Follows rename chains by default.",
"flags": ["<path>", "-p/--project <path>", "--discussions", "--no-follow-renames", "-n/--limit <N>"],
"example": "lore --robot trace src/main.rs -p group/repo",
"response_schema": {
"ok": "bool",
"data": {"path": "string", "resolved_paths": "[string]", "trace_chains": "[{mr_iid:int, mr_title:string, mr_state:string, mr_author:string, change_type:string, merged_at_iso:string?, updated_at_iso:string, web_url:string?, issues:[{iid:int, title:string, state:string, reference_type:string, web_url:string?}], discussions:[{discussion_id:string, mr_iid:int, author_username:string, body_snippet:string, path:string, created_at_iso:string}]}]"},
"meta": {"tier": "string (api_only)", "line_requested": "int?", "elapsed_ms": "int", "total_chains": "int", "renames_followed": "bool"}
}
},
"file-history": {
"description": "Show MRs that touched a file, with rename chain resolution and optional DiffNote discussions",
"flags": ["<path>", "-p/--project <path>", "--discussions", "--no-follow-renames", "--merged", "-n/--limit <N>"],