2 Commits

Author SHA1 Message Date
teernisse
63bd58c9b4 feat(who): filter unresolved discussions to open entities only
Workload and active modes now exclude discussions on closed issues and
merged/closed MRs by default. Adds --include-closed flag to restore
the previous behavior when needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 10:34:28 -05:00
teernisse
714c8c2623 feat(path): rename-aware ambiguity resolution for suffix probe
When a bare filename like 'operators.ts' matches multiple full paths,
check if they are the same file connected by renames (via BFS on
mr_file_changes). If so, auto-resolve to the newest path instead of
erroring. Also wires path resolution into file-history and trace
commands so bare filenames work everywhere.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 10:34:28 -05:00
11 changed files with 848 additions and 132 deletions

File diff suppressed because one or more lines are too long

View File

@@ -1 +1 @@
bd-1sc6
bd-1elx

View File

@@ -0,0 +1,140 @@
Your iteration 4 plan is already strong. The highest-impact revisions are around query shape, transaction boundaries, and contract stability for agents.
1. **Switch discussions query to a two-phase page-first architecture**
Analysis: Current `ranked_notes` runs over every filtered discussion before `LIMIT`, which can explode on project-wide queries. A page-first plan keeps complexity proportional to `limit`, improves tail latency, and reduces memory churn.
```diff
@@ ## 3c. SQL Query
-Core query uses a CTE + ranked-notes rollup (window function) to avoid per-row correlated
-subqueries.
+Core query is split into two phases for scalability:
+1) `paged_discussions` applies filters/sort/LIMIT and returns only page IDs.
+2) Note rollups and optional `--include-notes` expansion run only for those page IDs.
+This bounds note scanning to visible results and stabilizes latency on large projects.
-WITH filtered_discussions AS (
+WITH filtered_discussions AS (
...
),
-ranked_notes AS (
+paged_discussions AS (
+ SELECT id
+ FROM filtered_discussions
+ ORDER BY COALESCE({sort_column}, 0) {order}, id {order}
+ LIMIT ?
+),
+ranked_notes AS (
...
- WHERE n.discussion_id IN (SELECT id FROM filtered_discussions)
+ WHERE n.discussion_id IN (SELECT id FROM paged_discussions)
)
```
2. **Move snapshot transaction ownership to handlers (not query helpers)**
Analysis: This avoids nested transaction edge cases, keeps function signatures clean, and guarantees one snapshot across count + page + include-notes + serialization metadata.
```diff
@@ ## Cross-cutting: snapshot consistency
-Wrap `query_notes` and `query_discussions` in a deferred read transaction.
+Open one deferred read transaction in each handler (`handle_notes`, `handle_discussions`)
+and pass `&Transaction` into query helpers. Query helpers do not open/commit transactions.
+This guarantees a single snapshot across all subqueries and avoids nested tx pitfalls.
-pub fn query_discussions(conn: &Connection, ...)
+pub fn query_discussions(tx: &rusqlite::Transaction<'_>, ...)
```
3. **Add immutable input filter `--project-id` across notes/discussions/show**
Analysis: You already expose `gitlab_project_id` because paths are mutable; input should support the same immutable selector. This removes failure modes after project renames/transfers.
```diff
@@ ## 3a. CLI Args
+ /// Filter by immutable GitLab project ID
+ #[arg(long, help_heading = "Filters", conflicts_with = "project")]
+ pub project_id: Option<i64>,
@@ ## Bridge Contract
+Input symmetry rule: commands that accept `--project` should also accept `--project-id`.
+If both are present, return usage error (exit code 2).
```
4. **Enforce bridge fields for nested notes in `discussions --include-notes`**
Analysis: Current guardrail is entity-level; nested notes can still lose required IDs under aggressive filtering. This is a contract hole for write-bridging.
```diff
@@ ### Field Filtering Guardrail
-In robot mode, `filter_fields` MUST force-include Bridge Contract fields...
+In robot mode, `filter_fields` MUST force-include Bridge Contract fields at all returned levels:
+- discussion row fields
+- nested note fields when `discussions --include-notes` is used
+const BRIDGE_FIELDS_DISCUSSION_NOTES: &[&str] = &[
+ "project_path", "gitlab_project_id", "noteable_type", "parent_iid",
+ "gitlab_discussion_id", "gitlab_note_id",
+];
```
5. **Make ambiguity preflight scope-aware and machine-actionable**
Analysis: Current preflight checks only `gitlab_discussion_id`, which can produce false ambiguity when additional filters already narrow to one project. Also, agents need structured candidates, not only free-text.
```diff
@@ ### Ambiguity Guardrail
-SELECT DISTINCT p.path_with_namespace
+SELECT DISTINCT p.path_with_namespace, p.gitlab_project_id
FROM discussions d
JOIN projects p ON p.id = d.project_id
-WHERE d.gitlab_discussion_id = ?
+WHERE d.gitlab_discussion_id = ?
+ /* plus active scope filters: noteable_type, for_issue/for_mr, since/path when present */
LIMIT 3
-Return LoreError::Ambiguous with message
+Return LoreError::Ambiguous with structured details:
+`{ code, message, candidates:[{project_path, gitlab_project_id}], suggestion }`
```
6. **Add `--contains` filter to `discussions`**
Analysis: This is a high-utility agent workflow gap. Agents frequently need “find thread by text then reply”; forcing a separate `notes` search round-trip is unnecessary.
```diff
@@ ## 3a. CLI Args
+ /// Filter discussions whose notes contain text
+ #[arg(long, help_heading = "Filters")]
+ pub contains: Option<String>,
@@ ## 3d. Filters struct
+ pub contains: Option<String>,
@@ ## 3d. Where-clause construction
+- `path` -> EXISTS (...)
+- `path` -> EXISTS (...)
+- `contains` -> EXISTS (
+ SELECT 1 FROM notes n
+ WHERE n.discussion_id = d.id
+ AND n.body LIKE ?
+ )
```
7. **Promote two baseline indexes from “candidate” to “required”**
Analysis: These are directly hit by new primary paths; waiting for post-merge profiling risks immediate perf cliffs in real usage.
```diff
@@ ## 3h. Query-plan validation
-Candidate indexes (add only if EXPLAIN QUERY PLAN shows they're needed):
-- discussions(project_id, gitlab_discussion_id)
-- notes(discussion_id, created_at DESC, id DESC)
+Required baseline indexes for this feature:
+- discussions(project_id, gitlab_discussion_id)
+- notes(discussion_id, created_at DESC, id DESC)
+Keep other indexes conditional on EXPLAIN QUERY PLAN.
```
8. **Add schema versioning and remove contradictory rejected items**
Analysis: `robot-docs` contract drift is a long-term agent risk; explicit schema versions let clients fail safely. Also, rejected items currently contradict active sections, which creates implementation ambiguity.
```diff
@@ ## 4. Fix Robot-Docs Response Schemas
"meta": {"elapsed_ms": "int", ...}
+"meta": {"elapsed_ms":"int", ..., "schema_version":"string"}
+
+Schema version policy:
+- bump minor on additive fields
+- bump major on removals/renames
+- expose per-command versions in `robot-docs`
@@ ## Rejected Recommendations
-- Add `gitlab_note_id` to show-command note detail structs ... rejected ...
-- Add `gitlab_discussion_id` to show-command discussion detail structs ... rejected ...
-- Add `gitlab_project_id` to show-command discussion detail structs ... rejected ...
+Remove stale rejected entries that conflict with accepted workstreams in this plan iteration.
```
If you want, I can produce a fully rewritten iteration 5 plan document that applies all of the above edits cleanly end-to-end.

View File

@@ -2,7 +2,7 @@
plan: true
title: ""
status: iterating
iteration: 4
iteration: 5
target_iterations: 8
beads_revision: 0
related_plans: []
@@ -52,8 +52,9 @@ output.
### Field Filtering Guardrail
In robot mode, `filter_fields` **MUST** force-include Bridge Contract fields even when the
caller passes a narrower `--fields` list. This prevents agents from accidentally stripping
the identifiers they need for write operations.
caller passes a narrower `--fields` list. This applies at **all nesting levels**: both the
top-level entity fields and nested sub-entities (e.g., notes inside `discussions --include-notes`).
This prevents agents from accidentally stripping the identifiers they need for write operations.
**Implementation**: Add a `BRIDGE_FIELDS` constant map per entity type. In `filter_fields()`,
when operating in robot mode, union the caller's requested fields with the bridge set before
@@ -69,70 +70,127 @@ const BRIDGE_FIELDS_DISCUSSIONS: &[&str] = &[
"project_path", "gitlab_project_id", "noteable_type", "parent_iid",
"gitlab_discussion_id",
];
// Applied to nested notes within discussions --include-notes
const BRIDGE_FIELDS_DISCUSSION_NOTES: &[&str] = &[
"project_path", "gitlab_project_id", "noteable_type", "parent_iid",
"gitlab_discussion_id", "gitlab_note_id",
];
```
In `filter_fields`, when entity is `"notes"` or `"discussions"`, merge the bridge set into the
requested fields before filtering the JSON value. This is a ~5-line change to the existing
function.
requested fields before filtering the JSON value. For `"discussions"`, also apply
`BRIDGE_FIELDS_DISCUSSION_NOTES` to each element of the nested `notes` array. This is a ~10-line
change to the existing function.
### Snapshot Consistency (Cross-Cutting)
Multi-query commands (`handle_notes`, `handle_discussions`) **MUST** execute all their queries
within a single deferred read transaction. This guarantees snapshot consistency when a concurrent
sync/ingest is modifying the database.
**Transaction ownership lives in handlers, not query helpers.** Each handler opens one deferred
read transaction and passes it to query helpers. Query helpers accept `&Connection` (which
`Transaction` derefs to via `std::ops::Deref`) so they remain testable with plain connections
in unit tests. This avoids nested transaction edge cases and guarantees a single snapshot across
count + page + include-notes + serialization.
```rust
// In handle_notes / handle_discussions:
let tx = conn.transaction_with_behavior(rusqlite::TransactionBehavior::Deferred)?;
let result = query_notes(&tx, &filters, &config)?;
// ... serialize ...
tx.commit()?; // read-only, but closes cleanly
```
Query helpers keep their `conn: &Connection` signature — `Transaction<'_>` implements
`Deref<Target = Connection>`, so `&tx` coerces to `&Connection` at call sites.
### Ambiguity Guardrail
When filtering by `gitlab_discussion_id` (on either `notes` or `discussions` commands) without
`--project`, if the query matches discussions in multiple projects:
- Return an `Ambiguous` error (exit code 18, matching existing convention)
- Include matching project paths in the error message
- Include matching project paths **and `gitlab_project_id`s** in a structured candidates list
- Suggest retry with `--project <path>`
**Implementation**: Run a **preflight distinct-project check** before the main list query
executes its `LIMIT`. This is critical because a post-query check on the paginated result set
can silently miss cross-project ambiguity when `LIMIT` truncates results to rows from a single
project. The preflight query is cheap (hits the `gitlab_discussion_id` index, returns at most
a few rows) and eliminates non-deterministic write-targeting risk.
**Implementation**: Run a **scope-aware preflight distinct-project check** before the main list
query executes its `LIMIT`. The preflight applies active scope filters (noteable_type, since,
for_issue/for_mr) alongside the discussion ID check, so it won't produce false ambiguity when
other filters already narrow to one project. This is critical because a post-query check on the
paginated result set can silently miss cross-project ambiguity when `LIMIT` truncates results to
rows from a single project. The preflight query is cheap (hits the `gitlab_discussion_id` index,
returns at most a few rows) and eliminates non-deterministic write-targeting risk.
```sql
-- Preflight ambiguity check (runs before main query)
SELECT DISTINCT p.path_with_namespace
-- Preflight ambiguity check (runs before main query, includes active scope filters)
SELECT DISTINCT p.path_with_namespace, p.gitlab_project_id
FROM discussions d
JOIN projects p ON p.id = d.project_id
WHERE d.gitlab_discussion_id = ?
-- scope filters applied dynamically:
-- AND d.noteable_type = ? (when --noteable-type present)
-- AND d.merge_request_id = (SELECT ...) (when --for-mr present)
-- AND d.issue_id = (SELECT ...) (when --for-issue present)
LIMIT 3
```
If more than one project is found, return `LoreError::Ambiguous` (exit code 18) with the
distinct project paths and suggestion to retry with `--project <path>`.
If more than one project is found, return `LoreError::Ambiguous` (exit code 18) with structured
candidates for machine consumption:
```rust
// In query_notes / query_discussions, before executing the main query:
if let Some(ref disc_id) = filters.gitlab_discussion_id {
if filters.project.is_none() {
let distinct_projects: Vec<String> = conn
let candidates: Vec<(String, i64)> = conn
.prepare(
"SELECT DISTINCT p.path_with_namespace \
"SELECT DISTINCT p.path_with_namespace, p.gitlab_project_id \
FROM discussions d \
JOIN projects p ON p.id = d.project_id \
WHERE d.gitlab_discussion_id = ? \
LIMIT 3"
// Note: add scope filter clauses dynamically
)?
.query_map([disc_id], |row| row.get(0))?
.query_map([disc_id], |row| Ok((row.get(0)?, row.get(1)?)))?
.collect::<std::result::Result<Vec<_>, _>>()?;
if distinct_projects.len() > 1 {
if candidates.len() > 1 {
return Err(LoreError::Ambiguous {
message: format!(
"Discussion ID matches {} projects: {}. Use --project to disambiguate.",
distinct_projects.len(),
distinct_projects.join(", ")
"Discussion ID matches {} projects. Use --project to disambiguate.",
candidates.len(),
),
candidates: candidates.into_iter()
.map(|(path, id)| AmbiguousCandidate { project_path: path, gitlab_project_id: id })
.collect(),
});
}
}
}
```
In robot mode, the error serializes as:
```json
{
"error": {
"code": "AMBIGUOUS",
"message": "Discussion ID matches 2 projects. Use --project to disambiguate.",
"candidates": [
{"project_path": "group/repo-a", "gitlab_project_id": 42},
{"project_path": "group/repo-b", "gitlab_project_id": 99}
],
"suggestion": "lore -J discussions --gitlab-discussion-id <id> --project <path>",
"actions": ["lore -J discussions --gitlab-discussion-id <id> --project group/repo-a"]
}
}
```
This gives agents machine-actionable candidates: they can pick a project and retry immediately
without parsing free-text error messages.
#### 1h. Wrap `query_notes` in a read transaction
Wrap the count query and page query in a deferred read transaction per the Snapshot Consistency
cross-cutting requirement. See the Bridge Contract section for the pattern.
Per the Snapshot Consistency cross-cutting requirement, `handle_notes` opens a deferred read
transaction and passes it to `query_notes`. See the Snapshot Consistency section for the pattern.
### Tests
@@ -337,6 +395,7 @@ fn notes_ambiguous_gitlab_discussion_id_across_projects() {
// (this can happen since IDs are per-project)
// Filter by gitlab_discussion_id without --project
// Assert LoreError::Ambiguous is returned with both project paths
// Assert candidates include gitlab_project_id for machine consumption
}
```
@@ -352,6 +411,19 @@ fn notes_ambiguity_preflight_not_defeated_by_limit() {
}
```
#### Test 8: Ambiguity preflight respects scope filters (no false positives)
```rust
#[test]
fn notes_ambiguity_preflight_respects_scope_filters() {
let conn = create_test_db();
// Insert 2 projects, each with a discussion sharing the same gitlab_discussion_id
// But one is Issue-type and the other MergeRequest-type
// Filter by gitlab_discussion_id + --noteable-type MergeRequest (narrows to 1 project)
// Assert NO ambiguity error — scope filters disambiguate
}
```
---
## 2. Add `gitlab_discussion_id` to Show Command Discussion Groups
@@ -644,6 +716,9 @@ lore -J discussions --gitlab-discussion-id 6a9c1750b37d
# List unresolved threads with latest 2 notes inline (fewer round-trips)
lore -J discussions --for-mr 99 --resolution unresolved --include-notes 2
# Find discussions containing specific text
lore -J discussions --for-mr 99 --contains "prefer the approach"
```
### Response Schema
@@ -801,6 +876,10 @@ pub struct DiscussionsArgs {
#[arg(long, value_enum, help_heading = "Filters")]
pub noteable_type: Option<NoteableTypeFilter>,
/// Filter discussions whose notes contain text (case-insensitive LIKE match)
#[arg(long, help_heading = "Filters")]
pub contains: Option<String>,
/// Include up to N latest notes per discussion (0 = none, default; clamped to 20)
#[arg(long, default_value = "0", help_heading = "Output")]
pub include_notes: usize,
@@ -925,7 +1004,7 @@ The `included_note_count` is set to `notes.len()` and `has_more_notes` is set to
`note_count > included_note_count` during the JSON conversion, providing per-discussion
truncation signals.
#### 3c. SQL Query
#### 3c. SQL Query — Two-Phase Page-First Architecture
**File**: `src/cli/commands/list.rs`
@@ -935,21 +1014,29 @@ pub fn query_discussions(
filters: &DiscussionListFilters,
config: &Config,
) -> Result<DiscussionListResult> {
// Wrap all queries in a deferred read transaction for snapshot consistency
let tx = conn.transaction_with_behavior(rusqlite::TransactionBehavior::Deferred)?;
// NOTE: Transaction is managed by the handler (handle_discussions).
// This function receives &Connection (which Transaction derefs to via `std::ops::Deref`).
// Preflight ambiguity check (if gitlab_discussion_id without project)
// ... see Ambiguity Guardrail section ...
// Main query + count query ...
// ... note expansion query (if include_notes > 0) ...
tx.commit()?;
// Phase 1: Filter + sort + LIMIT to get page IDs
// Phase 2: Note rollups only for paged results
// Phase 3: Optional --include-notes expansion (separate query)
}
```
Core query uses a CTE + ranked-notes rollup (window function) to avoid per-row correlated
subqueries. The `ROW_NUMBER()` approach produces a single scan over the notes table, which
is more predictable than repeated LIMIT 1 sub-selects at scale (200K+ discussions):
The query uses a **two-phase page-first architecture** for scalability:
1. **Phase 1** (`paged_discussions`): Apply all filters, sort, and LIMIT to produce just the
discussion IDs for the current page. This bounds the result set before any note scanning.
2. **Phase 2** (`ranked_notes` + `note_rollup`): Run note aggregation only for the paged
discussion IDs. This ensures note scanning is proportional to `--limit`, not to the total
filtered discussion count.
This architecture prevents the performance cliff that occurs on project-wide queries with
thousands of discussions: instead of scanning notes for all filtered discussions (potentially
200K+), we scan only for the 50 (or whatever `--limit` is) that will actually be returned.
```sql
WITH filtered_discussions AS (
@@ -961,6 +1048,14 @@ WITH filtered_discussions AS (
JOIN projects p ON d.project_id = p.id
{where_sql}
),
-- Phase 1: Page-first — apply sort + LIMIT before note scanning
paged_discussions AS (
SELECT id
FROM filtered_discussions
ORDER BY COALESCE({sort_column}, 0) {order}, id {order}
LIMIT ?
),
-- Phase 2: Note rollups only for paged results
ranked_notes AS (
SELECT
n.discussion_id,
@@ -980,7 +1075,7 @@ ranked_notes AS (
n.created_at, n.id
) AS rn_first_position
FROM notes n
WHERE n.discussion_id IN (SELECT id FROM filtered_discussions)
WHERE n.discussion_id IN (SELECT id FROM paged_discussions)
),
note_rollup AS (
SELECT
@@ -1012,12 +1107,12 @@ SELECT
nr.position_new_path,
nr.position_new_line
FROM filtered_discussions fd
JOIN paged_discussions pd ON fd.id = pd.id
JOIN projects p ON fd.project_id = p.id
LEFT JOIN issues i ON fd.issue_id = i.id
LEFT JOIN merge_requests m ON fd.merge_request_id = m.id
LEFT JOIN note_rollup nr ON nr.discussion_id = fd.id
ORDER BY COALESCE({sort_column}, 0) {order}, fd.id {order}
LIMIT ?
```
**Dual window function rationale**: The `ranked_notes` CTE uses two separate `ROW_NUMBER()`
@@ -1028,12 +1123,11 @@ displacing the first human author/body, and prevents a non-positioned note from
the file location. The `MAX(CASE WHEN rn_xxx = 1 ...)` pattern extracts the correct value
from each independently-ranked sequence.
**Performance rationale**: The CTE pre-filters discussions before joining notes. The
`ranked_notes` CTE uses `ROW_NUMBER()` (a single pass over the notes index) instead of
correlated `(SELECT ... LIMIT 1)` sub-selects per discussion. For MR-scoped queries
(50-200 discussions) the performance is equivalent. For project-wide scans with thousands
of discussions, the window function approach avoids repeated index probes and produces a
more predictable query plan.
**Page-first scalability rationale**: The `paged_discussions` CTE applies LIMIT before note
scanning. For MR-scoped queries (50-200 discussions) the performance is equivalent to the
non-paged approach. For project-wide scans with thousands of discussions, the page-first
architecture avoids scanning notes for discussions that won't appear in the result, keeping
latency proportional to `--limit` rather than to the total filtered count.
**Note on ordering**: The `COALESCE({sort_column}, 0)` with tiebreaker `fd.id` ensures
deterministic ordering even when timestamps are NULL (partial sync states). The `id`
@@ -1042,6 +1136,10 @@ tiebreaker is cheap (primary key) and prevents unstable sort output.
**Note on SQLite FILTER syntax**: SQLite does not support `COUNT(*) FILTER (WHERE ...)`.
Use `SUM(CASE WHEN ... THEN 1 ELSE 0 END)` instead (as shown above).
**Count query**: The total_count query runs separately against `filtered_discussions` (without
the LIMIT) using `SELECT COUNT(*) FROM filtered_discussions`. This is needed for `has_more`
metadata. The count uses the same filter CTEs but omits notes entirely.
#### 3c-ii. Note expansion query (--include-notes)
When `include_notes > 0`, after the main discussion query, run a **single batched query**
@@ -1103,6 +1201,7 @@ pub struct DiscussionListFilters {
pub since: Option<String>,
pub path: Option<String>,
pub noteable_type: Option<NoteableTypeFilter>,
pub contains: Option<String>,
pub sort: DiscussionSortField,
pub order: SortDirection,
pub include_notes: usize,
@@ -1117,6 +1216,7 @@ Where-clause construction uses `match` on typed enums — never raw string inter
- `since``d.first_note_at >= ?` (using `parse_since()`)
- `path``EXISTS (SELECT 1 FROM notes n WHERE n.discussion_id = d.id AND n.position_new_path LIKE ?)`
- `noteable_type` → match: `Issue``d.noteable_type = 'Issue'`, `MergeRequest``d.noteable_type = 'MergeRequest'`
- `contains``EXISTS (SELECT 1 FROM notes n WHERE n.discussion_id = d.id AND n.body LIKE '%' || ? || '%')`
#### 3e. Handler wiring
@@ -1128,7 +1228,7 @@ Add match arm:
Some(Commands::Discussions(args)) => handle_discussions(cli.config.as_deref(), args, robot_mode),
```
Handler function:
Handler function (with transaction ownership):
```rust
fn handle_discussions(
@@ -1143,6 +1243,10 @@ fn handle_discussions(
let effective_limit = args.limit.min(500);
let effective_include_notes = args.include_notes.min(20);
// Snapshot consistency: one transaction across all queries
let tx = conn.transaction_with_behavior(rusqlite::TransactionBehavior::Deferred)?;
let filters = DiscussionListFilters {
limit: effective_limit,
project: args.project,
@@ -1153,12 +1257,15 @@ fn handle_discussions(
since: args.since,
path: args.path,
noteable_type: args.noteable_type,
contains: args.contains,
sort: args.sort,
order: args.order,
include_notes: effective_include_notes,
};
let result = query_discussions(&conn, &filters, &config)?;
let result = query_discussions(&tx, &filters, &config)?;
tx.commit()?; // read-only, but closes cleanly
let format = if robot_mode && args.format == "table" {
"json"
@@ -1247,7 +1354,7 @@ CSV view: all fields, following same pattern as `print_list_notes_csv`.
.collect(),
```
#### 3h. Query-plan validation
#### 3h. Query-plan validation and indexes
Before merging the discussions command, capture `EXPLAIN QUERY PLAN` output for the three
primary query patterns:
@@ -1255,17 +1362,26 @@ primary query patterns:
- `--project <path> --since 7d --sort last-note`
- `--gitlab-discussion-id <id>`
If plans show table scans on `notes` or `discussions` for these patterns, add targeted indexes
to the `MIGRATIONS` array in `src/core/db.rs`:
**Required baseline index** (directly hit by `--include-notes` expansion, which runs a
`ROW_NUMBER() OVER (PARTITION BY discussion_id ORDER BY created_at DESC, id DESC)` window
on the notes table):
**Candidate indexes** (add only if EXPLAIN QUERY PLAN shows they're needed):
```sql
CREATE INDEX IF NOT EXISTS idx_notes_discussion_created_desc
ON notes(discussion_id, created_at DESC, id DESC);
```
This index is non-negotiable because the include-notes expansion query's performance is
directly proportional to how efficiently it can scan notes per discussion. Without it, SQLite
falls back to a full table scan of the 282K-row notes table for each batch.
**Conditional indexes** (add only if EXPLAIN QUERY PLAN shows they're needed):
- `discussions(project_id, gitlab_discussion_id)` — for ambiguity preflight + direct ID lookup
- `discussions(merge_request_id, last_note_at, id)` — for MR-scoped + sorted queries
- `notes(discussion_id, created_at DESC, id DESC)` — for `--include-notes` expansion
- `notes(discussion_id, is_system, created_at, id)` — for ranked_notes CTE ordering
This is a measured approach: profile first, add indexes only where the query plan demands them.
No speculative index creation.
This is a measured approach: one required index for the critical new path, remaining indexes
added only where the query plan demands them.
### Tests
@@ -1500,7 +1616,7 @@ fn discussions_ambiguous_gitlab_discussion_id_across_projects() {
};
let result = query_discussions(&conn, &filters, &Config::default());
assert!(result.is_err());
// Error should be Ambiguous with both project paths
// Error should be Ambiguous with both project paths and gitlab_project_ids
}
```
@@ -1579,6 +1695,99 @@ fn discussions_first_note_rollup_skips_system_notes() {
}
```
#### Test 15: --contains filter returns matching discussions
```rust
#[test]
fn query_discussions_contains_filter() {
let conn = create_test_db();
insert_project(&conn, 1);
insert_mr(&conn, 1, 1, 99, "Test MR");
insert_discussion(&conn, 1, "disc-match", 1, None, Some(1), "MergeRequest");
insert_discussion(&conn, 2, "disc-nomatch", 1, None, Some(1), "MergeRequest");
insert_note_in_discussion(&conn, 1, 500, 1, 1, "alice", "I really do prefer this approach");
insert_note_in_discussion(&conn, 2, 501, 2, 1, "bob", "Looks good to me");
let filters = DiscussionListFilters {
contains: Some("really do prefer".to_string()),
..DiscussionListFilters::default_for_mr(99)
};
let result = query_discussions(&conn, &filters, &Config::default()).unwrap();
assert_eq!(result.discussions.len(), 1);
assert_eq!(result.discussions[0].gitlab_discussion_id, "disc-match");
}
```
#### Test 16: Nested note bridge fields survive --fields filtering in robot mode
```rust
#[test]
fn discussions_nested_note_bridge_fields_forced_in_robot_mode() {
// When discussions --include-notes returns nested notes,
// bridge fields on nested notes must survive --fields filtering
let mut value = serde_json::json!({
"data": {
"discussions": [{
"gitlab_discussion_id": "abc",
"noteable_type": "MergeRequest",
"parent_iid": 99,
"project_path": "group/repo",
"gitlab_project_id": 42,
"note_count": 1,
"notes": [{
"body": "test note",
"project_path": "group/repo",
"gitlab_project_id": 42,
"noteable_type": "MergeRequest",
"parent_iid": 99,
"gitlab_discussion_id": "abc",
"gitlab_note_id": 500
}]
}]
}
});
// Agent requests only "body" on notes — bridge fields must still appear
filter_fields_robot(
&mut value,
"discussions",
&["note_count".to_string()],
);
let note = &value["data"]["discussions"][0]["notes"][0];
assert!(note.get("gitlab_discussion_id").is_some());
assert!(note.get("gitlab_note_id").is_some());
assert!(note.get("gitlab_project_id").is_some());
}
```
#### Test 17: Ambiguity preflight respects scope filters (no false positives)
```rust
#[test]
fn discussions_ambiguity_preflight_respects_scope_filters() {
let conn = create_test_db();
insert_project(&conn, 1); // "group/repo-a"
insert_project(&conn, 2); // "group/repo-b"
// Same gitlab_discussion_id in both projects
// But different noteable_types
insert_discussion(&conn, 1, "shared-id", 1, Some(1), None, "Issue");
insert_discussion(&conn, 2, "shared-id", 2, None, Some(1), "MergeRequest");
// Filter by noteable_type narrows to one project — should NOT fire ambiguity
let filters = DiscussionListFilters {
gitlab_discussion_id: Some("shared-id".to_string()),
noteable_type: Some(NoteableTypeFilter::MergeRequest),
project: None,
..DiscussionListFilters::default()
};
let result = query_discussions(&conn, &filters, &Config::default());
assert!(result.is_ok());
assert_eq!(result.unwrap().discussions.len(), 1);
}
```
---
## 4. Fix Robot-Docs Response Schemas
@@ -1629,6 +1838,7 @@ With:
"--since <period>",
"--path <filepath>",
"--noteable-type <Issue|MergeRequest>",
"--contains <text>",
"--include-notes <N>",
"--sort <first-note|last-note>",
"--order <asc|desc>",
@@ -1831,14 +2041,13 @@ Changes 1 and 2 can be done in parallel. Change 4 must come last since it docume
final schema of all preceding changes.
**Cross-cutting**: The Bridge Contract field guardrail (force-including bridge fields in robot
mode) should be implemented as part of Change 1, since it modifies `filter_fields` in
`robot.rs` which all subsequent changes depend on. The `BRIDGE_FIELDS_*` constants are defined
once and reused by Changes 3 and 4.
mode, including nested notes) should be implemented as part of Change 1, since it modifies
`filter_fields` in `robot.rs` which all subsequent changes depend on. The `BRIDGE_FIELDS_*`
constants are defined once and reused by Changes 3 and 4.
**Cross-cutting**: The snapshot consistency pattern (deferred read transaction) should be
implemented in Change 1 for `query_notes` and carried forward to Change 3 for
`query_discussions`. This is a one-line wrapper that provides correctness guarantees with
zero performance cost.
**Cross-cutting**: The snapshot consistency pattern (deferred read transaction in handlers)
should be implemented in Change 1 for `handle_notes` and carried forward to Change 3 for
`handle_discussions`. Transaction ownership lives in handlers; query helpers accept `&Connection`.
---
@@ -1850,40 +2059,52 @@ After all changes:
`gitlab_discussion_id`, `gitlab_note_id`, and `gitlab_project_id` in the response
2. An agent can run `lore -J discussions --for-mr 3929 --resolution unresolved` to see all
open threads with their IDs
3. An agent can run `lore -J mrs 3929` and see `gitlab_discussion_id`, `resolvable`,
3. An agent can run `lore -J discussions --for-mr 3929 --contains "prefer the approach"` to
find threads by text content without a separate `notes` round-trip
4. An agent can run `lore -J mrs 3929` and see `gitlab_discussion_id`, `resolvable`,
`resolved`, and `last_note_at_iso` on each discussion group, plus `gitlab_note_id` on
each note within
4. `lore robot-docs` lists actual field names for all commands
5. All existing tests still pass
6. No clippy warnings (pedantic + nursery)
7. Robot-docs contract tests pass with field-set parity (not just string-contains), preventing
5. `lore robot-docs` lists actual field names for all commands
6. All existing tests still pass
7. No clippy warnings (pedantic + nursery)
8. Robot-docs contract tests pass with field-set parity (not just string-contains), preventing
future schema drift in both directions
8. Bridge Contract fields (`project_path`, `gitlab_project_id`, `noteable_type`, `parent_iid`,
9. Bridge Contract fields (`project_path`, `gitlab_project_id`, `noteable_type`, `parent_iid`,
`gitlab_discussion_id`, `gitlab_note_id`) are present in every applicable read payload
9. Bridge Contract fields survive `--fields` filtering in robot mode (guardrail enforced)
10. `--gitlab-discussion-id` filter works on both `notes` and `discussions` commands
11. `--include-notes N` populates inline notes on `discussions` output via single batched query
12. CLI-level contract integration tests verify bridge fields through the full handler path
13. `gitlab_note_id` is available in notes list output (alongside `gitlab_id` for back-compat)
10. Bridge Contract fields survive `--fields` filtering in robot mode (guardrail enforced),
including nested notes within `discussions --include-notes`
11. `--gitlab-discussion-id` filter works on both `notes` and `discussions` commands
12. `--include-notes N` populates inline notes on `discussions` output via single batched query
13. CLI-level contract integration tests verify bridge fields through the full handler path
14. `gitlab_note_id` is available in notes list output (alongside `gitlab_id` for back-compat)
and in show detail notes, providing a uniform field name across all commands
14. Ambiguity guardrail fires when `--gitlab-discussion-id` matches multiple projects without
15. Ambiguity guardrail fires when `--gitlab-discussion-id` matches multiple projects without
`--project` specified — **including when LIMIT would have hidden the ambiguity** (preflight
query runs before LIMIT)
15. Output guardrails clamp `--limit` to 500 and `--include-notes` to 20; `meta` reports
query runs before LIMIT). Error includes structured candidates with `gitlab_project_id`
for machine consumption
16. Ambiguity preflight is scope-aware: active filters (noteable_type, for_issue/for_mr) are
applied alongside the discussion ID check, preventing false ambiguity when scope already
narrows to one project
17. Output guardrails clamp `--limit` to 500 and `--include-notes` to 20; `meta` reports
effective values and `has_more` truncation flag
16. Discussion and show queries use deterministic ordering (COALESCE + id tiebreaker) to
18. Discussion and show queries use deterministic ordering (COALESCE + id tiebreaker) to
prevent unstable output during partial sync states
17. Per-discussion truncation signals (`included_note_count`, `has_more_notes`) are accurate
19. Per-discussion truncation signals (`included_note_count`, `has_more_notes`) are accurate
for `--include-notes` output
18. Multi-query commands (`query_notes`, `query_discussions`) use deferred read transactions
for snapshot consistency during concurrent ingest
19. Discussion filters (`resolution`, `noteable_type`, `sort`, `order`) use typed enums
20. Multi-query handlers (`handle_notes`, `handle_discussions`) open deferred read transactions;
query helpers accept `&Connection` for snapshot consistency and testability
21. Discussion filters (`resolution`, `noteable_type`, `sort`, `order`) use typed enums
with match-to-SQL mapping — no raw string interpolation in query construction
20. First-note rollup correctly handles discussions with leading system notes — `first_author`
22. First-note rollup correctly handles discussions with leading system notes — `first_author`
and `first_note_body_snippet` always reflect the first non-system note
21. Query plans for primary discussion query patterns (`--for-mr`, `--project --since`,
23. Query plans for primary discussion query patterns (`--for-mr`, `--project --since`,
`--gitlab-discussion-id`) have been validated via EXPLAIN QUERY PLAN; targeted indexes
added only where scans were observed
24. The `notes(discussion_id, created_at DESC, id DESC)` index is present for `--include-notes`
expansion performance
25. Discussion query uses page-first CTE architecture: note rollups scan only the paged result
set, not all filtered discussions, keeping latency proportional to `--limit`
26. `--contains` filter on `discussions` returns only discussions with matching note text
---
@@ -1902,6 +2123,6 @@ After all changes:
- **`--with-write-hints` flag for inline glab endpoint templates** — rejected because this couples lore's read surface to glab's API surface, violating the read/write split principle. The Bridge Contract gives agents the raw identifiers; constructing glab commands is the agent's responsibility. Adding endpoint templates would require lore to track glab API changes, creating an unnecessary maintenance burden.
- **Show-command note ordering change (`ORDER BY COALESCE(position, ...), created_at, id`)** — rejected because show-command note ordering within a discussion thread is out of scope for this plan. The existing ordering works correctly for present data; the defensive COALESCE pattern is applied to discussion-level ordering where it matters for agent workflows.
- **Query-plan validation as a separate numbered workstream** — rejected because it adds delivery overhead without proportional benefit. Query-plan validation is integrated into workstream 3 as a pre-merge validation step (section 3h), with candidate indexes listed but only added when EXPLAIN QUERY PLAN shows they're needed. This keeps the measured approach without inflating the workstream count.
- **Add `gitlab_note_id` to show-command note detail structs** — rejected because show-command note detail structs already have `gitlab_id` (same value as `id`). The field is unambiguous and consistent with the Bridge Contract. Adding `gitlab_note_id` would create a duplicate and increase payload size without benefit.
- **Add `gitlab_discussion_id` to show-command discussion detail structs** — rejected because show-command discussion detail structs already have `gitlab_discussion_id`. The field is unambiguous and consistent with the Bridge Contract. Adding `gitlab_discussion_id` would create a duplicate and increase payload size without benefit.
- **Add `gitlab_project_id` to show-command discussion detail structs** — rejected because show-command discussion detail structs already have `gitlab_project_id`. The field is unambiguous and consistent with the Bridge Contract. Adding `gitlab_project_id` would create a duplicate and increase payload size without benefit.
- **`--project-id` immutable input filter across notes/discussions/show** — rejected because this is scope creep touching every command and changing CLI ergonomics. Agents already get `gitlab_project_id` in output to construct API calls; the input-side concern (project renames breaking `--project`) is theoretical and hasn't been observed in practice. The `--project` flag already supports fuzzy matching which handles most rename scenarios. If real-world evidence surfaces, this can be added later without breaking changes.
- **Schema versioning in robot-docs (`schema_version` field + semver policy)** — rejected because this tool has zero external consumers beyond our own agents, and the contract tests (field-set parity assertions) catch drift at compile time. Schema versioning adds bureaucratic overhead (version bumps, compatibility matrices, deprecation policies) without proportional benefit for an internal tool in early development. If lore gains external consumers, this can be reconsidered.
- **Remove "stale" rejected items that "conflict" with active sections** — rejected because the prior entries about show-command structs were stale from iteration 2 and have been cleaned up independently. The rejected section is cumulative by design — it prevents future reviewers from re-proposing changes that have already been evaluated.

View File

@@ -193,6 +193,7 @@ const COMMAND_FLAGS: &[(&str, &[&str])] = &[
"--as-of",
"--explain-score",
"--include-bots",
"--include-closed",
"--all-history",
],
),

View File

@@ -344,7 +344,14 @@ pub fn run_who(config: &Config, args: &WhoArgs) -> Result<WhoRun> {
.map(resolve_since_required)
.transpose()?;
let limit = usize::from(args.limit);
let result = query_workload(&conn, username, project_id, since_ms, limit)?;
let result = query_workload(
&conn,
username,
project_id,
since_ms,
limit,
args.include_closed,
)?;
Ok(WhoRun {
resolved_input: WhoResolvedInput {
mode: "workload".to_string(),
@@ -377,7 +384,7 @@ pub fn run_who(config: &Config, args: &WhoArgs) -> Result<WhoRun> {
WhoMode::Active => {
let since_ms = resolve_since(args.since.as_deref(), "7d")?;
let limit = usize::from(args.limit);
let result = query_active(&conn, project_id, since_ms, limit)?;
let result = query_active(&conn, project_id, since_ms, limit, args.include_closed)?;
Ok(WhoRun {
resolved_input: WhoResolvedInput {
mode: "active".to_string(),
@@ -1149,6 +1156,7 @@ fn query_workload(
project_id: Option<i64>,
since_ms: Option<i64>,
limit: usize,
include_closed: bool,
) -> Result<WorkloadResult> {
let limit_plus_one = (limit + 1) as i64;
@@ -1245,7 +1253,14 @@ fn query_workload(
.collect::<std::result::Result<Vec<_>, _>>()?;
// Query 4: Unresolved discussions where user participated
let disc_sql = "SELECT d.noteable_type,
let state_filter = if include_closed {
""
} else {
" AND (i.id IS NULL OR i.state = 'opened')
AND (m.id IS NULL OR m.state = 'opened')"
};
let disc_sql = format!(
"SELECT d.noteable_type,
COALESCE(i.iid, m.iid) AS entity_iid,
(p.path_with_namespace ||
CASE WHEN d.noteable_type = 'MergeRequest' THEN '!' ELSE '#' END ||
@@ -1266,10 +1281,12 @@ fn query_workload(
)
AND (?2 IS NULL OR d.project_id = ?2)
AND (?3 IS NULL OR d.last_note_at >= ?3)
{state_filter}
ORDER BY d.last_note_at DESC
LIMIT ?4";
LIMIT ?4"
);
let mut stmt = conn.prepare_cached(disc_sql)?;
let mut stmt = conn.prepare_cached(&disc_sql)?;
let unresolved_discussions: Vec<WorkloadDiscussion> = stmt
.query_map(
rusqlite::params![username, project_id, since_ms, limit_plus_one],
@@ -1451,35 +1468,63 @@ fn query_active(
project_id: Option<i64>,
since_ms: i64,
limit: usize,
include_closed: bool,
) -> Result<ActiveResult> {
let limit_plus_one = (limit + 1) as i64;
// Total unresolved count -- two static variants
let total_sql_global = "SELECT COUNT(*) FROM discussions d
WHERE d.resolvable = 1 AND d.resolved = 0
AND d.last_note_at >= ?1";
let total_sql_scoped = "SELECT COUNT(*) FROM discussions d
WHERE d.resolvable = 1 AND d.resolved = 0
AND d.last_note_at >= ?1
AND d.project_id = ?2";
let total_unresolved_in_window: u32 = match project_id {
None => conn.query_row(total_sql_global, rusqlite::params![since_ms], |row| {
row.get(0)
})?,
Some(pid) => conn.query_row(total_sql_scoped, rusqlite::params![since_ms, pid], |row| {
row.get(0)
})?,
// State filter for open-entities-only (default behavior)
let state_joins = if include_closed {
""
} else {
" LEFT JOIN issues i ON d.issue_id = i.id
LEFT JOIN merge_requests m ON d.merge_request_id = m.id"
};
let state_filter = if include_closed {
""
} else {
" AND (i.id IS NULL OR i.state = 'opened')
AND (m.id IS NULL OR m.state = 'opened')"
};
// Active discussions with context -- two static SQL variants
let sql_global = "
// Total unresolved count -- conditionally built
let total_sql_global = format!(
"SELECT COUNT(*) FROM discussions d
{state_joins}
WHERE d.resolvable = 1 AND d.resolved = 0
AND d.last_note_at >= ?1
{state_filter}"
);
let total_sql_scoped = format!(
"SELECT COUNT(*) FROM discussions d
{state_joins}
WHERE d.resolvable = 1 AND d.resolved = 0
AND d.last_note_at >= ?1
AND d.project_id = ?2
{state_filter}"
);
let total_unresolved_in_window: u32 = match project_id {
None => conn.query_row(&total_sql_global, rusqlite::params![since_ms], |row| {
row.get(0)
})?,
Some(pid) => {
conn.query_row(&total_sql_scoped, rusqlite::params![since_ms, pid], |row| {
row.get(0)
})?
}
};
// Active discussions with context -- conditionally built SQL
let sql_global = format!(
"
WITH picked AS (
SELECT d.id, d.noteable_type, d.issue_id, d.merge_request_id,
d.project_id, d.last_note_at
FROM discussions d
{state_joins}
WHERE d.resolvable = 1 AND d.resolved = 0
AND d.last_note_at >= ?1
{state_filter}
ORDER BY d.last_note_at DESC
LIMIT ?2
),
@@ -1520,16 +1565,20 @@ fn query_active(
LEFT JOIN note_counts nc ON nc.discussion_id = p.id
LEFT JOIN participants pa ON pa.discussion_id = p.id
ORDER BY p.last_note_at DESC
";
"
);
let sql_scoped = "
let sql_scoped = format!(
"
WITH picked AS (
SELECT d.id, d.noteable_type, d.issue_id, d.merge_request_id,
d.project_id, d.last_note_at
FROM discussions d
{state_joins}
WHERE d.resolvable = 1 AND d.resolved = 0
AND d.last_note_at >= ?1
AND d.project_id = ?2
{state_filter}
ORDER BY d.last_note_at DESC
LIMIT ?3
),
@@ -1570,7 +1619,8 @@ fn query_active(
LEFT JOIN note_counts nc ON nc.discussion_id = p.id
LEFT JOIN participants pa ON pa.discussion_id = p.id
ORDER BY p.last_note_at DESC
";
"
);
// Row-mapping closure shared between both variants
let map_row = |row: &rusqlite::Row| -> rusqlite::Result<ActiveDiscussion> {
@@ -1613,12 +1663,12 @@ fn query_active(
// Select variant first, then prepare exactly one statement
let discussions: Vec<ActiveDiscussion> = match project_id {
None => {
let mut stmt = conn.prepare_cached(sql_global)?;
let mut stmt = conn.prepare_cached(&sql_global)?;
stmt.query_map(rusqlite::params![since_ms, limit_plus_one], &map_row)?
.collect::<std::result::Result<Vec<_>, _>>()?
}
Some(pid) => {
let mut stmt = conn.prepare_cached(sql_scoped)?;
let mut stmt = conn.prepare_cached(&sql_scoped)?;
stmt.query_map(rusqlite::params![since_ms, pid, limit_plus_one], &map_row)?
.collect::<std::result::Result<Vec<_>, _>>()?
}

View File

@@ -54,15 +54,27 @@ fn insert_mr(conn: &Connection, id: i64, project_id: i64, iid: i64, author: &str
}
fn insert_issue(conn: &Connection, id: i64, project_id: i64, iid: i64, author: &str) {
insert_issue_with_state(conn, id, project_id, iid, author, "opened");
}
fn insert_issue_with_state(
conn: &Connection,
id: i64,
project_id: i64,
iid: i64,
author: &str,
state: &str,
) {
conn.execute(
"INSERT INTO issues (id, gitlab_id, project_id, iid, title, state, author_username, created_at, updated_at, last_seen_at)
VALUES (?1, ?2, ?3, ?4, ?5, 'opened', ?6, ?7, ?8, ?9)",
VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8, ?9, ?10)",
rusqlite::params![
id,
id * 10,
project_id,
iid,
format!("Issue {iid}"),
state,
author,
now_ms(),
now_ms(),
@@ -134,6 +146,24 @@ fn insert_diffnote(
.unwrap();
}
fn insert_note(conn: &Connection, id: i64, discussion_id: i64, project_id: i64, author: &str) {
conn.execute(
"INSERT INTO notes (id, gitlab_id, discussion_id, project_id, note_type, is_system, author_username, body, created_at, updated_at, last_seen_at)
VALUES (?1, ?2, ?3, ?4, 'DiscussionNote', 0, ?5, 'comment', ?6, ?7, ?8)",
rusqlite::params![
id,
id * 10,
discussion_id,
project_id,
author,
now_ms(),
now_ms(),
now_ms()
],
)
.unwrap();
}
fn insert_assignee(conn: &Connection, issue_id: i64, username: &str) {
conn.execute(
"INSERT INTO issue_assignees (issue_id, username) VALUES (?1, ?2)",
@@ -263,6 +293,7 @@ fn test_is_file_path_discrimination() {
as_of: None,
explain_score: false,
include_bots: false,
include_closed: false,
all_history: false,
})
.unwrap(),
@@ -286,6 +317,7 @@ fn test_is_file_path_discrimination() {
as_of: None,
explain_score: false,
include_bots: false,
include_closed: false,
all_history: false,
})
.unwrap(),
@@ -309,6 +341,7 @@ fn test_is_file_path_discrimination() {
as_of: None,
explain_score: false,
include_bots: false,
include_closed: false,
all_history: false,
})
.unwrap(),
@@ -332,6 +365,7 @@ fn test_is_file_path_discrimination() {
as_of: None,
explain_score: false,
include_bots: false,
include_closed: false,
all_history: false,
})
.unwrap(),
@@ -355,6 +389,7 @@ fn test_is_file_path_discrimination() {
as_of: None,
explain_score: false,
include_bots: false,
include_closed: false,
all_history: false,
})
.unwrap(),
@@ -378,6 +413,7 @@ fn test_is_file_path_discrimination() {
as_of: None,
explain_score: false,
include_bots: false,
include_closed: false,
all_history: false,
})
.unwrap(),
@@ -402,6 +438,7 @@ fn test_detail_rejected_outside_expert_mode() {
as_of: None,
explain_score: false,
include_bots: false,
include_closed: false,
all_history: false,
};
let mode = resolve_mode(&args).unwrap();
@@ -430,6 +467,7 @@ fn test_detail_allowed_in_expert_mode() {
as_of: None,
explain_score: false,
include_bots: false,
include_closed: false,
all_history: false,
};
let mode = resolve_mode(&args).unwrap();
@@ -579,7 +617,7 @@ fn test_workload_query() {
insert_assignee(&conn, 1, "dev_a");
insert_mr(&conn, 1, 1, 100, "dev_a", "opened");
let result = query_workload(&conn, "dev_a", None, None, 20).unwrap();
let result = query_workload(&conn, "dev_a", None, None, 20, true).unwrap();
assert_eq!(result.assigned_issues.len(), 1);
assert_eq!(result.authored_mrs.len(), 1);
}
@@ -626,7 +664,7 @@ fn test_active_query() {
// Second note by same participant -- note_count should be 2, participants still ["reviewer_b"]
insert_diffnote(&conn, 2, 1, 1, "reviewer_b", "src/foo.rs", "follow-up");
let result = query_active(&conn, None, 0, 20).unwrap();
let result = query_active(&conn, None, 0, 20, true).unwrap();
assert_eq!(result.total_unresolved_in_window, 1);
assert_eq!(result.discussions.len(), 1);
assert_eq!(result.discussions[0].participants, vec!["reviewer_b"]);
@@ -878,7 +916,7 @@ fn test_active_participants_sorted() {
insert_diffnote(&conn, 1, 1, 1, "zebra_user", "src/foo.rs", "note 1");
insert_diffnote(&conn, 2, 1, 1, "alpha_user", "src/foo.rs", "note 2");
let result = query_active(&conn, None, 0, 20).unwrap();
let result = query_active(&conn, None, 0, 20, true).unwrap();
assert_eq!(
result.discussions[0].participants,
vec!["alpha_user", "zebra_user"]
@@ -3265,3 +3303,94 @@ fn test_deterministic_accumulation_order() {
);
}
}
// ─── Tests: include_closed filter ────────────────────────────────────────────
#[test]
fn workload_excludes_closed_entity_discussions() {
let conn = setup_test_db();
insert_project(&conn, 1, "group/repo");
// Open issue with unresolved discussion
insert_issue_with_state(&conn, 10, 1, 10, "someone", "opened");
insert_discussion(&conn, 100, 1, None, Some(10), true, false);
insert_note(&conn, 1000, 100, 1, "alice");
// Closed issue with unresolved discussion
insert_issue_with_state(&conn, 20, 1, 20, "someone", "closed");
insert_discussion(&conn, 200, 1, None, Some(20), true, false);
insert_note(&conn, 2000, 200, 1, "alice");
// Default: exclude closed
let result = query_workload(&conn, "alice", None, None, 50, false).unwrap();
assert_eq!(result.unresolved_discussions.len(), 1);
assert_eq!(result.unresolved_discussions[0].entity_iid, 10);
}
#[test]
fn workload_include_closed_flag_shows_all() {
let conn = setup_test_db();
insert_project(&conn, 1, "group/repo");
insert_issue_with_state(&conn, 10, 1, 10, "someone", "opened");
insert_discussion(&conn, 100, 1, None, Some(10), true, false);
insert_note(&conn, 1000, 100, 1, "alice");
insert_issue_with_state(&conn, 20, 1, 20, "someone", "closed");
insert_discussion(&conn, 200, 1, None, Some(20), true, false);
insert_note(&conn, 2000, 200, 1, "alice");
let result = query_workload(&conn, "alice", None, None, 50, true).unwrap();
assert_eq!(result.unresolved_discussions.len(), 2);
}
#[test]
fn workload_excludes_merged_mr_discussions() {
let conn = setup_test_db();
insert_project(&conn, 1, "group/repo");
// Open MR with unresolved discussion
insert_mr(&conn, 10, 1, 10, "someone", "opened");
insert_discussion(&conn, 100, 1, Some(10), None, true, false);
insert_note(&conn, 1000, 100, 1, "alice");
// Merged MR with unresolved discussion
insert_mr(&conn, 20, 1, 20, "someone", "merged");
insert_discussion(&conn, 200, 1, Some(20), None, true, false);
insert_note(&conn, 2000, 200, 1, "alice");
let result = query_workload(&conn, "alice", None, None, 50, false).unwrap();
assert_eq!(result.unresolved_discussions.len(), 1);
assert_eq!(result.unresolved_discussions[0].entity_iid, 10);
// include_closed shows both
let result = query_workload(&conn, "alice", None, None, 50, true).unwrap();
assert_eq!(result.unresolved_discussions.len(), 2);
}
#[test]
fn active_excludes_closed_entity_discussions() {
let conn = setup_test_db();
insert_project(&conn, 1, "group/repo");
// Open issue with unresolved discussion
insert_issue_with_state(&conn, 10, 1, 10, "someone", "opened");
insert_discussion(&conn, 100, 1, None, Some(10), true, false);
insert_note(&conn, 1000, 100, 1, "alice");
// Closed issue with unresolved discussion
insert_issue_with_state(&conn, 20, 1, 20, "someone", "closed");
insert_discussion(&conn, 200, 1, None, Some(20), true, false);
insert_note(&conn, 2000, 200, 1, "alice");
// Default: exclude closed
let result = query_active(&conn, None, 0, 50, false).unwrap();
assert_eq!(result.discussions.len(), 1);
assert_eq!(result.discussions[0].entity_iid, 10);
assert_eq!(result.total_unresolved_in_window, 1);
// include_closed shows both
let result = query_active(&conn, None, 0, 50, true).unwrap();
assert_eq!(result.discussions.len(), 2);
assert_eq!(result.total_unresolved_in_window, 2);
}

View File

@@ -964,6 +964,10 @@ pub struct WhoArgs {
#[arg(long = "include-bots", help_heading = "Scoring")]
pub include_bots: bool,
/// Include discussions on closed issues and merged/closed MRs
#[arg(long, help_heading = "Filters")]
pub include_closed: bool,
/// Remove the default time window (query all history). Conflicts with --since.
#[arg(
long = "all-history",

View File

@@ -1,6 +1,7 @@
use rusqlite::Connection;
use super::error::{LoreError, Result};
use super::file_history::resolve_rename_chain;
// ─── SQL Helpers ─────────────────────────────────────────────────────────────
@@ -149,6 +150,16 @@ pub fn build_path_query(
is_prefix: false,
}),
SuffixResult::Ambiguous(candidates) => {
// Check if all candidates are the same file connected by renames.
// resolve_rename_chain requires a concrete project_id.
if let Some(pid) = project_id
&& let Some(resolved) = try_resolve_rename_ambiguity(conn, pid, &candidates)?
{
return Ok(PathQuery {
value: resolved,
is_prefix: false,
});
}
let list = candidates
.iter()
.map(|p| format!(" {p}"))
@@ -239,6 +250,59 @@ pub fn suffix_probe(
}
}
/// Maximum rename hops when resolving ambiguity.
const AMBIGUITY_MAX_RENAME_HOPS: usize = 10;
/// When suffix probe returns multiple paths, check if they are all the same file
/// connected by renames. If so, return the "newest" path (the leaf of the chain
/// that is never renamed away from). Returns `None` if truly ambiguous.
fn try_resolve_rename_ambiguity(
conn: &Connection,
project_id: i64,
candidates: &[String],
) -> Result<Option<String>> {
// BFS from the first candidate to discover the full rename chain.
let chain = resolve_rename_chain(conn, project_id, &candidates[0], AMBIGUITY_MAX_RENAME_HOPS)?;
// If any candidate is NOT in the chain, these are genuinely different files.
if !candidates.iter().all(|c| chain.contains(c)) {
return Ok(None);
}
// All candidates are the same file. Find the "newest" path: the one that
// appears as new_path in a rename but is never old_path in a subsequent rename
// (within the chain). This is the leaf of the rename DAG.
let placeholders: Vec<String> = (0..chain.len()).map(|i| format!("?{}", i + 2)).collect();
let in_clause = placeholders.join(", ");
// Find paths that are old_path in a rename where new_path is also in the chain.
let sql = format!(
"SELECT DISTINCT old_path FROM mr_file_changes \
WHERE project_id = ?1 \
AND change_type = 'renamed' \
AND old_path IN ({in_clause}) \
AND new_path IN ({in_clause})"
);
let mut stmt = conn.prepare(&sql)?;
let mut params: Vec<Box<dyn rusqlite::types::ToSql>> = Vec::new();
params.push(Box::new(project_id));
for p in &chain {
params.push(Box::new(p.clone()));
}
let param_refs: Vec<&dyn rusqlite::types::ToSql> = params.iter().map(|p| p.as_ref()).collect();
let old_paths: Vec<String> = stmt
.query_map(param_refs.as_slice(), |row| row.get(0))?
.filter_map(std::result::Result::ok)
.collect();
// The newest path is a candidate that is NOT an old_path in any intra-chain rename.
let newest = candidates.iter().find(|c| !old_paths.contains(c));
Ok(newest.cloned().or_else(|| Some(candidates[0].clone())))
}
#[cfg(test)]
#[path = "path_resolver_tests.rs"]
mod tests;

View File

@@ -288,3 +288,80 @@ fn test_exact_match_preferred_over_suffix() {
assert_eq!(pq.value, "README.md");
assert!(!pq.is_prefix);
}
fn seed_rename(conn: &Connection, mr_id: i64, project_id: i64, old_path: &str, new_path: &str) {
conn.execute(
"INSERT INTO mr_file_changes (merge_request_id, project_id, old_path, new_path, change_type)
VALUES (?1, ?2, ?3, ?4, 'renamed')",
rusqlite::params![mr_id, project_id, old_path, new_path],
)
.unwrap();
}
// ─── rename-aware ambiguity resolution ──────────────────────────────────────
#[test]
fn test_ambiguity_resolved_by_rename_chain() {
let conn = setup_test_db();
seed_project(&conn, 1);
seed_mr(&conn, 1, 1);
seed_mr(&conn, 2, 1);
// File was at src/old/operators.ts, then renamed to src/new/operators.ts
seed_file_change(&conn, 1, 1, "src/old/operators.ts");
seed_rename(&conn, 2, 1, "src/old/operators.ts", "src/new/operators.ts");
// Bare "operators.ts" matches both paths via suffix probe, but they're
// connected by a rename — should auto-resolve to the newest path.
let pq = build_path_query(&conn, "operators.ts", Some(1)).unwrap();
assert_eq!(pq.value, "src/new/operators.ts");
assert!(!pq.is_prefix);
}
#[test]
fn test_ambiguity_not_resolved_when_genuinely_different_files() {
let conn = setup_test_db();
seed_project(&conn, 1);
seed_mr(&conn, 1, 1);
// Two genuinely different files with the same name (no rename connecting them)
seed_file_change(&conn, 1, 1, "src/utils/helpers.ts");
seed_file_change(&conn, 1, 1, "tests/utils/helpers.ts");
let err = build_path_query(&conn, "helpers.ts", Some(1)).unwrap_err();
assert!(err.to_string().contains("matches multiple paths"));
}
#[test]
fn test_ambiguity_rename_chain_with_three_hops() {
let conn = setup_test_db();
seed_project(&conn, 1);
seed_mr(&conn, 1, 1);
seed_mr(&conn, 2, 1);
seed_mr(&conn, 3, 1);
// File named "config.ts" moved twice: lib/ -> src/ -> src/core/
seed_file_change(&conn, 1, 1, "lib/config.ts");
seed_rename(&conn, 2, 1, "lib/config.ts", "src/config.ts");
seed_rename(&conn, 3, 1, "src/config.ts", "src/core/config.ts");
// "config.ts" matches lib/config.ts, src/config.ts, src/core/config.ts via suffix
let pq = build_path_query(&conn, "config.ts", Some(1)).unwrap();
assert_eq!(pq.value, "src/core/config.ts");
assert!(!pq.is_prefix);
}
#[test]
fn test_ambiguity_rename_without_project_id_stays_ambiguous() {
let conn = setup_test_db();
seed_project(&conn, 1);
seed_mr(&conn, 1, 1);
seed_mr(&conn, 2, 1);
seed_file_change(&conn, 1, 1, "src/old/utils.ts");
seed_rename(&conn, 2, 1, "src/old/utils.ts", "src/new/utils.ts");
// Without project_id, rename resolution is skipped → stays ambiguous
let err = build_path_query(&conn, "utils.ts", None).unwrap_err();
assert!(err.to_string().contains("matches multiple paths"));
}

View File

@@ -39,6 +39,7 @@ use lore::core::dependent_queue::release_all_locked_jobs;
use lore::core::error::{LoreError, RobotErrorOutput};
use lore::core::logging;
use lore::core::metrics::MetricsLayer;
use lore::core::path_resolver::{build_path_query, normalize_repo_path};
use lore::core::paths::{get_config_path, get_db_path, get_log_dir};
use lore::core::project::resolve_project;
use lore::core::shutdown::ShutdownSignal;
@@ -1874,9 +1875,27 @@ fn handle_file_history(
.effective_project(args.project.as_deref())
.map(String::from);
let normalized = normalize_repo_path(&args.path);
// Resolve bare filenames before querying (same path resolution as trace/who)
let db_path_tmp = get_db_path(config.storage.db_path.as_deref());
let conn_tmp = create_connection(&db_path_tmp)?;
let project_id_tmp = project
.as_deref()
.map(|p| resolve_project(&conn_tmp, p))
.transpose()?;
let pq = build_path_query(&conn_tmp, &normalized, project_id_tmp)?;
let resolved_path = if pq.is_prefix {
// Directory prefix — file-history is file-oriented, pass the raw path.
// Don't use pq.value which contains LIKE-escaped metacharacters.
normalized.trim_end_matches('/').to_string()
} else {
pq.value
};
let result = run_file_history(
&config,
&args.path,
&resolved_path,
project.as_deref(),
args.no_follow_renames,
args.merged,
@@ -1901,7 +1920,8 @@ fn handle_trace(
let start = std::time::Instant::now();
let config = Config::load(config_override)?;
let (path, line_requested) = parse_trace_path(&args.path);
let (raw_path, line_requested) = parse_trace_path(&args.path);
let normalized = normalize_repo_path(&raw_path);
if line_requested.is_some() && !robot_mode {
eprintln!(
@@ -1920,6 +1940,16 @@ fn handle_trace(
.map(|p| resolve_project(&conn, p))
.transpose()?;
// Resolve bare filenames (e.g. "operators.ts" -> "src/utils/operators.ts")
let pq = build_path_query(&conn, &normalized, project_id)?;
let path = if pq.is_prefix {
// Directory prefix — trace is file-oriented, pass the raw path.
// Don't use pq.value which contains LIKE-escaped metacharacters.
normalized.trim_end_matches('/').to_string()
} else {
pq.value
};
let result = run_trace(
&conn,
project_id,