feat(path): rename-aware ambiguity resolution for suffix probe
When a bare filename like 'operators.ts' matches multiple full paths, check if they are the same file connected by renames (via BFS on mr_file_changes). If so, auto-resolve to the newest path instead of erroring. Also wires path resolution into file-history and trace commands so bare filenames work everywhere. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
File diff suppressed because one or more lines are too long
@@ -1 +1 @@
|
|||||||
bd-1sc6
|
bd-1elx
|
||||||
|
|||||||
140
docs/plan-expose-discussion-ids.feedback-5.md
Normal file
140
docs/plan-expose-discussion-ids.feedback-5.md
Normal file
@@ -0,0 +1,140 @@
|
|||||||
|
Your iteration 4 plan is already strong. The highest-impact revisions are around query shape, transaction boundaries, and contract stability for agents.
|
||||||
|
|
||||||
|
1. **Switch discussions query to a two-phase page-first architecture**
|
||||||
|
Analysis: Current `ranked_notes` runs over every filtered discussion before `LIMIT`, which can explode on project-wide queries. A page-first plan keeps complexity proportional to `limit`, improves tail latency, and reduces memory churn.
|
||||||
|
```diff
|
||||||
|
@@ ## 3c. SQL Query
|
||||||
|
-Core query uses a CTE + ranked-notes rollup (window function) to avoid per-row correlated
|
||||||
|
-subqueries.
|
||||||
|
+Core query is split into two phases for scalability:
|
||||||
|
+1) `paged_discussions` applies filters/sort/LIMIT and returns only page IDs.
|
||||||
|
+2) Note rollups and optional `--include-notes` expansion run only for those page IDs.
|
||||||
|
+This bounds note scanning to visible results and stabilizes latency on large projects.
|
||||||
|
|
||||||
|
-WITH filtered_discussions AS (
|
||||||
|
+WITH filtered_discussions AS (
|
||||||
|
...
|
||||||
|
),
|
||||||
|
-ranked_notes AS (
|
||||||
|
+paged_discussions AS (
|
||||||
|
+ SELECT id
|
||||||
|
+ FROM filtered_discussions
|
||||||
|
+ ORDER BY COALESCE({sort_column}, 0) {order}, id {order}
|
||||||
|
+ LIMIT ?
|
||||||
|
+),
|
||||||
|
+ranked_notes AS (
|
||||||
|
...
|
||||||
|
- WHERE n.discussion_id IN (SELECT id FROM filtered_discussions)
|
||||||
|
+ WHERE n.discussion_id IN (SELECT id FROM paged_discussions)
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Move snapshot transaction ownership to handlers (not query helpers)**
|
||||||
|
Analysis: This avoids nested transaction edge cases, keeps function signatures clean, and guarantees one snapshot across count + page + include-notes + serialization metadata.
|
||||||
|
```diff
|
||||||
|
@@ ## Cross-cutting: snapshot consistency
|
||||||
|
-Wrap `query_notes` and `query_discussions` in a deferred read transaction.
|
||||||
|
+Open one deferred read transaction in each handler (`handle_notes`, `handle_discussions`)
|
||||||
|
+and pass `&Transaction` into query helpers. Query helpers do not open/commit transactions.
|
||||||
|
+This guarantees a single snapshot across all subqueries and avoids nested tx pitfalls.
|
||||||
|
|
||||||
|
-pub fn query_discussions(conn: &Connection, ...)
|
||||||
|
+pub fn query_discussions(tx: &rusqlite::Transaction<'_>, ...)
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Add immutable input filter `--project-id` across notes/discussions/show**
|
||||||
|
Analysis: You already expose `gitlab_project_id` because paths are mutable; input should support the same immutable selector. This removes failure modes after project renames/transfers.
|
||||||
|
```diff
|
||||||
|
@@ ## 3a. CLI Args
|
||||||
|
+ /// Filter by immutable GitLab project ID
|
||||||
|
+ #[arg(long, help_heading = "Filters", conflicts_with = "project")]
|
||||||
|
+ pub project_id: Option<i64>,
|
||||||
|
@@ ## Bridge Contract
|
||||||
|
+Input symmetry rule: commands that accept `--project` should also accept `--project-id`.
|
||||||
|
+If both are present, return usage error (exit code 2).
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Enforce bridge fields for nested notes in `discussions --include-notes`**
|
||||||
|
Analysis: Current guardrail is entity-level; nested notes can still lose required IDs under aggressive filtering. This is a contract hole for write-bridging.
|
||||||
|
```diff
|
||||||
|
@@ ### Field Filtering Guardrail
|
||||||
|
-In robot mode, `filter_fields` MUST force-include Bridge Contract fields...
|
||||||
|
+In robot mode, `filter_fields` MUST force-include Bridge Contract fields at all returned levels:
|
||||||
|
+- discussion row fields
|
||||||
|
+- nested note fields when `discussions --include-notes` is used
|
||||||
|
|
||||||
|
+const BRIDGE_FIELDS_DISCUSSION_NOTES: &[&str] = &[
|
||||||
|
+ "project_path", "gitlab_project_id", "noteable_type", "parent_iid",
|
||||||
|
+ "gitlab_discussion_id", "gitlab_note_id",
|
||||||
|
+];
|
||||||
|
```
|
||||||
|
|
||||||
|
5. **Make ambiguity preflight scope-aware and machine-actionable**
|
||||||
|
Analysis: Current preflight checks only `gitlab_discussion_id`, which can produce false ambiguity when additional filters already narrow to one project. Also, agents need structured candidates, not only free-text.
|
||||||
|
```diff
|
||||||
|
@@ ### Ambiguity Guardrail
|
||||||
|
-SELECT DISTINCT p.path_with_namespace
|
||||||
|
+SELECT DISTINCT p.path_with_namespace, p.gitlab_project_id
|
||||||
|
FROM discussions d
|
||||||
|
JOIN projects p ON p.id = d.project_id
|
||||||
|
-WHERE d.gitlab_discussion_id = ?
|
||||||
|
+WHERE d.gitlab_discussion_id = ?
|
||||||
|
+ /* plus active scope filters: noteable_type, for_issue/for_mr, since/path when present */
|
||||||
|
LIMIT 3
|
||||||
|
|
||||||
|
-Return LoreError::Ambiguous with message
|
||||||
|
+Return LoreError::Ambiguous with structured details:
|
||||||
|
+`{ code, message, candidates:[{project_path, gitlab_project_id}], suggestion }`
|
||||||
|
```
|
||||||
|
|
||||||
|
6. **Add `--contains` filter to `discussions`**
|
||||||
|
Analysis: This is a high-utility agent workflow gap. Agents frequently need “find thread by text then reply”; forcing a separate `notes` search round-trip is unnecessary.
|
||||||
|
```diff
|
||||||
|
@@ ## 3a. CLI Args
|
||||||
|
+ /// Filter discussions whose notes contain text
|
||||||
|
+ #[arg(long, help_heading = "Filters")]
|
||||||
|
+ pub contains: Option<String>,
|
||||||
|
@@ ## 3d. Filters struct
|
||||||
|
+ pub contains: Option<String>,
|
||||||
|
@@ ## 3d. Where-clause construction
|
||||||
|
+- `path` -> EXISTS (...)
|
||||||
|
+- `path` -> EXISTS (...)
|
||||||
|
+- `contains` -> EXISTS (
|
||||||
|
+ SELECT 1 FROM notes n
|
||||||
|
+ WHERE n.discussion_id = d.id
|
||||||
|
+ AND n.body LIKE ?
|
||||||
|
+ )
|
||||||
|
```
|
||||||
|
|
||||||
|
7. **Promote two baseline indexes from “candidate” to “required”**
|
||||||
|
Analysis: These are directly hit by new primary paths; waiting for post-merge profiling risks immediate perf cliffs in real usage.
|
||||||
|
```diff
|
||||||
|
@@ ## 3h. Query-plan validation
|
||||||
|
-Candidate indexes (add only if EXPLAIN QUERY PLAN shows they're needed):
|
||||||
|
-- discussions(project_id, gitlab_discussion_id)
|
||||||
|
-- notes(discussion_id, created_at DESC, id DESC)
|
||||||
|
+Required baseline indexes for this feature:
|
||||||
|
+- discussions(project_id, gitlab_discussion_id)
|
||||||
|
+- notes(discussion_id, created_at DESC, id DESC)
|
||||||
|
+Keep other indexes conditional on EXPLAIN QUERY PLAN.
|
||||||
|
```
|
||||||
|
|
||||||
|
8. **Add schema versioning and remove contradictory rejected items**
|
||||||
|
Analysis: `robot-docs` contract drift is a long-term agent risk; explicit schema versions let clients fail safely. Also, rejected items currently contradict active sections, which creates implementation ambiguity.
|
||||||
|
```diff
|
||||||
|
@@ ## 4. Fix Robot-Docs Response Schemas
|
||||||
|
"meta": {"elapsed_ms": "int", ...}
|
||||||
|
+"meta": {"elapsed_ms":"int", ..., "schema_version":"string"}
|
||||||
|
+
|
||||||
|
+Schema version policy:
|
||||||
|
+- bump minor on additive fields
|
||||||
|
+- bump major on removals/renames
|
||||||
|
+- expose per-command versions in `robot-docs`
|
||||||
|
@@ ## Rejected Recommendations
|
||||||
|
-- Add `gitlab_note_id` to show-command note detail structs ... rejected ...
|
||||||
|
-- Add `gitlab_discussion_id` to show-command discussion detail structs ... rejected ...
|
||||||
|
-- Add `gitlab_project_id` to show-command discussion detail structs ... rejected ...
|
||||||
|
+Remove stale rejected entries that conflict with accepted workstreams in this plan iteration.
|
||||||
|
```
|
||||||
|
|
||||||
|
If you want, I can produce a fully rewritten iteration 5 plan document that applies all of the above edits cleanly end-to-end.
|
||||||
@@ -2,7 +2,7 @@
|
|||||||
plan: true
|
plan: true
|
||||||
title: ""
|
title: ""
|
||||||
status: iterating
|
status: iterating
|
||||||
iteration: 4
|
iteration: 5
|
||||||
target_iterations: 8
|
target_iterations: 8
|
||||||
beads_revision: 0
|
beads_revision: 0
|
||||||
related_plans: []
|
related_plans: []
|
||||||
@@ -52,8 +52,9 @@ output.
|
|||||||
### Field Filtering Guardrail
|
### Field Filtering Guardrail
|
||||||
|
|
||||||
In robot mode, `filter_fields` **MUST** force-include Bridge Contract fields even when the
|
In robot mode, `filter_fields` **MUST** force-include Bridge Contract fields even when the
|
||||||
caller passes a narrower `--fields` list. This prevents agents from accidentally stripping
|
caller passes a narrower `--fields` list. This applies at **all nesting levels**: both the
|
||||||
the identifiers they need for write operations.
|
top-level entity fields and nested sub-entities (e.g., notes inside `discussions --include-notes`).
|
||||||
|
This prevents agents from accidentally stripping the identifiers they need for write operations.
|
||||||
|
|
||||||
**Implementation**: Add a `BRIDGE_FIELDS` constant map per entity type. In `filter_fields()`,
|
**Implementation**: Add a `BRIDGE_FIELDS` constant map per entity type. In `filter_fields()`,
|
||||||
when operating in robot mode, union the caller's requested fields with the bridge set before
|
when operating in robot mode, union the caller's requested fields with the bridge set before
|
||||||
@@ -69,70 +70,127 @@ const BRIDGE_FIELDS_DISCUSSIONS: &[&str] = &[
|
|||||||
"project_path", "gitlab_project_id", "noteable_type", "parent_iid",
|
"project_path", "gitlab_project_id", "noteable_type", "parent_iid",
|
||||||
"gitlab_discussion_id",
|
"gitlab_discussion_id",
|
||||||
];
|
];
|
||||||
|
// Applied to nested notes within discussions --include-notes
|
||||||
|
const BRIDGE_FIELDS_DISCUSSION_NOTES: &[&str] = &[
|
||||||
|
"project_path", "gitlab_project_id", "noteable_type", "parent_iid",
|
||||||
|
"gitlab_discussion_id", "gitlab_note_id",
|
||||||
|
];
|
||||||
```
|
```
|
||||||
|
|
||||||
In `filter_fields`, when entity is `"notes"` or `"discussions"`, merge the bridge set into the
|
In `filter_fields`, when entity is `"notes"` or `"discussions"`, merge the bridge set into the
|
||||||
requested fields before filtering the JSON value. This is a ~5-line change to the existing
|
requested fields before filtering the JSON value. For `"discussions"`, also apply
|
||||||
function.
|
`BRIDGE_FIELDS_DISCUSSION_NOTES` to each element of the nested `notes` array. This is a ~10-line
|
||||||
|
change to the existing function.
|
||||||
|
|
||||||
|
### Snapshot Consistency (Cross-Cutting)
|
||||||
|
|
||||||
|
Multi-query commands (`handle_notes`, `handle_discussions`) **MUST** execute all their queries
|
||||||
|
within a single deferred read transaction. This guarantees snapshot consistency when a concurrent
|
||||||
|
sync/ingest is modifying the database.
|
||||||
|
|
||||||
|
**Transaction ownership lives in handlers, not query helpers.** Each handler opens one deferred
|
||||||
|
read transaction and passes it to query helpers. Query helpers accept `&Connection` (which
|
||||||
|
`Transaction` derefs to via `std::ops::Deref`) so they remain testable with plain connections
|
||||||
|
in unit tests. This avoids nested transaction edge cases and guarantees a single snapshot across
|
||||||
|
count + page + include-notes + serialization.
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// In handle_notes / handle_discussions:
|
||||||
|
let tx = conn.transaction_with_behavior(rusqlite::TransactionBehavior::Deferred)?;
|
||||||
|
let result = query_notes(&tx, &filters, &config)?;
|
||||||
|
// ... serialize ...
|
||||||
|
tx.commit()?; // read-only, but closes cleanly
|
||||||
|
```
|
||||||
|
|
||||||
|
Query helpers keep their `conn: &Connection` signature — `Transaction<'_>` implements
|
||||||
|
`Deref<Target = Connection>`, so `&tx` coerces to `&Connection` at call sites.
|
||||||
|
|
||||||
### Ambiguity Guardrail
|
### Ambiguity Guardrail
|
||||||
|
|
||||||
When filtering by `gitlab_discussion_id` (on either `notes` or `discussions` commands) without
|
When filtering by `gitlab_discussion_id` (on either `notes` or `discussions` commands) without
|
||||||
`--project`, if the query matches discussions in multiple projects:
|
`--project`, if the query matches discussions in multiple projects:
|
||||||
- Return an `Ambiguous` error (exit code 18, matching existing convention)
|
- Return an `Ambiguous` error (exit code 18, matching existing convention)
|
||||||
- Include matching project paths in the error message
|
- Include matching project paths **and `gitlab_project_id`s** in a structured candidates list
|
||||||
- Suggest retry with `--project <path>`
|
- Suggest retry with `--project <path>`
|
||||||
|
|
||||||
**Implementation**: Run a **preflight distinct-project check** before the main list query
|
**Implementation**: Run a **scope-aware preflight distinct-project check** before the main list
|
||||||
executes its `LIMIT`. This is critical because a post-query check on the paginated result set
|
query executes its `LIMIT`. The preflight applies active scope filters (noteable_type, since,
|
||||||
can silently miss cross-project ambiguity when `LIMIT` truncates results to rows from a single
|
for_issue/for_mr) alongside the discussion ID check, so it won't produce false ambiguity when
|
||||||
project. The preflight query is cheap (hits the `gitlab_discussion_id` index, returns at most
|
other filters already narrow to one project. This is critical because a post-query check on the
|
||||||
a few rows) and eliminates non-deterministic write-targeting risk.
|
paginated result set can silently miss cross-project ambiguity when `LIMIT` truncates results to
|
||||||
|
rows from a single project. The preflight query is cheap (hits the `gitlab_discussion_id` index,
|
||||||
|
returns at most a few rows) and eliminates non-deterministic write-targeting risk.
|
||||||
|
|
||||||
```sql
|
```sql
|
||||||
-- Preflight ambiguity check (runs before main query)
|
-- Preflight ambiguity check (runs before main query, includes active scope filters)
|
||||||
SELECT DISTINCT p.path_with_namespace
|
SELECT DISTINCT p.path_with_namespace, p.gitlab_project_id
|
||||||
FROM discussions d
|
FROM discussions d
|
||||||
JOIN projects p ON p.id = d.project_id
|
JOIN projects p ON p.id = d.project_id
|
||||||
WHERE d.gitlab_discussion_id = ?
|
WHERE d.gitlab_discussion_id = ?
|
||||||
|
-- scope filters applied dynamically:
|
||||||
|
-- AND d.noteable_type = ? (when --noteable-type present)
|
||||||
|
-- AND d.merge_request_id = (SELECT ...) (when --for-mr present)
|
||||||
|
-- AND d.issue_id = (SELECT ...) (when --for-issue present)
|
||||||
LIMIT 3
|
LIMIT 3
|
||||||
```
|
```
|
||||||
|
|
||||||
If more than one project is found, return `LoreError::Ambiguous` (exit code 18) with the
|
If more than one project is found, return `LoreError::Ambiguous` (exit code 18) with structured
|
||||||
distinct project paths and suggestion to retry with `--project <path>`.
|
candidates for machine consumption:
|
||||||
|
|
||||||
```rust
|
```rust
|
||||||
// In query_notes / query_discussions, before executing the main query:
|
// In query_notes / query_discussions, before executing the main query:
|
||||||
if let Some(ref disc_id) = filters.gitlab_discussion_id {
|
if let Some(ref disc_id) = filters.gitlab_discussion_id {
|
||||||
if filters.project.is_none() {
|
if filters.project.is_none() {
|
||||||
let distinct_projects: Vec<String> = conn
|
let candidates: Vec<(String, i64)> = conn
|
||||||
.prepare(
|
.prepare(
|
||||||
"SELECT DISTINCT p.path_with_namespace \
|
"SELECT DISTINCT p.path_with_namespace, p.gitlab_project_id \
|
||||||
FROM discussions d \
|
FROM discussions d \
|
||||||
JOIN projects p ON p.id = d.project_id \
|
JOIN projects p ON p.id = d.project_id \
|
||||||
WHERE d.gitlab_discussion_id = ? \
|
WHERE d.gitlab_discussion_id = ? \
|
||||||
LIMIT 3"
|
LIMIT 3"
|
||||||
|
// Note: add scope filter clauses dynamically
|
||||||
)?
|
)?
|
||||||
.query_map([disc_id], |row| row.get(0))?
|
.query_map([disc_id], |row| Ok((row.get(0)?, row.get(1)?)))?
|
||||||
.collect::<std::result::Result<Vec<_>, _>>()?;
|
.collect::<std::result::Result<Vec<_>, _>>()?;
|
||||||
|
|
||||||
if distinct_projects.len() > 1 {
|
if candidates.len() > 1 {
|
||||||
return Err(LoreError::Ambiguous {
|
return Err(LoreError::Ambiguous {
|
||||||
message: format!(
|
message: format!(
|
||||||
"Discussion ID matches {} projects: {}. Use --project to disambiguate.",
|
"Discussion ID matches {} projects. Use --project to disambiguate.",
|
||||||
distinct_projects.len(),
|
candidates.len(),
|
||||||
distinct_projects.join(", ")
|
|
||||||
),
|
),
|
||||||
|
candidates: candidates.into_iter()
|
||||||
|
.map(|(path, id)| AmbiguousCandidate { project_path: path, gitlab_project_id: id })
|
||||||
|
.collect(),
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
In robot mode, the error serializes as:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"error": {
|
||||||
|
"code": "AMBIGUOUS",
|
||||||
|
"message": "Discussion ID matches 2 projects. Use --project to disambiguate.",
|
||||||
|
"candidates": [
|
||||||
|
{"project_path": "group/repo-a", "gitlab_project_id": 42},
|
||||||
|
{"project_path": "group/repo-b", "gitlab_project_id": 99}
|
||||||
|
],
|
||||||
|
"suggestion": "lore -J discussions --gitlab-discussion-id <id> --project <path>",
|
||||||
|
"actions": ["lore -J discussions --gitlab-discussion-id <id> --project group/repo-a"]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This gives agents machine-actionable candidates: they can pick a project and retry immediately
|
||||||
|
without parsing free-text error messages.
|
||||||
|
|
||||||
#### 1h. Wrap `query_notes` in a read transaction
|
#### 1h. Wrap `query_notes` in a read transaction
|
||||||
|
|
||||||
Wrap the count query and page query in a deferred read transaction per the Snapshot Consistency
|
Per the Snapshot Consistency cross-cutting requirement, `handle_notes` opens a deferred read
|
||||||
cross-cutting requirement. See the Bridge Contract section for the pattern.
|
transaction and passes it to `query_notes`. See the Snapshot Consistency section for the pattern.
|
||||||
|
|
||||||
### Tests
|
### Tests
|
||||||
|
|
||||||
@@ -337,6 +395,7 @@ fn notes_ambiguous_gitlab_discussion_id_across_projects() {
|
|||||||
// (this can happen since IDs are per-project)
|
// (this can happen since IDs are per-project)
|
||||||
// Filter by gitlab_discussion_id without --project
|
// Filter by gitlab_discussion_id without --project
|
||||||
// Assert LoreError::Ambiguous is returned with both project paths
|
// Assert LoreError::Ambiguous is returned with both project paths
|
||||||
|
// Assert candidates include gitlab_project_id for machine consumption
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -352,6 +411,19 @@ fn notes_ambiguity_preflight_not_defeated_by_limit() {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
#### Test 8: Ambiguity preflight respects scope filters (no false positives)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[test]
|
||||||
|
fn notes_ambiguity_preflight_respects_scope_filters() {
|
||||||
|
let conn = create_test_db();
|
||||||
|
// Insert 2 projects, each with a discussion sharing the same gitlab_discussion_id
|
||||||
|
// But one is Issue-type and the other MergeRequest-type
|
||||||
|
// Filter by gitlab_discussion_id + --noteable-type MergeRequest (narrows to 1 project)
|
||||||
|
// Assert NO ambiguity error — scope filters disambiguate
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 2. Add `gitlab_discussion_id` to Show Command Discussion Groups
|
## 2. Add `gitlab_discussion_id` to Show Command Discussion Groups
|
||||||
@@ -644,6 +716,9 @@ lore -J discussions --gitlab-discussion-id 6a9c1750b37d
|
|||||||
|
|
||||||
# List unresolved threads with latest 2 notes inline (fewer round-trips)
|
# List unresolved threads with latest 2 notes inline (fewer round-trips)
|
||||||
lore -J discussions --for-mr 99 --resolution unresolved --include-notes 2
|
lore -J discussions --for-mr 99 --resolution unresolved --include-notes 2
|
||||||
|
|
||||||
|
# Find discussions containing specific text
|
||||||
|
lore -J discussions --for-mr 99 --contains "prefer the approach"
|
||||||
```
|
```
|
||||||
|
|
||||||
### Response Schema
|
### Response Schema
|
||||||
@@ -801,6 +876,10 @@ pub struct DiscussionsArgs {
|
|||||||
#[arg(long, value_enum, help_heading = "Filters")]
|
#[arg(long, value_enum, help_heading = "Filters")]
|
||||||
pub noteable_type: Option<NoteableTypeFilter>,
|
pub noteable_type: Option<NoteableTypeFilter>,
|
||||||
|
|
||||||
|
/// Filter discussions whose notes contain text (case-insensitive LIKE match)
|
||||||
|
#[arg(long, help_heading = "Filters")]
|
||||||
|
pub contains: Option<String>,
|
||||||
|
|
||||||
/// Include up to N latest notes per discussion (0 = none, default; clamped to 20)
|
/// Include up to N latest notes per discussion (0 = none, default; clamped to 20)
|
||||||
#[arg(long, default_value = "0", help_heading = "Output")]
|
#[arg(long, default_value = "0", help_heading = "Output")]
|
||||||
pub include_notes: usize,
|
pub include_notes: usize,
|
||||||
@@ -925,7 +1004,7 @@ The `included_note_count` is set to `notes.len()` and `has_more_notes` is set to
|
|||||||
`note_count > included_note_count` during the JSON conversion, providing per-discussion
|
`note_count > included_note_count` during the JSON conversion, providing per-discussion
|
||||||
truncation signals.
|
truncation signals.
|
||||||
|
|
||||||
#### 3c. SQL Query
|
#### 3c. SQL Query — Two-Phase Page-First Architecture
|
||||||
|
|
||||||
**File**: `src/cli/commands/list.rs`
|
**File**: `src/cli/commands/list.rs`
|
||||||
|
|
||||||
@@ -935,21 +1014,29 @@ pub fn query_discussions(
|
|||||||
filters: &DiscussionListFilters,
|
filters: &DiscussionListFilters,
|
||||||
config: &Config,
|
config: &Config,
|
||||||
) -> Result<DiscussionListResult> {
|
) -> Result<DiscussionListResult> {
|
||||||
// Wrap all queries in a deferred read transaction for snapshot consistency
|
// NOTE: Transaction is managed by the handler (handle_discussions).
|
||||||
let tx = conn.transaction_with_behavior(rusqlite::TransactionBehavior::Deferred)?;
|
// This function receives &Connection (which Transaction derefs to via `std::ops::Deref`).
|
||||||
|
|
||||||
// Preflight ambiguity check (if gitlab_discussion_id without project)
|
// Preflight ambiguity check (if gitlab_discussion_id without project)
|
||||||
// ... see Ambiguity Guardrail section ...
|
// ... see Ambiguity Guardrail section ...
|
||||||
|
|
||||||
// Main query + count query ...
|
// Phase 1: Filter + sort + LIMIT to get page IDs
|
||||||
// ... note expansion query (if include_notes > 0) ...
|
// Phase 2: Note rollups only for paged results
|
||||||
|
// Phase 3: Optional --include-notes expansion (separate query)
|
||||||
tx.commit()?;
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Core query uses a CTE + ranked-notes rollup (window function) to avoid per-row correlated
|
The query uses a **two-phase page-first architecture** for scalability:
|
||||||
subqueries. The `ROW_NUMBER()` approach produces a single scan over the notes table, which
|
|
||||||
is more predictable than repeated LIMIT 1 sub-selects at scale (200K+ discussions):
|
1. **Phase 1** (`paged_discussions`): Apply all filters, sort, and LIMIT to produce just the
|
||||||
|
discussion IDs for the current page. This bounds the result set before any note scanning.
|
||||||
|
2. **Phase 2** (`ranked_notes` + `note_rollup`): Run note aggregation only for the paged
|
||||||
|
discussion IDs. This ensures note scanning is proportional to `--limit`, not to the total
|
||||||
|
filtered discussion count.
|
||||||
|
|
||||||
|
This architecture prevents the performance cliff that occurs on project-wide queries with
|
||||||
|
thousands of discussions: instead of scanning notes for all filtered discussions (potentially
|
||||||
|
200K+), we scan only for the 50 (or whatever `--limit` is) that will actually be returned.
|
||||||
|
|
||||||
```sql
|
```sql
|
||||||
WITH filtered_discussions AS (
|
WITH filtered_discussions AS (
|
||||||
@@ -961,6 +1048,14 @@ WITH filtered_discussions AS (
|
|||||||
JOIN projects p ON d.project_id = p.id
|
JOIN projects p ON d.project_id = p.id
|
||||||
{where_sql}
|
{where_sql}
|
||||||
),
|
),
|
||||||
|
-- Phase 1: Page-first — apply sort + LIMIT before note scanning
|
||||||
|
paged_discussions AS (
|
||||||
|
SELECT id
|
||||||
|
FROM filtered_discussions
|
||||||
|
ORDER BY COALESCE({sort_column}, 0) {order}, id {order}
|
||||||
|
LIMIT ?
|
||||||
|
),
|
||||||
|
-- Phase 2: Note rollups only for paged results
|
||||||
ranked_notes AS (
|
ranked_notes AS (
|
||||||
SELECT
|
SELECT
|
||||||
n.discussion_id,
|
n.discussion_id,
|
||||||
@@ -980,7 +1075,7 @@ ranked_notes AS (
|
|||||||
n.created_at, n.id
|
n.created_at, n.id
|
||||||
) AS rn_first_position
|
) AS rn_first_position
|
||||||
FROM notes n
|
FROM notes n
|
||||||
WHERE n.discussion_id IN (SELECT id FROM filtered_discussions)
|
WHERE n.discussion_id IN (SELECT id FROM paged_discussions)
|
||||||
),
|
),
|
||||||
note_rollup AS (
|
note_rollup AS (
|
||||||
SELECT
|
SELECT
|
||||||
@@ -1012,12 +1107,12 @@ SELECT
|
|||||||
nr.position_new_path,
|
nr.position_new_path,
|
||||||
nr.position_new_line
|
nr.position_new_line
|
||||||
FROM filtered_discussions fd
|
FROM filtered_discussions fd
|
||||||
|
JOIN paged_discussions pd ON fd.id = pd.id
|
||||||
JOIN projects p ON fd.project_id = p.id
|
JOIN projects p ON fd.project_id = p.id
|
||||||
LEFT JOIN issues i ON fd.issue_id = i.id
|
LEFT JOIN issues i ON fd.issue_id = i.id
|
||||||
LEFT JOIN merge_requests m ON fd.merge_request_id = m.id
|
LEFT JOIN merge_requests m ON fd.merge_request_id = m.id
|
||||||
LEFT JOIN note_rollup nr ON nr.discussion_id = fd.id
|
LEFT JOIN note_rollup nr ON nr.discussion_id = fd.id
|
||||||
ORDER BY COALESCE({sort_column}, 0) {order}, fd.id {order}
|
ORDER BY COALESCE({sort_column}, 0) {order}, fd.id {order}
|
||||||
LIMIT ?
|
|
||||||
```
|
```
|
||||||
|
|
||||||
**Dual window function rationale**: The `ranked_notes` CTE uses two separate `ROW_NUMBER()`
|
**Dual window function rationale**: The `ranked_notes` CTE uses two separate `ROW_NUMBER()`
|
||||||
@@ -1028,12 +1123,11 @@ displacing the first human author/body, and prevents a non-positioned note from
|
|||||||
the file location. The `MAX(CASE WHEN rn_xxx = 1 ...)` pattern extracts the correct value
|
the file location. The `MAX(CASE WHEN rn_xxx = 1 ...)` pattern extracts the correct value
|
||||||
from each independently-ranked sequence.
|
from each independently-ranked sequence.
|
||||||
|
|
||||||
**Performance rationale**: The CTE pre-filters discussions before joining notes. The
|
**Page-first scalability rationale**: The `paged_discussions` CTE applies LIMIT before note
|
||||||
`ranked_notes` CTE uses `ROW_NUMBER()` (a single pass over the notes index) instead of
|
scanning. For MR-scoped queries (50-200 discussions) the performance is equivalent to the
|
||||||
correlated `(SELECT ... LIMIT 1)` sub-selects per discussion. For MR-scoped queries
|
non-paged approach. For project-wide scans with thousands of discussions, the page-first
|
||||||
(50-200 discussions) the performance is equivalent. For project-wide scans with thousands
|
architecture avoids scanning notes for discussions that won't appear in the result, keeping
|
||||||
of discussions, the window function approach avoids repeated index probes and produces a
|
latency proportional to `--limit` rather than to the total filtered count.
|
||||||
more predictable query plan.
|
|
||||||
|
|
||||||
**Note on ordering**: The `COALESCE({sort_column}, 0)` with tiebreaker `fd.id` ensures
|
**Note on ordering**: The `COALESCE({sort_column}, 0)` with tiebreaker `fd.id` ensures
|
||||||
deterministic ordering even when timestamps are NULL (partial sync states). The `id`
|
deterministic ordering even when timestamps are NULL (partial sync states). The `id`
|
||||||
@@ -1042,6 +1136,10 @@ tiebreaker is cheap (primary key) and prevents unstable sort output.
|
|||||||
**Note on SQLite FILTER syntax**: SQLite does not support `COUNT(*) FILTER (WHERE ...)`.
|
**Note on SQLite FILTER syntax**: SQLite does not support `COUNT(*) FILTER (WHERE ...)`.
|
||||||
Use `SUM(CASE WHEN ... THEN 1 ELSE 0 END)` instead (as shown above).
|
Use `SUM(CASE WHEN ... THEN 1 ELSE 0 END)` instead (as shown above).
|
||||||
|
|
||||||
|
**Count query**: The total_count query runs separately against `filtered_discussions` (without
|
||||||
|
the LIMIT) using `SELECT COUNT(*) FROM filtered_discussions`. This is needed for `has_more`
|
||||||
|
metadata. The count uses the same filter CTEs but omits notes entirely.
|
||||||
|
|
||||||
#### 3c-ii. Note expansion query (--include-notes)
|
#### 3c-ii. Note expansion query (--include-notes)
|
||||||
|
|
||||||
When `include_notes > 0`, after the main discussion query, run a **single batched query**
|
When `include_notes > 0`, after the main discussion query, run a **single batched query**
|
||||||
@@ -1103,6 +1201,7 @@ pub struct DiscussionListFilters {
|
|||||||
pub since: Option<String>,
|
pub since: Option<String>,
|
||||||
pub path: Option<String>,
|
pub path: Option<String>,
|
||||||
pub noteable_type: Option<NoteableTypeFilter>,
|
pub noteable_type: Option<NoteableTypeFilter>,
|
||||||
|
pub contains: Option<String>,
|
||||||
pub sort: DiscussionSortField,
|
pub sort: DiscussionSortField,
|
||||||
pub order: SortDirection,
|
pub order: SortDirection,
|
||||||
pub include_notes: usize,
|
pub include_notes: usize,
|
||||||
@@ -1117,6 +1216,7 @@ Where-clause construction uses `match` on typed enums — never raw string inter
|
|||||||
- `since` → `d.first_note_at >= ?` (using `parse_since()`)
|
- `since` → `d.first_note_at >= ?` (using `parse_since()`)
|
||||||
- `path` → `EXISTS (SELECT 1 FROM notes n WHERE n.discussion_id = d.id AND n.position_new_path LIKE ?)`
|
- `path` → `EXISTS (SELECT 1 FROM notes n WHERE n.discussion_id = d.id AND n.position_new_path LIKE ?)`
|
||||||
- `noteable_type` → match: `Issue` → `d.noteable_type = 'Issue'`, `MergeRequest` → `d.noteable_type = 'MergeRequest'`
|
- `noteable_type` → match: `Issue` → `d.noteable_type = 'Issue'`, `MergeRequest` → `d.noteable_type = 'MergeRequest'`
|
||||||
|
- `contains` → `EXISTS (SELECT 1 FROM notes n WHERE n.discussion_id = d.id AND n.body LIKE '%' || ? || '%')`
|
||||||
|
|
||||||
#### 3e. Handler wiring
|
#### 3e. Handler wiring
|
||||||
|
|
||||||
@@ -1128,7 +1228,7 @@ Add match arm:
|
|||||||
Some(Commands::Discussions(args)) => handle_discussions(cli.config.as_deref(), args, robot_mode),
|
Some(Commands::Discussions(args)) => handle_discussions(cli.config.as_deref(), args, robot_mode),
|
||||||
```
|
```
|
||||||
|
|
||||||
Handler function:
|
Handler function (with transaction ownership):
|
||||||
|
|
||||||
```rust
|
```rust
|
||||||
fn handle_discussions(
|
fn handle_discussions(
|
||||||
@@ -1143,6 +1243,10 @@ fn handle_discussions(
|
|||||||
|
|
||||||
let effective_limit = args.limit.min(500);
|
let effective_limit = args.limit.min(500);
|
||||||
let effective_include_notes = args.include_notes.min(20);
|
let effective_include_notes = args.include_notes.min(20);
|
||||||
|
|
||||||
|
// Snapshot consistency: one transaction across all queries
|
||||||
|
let tx = conn.transaction_with_behavior(rusqlite::TransactionBehavior::Deferred)?;
|
||||||
|
|
||||||
let filters = DiscussionListFilters {
|
let filters = DiscussionListFilters {
|
||||||
limit: effective_limit,
|
limit: effective_limit,
|
||||||
project: args.project,
|
project: args.project,
|
||||||
@@ -1153,12 +1257,15 @@ fn handle_discussions(
|
|||||||
since: args.since,
|
since: args.since,
|
||||||
path: args.path,
|
path: args.path,
|
||||||
noteable_type: args.noteable_type,
|
noteable_type: args.noteable_type,
|
||||||
|
contains: args.contains,
|
||||||
sort: args.sort,
|
sort: args.sort,
|
||||||
order: args.order,
|
order: args.order,
|
||||||
include_notes: effective_include_notes,
|
include_notes: effective_include_notes,
|
||||||
};
|
};
|
||||||
|
|
||||||
let result = query_discussions(&conn, &filters, &config)?;
|
let result = query_discussions(&tx, &filters, &config)?;
|
||||||
|
|
||||||
|
tx.commit()?; // read-only, but closes cleanly
|
||||||
|
|
||||||
let format = if robot_mode && args.format == "table" {
|
let format = if robot_mode && args.format == "table" {
|
||||||
"json"
|
"json"
|
||||||
@@ -1247,7 +1354,7 @@ CSV view: all fields, following same pattern as `print_list_notes_csv`.
|
|||||||
.collect(),
|
.collect(),
|
||||||
```
|
```
|
||||||
|
|
||||||
#### 3h. Query-plan validation
|
#### 3h. Query-plan validation and indexes
|
||||||
|
|
||||||
Before merging the discussions command, capture `EXPLAIN QUERY PLAN` output for the three
|
Before merging the discussions command, capture `EXPLAIN QUERY PLAN` output for the three
|
||||||
primary query patterns:
|
primary query patterns:
|
||||||
@@ -1255,17 +1362,26 @@ primary query patterns:
|
|||||||
- `--project <path> --since 7d --sort last-note`
|
- `--project <path> --since 7d --sort last-note`
|
||||||
- `--gitlab-discussion-id <id>`
|
- `--gitlab-discussion-id <id>`
|
||||||
|
|
||||||
If plans show table scans on `notes` or `discussions` for these patterns, add targeted indexes
|
**Required baseline index** (directly hit by `--include-notes` expansion, which runs a
|
||||||
to the `MIGRATIONS` array in `src/core/db.rs`:
|
`ROW_NUMBER() OVER (PARTITION BY discussion_id ORDER BY created_at DESC, id DESC)` window
|
||||||
|
on the notes table):
|
||||||
|
|
||||||
**Candidate indexes** (add only if EXPLAIN QUERY PLAN shows they're needed):
|
```sql
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_notes_discussion_created_desc
|
||||||
|
ON notes(discussion_id, created_at DESC, id DESC);
|
||||||
|
```
|
||||||
|
|
||||||
|
This index is non-negotiable because the include-notes expansion query's performance is
|
||||||
|
directly proportional to how efficiently it can scan notes per discussion. Without it, SQLite
|
||||||
|
falls back to a full table scan of the 282K-row notes table for each batch.
|
||||||
|
|
||||||
|
**Conditional indexes** (add only if EXPLAIN QUERY PLAN shows they're needed):
|
||||||
- `discussions(project_id, gitlab_discussion_id)` — for ambiguity preflight + direct ID lookup
|
- `discussions(project_id, gitlab_discussion_id)` — for ambiguity preflight + direct ID lookup
|
||||||
- `discussions(merge_request_id, last_note_at, id)` — for MR-scoped + sorted queries
|
- `discussions(merge_request_id, last_note_at, id)` — for MR-scoped + sorted queries
|
||||||
- `notes(discussion_id, created_at DESC, id DESC)` — for `--include-notes` expansion
|
|
||||||
- `notes(discussion_id, is_system, created_at, id)` — for ranked_notes CTE ordering
|
- `notes(discussion_id, is_system, created_at, id)` — for ranked_notes CTE ordering
|
||||||
|
|
||||||
This is a measured approach: profile first, add indexes only where the query plan demands them.
|
This is a measured approach: one required index for the critical new path, remaining indexes
|
||||||
No speculative index creation.
|
added only where the query plan demands them.
|
||||||
|
|
||||||
### Tests
|
### Tests
|
||||||
|
|
||||||
@@ -1500,7 +1616,7 @@ fn discussions_ambiguous_gitlab_discussion_id_across_projects() {
|
|||||||
};
|
};
|
||||||
let result = query_discussions(&conn, &filters, &Config::default());
|
let result = query_discussions(&conn, &filters, &Config::default());
|
||||||
assert!(result.is_err());
|
assert!(result.is_err());
|
||||||
// Error should be Ambiguous with both project paths
|
// Error should be Ambiguous with both project paths and gitlab_project_ids
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -1579,6 +1695,99 @@ fn discussions_first_note_rollup_skips_system_notes() {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
#### Test 15: --contains filter returns matching discussions
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[test]
|
||||||
|
fn query_discussions_contains_filter() {
|
||||||
|
let conn = create_test_db();
|
||||||
|
insert_project(&conn, 1);
|
||||||
|
insert_mr(&conn, 1, 1, 99, "Test MR");
|
||||||
|
insert_discussion(&conn, 1, "disc-match", 1, None, Some(1), "MergeRequest");
|
||||||
|
insert_discussion(&conn, 2, "disc-nomatch", 1, None, Some(1), "MergeRequest");
|
||||||
|
insert_note_in_discussion(&conn, 1, 500, 1, 1, "alice", "I really do prefer this approach");
|
||||||
|
insert_note_in_discussion(&conn, 2, 501, 2, 1, "bob", "Looks good to me");
|
||||||
|
|
||||||
|
let filters = DiscussionListFilters {
|
||||||
|
contains: Some("really do prefer".to_string()),
|
||||||
|
..DiscussionListFilters::default_for_mr(99)
|
||||||
|
};
|
||||||
|
let result = query_discussions(&conn, &filters, &Config::default()).unwrap();
|
||||||
|
|
||||||
|
assert_eq!(result.discussions.len(), 1);
|
||||||
|
assert_eq!(result.discussions[0].gitlab_discussion_id, "disc-match");
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Test 16: Nested note bridge fields survive --fields filtering in robot mode
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[test]
|
||||||
|
fn discussions_nested_note_bridge_fields_forced_in_robot_mode() {
|
||||||
|
// When discussions --include-notes returns nested notes,
|
||||||
|
// bridge fields on nested notes must survive --fields filtering
|
||||||
|
let mut value = serde_json::json!({
|
||||||
|
"data": {
|
||||||
|
"discussions": [{
|
||||||
|
"gitlab_discussion_id": "abc",
|
||||||
|
"noteable_type": "MergeRequest",
|
||||||
|
"parent_iid": 99,
|
||||||
|
"project_path": "group/repo",
|
||||||
|
"gitlab_project_id": 42,
|
||||||
|
"note_count": 1,
|
||||||
|
"notes": [{
|
||||||
|
"body": "test note",
|
||||||
|
"project_path": "group/repo",
|
||||||
|
"gitlab_project_id": 42,
|
||||||
|
"noteable_type": "MergeRequest",
|
||||||
|
"parent_iid": 99,
|
||||||
|
"gitlab_discussion_id": "abc",
|
||||||
|
"gitlab_note_id": 500
|
||||||
|
}]
|
||||||
|
}]
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
// Agent requests only "body" on notes — bridge fields must still appear
|
||||||
|
filter_fields_robot(
|
||||||
|
&mut value,
|
||||||
|
"discussions",
|
||||||
|
&["note_count".to_string()],
|
||||||
|
);
|
||||||
|
|
||||||
|
let note = &value["data"]["discussions"][0]["notes"][0];
|
||||||
|
assert!(note.get("gitlab_discussion_id").is_some());
|
||||||
|
assert!(note.get("gitlab_note_id").is_some());
|
||||||
|
assert!(note.get("gitlab_project_id").is_some());
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Test 17: Ambiguity preflight respects scope filters (no false positives)
|
||||||
|
|
||||||
|
```rust
|
||||||
|
#[test]
|
||||||
|
fn discussions_ambiguity_preflight_respects_scope_filters() {
|
||||||
|
let conn = create_test_db();
|
||||||
|
insert_project(&conn, 1); // "group/repo-a"
|
||||||
|
insert_project(&conn, 2); // "group/repo-b"
|
||||||
|
// Same gitlab_discussion_id in both projects
|
||||||
|
// But different noteable_types
|
||||||
|
insert_discussion(&conn, 1, "shared-id", 1, Some(1), None, "Issue");
|
||||||
|
insert_discussion(&conn, 2, "shared-id", 2, None, Some(1), "MergeRequest");
|
||||||
|
|
||||||
|
// Filter by noteable_type narrows to one project — should NOT fire ambiguity
|
||||||
|
let filters = DiscussionListFilters {
|
||||||
|
gitlab_discussion_id: Some("shared-id".to_string()),
|
||||||
|
noteable_type: Some(NoteableTypeFilter::MergeRequest),
|
||||||
|
project: None,
|
||||||
|
..DiscussionListFilters::default()
|
||||||
|
};
|
||||||
|
let result = query_discussions(&conn, &filters, &Config::default());
|
||||||
|
assert!(result.is_ok());
|
||||||
|
assert_eq!(result.unwrap().discussions.len(), 1);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 4. Fix Robot-Docs Response Schemas
|
## 4. Fix Robot-Docs Response Schemas
|
||||||
@@ -1629,6 +1838,7 @@ With:
|
|||||||
"--since <period>",
|
"--since <period>",
|
||||||
"--path <filepath>",
|
"--path <filepath>",
|
||||||
"--noteable-type <Issue|MergeRequest>",
|
"--noteable-type <Issue|MergeRequest>",
|
||||||
|
"--contains <text>",
|
||||||
"--include-notes <N>",
|
"--include-notes <N>",
|
||||||
"--sort <first-note|last-note>",
|
"--sort <first-note|last-note>",
|
||||||
"--order <asc|desc>",
|
"--order <asc|desc>",
|
||||||
@@ -1831,14 +2041,13 @@ Changes 1 and 2 can be done in parallel. Change 4 must come last since it docume
|
|||||||
final schema of all preceding changes.
|
final schema of all preceding changes.
|
||||||
|
|
||||||
**Cross-cutting**: The Bridge Contract field guardrail (force-including bridge fields in robot
|
**Cross-cutting**: The Bridge Contract field guardrail (force-including bridge fields in robot
|
||||||
mode) should be implemented as part of Change 1, since it modifies `filter_fields` in
|
mode, including nested notes) should be implemented as part of Change 1, since it modifies
|
||||||
`robot.rs` which all subsequent changes depend on. The `BRIDGE_FIELDS_*` constants are defined
|
`filter_fields` in `robot.rs` which all subsequent changes depend on. The `BRIDGE_FIELDS_*`
|
||||||
once and reused by Changes 3 and 4.
|
constants are defined once and reused by Changes 3 and 4.
|
||||||
|
|
||||||
**Cross-cutting**: The snapshot consistency pattern (deferred read transaction) should be
|
**Cross-cutting**: The snapshot consistency pattern (deferred read transaction in handlers)
|
||||||
implemented in Change 1 for `query_notes` and carried forward to Change 3 for
|
should be implemented in Change 1 for `handle_notes` and carried forward to Change 3 for
|
||||||
`query_discussions`. This is a one-line wrapper that provides correctness guarantees with
|
`handle_discussions`. Transaction ownership lives in handlers; query helpers accept `&Connection`.
|
||||||
zero performance cost.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -1850,40 +2059,52 @@ After all changes:
|
|||||||
`gitlab_discussion_id`, `gitlab_note_id`, and `gitlab_project_id` in the response
|
`gitlab_discussion_id`, `gitlab_note_id`, and `gitlab_project_id` in the response
|
||||||
2. An agent can run `lore -J discussions --for-mr 3929 --resolution unresolved` to see all
|
2. An agent can run `lore -J discussions --for-mr 3929 --resolution unresolved` to see all
|
||||||
open threads with their IDs
|
open threads with their IDs
|
||||||
3. An agent can run `lore -J mrs 3929` and see `gitlab_discussion_id`, `resolvable`,
|
3. An agent can run `lore -J discussions --for-mr 3929 --contains "prefer the approach"` to
|
||||||
|
find threads by text content without a separate `notes` round-trip
|
||||||
|
4. An agent can run `lore -J mrs 3929` and see `gitlab_discussion_id`, `resolvable`,
|
||||||
`resolved`, and `last_note_at_iso` on each discussion group, plus `gitlab_note_id` on
|
`resolved`, and `last_note_at_iso` on each discussion group, plus `gitlab_note_id` on
|
||||||
each note within
|
each note within
|
||||||
4. `lore robot-docs` lists actual field names for all commands
|
5. `lore robot-docs` lists actual field names for all commands
|
||||||
5. All existing tests still pass
|
6. All existing tests still pass
|
||||||
6. No clippy warnings (pedantic + nursery)
|
7. No clippy warnings (pedantic + nursery)
|
||||||
7. Robot-docs contract tests pass with field-set parity (not just string-contains), preventing
|
8. Robot-docs contract tests pass with field-set parity (not just string-contains), preventing
|
||||||
future schema drift in both directions
|
future schema drift in both directions
|
||||||
8. Bridge Contract fields (`project_path`, `gitlab_project_id`, `noteable_type`, `parent_iid`,
|
9. Bridge Contract fields (`project_path`, `gitlab_project_id`, `noteable_type`, `parent_iid`,
|
||||||
`gitlab_discussion_id`, `gitlab_note_id`) are present in every applicable read payload
|
`gitlab_discussion_id`, `gitlab_note_id`) are present in every applicable read payload
|
||||||
9. Bridge Contract fields survive `--fields` filtering in robot mode (guardrail enforced)
|
10. Bridge Contract fields survive `--fields` filtering in robot mode (guardrail enforced),
|
||||||
10. `--gitlab-discussion-id` filter works on both `notes` and `discussions` commands
|
including nested notes within `discussions --include-notes`
|
||||||
11. `--include-notes N` populates inline notes on `discussions` output via single batched query
|
11. `--gitlab-discussion-id` filter works on both `notes` and `discussions` commands
|
||||||
12. CLI-level contract integration tests verify bridge fields through the full handler path
|
12. `--include-notes N` populates inline notes on `discussions` output via single batched query
|
||||||
13. `gitlab_note_id` is available in notes list output (alongside `gitlab_id` for back-compat)
|
13. CLI-level contract integration tests verify bridge fields through the full handler path
|
||||||
|
14. `gitlab_note_id` is available in notes list output (alongside `gitlab_id` for back-compat)
|
||||||
and in show detail notes, providing a uniform field name across all commands
|
and in show detail notes, providing a uniform field name across all commands
|
||||||
14. Ambiguity guardrail fires when `--gitlab-discussion-id` matches multiple projects without
|
15. Ambiguity guardrail fires when `--gitlab-discussion-id` matches multiple projects without
|
||||||
`--project` specified — **including when LIMIT would have hidden the ambiguity** (preflight
|
`--project` specified — **including when LIMIT would have hidden the ambiguity** (preflight
|
||||||
query runs before LIMIT)
|
query runs before LIMIT). Error includes structured candidates with `gitlab_project_id`
|
||||||
15. Output guardrails clamp `--limit` to 500 and `--include-notes` to 20; `meta` reports
|
for machine consumption
|
||||||
|
16. Ambiguity preflight is scope-aware: active filters (noteable_type, for_issue/for_mr) are
|
||||||
|
applied alongside the discussion ID check, preventing false ambiguity when scope already
|
||||||
|
narrows to one project
|
||||||
|
17. Output guardrails clamp `--limit` to 500 and `--include-notes` to 20; `meta` reports
|
||||||
effective values and `has_more` truncation flag
|
effective values and `has_more` truncation flag
|
||||||
16. Discussion and show queries use deterministic ordering (COALESCE + id tiebreaker) to
|
18. Discussion and show queries use deterministic ordering (COALESCE + id tiebreaker) to
|
||||||
prevent unstable output during partial sync states
|
prevent unstable output during partial sync states
|
||||||
17. Per-discussion truncation signals (`included_note_count`, `has_more_notes`) are accurate
|
19. Per-discussion truncation signals (`included_note_count`, `has_more_notes`) are accurate
|
||||||
for `--include-notes` output
|
for `--include-notes` output
|
||||||
18. Multi-query commands (`query_notes`, `query_discussions`) use deferred read transactions
|
20. Multi-query handlers (`handle_notes`, `handle_discussions`) open deferred read transactions;
|
||||||
for snapshot consistency during concurrent ingest
|
query helpers accept `&Connection` for snapshot consistency and testability
|
||||||
19. Discussion filters (`resolution`, `noteable_type`, `sort`, `order`) use typed enums
|
21. Discussion filters (`resolution`, `noteable_type`, `sort`, `order`) use typed enums
|
||||||
with match-to-SQL mapping — no raw string interpolation in query construction
|
with match-to-SQL mapping — no raw string interpolation in query construction
|
||||||
20. First-note rollup correctly handles discussions with leading system notes — `first_author`
|
22. First-note rollup correctly handles discussions with leading system notes — `first_author`
|
||||||
and `first_note_body_snippet` always reflect the first non-system note
|
and `first_note_body_snippet` always reflect the first non-system note
|
||||||
21. Query plans for primary discussion query patterns (`--for-mr`, `--project --since`,
|
23. Query plans for primary discussion query patterns (`--for-mr`, `--project --since`,
|
||||||
`--gitlab-discussion-id`) have been validated via EXPLAIN QUERY PLAN; targeted indexes
|
`--gitlab-discussion-id`) have been validated via EXPLAIN QUERY PLAN; targeted indexes
|
||||||
added only where scans were observed
|
added only where scans were observed
|
||||||
|
24. The `notes(discussion_id, created_at DESC, id DESC)` index is present for `--include-notes`
|
||||||
|
expansion performance
|
||||||
|
25. Discussion query uses page-first CTE architecture: note rollups scan only the paged result
|
||||||
|
set, not all filtered discussions, keeping latency proportional to `--limit`
|
||||||
|
26. `--contains` filter on `discussions` returns only discussions with matching note text
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -1902,6 +2123,6 @@ After all changes:
|
|||||||
- **`--with-write-hints` flag for inline glab endpoint templates** — rejected because this couples lore's read surface to glab's API surface, violating the read/write split principle. The Bridge Contract gives agents the raw identifiers; constructing glab commands is the agent's responsibility. Adding endpoint templates would require lore to track glab API changes, creating an unnecessary maintenance burden.
|
- **`--with-write-hints` flag for inline glab endpoint templates** — rejected because this couples lore's read surface to glab's API surface, violating the read/write split principle. The Bridge Contract gives agents the raw identifiers; constructing glab commands is the agent's responsibility. Adding endpoint templates would require lore to track glab API changes, creating an unnecessary maintenance burden.
|
||||||
- **Show-command note ordering change (`ORDER BY COALESCE(position, ...), created_at, id`)** — rejected because show-command note ordering within a discussion thread is out of scope for this plan. The existing ordering works correctly for present data; the defensive COALESCE pattern is applied to discussion-level ordering where it matters for agent workflows.
|
- **Show-command note ordering change (`ORDER BY COALESCE(position, ...), created_at, id`)** — rejected because show-command note ordering within a discussion thread is out of scope for this plan. The existing ordering works correctly for present data; the defensive COALESCE pattern is applied to discussion-level ordering where it matters for agent workflows.
|
||||||
- **Query-plan validation as a separate numbered workstream** — rejected because it adds delivery overhead without proportional benefit. Query-plan validation is integrated into workstream 3 as a pre-merge validation step (section 3h), with candidate indexes listed but only added when EXPLAIN QUERY PLAN shows they're needed. This keeps the measured approach without inflating the workstream count.
|
- **Query-plan validation as a separate numbered workstream** — rejected because it adds delivery overhead without proportional benefit. Query-plan validation is integrated into workstream 3 as a pre-merge validation step (section 3h), with candidate indexes listed but only added when EXPLAIN QUERY PLAN shows they're needed. This keeps the measured approach without inflating the workstream count.
|
||||||
- **Add `gitlab_note_id` to show-command note detail structs** — rejected because show-command note detail structs already have `gitlab_id` (same value as `id`). The field is unambiguous and consistent with the Bridge Contract. Adding `gitlab_note_id` would create a duplicate and increase payload size without benefit.
|
- **`--project-id` immutable input filter across notes/discussions/show** — rejected because this is scope creep touching every command and changing CLI ergonomics. Agents already get `gitlab_project_id` in output to construct API calls; the input-side concern (project renames breaking `--project`) is theoretical and hasn't been observed in practice. The `--project` flag already supports fuzzy matching which handles most rename scenarios. If real-world evidence surfaces, this can be added later without breaking changes.
|
||||||
- **Add `gitlab_discussion_id` to show-command discussion detail structs** — rejected because show-command discussion detail structs already have `gitlab_discussion_id`. The field is unambiguous and consistent with the Bridge Contract. Adding `gitlab_discussion_id` would create a duplicate and increase payload size without benefit.
|
- **Schema versioning in robot-docs (`schema_version` field + semver policy)** — rejected because this tool has zero external consumers beyond our own agents, and the contract tests (field-set parity assertions) catch drift at compile time. Schema versioning adds bureaucratic overhead (version bumps, compatibility matrices, deprecation policies) without proportional benefit for an internal tool in early development. If lore gains external consumers, this can be reconsidered.
|
||||||
- **Add `gitlab_project_id` to show-command discussion detail structs** — rejected because show-command discussion detail structs already have `gitlab_project_id`. The field is unambiguous and consistent with the Bridge Contract. Adding `gitlab_project_id` would create a duplicate and increase payload size without benefit.
|
- **Remove "stale" rejected items that "conflict" with active sections** — rejected because the prior entries about show-command structs were stale from iteration 2 and have been cleaned up independently. The rejected section is cumulative by design — it prevents future reviewers from re-proposing changes that have already been evaluated.
|
||||||
|
|||||||
@@ -1,6 +1,7 @@
|
|||||||
use rusqlite::Connection;
|
use rusqlite::Connection;
|
||||||
|
|
||||||
use super::error::{LoreError, Result};
|
use super::error::{LoreError, Result};
|
||||||
|
use super::file_history::resolve_rename_chain;
|
||||||
|
|
||||||
// ─── SQL Helpers ─────────────────────────────────────────────────────────────
|
// ─── SQL Helpers ─────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
@@ -149,6 +150,16 @@ pub fn build_path_query(
|
|||||||
is_prefix: false,
|
is_prefix: false,
|
||||||
}),
|
}),
|
||||||
SuffixResult::Ambiguous(candidates) => {
|
SuffixResult::Ambiguous(candidates) => {
|
||||||
|
// Check if all candidates are the same file connected by renames.
|
||||||
|
// resolve_rename_chain requires a concrete project_id.
|
||||||
|
if let Some(pid) = project_id
|
||||||
|
&& let Some(resolved) = try_resolve_rename_ambiguity(conn, pid, &candidates)?
|
||||||
|
{
|
||||||
|
return Ok(PathQuery {
|
||||||
|
value: resolved,
|
||||||
|
is_prefix: false,
|
||||||
|
});
|
||||||
|
}
|
||||||
let list = candidates
|
let list = candidates
|
||||||
.iter()
|
.iter()
|
||||||
.map(|p| format!(" {p}"))
|
.map(|p| format!(" {p}"))
|
||||||
@@ -239,6 +250,59 @@ pub fn suffix_probe(
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// Maximum rename hops when resolving ambiguity.
|
||||||
|
const AMBIGUITY_MAX_RENAME_HOPS: usize = 10;
|
||||||
|
|
||||||
|
/// When suffix probe returns multiple paths, check if they are all the same file
|
||||||
|
/// connected by renames. If so, return the "newest" path (the leaf of the chain
|
||||||
|
/// that is never renamed away from). Returns `None` if truly ambiguous.
|
||||||
|
fn try_resolve_rename_ambiguity(
|
||||||
|
conn: &Connection,
|
||||||
|
project_id: i64,
|
||||||
|
candidates: &[String],
|
||||||
|
) -> Result<Option<String>> {
|
||||||
|
// BFS from the first candidate to discover the full rename chain.
|
||||||
|
let chain = resolve_rename_chain(conn, project_id, &candidates[0], AMBIGUITY_MAX_RENAME_HOPS)?;
|
||||||
|
|
||||||
|
// If any candidate is NOT in the chain, these are genuinely different files.
|
||||||
|
if !candidates.iter().all(|c| chain.contains(c)) {
|
||||||
|
return Ok(None);
|
||||||
|
}
|
||||||
|
|
||||||
|
// All candidates are the same file. Find the "newest" path: the one that
|
||||||
|
// appears as new_path in a rename but is never old_path in a subsequent rename
|
||||||
|
// (within the chain). This is the leaf of the rename DAG.
|
||||||
|
let placeholders: Vec<String> = (0..chain.len()).map(|i| format!("?{}", i + 2)).collect();
|
||||||
|
let in_clause = placeholders.join(", ");
|
||||||
|
|
||||||
|
// Find paths that are old_path in a rename where new_path is also in the chain.
|
||||||
|
let sql = format!(
|
||||||
|
"SELECT DISTINCT old_path FROM mr_file_changes \
|
||||||
|
WHERE project_id = ?1 \
|
||||||
|
AND change_type = 'renamed' \
|
||||||
|
AND old_path IN ({in_clause}) \
|
||||||
|
AND new_path IN ({in_clause})"
|
||||||
|
);
|
||||||
|
|
||||||
|
let mut stmt = conn.prepare(&sql)?;
|
||||||
|
let mut params: Vec<Box<dyn rusqlite::types::ToSql>> = Vec::new();
|
||||||
|
params.push(Box::new(project_id));
|
||||||
|
for p in &chain {
|
||||||
|
params.push(Box::new(p.clone()));
|
||||||
|
}
|
||||||
|
let param_refs: Vec<&dyn rusqlite::types::ToSql> = params.iter().map(|p| p.as_ref()).collect();
|
||||||
|
|
||||||
|
let old_paths: Vec<String> = stmt
|
||||||
|
.query_map(param_refs.as_slice(), |row| row.get(0))?
|
||||||
|
.filter_map(std::result::Result::ok)
|
||||||
|
.collect();
|
||||||
|
|
||||||
|
// The newest path is a candidate that is NOT an old_path in any intra-chain rename.
|
||||||
|
let newest = candidates.iter().find(|c| !old_paths.contains(c));
|
||||||
|
|
||||||
|
Ok(newest.cloned().or_else(|| Some(candidates[0].clone())))
|
||||||
|
}
|
||||||
|
|
||||||
#[cfg(test)]
|
#[cfg(test)]
|
||||||
#[path = "path_resolver_tests.rs"]
|
#[path = "path_resolver_tests.rs"]
|
||||||
mod tests;
|
mod tests;
|
||||||
|
|||||||
@@ -288,3 +288,80 @@ fn test_exact_match_preferred_over_suffix() {
|
|||||||
assert_eq!(pq.value, "README.md");
|
assert_eq!(pq.value, "README.md");
|
||||||
assert!(!pq.is_prefix);
|
assert!(!pq.is_prefix);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
fn seed_rename(conn: &Connection, mr_id: i64, project_id: i64, old_path: &str, new_path: &str) {
|
||||||
|
conn.execute(
|
||||||
|
"INSERT INTO mr_file_changes (merge_request_id, project_id, old_path, new_path, change_type)
|
||||||
|
VALUES (?1, ?2, ?3, ?4, 'renamed')",
|
||||||
|
rusqlite::params![mr_id, project_id, old_path, new_path],
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
}
|
||||||
|
|
||||||
|
// ─── rename-aware ambiguity resolution ──────────────────────────────────────
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_ambiguity_resolved_by_rename_chain() {
|
||||||
|
let conn = setup_test_db();
|
||||||
|
seed_project(&conn, 1);
|
||||||
|
seed_mr(&conn, 1, 1);
|
||||||
|
seed_mr(&conn, 2, 1);
|
||||||
|
|
||||||
|
// File was at src/old/operators.ts, then renamed to src/new/operators.ts
|
||||||
|
seed_file_change(&conn, 1, 1, "src/old/operators.ts");
|
||||||
|
seed_rename(&conn, 2, 1, "src/old/operators.ts", "src/new/operators.ts");
|
||||||
|
|
||||||
|
// Bare "operators.ts" matches both paths via suffix probe, but they're
|
||||||
|
// connected by a rename — should auto-resolve to the newest path.
|
||||||
|
let pq = build_path_query(&conn, "operators.ts", Some(1)).unwrap();
|
||||||
|
assert_eq!(pq.value, "src/new/operators.ts");
|
||||||
|
assert!(!pq.is_prefix);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_ambiguity_not_resolved_when_genuinely_different_files() {
|
||||||
|
let conn = setup_test_db();
|
||||||
|
seed_project(&conn, 1);
|
||||||
|
seed_mr(&conn, 1, 1);
|
||||||
|
|
||||||
|
// Two genuinely different files with the same name (no rename connecting them)
|
||||||
|
seed_file_change(&conn, 1, 1, "src/utils/helpers.ts");
|
||||||
|
seed_file_change(&conn, 1, 1, "tests/utils/helpers.ts");
|
||||||
|
|
||||||
|
let err = build_path_query(&conn, "helpers.ts", Some(1)).unwrap_err();
|
||||||
|
assert!(err.to_string().contains("matches multiple paths"));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_ambiguity_rename_chain_with_three_hops() {
|
||||||
|
let conn = setup_test_db();
|
||||||
|
seed_project(&conn, 1);
|
||||||
|
seed_mr(&conn, 1, 1);
|
||||||
|
seed_mr(&conn, 2, 1);
|
||||||
|
seed_mr(&conn, 3, 1);
|
||||||
|
|
||||||
|
// File named "config.ts" moved twice: lib/ -> src/ -> src/core/
|
||||||
|
seed_file_change(&conn, 1, 1, "lib/config.ts");
|
||||||
|
seed_rename(&conn, 2, 1, "lib/config.ts", "src/config.ts");
|
||||||
|
seed_rename(&conn, 3, 1, "src/config.ts", "src/core/config.ts");
|
||||||
|
|
||||||
|
// "config.ts" matches lib/config.ts, src/config.ts, src/core/config.ts via suffix
|
||||||
|
let pq = build_path_query(&conn, "config.ts", Some(1)).unwrap();
|
||||||
|
assert_eq!(pq.value, "src/core/config.ts");
|
||||||
|
assert!(!pq.is_prefix);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_ambiguity_rename_without_project_id_stays_ambiguous() {
|
||||||
|
let conn = setup_test_db();
|
||||||
|
seed_project(&conn, 1);
|
||||||
|
seed_mr(&conn, 1, 1);
|
||||||
|
seed_mr(&conn, 2, 1);
|
||||||
|
|
||||||
|
seed_file_change(&conn, 1, 1, "src/old/utils.ts");
|
||||||
|
seed_rename(&conn, 2, 1, "src/old/utils.ts", "src/new/utils.ts");
|
||||||
|
|
||||||
|
// Without project_id, rename resolution is skipped → stays ambiguous
|
||||||
|
let err = build_path_query(&conn, "utils.ts", None).unwrap_err();
|
||||||
|
assert!(err.to_string().contains("matches multiple paths"));
|
||||||
|
}
|
||||||
|
|||||||
34
src/main.rs
34
src/main.rs
@@ -39,6 +39,7 @@ use lore::core::dependent_queue::release_all_locked_jobs;
|
|||||||
use lore::core::error::{LoreError, RobotErrorOutput};
|
use lore::core::error::{LoreError, RobotErrorOutput};
|
||||||
use lore::core::logging;
|
use lore::core::logging;
|
||||||
use lore::core::metrics::MetricsLayer;
|
use lore::core::metrics::MetricsLayer;
|
||||||
|
use lore::core::path_resolver::{build_path_query, normalize_repo_path};
|
||||||
use lore::core::paths::{get_config_path, get_db_path, get_log_dir};
|
use lore::core::paths::{get_config_path, get_db_path, get_log_dir};
|
||||||
use lore::core::project::resolve_project;
|
use lore::core::project::resolve_project;
|
||||||
use lore::core::shutdown::ShutdownSignal;
|
use lore::core::shutdown::ShutdownSignal;
|
||||||
@@ -1874,9 +1875,27 @@ fn handle_file_history(
|
|||||||
.effective_project(args.project.as_deref())
|
.effective_project(args.project.as_deref())
|
||||||
.map(String::from);
|
.map(String::from);
|
||||||
|
|
||||||
|
let normalized = normalize_repo_path(&args.path);
|
||||||
|
|
||||||
|
// Resolve bare filenames before querying (same path resolution as trace/who)
|
||||||
|
let db_path_tmp = get_db_path(config.storage.db_path.as_deref());
|
||||||
|
let conn_tmp = create_connection(&db_path_tmp)?;
|
||||||
|
let project_id_tmp = project
|
||||||
|
.as_deref()
|
||||||
|
.map(|p| resolve_project(&conn_tmp, p))
|
||||||
|
.transpose()?;
|
||||||
|
let pq = build_path_query(&conn_tmp, &normalized, project_id_tmp)?;
|
||||||
|
let resolved_path = if pq.is_prefix {
|
||||||
|
// Directory prefix — file-history is file-oriented, pass the raw path.
|
||||||
|
// Don't use pq.value which contains LIKE-escaped metacharacters.
|
||||||
|
normalized.trim_end_matches('/').to_string()
|
||||||
|
} else {
|
||||||
|
pq.value
|
||||||
|
};
|
||||||
|
|
||||||
let result = run_file_history(
|
let result = run_file_history(
|
||||||
&config,
|
&config,
|
||||||
&args.path,
|
&resolved_path,
|
||||||
project.as_deref(),
|
project.as_deref(),
|
||||||
args.no_follow_renames,
|
args.no_follow_renames,
|
||||||
args.merged,
|
args.merged,
|
||||||
@@ -1901,7 +1920,8 @@ fn handle_trace(
|
|||||||
let start = std::time::Instant::now();
|
let start = std::time::Instant::now();
|
||||||
let config = Config::load(config_override)?;
|
let config = Config::load(config_override)?;
|
||||||
|
|
||||||
let (path, line_requested) = parse_trace_path(&args.path);
|
let (raw_path, line_requested) = parse_trace_path(&args.path);
|
||||||
|
let normalized = normalize_repo_path(&raw_path);
|
||||||
|
|
||||||
if line_requested.is_some() && !robot_mode {
|
if line_requested.is_some() && !robot_mode {
|
||||||
eprintln!(
|
eprintln!(
|
||||||
@@ -1920,6 +1940,16 @@ fn handle_trace(
|
|||||||
.map(|p| resolve_project(&conn, p))
|
.map(|p| resolve_project(&conn, p))
|
||||||
.transpose()?;
|
.transpose()?;
|
||||||
|
|
||||||
|
// Resolve bare filenames (e.g. "operators.ts" -> "src/utils/operators.ts")
|
||||||
|
let pq = build_path_query(&conn, &normalized, project_id)?;
|
||||||
|
let path = if pq.is_prefix {
|
||||||
|
// Directory prefix — trace is file-oriented, pass the raw path.
|
||||||
|
// Don't use pq.value which contains LIKE-escaped metacharacters.
|
||||||
|
normalized.trim_end_matches('/').to_string()
|
||||||
|
} else {
|
||||||
|
pq.value
|
||||||
|
};
|
||||||
|
|
||||||
let result = run_trace(
|
let result = run_trace(
|
||||||
&conn,
|
&conn,
|
||||||
project_id,
|
project_id,
|
||||||
|
|||||||
Reference in New Issue
Block a user