Adds file-history command showing which MRs touched a file, with:
- Rename chain resolution via BFS (resolve_rename_chain from bd-1yx)
- DiffNote discussion snippets with --discussions flag
- --merged filter, --no-follow-renames, -n limit
- Human output with styled MR list and rename chain display
- Robot JSON output with {ok, data, meta} envelope
- Autocorrect registry and robot-docs manifest entry
- Fixes pre-existing --no-status missing from sync autocorrect registry
160 lines
6.9 KiB
Markdown
160 lines
6.9 KiB
Markdown
I reviewed the plan end-to-end and focused only on new improvements (none of the items in `## Rejected Recommendations` are re-proposed).
|
||
|
||
1. Add direct `--discussion-id` retrieval paths
|
||
Rationale: This removes a full discovery hop for the exact workflow that failed (replying to a known thread). It also reduces ambiguity and query cost when an agent already has the thread ID.
|
||
|
||
```diff
|
||
@@ Core Changes
|
||
| 7 | Fix robot-docs to list actual field names | Docs | Small |
|
||
+| 8 | Add direct `--discussion-id` filter to notes/discussions/show | Core | Small |
|
||
|
||
@@ Change 3: Add Standalone `discussions` List Command
|
||
lore -J discussions --for-mr 99 --cursor <token> # keyset pagination
|
||
+lore -J discussions --discussion-id 6a9c1750b37d... # direct lookup
|
||
|
||
@@ 3a. CLI Args
|
||
+ #[arg(long, conflicts_with_all = ["for_issue", "for_mr"], help_heading = "Filters")]
|
||
+ pub discussion_id: Option<String>,
|
||
|
||
@@ Change 1: Add `gitlab_discussion_id` to Notes Output
|
||
+Add `--discussion-id <hex>` filter to `notes` for direct note retrieval within one thread.
|
||
```
|
||
|
||
2. Add a shared filter compiler to eliminate count/query drift
|
||
Rationale: The plan currently repeats filters across data query, `total_count`, and `incomplete_rows` count queries. That is a classic reliability bug source. A single compiled filter object makes count semantics provably consistent.
|
||
|
||
```diff
|
||
@@ Count Semantics (Cross-Cutting Convention)
|
||
+## Filter Compiler (NEW, Cross-Cutting Convention)
|
||
+All list commands must build predicates via a shared `CompiledFilters` object that emits:
|
||
+- SQL predicate fragment
|
||
+- bind parameters
|
||
+- canonical filter string (for cursor hash)
|
||
+The same compiled object is reused by:
|
||
+- page data query
|
||
+- `total_count` query
|
||
+- `incomplete_rows` query
|
||
```
|
||
|
||
3. Harden keyset pagination semantics for `DESC`, limits, and client ergonomics
|
||
Rationale: `(sort_value, id) > (?, ?)` is only correct for ascending order. Descending sort needs `<`. Also add explicit `has_more` so clients don’t infer from cursor nullability.
|
||
|
||
```diff
|
||
@@ Keyset Pagination (Cross-Cutting, Change B)
|
||
-```sql
|
||
-WHERE (sort_value, id) > (?, ?)
|
||
-```
|
||
+Use comparator by order:
|
||
+- ASC: `(sort_value, id) > (?, ?)`
|
||
+- DESC: `(sort_value, id) < (?, ?)`
|
||
|
||
@@ 3a. CLI Args
|
||
+ #[arg(short = 'n', long = "limit", default_value = "50", value_parser = clap::value_parser!(usize).range(1..=500), help_heading = "Output")]
|
||
+ pub limit: usize,
|
||
|
||
@@ Response Schema
|
||
- "next_cursor": "aW...xyz=="
|
||
+ "next_cursor": "aW...xyz==",
|
||
+ "has_more": true
|
||
```
|
||
|
||
4. Add DB-level entity integrity invariants (not just response invariants)
|
||
Rationale: Response-side filtering is good, but DB correctness should also be guarded. This prevents silent corruption and bad joins from ingestion or future migrations.
|
||
|
||
```diff
|
||
@@ Contract Invariants (NEW)
|
||
+### Entity Integrity Invariants (DB + Ingest)
|
||
+1. `discussions` must belong to exactly one parent (`issue_id XOR merge_request_id`).
|
||
+2. `discussions.noteable_type` must match the populated parent column.
|
||
+3. Natural-key uniqueness is enforced where valid:
|
||
+ - `(project_id, gitlab_discussion_id)` unique for discussions.
|
||
+4. Ingestion must reject/quarantine rows violating invariants and report counts.
|
||
|
||
@@ Supporting Indexes (Cross-Cutting, Change D)
|
||
+CREATE UNIQUE INDEX IF NOT EXISTS idx_discussions_project_gitlab_discussion_id
|
||
+ ON discussions(project_id, gitlab_discussion_id);
|
||
```
|
||
|
||
5. Switch bulk note loading to streaming grouping (avoid large intermediate vecs)
|
||
Rationale: Current bulk strategy still materializes all notes before grouping. Streaming into the map cuts peak memory and improves large-MR stability.
|
||
|
||
```diff
|
||
@@ Change 2e. Constructor — use bulk notes map
|
||
-let all_note_rows: Vec<MrNoteDetail> = ... // From bulk query above
|
||
-let notes_by_discussion: HashMap<i64, Vec<MrNoteDetail>> =
|
||
- all_note_rows.into_iter().fold(HashMap::new(), |mut map, note| {
|
||
- map.entry(note.discussion_id).or_insert_with(Vec::new).push(note);
|
||
- map
|
||
- });
|
||
+let mut notes_by_discussion: HashMap<i64, Vec<MrNoteDetail>> = HashMap::new();
|
||
+for row in bulk_note_stmt.query_map(params, map_note_row)? {
|
||
+ let note = row?;
|
||
+ notes_by_discussion.entry(note.discussion_id).or_default().push(note);
|
||
+}
|
||
```
|
||
|
||
6. Make freshness tri-state (`fresh|stale|unknown`) and fail closed on unknown with `--require-fresh`
|
||
Rationale: `stale: bool` alone cannot represent “never synced / unknown project freshness.” For write safety, unknown freshness should be explicit and reject under freshness constraints.
|
||
|
||
```diff
|
||
@@ Freshness Metadata & Staleness Guards
|
||
pub struct ResponseMeta {
|
||
pub elapsed_ms: i64,
|
||
pub data_as_of_iso: String,
|
||
pub sync_lag_seconds: i64,
|
||
pub stale: bool,
|
||
+ pub freshness_state: String, // "fresh" | "stale" | "unknown"
|
||
+ #[serde(skip_serializing_if = "Option::is_none")]
|
||
+ pub freshness_reason: Option<String>,
|
||
pub incomplete_rows: i64,
|
||
@@
|
||
-if sync_lag_seconds > max_age_secs {
|
||
+if freshness_state == "unknown" || sync_lag_seconds > max_age_secs {
|
||
```
|
||
|
||
7. Tune indexes to match actual ORDER BY paths in window queries
|
||
Rationale: `idx_notes_discussion_position` is likely insufficient for the two window orderings. A covering-style index aligned with partition/order keys reduces random table lookups.
|
||
|
||
```diff
|
||
@@ Supporting Indexes (Cross-Cutting, Change D)
|
||
--- Notes: window function ORDER BY (discussion_id, position) for ROW_NUMBER()
|
||
-CREATE INDEX IF NOT EXISTS idx_notes_discussion_position
|
||
- ON notes(discussion_id, position);
|
||
+-- Notes: support dual ROW_NUMBER() orderings and reduce table lookups
|
||
+CREATE INDEX IF NOT EXISTS idx_notes_discussion_window
|
||
+ ON notes(discussion_id, is_system, position, created_at, gitlab_id);
|
||
```
|
||
|
||
8. Add a phased rollout gate before strict exclusion becomes default
|
||
Rationale: Enforcing `gitlab_* IS NOT NULL` immediately can hide data if existing rows are incomplete. A short observation gate prevents sudden regressions while preserving the end-state contract.
|
||
|
||
```diff
|
||
@@ Delivery Order
|
||
+Batch 0: Observability gate (NEW)
|
||
+- Ship `incomplete_rows` and freshness meta first
|
||
+- Measure incomplete rate across real datasets
|
||
+- If incomplete ratio <= threshold, enable strict exclusion defaults
|
||
+- If above threshold, block rollout and fix ingestion quality first
|
||
+
|
||
Change 1 (notes output) ──┐
|
||
```
|
||
|
||
9. Add property-based invariants for pagination/count correctness
|
||
Rationale: Your current tests are scenario-based and good, but randomized property tests are much better at catching edge-case cursor/count bugs.
|
||
|
||
```diff
|
||
@@ Tests (Change 3 / Change B)
|
||
+**Test 12**: Property-based pagination invariants (`proptest`)
|
||
+```rust
|
||
+#[test]
|
||
+fn prop_discussion_cursor_no_overlap_no_gap_under_random_data() { /* ... */ }
|
||
+```
|
||
+
|
||
+**Test 13**: Property-based count invariants
|
||
+```rust
|
||
+#[test]
|
||
+fn prop_total_count_and_incomplete_rows_match_filter_partition() { /* ... */ }
|
||
+```
|
||
```
|
||
|
||
If you want, I can now produce a fully consolidated “Plan v4” that applies these diffs cleanly into your original document so it reads as a single coherent spec. |