Below are the highest-leverage revisions I’d make. They’re tightly scoped (no new tables/APIs), but fix a few real correctness issues and make the outputs more actionable. 1) Fix a correctness bug in PathQuery: don’t escape for =, and make --path Makefile actually work Why Bug: build_path_query() currently runs escape_like() even when is_prefix = false (exact match). That will break exact matches for paths containing _, %, or \ because = does not treat those as metacharacters (so the escaped string won’t equal the stored path). UX mismatch: The plan says --path handles dotless root files (Makefile/LICENSE), but the current logic still treats them as directory prefixes (Makefile/%) → zero results. Change Only escape for LIKE. Treat root paths (no /) passed via --path as exact matches by default (unless they end with /). diff Copy code diff --git a/plan.md b/plan.md @@ -/// Build a path query from a user-supplied path. -/// -/// Rules: -/// - If the path ends with `/`, it's a directory prefix -> `escaped_path%` (LIKE) -/// - If the last path segment contains `.`, it's a file -> exact match (=) -/// - Otherwise, it's a directory prefix -> `escaped_path/%` (LIKE) +/// Build a path query from a user-supplied path. +/// +/// Rules: +/// - If the path ends with `/`, it's a directory prefix -> `escaped_path/%` (LIKE) +/// - If the path is a root path (no `/`) and does NOT end with `/`, treat as exact (=) +/// (this makes `--path Makefile` and `--path LICENSE` work as intended) +/// - Else if the last path segment contains `.`, treat as exact (=) +/// - Otherwise, treat as directory prefix -> `escaped_path/%` (LIKE) @@ -fn build_path_query(path: &str) -> PathQuery { +fn build_path_query(path: &str) -> PathQuery { let trimmed = path.trim_end_matches('/'); let last_segment = trimmed.rsplit('/').next().unwrap_or(trimmed); - let is_file = !path.ends_with('/') && last_segment.contains('.'); - let escaped = escape_like(trimmed); + let is_root = !trimmed.contains('/'); + let is_file = !path.ends_with('/') && (is_root || last_segment.contains('.')); if is_file { PathQuery { - value: escaped, + // IMPORTANT: do NOT escape for exact match (=) + value: trimmed.to_string(), is_prefix: false, } } else { + let escaped = escape_like(trimmed); PathQuery { value: format!("{escaped}/%"), is_prefix: true, } } } @@ -/// **Known limitation:** Dotless root files (LICENSE, Makefile, Dockerfile) -/// without a trailing `/` will be treated as directory prefixes. Use `--path` -/// for these — the `--path` flag passes through to Expert mode directly, -/// and the `build_path_query` output for "LICENSE" is a prefix `LICENSE/%` -/// which will simply return zero results (a safe, obvious failure mode that the -/// help text addresses). +/// Note: Root file paths passed via `--path` (including dotless files like Makefile/LICENSE) +/// are treated as exact matches unless they end with `/`. Also update the --path help text to be explicit: diff Copy code diff --git a/plan.md b/plan.md @@ - /// Force expert mode for a file/directory path (handles root files like - /// README.md, LICENSE, Makefile that lack a / and can't be auto-detected) + /// Force expert mode for a file/directory path. + /// Root files (README.md, LICENSE, Makefile) are treated as exact matches. + /// Use a trailing `/` to force directory-prefix matching. 2) Fix Active mode: your note_count is currently counting participants, and the CTE scans too broadly Why In note_agg, you do SELECT DISTINCT discussion_id, author_username and then COUNT(*) AS note_count. That’s participant count, not note count. The current note_agg also builds the DISTINCT set from all notes then joins to picked. It’s avoidable work. Change Split into two aggregations scoped to picked: note_counts: counts non-system notes per picked discussion. participants: distinct usernames per picked discussion, then GROUP_CONCAT. diff Copy code diff --git a/plan.md b/plan.md @@ - note_agg AS ( - SELECT - n.discussion_id, - COUNT(*) AS note_count, - GROUP_CONCAT(n.author_username, X'1F') AS participants - FROM ( - SELECT DISTINCT discussion_id, author_username - FROM notes - WHERE is_system = 0 AND author_username IS NOT NULL - ) n - JOIN picked p ON p.id = n.discussion_id - GROUP BY n.discussion_id - ) + note_counts AS ( + SELECT + n.discussion_id, + COUNT(*) AS note_count + FROM notes n + JOIN picked p ON p.id = n.discussion_id + WHERE n.is_system = 0 + GROUP BY n.discussion_id + ), + participants AS ( + SELECT + x.discussion_id, + GROUP_CONCAT(x.author_username, X'1F') AS participants + FROM ( + SELECT DISTINCT n.discussion_id, n.author_username + FROM notes n + JOIN picked p ON p.id = n.discussion_id + WHERE n.is_system = 0 AND n.author_username IS NOT NULL + ) x + GROUP BY x.discussion_id + ) @@ - LEFT JOIN note_agg na ON na.discussion_id = p.id + LEFT JOIN note_counts nc ON nc.discussion_id = p.id + LEFT JOIN participants pa ON pa.discussion_id = p.id @@ - COALESCE(na.note_count, 0) AS note_count, - COALESCE(na.participants, '') AS participants + COALESCE(nc.note_count, 0) AS note_count, + COALESCE(pa.participants, '') AS participants Net effect: correctness fix + more predictable perf. Add a test that would have failed before: diff Copy code diff --git a/plan.md b/plan.md @@ #[test] fn test_active_query() { @@ - insert_diffnote(&conn, 1, 1, 1, "reviewer_b", "src/foo.rs", "needs work"); + insert_diffnote(&conn, 1, 1, 1, "reviewer_b", "src/foo.rs", "needs work"); + insert_diffnote(&conn, 2, 1, 1, "reviewer_b", "src/foo.rs", "follow-up"); @@ - assert_eq!(result.discussions[0].participants, vec!["reviewer_b"]); + assert_eq!(result.discussions[0].participants, vec!["reviewer_b"]); + assert_eq!(result.discussions[0].note_count, 2); 3) Index fix: idx_discussions_unresolved_recent won’t help global --active ordering Why Your index is (project_id, last_note_at) with WHERE resolvable=1 AND resolved=0. When --active is not project-scoped (common default), SQLite can’t use (project_id, last_note_at) to satisfy ORDER BY last_note_at DESC efficiently because project_id isn’t constrained. This can turn into a scan+sort over potentially large unresolved sets. Change Keep the project-scoped index, but add a global ordering index (partial, still small): diff Copy code diff --git a/plan.md b/plan.md @@ CREATE INDEX IF NOT EXISTS idx_discussions_unresolved_recent ON discussions(project_id, last_note_at) WHERE resolvable = 1 AND resolved = 0; + +-- Active (global): unresolved discussions by recency (no project scope). +-- Supports ORDER BY last_note_at DESC LIMIT N when project_id is unconstrained. +CREATE INDEX IF NOT EXISTS idx_discussions_unresolved_recent_global + ON discussions(last_note_at) + WHERE resolvable = 1 AND resolved = 0; 4) Make Overlap “touches” coherent: count MRs for reviewers, not DiffNotes Why Overlap’s question is “Who else has MRs touching my files?” but: reviewer branch uses COUNT(*) (DiffNotes) author branch uses COUNT(DISTINCT m.id) (MRs) Those are different units; summing them into touch_count is misleading. Change Count distinct MRs on the reviewer branch too: diff Copy code diff --git a/plan.md b/plan.md @@ - COUNT(*) AS touch_count, + COUNT(DISTINCT m.id) AS touch_count, MAX(n.created_at) AS last_touch_at, GROUP_CONCAT(DISTINCT (p.path_with_namespace || '!' || m.iid)) AS mr_refs Also update human output labeling: diff Copy code diff --git a/plan.md b/plan.md @@ - style("Touches").bold(), + style("MRs").bold(), (You still preserve “strength” via mr_refs and last_touch_at.) 5) Make outputs more actionable: add a canonical ref field (group/project!iid, group/project#iid) Why You already do this for Overlap (mr_refs). Doing the same for Workload and Active reduces friction for both humans and agents: humans can copy/paste a single token robots don’t need to stitch project_path + iid + prefix Change (Workload structs + SQL) diff Copy code diff --git a/plan.md b/plan.md @@ pub struct WorkloadIssue { pub iid: i64, + pub ref_: String, pub title: String, pub project_path: String, pub updated_at: i64, } @@ pub struct WorkloadMr { pub iid: i64, + pub ref_: String, pub title: String, pub draft: bool, pub project_path: String, @@ - let issues_sql = - "SELECT i.iid, i.title, p.path_with_namespace, i.updated_at + let issues_sql = + "SELECT i.iid, + (p.path_with_namespace || '#' || i.iid) AS ref, + i.title, p.path_with_namespace, i.updated_at @@ - iid: row.get(0)?, - title: row.get(1)?, - project_path: row.get(2)?, - updated_at: row.get(3)?, + iid: row.get(0)?, + ref_: row.get(1)?, + title: row.get(2)?, + project_path: row.get(3)?, + updated_at: row.get(4)?, }) @@ - let authored_sql = - "SELECT m.iid, m.title, m.draft, p.path_with_namespace, m.updated_at + let authored_sql = + "SELECT m.iid, + (p.path_with_namespace || '!' || m.iid) AS ref, + m.title, m.draft, p.path_with_namespace, m.updated_at @@ - iid: row.get(0)?, - title: row.get(1)?, - draft: row.get::<_, i32>(2)? != 0, - project_path: row.get(3)?, + iid: row.get(0)?, + ref_: row.get(1)?, + title: row.get(2)?, + draft: row.get::<_, i32>(3)? != 0, + project_path: row.get(4)?, author_username: None, - updated_at: row.get(4)?, + updated_at: row.get(5)?, }) Then use ref_ in human output + robot JSON. 6) Reviews mode: tolerate leading whitespace before **prefix** Why Many people write " **suggestion**: ...". Current LIKE '**%**%' misses that. Change Use ltrim(n.body) consistently: diff Copy code diff --git a/plan.md b/plan.md @@ - AND n.body LIKE '**%**%' + AND ltrim(n.body) LIKE '**%**%' @@ - SUBSTR(n.body, 3, INSTR(SUBSTR(n.body, 3), '**') - 1) AS raw_prefix, + SUBSTR(ltrim(n.body), 3, INSTR(SUBSTR(ltrim(n.body), 3), '**') - 1) AS raw_prefix, 7) Add two small tests that catch the above regressions Why These are exactly the kind of issues that slip through without targeted tests. diff Copy code diff --git a/plan.md b/plan.md @@ #[test] fn test_escape_like() { @@ } + + #[test] + fn test_build_path_query_exact_does_not_escape() { + // '_' must not be escaped for '=' + let pq = build_path_query("README_with_underscore.md"); + assert_eq!(pq.value, "README_with_underscore.md"); + assert!(!pq.is_prefix); + } + + #[test] + fn test_path_flag_dotless_root_file_is_exact() { + let pq = build_path_query("Makefile"); + assert_eq!(pq.value, "Makefile"); + assert!(!pq.is_prefix); + } Summary of net effect Correctness fixes: exact-path escaping bug; Active.note_count bug. Perf fixes: global --active index; avoid broad note scans in Active. Usefulness upgrades: coherent overlap “touch” metric; canonical refs everywhere; reviews prefix more robust. If you want one extra “stretch” that still isn’t scope creep: add an unscoped warning line in human output when project_id == None (e.g., “Aggregated across projects; use -p to scope”) for Expert/Overlap/Active. That’s pure presentation, but prevents misinterpretation in multi-project DBs.