2026-02-07 21:35:05 -05:00
parent 3e7fa607d3
commit 8dc479e515
9 changed files with 6194 additions and 0 deletions
							
							
							
						
@@ -0,0 +1,356 @@
Below are the highest-leverage revisions I’d make. They’re tightly scoped (no new tables/APIs), but fix a few real correctness issues and make the outputs more actionable.
1) Fix a correctness bug in PathQuery: don’t escape for =, and make --path Makefile actually work
Why
Bug: build_path_query() currently runs escape_like() even when is_prefix = false (exact match). That will break exact matches for paths containing _, %, or \ because = does not treat those as metacharacters (so the escaped string won’t equal the stored path).
UX mismatch: The plan says --path handles dotless root files (Makefile/LICENSE), but the current logic still treats them as directory prefixes (Makefile/%) → zero results.
Change
Only escape for LIKE.
Treat root paths (no /) passed via --path as exact matches by default (unless they end with /).
diff
Copy code
diff --git a/plan.md b/plan.md
@@
-/// Build a path query from a user-supplied path.
-///
-/// Rules:
-/// - If the path ends with `/`, it's a directory prefix -> `escaped_path%` (LIKE)
-/// - If the last path segment contains `.`, it's a file -> exact match (=)
-/// - Otherwise, it's a directory prefix -> `escaped_path/%` (LIKE)
+/// Build a path query from a user-supplied path.
+///
+/// Rules:
+/// - If the path ends with `/`, it's a directory prefix -> `escaped_path/%` (LIKE)
+/// - If the path is a root path (no `/`) and does NOT end with `/`, treat as exact (=)
+///   (this makes `--path Makefile` and `--path LICENSE` work as intended)
+/// - Else if the last path segment contains `.`, treat as exact (=)
+/// - Otherwise, treat as directory prefix -> `escaped_path/%` (LIKE)
@@
-fn build_path_query(path: &str) -> PathQuery {
+fn build_path_query(path: &str) -> PathQuery {
     let trimmed = path.trim_end_matches('/');
     let last_segment = trimmed.rsplit('/').next().unwrap_or(trimmed);
-    let is_file = !path.ends_with('/') && last_segment.contains('.');
-    let escaped = escape_like(trimmed);
+    let is_root = !trimmed.contains('/');
+    let is_file = !path.ends_with('/') && (is_root || last_segment.contains('.'));
 
     if is_file {
         PathQuery {
-            value: escaped,
+            // IMPORTANT: do NOT escape for exact match (=)
+            value: trimmed.to_string(),
             is_prefix: false,
         }
     } else {
+        let escaped = escape_like(trimmed);
         PathQuery {
             value: format!("{escaped}/%"),
             is_prefix: true,
         }
     }
 }
@@
-/// **Known limitation:** Dotless root files (LICENSE, Makefile, Dockerfile)
-/// without a trailing `/` will be treated as directory prefixes. Use `--path`
-/// for these — the `--path` flag passes through to Expert mode directly,
-/// and the `build_path_query` output for "LICENSE" is a prefix `LICENSE/%`
-/// which will simply return zero results (a safe, obvious failure mode that the
-/// help text addresses).
+/// Note: Root file paths passed via `--path` (including dotless files like Makefile/LICENSE)
+/// are treated as exact matches unless they end with `/`.
Also update the --path help text to be explicit:
diff
Copy code
diff --git a/plan.md b/plan.md
@@
-    /// Force expert mode for a file/directory path (handles root files like
-    /// README.md, LICENSE, Makefile that lack a / and can't be auto-detected)
+    /// Force expert mode for a file/directory path.
+    /// Root files (README.md, LICENSE, Makefile) are treated as exact matches.
+    /// Use a trailing `/` to force directory-prefix matching.
2) Fix Active mode: your note_count is currently counting participants, and the CTE scans too broadly
Why
In note_agg, you do SELECT DISTINCT discussion_id, author_username and then COUNT(*) AS note_count. That’s participant count, not note count.
The current note_agg also builds the DISTINCT set from all notes then joins to picked. It’s avoidable work.
Change
Split into two aggregations scoped to picked:
note_counts: counts non-system notes per picked discussion.
participants: distinct usernames per picked discussion, then GROUP_CONCAT.
diff
Copy code
diff --git a/plan.md b/plan.md
@@
-        note_agg AS (
-            SELECT
-                n.discussion_id,
-                COUNT(*) AS note_count,
-                GROUP_CONCAT(n.author_username, X'1F') AS participants
-            FROM (
-                SELECT DISTINCT discussion_id, author_username
-                FROM notes
-                WHERE is_system = 0 AND author_username IS NOT NULL
-            ) n
-            JOIN picked p ON p.id = n.discussion_id
-            GROUP BY n.discussion_id
-        )
+        note_counts AS (
+            SELECT
+                n.discussion_id,
+                COUNT(*) AS note_count
+            FROM notes n
+            JOIN picked p ON p.id = n.discussion_id
+            WHERE n.is_system = 0
+            GROUP BY n.discussion_id
+        ),
+        participants AS (
+            SELECT
+                x.discussion_id,
+                GROUP_CONCAT(x.author_username, X'1F') AS participants
+            FROM (
+                SELECT DISTINCT n.discussion_id, n.author_username
+                FROM notes n
+                JOIN picked p ON p.id = n.discussion_id
+                WHERE n.is_system = 0 AND n.author_username IS NOT NULL
+            ) x
+            GROUP BY x.discussion_id
+        )
@@
-        LEFT JOIN note_agg na ON na.discussion_id = p.id
+        LEFT JOIN note_counts nc ON nc.discussion_id = p.id
+        LEFT JOIN participants pa ON pa.discussion_id = p.id
@@
-            COALESCE(na.note_count, 0) AS note_count,
-            COALESCE(na.participants, '') AS participants
+            COALESCE(nc.note_count, 0) AS note_count,
+            COALESCE(pa.participants, '') AS participants
Net effect: correctness fix + more predictable perf.
Add a test that would have failed before:
diff
Copy code
diff --git a/plan.md b/plan.md
@@
     #[test]
     fn test_active_query() {
@@
-        insert_diffnote(&conn, 1, 1, 1, "reviewer_b", "src/foo.rs", "needs work");
+        insert_diffnote(&conn, 1, 1, 1, "reviewer_b", "src/foo.rs", "needs work");
+        insert_diffnote(&conn, 2, 1, 1, "reviewer_b", "src/foo.rs", "follow-up");
@@
-        assert_eq!(result.discussions[0].participants, vec!["reviewer_b"]);
+        assert_eq!(result.discussions[0].participants, vec!["reviewer_b"]);
+        assert_eq!(result.discussions[0].note_count, 2);
3) Index fix: idx_discussions_unresolved_recent won’t help global --active ordering
Why
Your index is (project_id, last_note_at) with WHERE resolvable=1 AND resolved=0.
When --active is not project-scoped (common default), SQLite can’t use (project_id, last_note_at) to satisfy ORDER BY last_note_at DESC efficiently because project_id isn’t constrained.
This can turn into a scan+sort over potentially large unresolved sets.
Change
Keep the project-scoped index, but add a global ordering index (partial, still small):
diff
Copy code
diff --git a/plan.md b/plan.md
@@
 CREATE INDEX IF NOT EXISTS idx_discussions_unresolved_recent
     ON discussions(project_id, last_note_at)
     WHERE resolvable = 1 AND resolved = 0;
+
+-- Active (global): unresolved discussions by recency (no project scope).
+-- Supports ORDER BY last_note_at DESC LIMIT N when project_id is unconstrained.
+CREATE INDEX IF NOT EXISTS idx_discussions_unresolved_recent_global
+    ON discussions(last_note_at)
+    WHERE resolvable = 1 AND resolved = 0;
4) Make Overlap “touches” coherent: count MRs for reviewers, not DiffNotes
Why
Overlap’s question is “Who else has MRs touching my files?” but:
reviewer branch uses COUNT(*) (DiffNotes)
author branch uses COUNT(DISTINCT m.id) (MRs)
Those are different units; summing them into touch_count is misleading.
Change
Count distinct MRs on the reviewer branch too:
diff
Copy code
diff --git a/plan.md b/plan.md
@@
-                COUNT(*) AS touch_count,
+                COUNT(DISTINCT m.id) AS touch_count,
                 MAX(n.created_at) AS last_touch_at,
                 GROUP_CONCAT(DISTINCT (p.path_with_namespace || '!' || m.iid)) AS mr_refs
Also update human output labeling:
diff
Copy code
diff --git a/plan.md b/plan.md
@@
-        style("Touches").bold(),
+        style("MRs").bold(),
(You still preserve “strength” via mr_refs and last_touch_at.)
5) Make outputs more actionable: add a canonical ref field (group/project!iid, group/project#iid)
Why
You already do this for Overlap (mr_refs). Doing the same for Workload and Active reduces friction for both humans and agents:
humans can copy/paste a single token
robots don’t need to stitch project_path + iid + prefix
Change (Workload structs + SQL)
diff
Copy code
diff --git a/plan.md b/plan.md
@@
 pub struct WorkloadIssue {
     pub iid: i64,
+    pub ref_: String,
     pub title: String,
     pub project_path: String,
     pub updated_at: i64,
 }
@@
 pub struct WorkloadMr {
     pub iid: i64,
+    pub ref_: String,
     pub title: String,
     pub draft: bool,
     pub project_path: String,
@@
-    let issues_sql =
-        "SELECT i.iid, i.title, p.path_with_namespace, i.updated_at
+    let issues_sql =
+        "SELECT i.iid,
+                (p.path_with_namespace || '#' || i.iid) AS ref,
+                i.title, p.path_with_namespace, i.updated_at
@@
-                iid: row.get(0)?,
-                title: row.get(1)?,
-                project_path: row.get(2)?,
-                updated_at: row.get(3)?,
+                iid: row.get(0)?,
+                ref_: row.get(1)?,
+                title: row.get(2)?,
+                project_path: row.get(3)?,
+                updated_at: row.get(4)?,
             })
@@
-    let authored_sql =
-        "SELECT m.iid, m.title, m.draft, p.path_with_namespace, m.updated_at
+    let authored_sql =
+        "SELECT m.iid,
+                (p.path_with_namespace || '!' || m.iid) AS ref,
+                m.title, m.draft, p.path_with_namespace, m.updated_at
@@
-                iid: row.get(0)?,
-                title: row.get(1)?,
-                draft: row.get::<_, i32>(2)? != 0,
-                project_path: row.get(3)?,
+                iid: row.get(0)?,
+                ref_: row.get(1)?,
+                title: row.get(2)?,
+                draft: row.get::<_, i32>(3)? != 0,
+                project_path: row.get(4)?,
                 author_username: None,
-                updated_at: row.get(4)?,
+                updated_at: row.get(5)?,
             })
Then use ref_ in human output + robot JSON.
6) Reviews mode: tolerate leading whitespace before **prefix**
Why
Many people write " **suggestion**: ...". Current LIKE '**%**%' misses that.
Change
Use ltrim(n.body) consistently:
diff
Copy code
diff --git a/plan.md b/plan.md
@@
-           AND n.body LIKE '**%**%'
+           AND ltrim(n.body) LIKE '**%**%'
@@
-            SUBSTR(n.body, 3, INSTR(SUBSTR(n.body, 3), '**') - 1) AS raw_prefix,
+            SUBSTR(ltrim(n.body), 3, INSTR(SUBSTR(ltrim(n.body), 3), '**') - 1) AS raw_prefix,
7) Add two small tests that catch the above regressions
Why
These are exactly the kind of issues that slip through without targeted tests.
diff
Copy code
diff --git a/plan.md b/plan.md
@@
     #[test]
     fn test_escape_like() {
@@
     }
+
+    #[test]
+    fn test_build_path_query_exact_does_not_escape() {
+        // '_' must not be escaped for '='
+        let pq = build_path_query("README_with_underscore.md");
+        assert_eq!(pq.value, "README_with_underscore.md");
+        assert!(!pq.is_prefix);
+    }
+
+    #[test]
+    fn test_path_flag_dotless_root_file_is_exact() {
+        let pq = build_path_query("Makefile");
+        assert_eq!(pq.value, "Makefile");
+        assert!(!pq.is_prefix);
+    }
Summary of net effect
Correctness fixes: exact-path escaping bug; Active.note_count bug.
Perf fixes: global --active index; avoid broad note scans in Active.
Usefulness upgrades: coherent overlap “touch” metric; canonical refs everywhere; reviews prefix more robust.
If you want one extra “stretch” that still isn’t scope creep: add an unscoped warning line in human output when project_id == None (e.g., “Aggregated across projects; use -p to scope”) for Expert/Overlap/Active. That’s pure presentation, but prevents misinterpretation in multi-project DBs.