Files

Taylor Eernisse f267578aab feat: implement lore who — people intelligence commands (5 modes)

Add `lore who` command with 5 query modes answering collaboration questions
using existing DB data (280K notes, 210K discussions, 33K DiffNotes):

- Expert: who knows about a file/directory (DiffNote path analysis + MR breadth scoring)
- Workload: what is a person working on (assigned issues, authored/reviewing MRs, discussions)
- Active: what discussions need attention (unresolved resolvable, global/project-scoped)
- Overlap: who else is touching these files (dual author+reviewer role tracking)
- Reviews: what review patterns does a person have (prefix-based category extraction)

Includes migration 017 (5 composite indexes), CLI skeleton with clap conflicts_with
validation, robot JSON output with input+resolved_input reproducibility, human terminal
output, and 20 unit tests. All quality gates pass.

Closes: bd-1q8z, bd-34rr, bd-2rk9, bd-2ldg, bd-zqpf, bd-s3rc, bd-m7k1, bd-b51e,
bd-2711, bd-1rdi, bd-3mj2, bd-tfh3, bd-zibc, bd-g0d5

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-07 23:11:14 -05:00

158 KiB

Raw Blame History

plan, title, status, iteration, target_iterations, beads_revision, related_plans, created, updated

plan

title

status

iteration

target_iterations

beads_revision

related_plans

created

updated

true

who-command-design

iterating

2026-02-07

Plan: `lore who` — People Intelligence Commands

Context

The current beads roadmap focuses on Gate 4/5 (file-history, code-trace) — archaeology queries requiring mr_file_changes data that doesn't exist yet. Meanwhile, the DB has rich people/activity data (280K notes, 210K discussions, 33K DiffNotes with file positions, 53 active participants) that can answer collaboration questions immediately with zero new tables or API calls.

This plan builds lore who — a pure SQL query layer over existing data answering five questions:

"Who should I talk to about this feature/file?"
"What is person X working on?"
"What review patterns does person X have?"
"What discussions are actively in progress?"
"Who else has MRs touching my files?"

Design Principles

Lean on existing infrastructure. Prefer (?N IS NULL OR ...) nullable binding pattern (already used in timeline_seed.rs) instead of dynamic SQL string assembly unless it materially changes index choice. In those cases, select between two static SQL strings at runtime (no format!()), e.g., Active mode uses separate global vs project-scoped statements to ensure the intended index is used. All SQL is fully static — no format!() for query text, including LIMIT (bound as ?N). Use prepare_cached() everywhere to capitalize on static SQL (cheap perf win, no scope creep). When multiple static SQL variants exist (exact/prefix; scoped/unscoped), always: (a) resolve which variant applies, then (b) prepare_cached() exactly one statement per invocation — avoid preparing both variants.
Precision over breadth. Path-based queries filter to note_type = 'DiffNote' only — the sole note type with reliable position_new_path data. System notes (is_system = 1) are excluded from all DiffNote-based branches consistently (reviewer AND author).
Performance-aware. A new migration adds composite indexes for the hot query paths. With 280K notes, unindexed LIKE on position_new_path plus timestamp filters would be unusable. Index column ordering is designed for query selectivity, not just predicate coverage. Performance is verified post-implementation with EXPLAIN QUERY PLAN.
Fail-fast at the CLI layer. Clap conflicts_with / requires attributes catch invalid flag combinations before reaching application code.
Robot-first reproducibility. Robot JSON output includes both a raw input object (echoing CLI args) and a resolved_input object (computed since_ms, since_iso, since_mode, resolved project_id + project_path, effective mode, limit) so agents can trace exactly what ran and reproduce it precisely. Fuzzy project resolution is non-deterministic over time; resolved values are the stable anchor. Entity IDs (discussion_id) are included in output so agents can dereference entities. since_mode is a tri-state ("default" / "explicit" / "none") that distinguishes "mode default applied" from "user provided --since" from "no window" (Workload), fixing the semantic bug where since_was_default = true was ambiguous for Workload mode (which has no default window).
Multi-project correctness. All entity references are project-qualified in output (human: show project_path for workload/overlap; robot: mr_refs as group/project!123 instead of bare IIDs). Project scoping predicates target the same table as the covering index (n.project_id for DiffNote queries, not m.project_id).
Exact vs prefix path matching. For exact file paths (no wildcard needed), use = instead of LIKE for better query plan signaling. Two static SQL strings are selected at call time via a PathQuery struct — no dynamic SQL.
Self-review exclusion. MR authors commenting on their own diffs (clarifications, replies) must not be counted as reviewers. All reviewer-role branches (Expert, Overlap) filter n.author_username != m.author_username to prevent inflating reviewer counts with self-comments. Reviews mode already does this (m.author_username != ?1).
Deterministic output. All list-valued fields in robot JSON must be deterministically ordered. GROUP_CONCAT results are sorted after parsing. HashSet-derived vectors are sorted before output. All sort comparisons include stable tie-breakers (username ASC as final key). This prevents robot-mode flakes across runs.
Truncation transparency. Result types carry a truncated: bool flag. Queries request LIMIT + 1 rows, set truncated = true if overflow detected, then trim to the requested limit. Both human and robot output surface this so agents know to retry with a higher --limit.
Bounded payloads. Robot JSON must never emit unbounded arrays (participants, MR refs). Large list fields are capped with *_total + *_truncated metadata so agents can detect silent truncation and page/retry. Top-level result set size is also bounded via --limit (1..=500, enforced at the CLI boundary via value_parser) to prevent runaway payloads from misconfigured agents. This is not scope creep — it's defensive output hygiene aligned with robot-first reproducibility.

Files to Create/Modify

File	Action	Description
`src/cli/commands/who.rs`	CREATE	All 5 query modes + human/robot output
`src/cli/commands/mod.rs`	MODIFY	Add `pub mod who` + re-exports
`src/cli/mod.rs`	MODIFY	Add `WhoArgs` struct + `Commands::Who` variant
`src/main.rs`	MODIFY	Add dispatch arm + `handle_who` fn + VALID_COMMANDS + robot-docs
`src/core/db.rs`	MODIFY	Add migration 017: composite indexes for who query paths

Reusable Utilities (DO NOT reimplement)

Utility	Location	Usage
`Config::load(override)`	`src/core/config.rs`	Load config from file
`get_db_path(override)`	`src/core/paths.rs`	Resolve DB file path
`create_connection(&path)`	`src/core/db.rs`	Open SQLite connection
`resolve_project(&conn, &str)`	`src/core/project.rs`	Fuzzy project name resolution (returns local `id: i64`)
`parse_since(&str) -> Option<i64>`	`src/core/time.rs`	Parse "7d", "2w", "6m", "2024-01-15" to ms epoch
`ms_to_iso(i64) -> String`	`src/core/time.rs`	ms epoch to ISO 8601 string
`now_ms() -> i64`	`src/core/time.rs`	Current time as ms epoch
`RobotMeta { elapsed_ms }`	`src/cli/robot.rs`	Standard meta field for JSON output
`LoreError`, `Result`	`src/core/error.rs`	Error types and Result alias

Note: format_relative_time() in src/cli/commands/list.rs:595 is private. Duplicate it in who.rs rather than refactoring list.rs — keep the blast radius small.

Note: lookup_project_path() does not currently exist. Add a small helper in who.rs that does a single SELECT path_with_namespace FROM projects WHERE id = ?1 — it's only called once per invocation and doesn't warrant a shared utility.

Step 0: Migration 017 — Composite Indexes for Who Query Paths

Why this step exists

With 280K notes, the path/timestamp queries will degrade without composite indexes. Existing indexes cover note_type and position_new_path separately (migration 006) but not as composites aligned to the who query patterns. This is a non-breaking, additive-only migration.

`src/core/db.rs` — Add migration to MIGRATIONS array

Add as entry 17 (index 16, since the array is 0-indexed and currently has 16 entries):

-- Migration 017: Composite indexes for `who` query paths

-- Expert/Overlap: DiffNote path prefix + timestamp filter.
-- Column ordering rationale: The partial index predicate already filters
-- note_type = 'DiffNote', so note_type is NOT a leading column.
-- position_new_path is the most selective prefix filter (LIKE 'path/%'),
-- followed by created_at for range scans, and project_id for optional scoping.
CREATE INDEX IF NOT EXISTS idx_notes_diffnote_path_created
    ON notes(position_new_path, created_at, project_id)
    WHERE note_type = 'DiffNote' AND is_system = 0;

-- Active/Workload: discussion participation lookups.
-- Covers the EXISTS subquery pattern: WHERE n.discussion_id = d.id
-- AND n.author_username = ?1 AND n.is_system = 0.
-- Also covers the "participants" subquery in Active mode.
CREATE INDEX IF NOT EXISTS idx_notes_discussion_author
    ON notes(discussion_id, author_username)
    WHERE is_system = 0;

-- Active (project-scoped): unresolved discussions by recency, scoped by project.
-- Column ordering: project_id first for optional scoping, then last_note_at
-- for ORDER BY DESC. Partial index keeps it tiny (only unresolved resolvable).
CREATE INDEX IF NOT EXISTS idx_discussions_unresolved_recent
    ON discussions(project_id, last_note_at)
    WHERE resolvable = 1 AND resolved = 0;

-- Active (global): unresolved discussions by recency (no project scope).
-- Supports ORDER BY last_note_at DESC LIMIT N when project_id is unconstrained.
-- Without this, the (project_id, last_note_at) index can't satisfy the ORDER BY
-- efficiently because project_id isn't constrained, forcing a sort+scan.
CREATE INDEX IF NOT EXISTS idx_discussions_unresolved_recent_global
    ON discussions(last_note_at)
    WHERE resolvable = 1 AND resolved = 0;

-- Workload: issue assignees by username.
-- Existing indexes are on (issue_id) only, not (username).
CREATE INDEX IF NOT EXISTS idx_issue_assignees_username
    ON issue_assignees(username, issue_id);

After adding, LATEST_SCHEMA_VERSION automatically becomes 17 via MIGRATIONS.len() as i32.

Rationale for each index:

idx_notes_diffnote_path_created: Covers Expert and Overlap mode's core query pattern — filter DiffNotes by path prefix + creation timestamp. Partial index (WHERE note_type = 'DiffNote' AND is_system = 0) keeps index small (~33K of 280K notes) and also covers the is_system = 0 predicate that applies to all branches. Leading with position_new_path (not note_type) because the partial index predicate already handles the constant filter — the leading column should be the most selective variable predicate.
idx_notes_discussion_author: Covers the repeated EXISTS (SELECT 1 FROM notes n WHERE n.discussion_id = d.id AND n.author_username = ?1 AND n.is_system = 0) pattern in Workload mode, and the participants subquery in Active mode. Without this, every discussion participation check does a full scan of all non-system notes for that discussion.
idx_discussions_unresolved_recent: Covers Active mode with project scoping — find unresolved resolvable discussions ordered by recency within a specific project. Partial index keeps it tiny.
idx_discussions_unresolved_recent_global: Covers Active mode without project scoping — when project_id is unconstrained, (project_id, last_note_at) can't satisfy ORDER BY last_note_at DESC efficiently. This single-column partial index handles the common default case where users run lore who --active without -p.
idx_issue_assignees_username: Covers Workload mode — look up issues assigned to a specific user. Existing indexes are on (issue_id) only, not (username).

Not added (already adequate):

merge_requests(author_username) — already indexed by migration 006 as idx_mrs_author
mr_reviewers(username) — already indexed by migration 006 as idx_mr_reviewers_username
notes(discussion_id) — already indexed by migration 002 as idx_notes_discussion

Step 1: CLI Args + Commands Enum

`src/cli/mod.rs` — Add `WhoArgs` struct

Insert after the TimelineArgs struct (which ends around line 195, before the hidden List command):

/// People intelligence: experts, workload, active discussions, overlap, review patterns
Who(WhoArgs),

Add the WhoArgs struct after TimelineArgs:

#[derive(Parser)]
#[command(after_help = "\x1b[1mExamples:\x1b[0m
  lore who src/features/auth/           # Who knows about this area?
  lore who @asmith                      # What is asmith working on?
  lore who @asmith --reviews            # What review patterns does asmith have?
  lore who --active                     # What discussions need attention?
  lore who --overlap src/features/auth/ # Who else is touching these files?
  lore who --path README.md             # Expert lookup for a root file
  lore who --path Makefile              # Expert lookup for a dotless root file")]
pub struct WhoArgs {
    /// Username or file path (path if contains /)
    pub target: Option<String>,

    /// Force expert mode for a file/directory path.
    /// Root files (README.md, LICENSE, Makefile) are treated as exact matches.
    /// Use a trailing `/` to force directory-prefix matching.
    #[arg(long, help_heading = "Mode", conflicts_with_all = ["active", "overlap", "reviews"])]
    pub path: Option<String>,

    /// Show active unresolved discussions
    #[arg(long, help_heading = "Mode", conflicts_with_all = ["target", "overlap", "reviews", "path"])]
    pub active: bool,

    /// Find users with MRs/notes touching this file path
    #[arg(long, help_heading = "Mode", conflicts_with_all = ["target", "active", "reviews", "path"])]
    pub overlap: Option<String>,

    /// Show review pattern analysis (requires username target)
    #[arg(long, help_heading = "Mode", requires = "target", conflicts_with_all = ["active", "overlap", "path"])]
    pub reviews: bool,

    /// Time window (7d, 2w, 6m, YYYY-MM-DD). Default varies by mode.
    #[arg(long, help_heading = "Filters")]
    pub since: Option<String>,

    /// Scope to a project (supports fuzzy matching)
    #[arg(short = 'p', long, help_heading = "Filters")]
    pub project: Option<String>,

    /// Maximum results per section (1..=500, bounded for output safety)
    #[arg(
        short = 'n',
        long = "limit",
        default_value = "20",
        value_parser = clap::value_parser!(u16).range(1..=500),
        help_heading = "Output"
    )]
    pub limit: u16,
}

Key design choice: --path flag for root files. The positional target argument discriminates username vs. file path by checking for / — a heuristic that works for all nested paths but fails for root files like README.md, LICENSE, or Makefile (which have no /). Rather than adding fragile heuristics (e.g., checking for file extensions, which misses extensionless files like Makefile and Dockerfile), we add --path as an explicit escape hatch. This covers the uncommon case cleanly without complicating the common case.

Clap conflicts_with constraints. Invalid combinations like lore who --active --overlap path are rejected at parse time with clap's built-in error messages, before application code ever runs. This removes the need for manual validation in resolve_mode.

`src/cli/mod.rs` — Add to `Commands` enum

Insert Who(WhoArgs) in the Commands enum. Place it after Timeline(TimelineArgs) and before the hidden List command block:

    /// Show a chronological timeline of events matching a query
    Timeline(TimelineArgs),

    /// People intelligence: experts, workload, active discussions, overlap
    Who(WhoArgs),

    #[command(hide = true)]
    List {

`src/cli/commands/mod.rs` — Register module + exports

Add after pub mod timeline;:

pub mod who;

Add re-exports after the timeline pub use line:

pub use who::{
    run_who, print_who_human, print_who_json, WhoRun,
};

`src/main.rs` — Import additions

Add to the import block from lore::cli::commands:

    run_who, print_who_human, print_who_json,

Add to the import block from lore::cli:

    WhoArgs,

`src/main.rs` — Dispatch arm

Insert in the match cli.command block, after Timeline:

        Some(Commands::Who(args)) => handle_who(cli.config.as_deref(), args, robot_mode),

`src/main.rs` — Handler function

Add before handle_list_compat:

fn handle_who(
    config_override: Option<&str>,
    args: WhoArgs,
    robot_mode: bool,
) -> Result<(), Box<dyn std::error::Error>> {
    let start = std::time::Instant::now();
    let config = Config::load(config_override)?;
    let run = run_who(&config, &args)?;
    let elapsed_ms = start.elapsed().as_millis() as u64;

    if robot_mode {
        print_who_json(&run, &args, elapsed_ms);
    } else {
        print_who_human(&run.result, run.resolved_input.project_path.as_deref());
    }
    Ok(())
}

Note: print_who_json takes &run (which contains both the result and resolved inputs) plus &args to echo raw CLI inputs.

`src/main.rs` — VALID_COMMANDS array

Add "who" to the VALID_COMMANDS const array in suggest_similar_command (around line 471):

    const VALID_COMMANDS: &[&str] = &[
        "issues",
        "mrs",
        // ... existing entries ...
        "timeline",
        "who",    // <-- add this
    ];

`src/main.rs` — robot-docs manifest

Add a "who" entry to the commands JSON in handle_robot_docs (around line 2014, after "timeline"):

        "who": {
            "description": "People intelligence: experts, workload, active discussions, overlap, review patterns",
            "flags": ["<target>", "--path <path>", "--active", "--overlap <path>", "--reviews", "--since <duration>", "-p/--project", "-n/--limit"],
            "modes": {
                "expert": "lore who <file-path> — Who knows about this area? (also: --path for root files)",
                "workload": "lore who <username> — What is someone working on?",
                "reviews": "lore who <username> --reviews — Review pattern analysis",
                "active": "lore who --active — Active unresolved discussions",
                "overlap": "lore who --overlap <path> — Who else is touching these files?"
            },
            "example": "lore --robot who src/features/auth/",
            "response_schema": {
                "ok": "bool",
                "data": {
                    "mode": "string",
                    "input": {"target": "string|null", "path": "string|null", "project": "string|null", "since": "string|null", "limit": "int"},
                    "resolved_input": {"mode": "string", "project_id": "int|null", "project_path": "string|null", "since_ms": "int", "since_iso": "string", "since_mode": "string (default|explicit|none)", "limit": "int"},
                    "...": "mode-specific fields"
                },
                "meta": {"elapsed_ms": "int"}
            }
        },

Also add to the workflows JSON:

        "people_intelligence": [
            "lore --robot who src/path/to/feature/",
            "lore --robot who @username",
            "lore --robot who @username --reviews",
            "lore --robot who --active --since 7d",
            "lore --robot who --overlap src/path/",
            "lore --robot who --path README.md"
        ]

Step 2: `src/cli/commands/who.rs` — Complete Implementation

This is the main file. Full code follows.

File Header and Imports

use console::style;
use rusqlite::Connection;
use serde::Serialize;
use std::collections::{HashMap, HashSet};

use crate::Config;
use crate::cli::robot::RobotMeta;
use crate::cli::WhoArgs;
use crate::core::db::create_connection;
use crate::core::error::{LoreError, Result};
use crate::core::paths::get_db_path;
use crate::core::project::resolve_project;
use crate::core::time::{ms_to_iso, now_ms, parse_since};

Mode Discrimination

/// Determines which query mode to run based on args.
/// Path variants own their strings because path normalization produces new `String`s.
/// Username variants borrow from args since no normalization is needed.
enum WhoMode<'a> {
    /// lore who <file-path>  OR  lore who --path <path>
    Expert { path: String },
    /// lore who <username>
    Workload { username: &'a str },
    /// lore who <username> --reviews
    Reviews { username: &'a str },
    /// lore who --active
    Active,
    /// lore who --overlap <path>
    Overlap { path: String },
}

fn resolve_mode<'a>(args: &'a WhoArgs) -> Result<WhoMode<'a>> {
    // Explicit --path flag always wins (handles root files like README.md,
    // LICENSE, Makefile — anything without a / that can't be auto-detected)
    if let Some(p) = &args.path {
        return Ok(WhoMode::Expert { path: normalize_repo_path(p) });
    }
    if args.active {
        return Ok(WhoMode::Active);
    }
    if let Some(path) = &args.overlap {
        return Ok(WhoMode::Overlap { path: normalize_repo_path(path) });
    }
    if let Some(target) = &args.target {
        let clean = target.strip_prefix('@').unwrap_or(target);
        if args.reviews {
            return Ok(WhoMode::Reviews { username: clean });
        }
        // Disambiguation: if target contains '/', it's a file path.
        // GitLab usernames never contain '/'.
        // Root files (no '/') require --path.
        if target.contains('/') {
            return Ok(WhoMode::Expert { path: normalize_repo_path(target) });
        }
        return Ok(WhoMode::Workload { username: clean });
    }
    Err(LoreError::Other(
        "Provide a username, file path, --active, or --overlap <path>.\n\n\
         Examples:\n  \
         lore who src/features/auth/\n  \
         lore who @username\n  \
         lore who --active\n  \
         lore who --overlap src/features/\n  \
         lore who --path README.md\n  \
         lore who --path Makefile"
            .to_string(),
    ))
}

/// Normalize user-supplied repo paths to match stored DiffNote paths.
/// - trims whitespace
/// - strips leading "./" and "/" (repo-relative paths)
/// - converts '\' to '/' when no '/' present (Windows paste)
/// - collapses repeated "//"
fn normalize_repo_path(input: &str) -> String {
    let mut s = input.trim().to_string();
    // Windows backslash normalization (only when no forward slashes present)
    if s.contains('\\') && !s.contains('/') {
        s = s.replace('\\', "/");
    }
    // Strip leading ./
    while s.starts_with("./") {
        s = s[2..].to_string();
    }
    // Strip leading /
    s = s.trim_start_matches('/').to_string();
    // Collapse repeated //
    while s.contains("//") {
        s = s.replace("//", "/");
    }
    s
}

Result Types

/// Top-level run result: carries resolved inputs + the mode-specific result.
pub struct WhoRun {
    pub resolved_input: WhoResolvedInput,
    pub result: WhoResult,
}

/// Resolved query parameters — computed once, used for robot JSON reproducibility.
pub struct WhoResolvedInput {
    pub mode: String,
    pub project_id: Option<i64>,
    pub project_path: Option<String>,
    pub since_ms: Option<i64>,
    pub since_iso: Option<String>,
    /// "default" (mode default applied), "explicit" (user provided --since), "none" (no window)
    pub since_mode: String,
    pub limit: u16,
}

/// Top-level result enum — one variant per mode.
pub enum WhoResult {
    Expert(ExpertResult),
    Workload(WorkloadResult),
    Reviews(ReviewsResult),
    Active(ActiveResult),
    Overlap(OverlapResult),
}

// --- Expert ---

pub struct ExpertResult {
    pub path_query: String,
    /// "exact" or "prefix" — how the path was matched in SQL.
    /// Helps agents and humans understand zero-result cases.
    pub path_match: String,
    pub experts: Vec<Expert>,
    pub truncated: bool,
}

pub struct Expert {
    pub username: String,
    pub score: i64,
    pub review_mr_count: u32,
    pub review_note_count: u32,
    pub author_mr_count: u32,
    pub last_seen_ms: i64,
}

// --- Workload ---

pub struct WorkloadResult {
    pub username: String,
    pub assigned_issues: Vec<WorkloadIssue>,
    pub authored_mrs: Vec<WorkloadMr>,
    pub reviewing_mrs: Vec<WorkloadMr>,
    pub unresolved_discussions: Vec<WorkloadDiscussion>,
    pub assigned_issues_truncated: bool,
    pub authored_mrs_truncated: bool,
    pub reviewing_mrs_truncated: bool,
    pub unresolved_discussions_truncated: bool,
}

pub struct WorkloadIssue {
    pub iid: i64,
    /// Canonical reference: `group/project#iid`
    pub ref_: String,
    pub title: String,
    pub project_path: String,
    pub updated_at: i64,
}

pub struct WorkloadMr {
    pub iid: i64,
    /// Canonical reference: `group/project!iid`
    pub ref_: String,
    pub title: String,
    pub draft: bool,
    pub project_path: String,
    pub author_username: Option<String>,
    pub updated_at: i64,
}

pub struct WorkloadDiscussion {
    pub entity_type: String,
    pub entity_iid: i64,
    /// Canonical reference: `group/project!iid` or `group/project#iid`
    pub ref_: String,
    pub entity_title: String,
    pub project_path: String,
    pub last_note_at: i64,
}

// --- Reviews ---

pub struct ReviewsResult {
    pub username: String,
    pub total_diffnotes: u32,
    pub categorized_count: u32,
    pub mrs_reviewed: u32,
    pub categories: Vec<ReviewCategory>,
}

pub struct ReviewCategory {
    pub name: String,
    pub count: u32,
    pub percentage: f64,
}

// --- Active ---

pub struct ActiveResult {
    pub discussions: Vec<ActiveDiscussion>,
    /// Count of unresolved discussions *within the time window*, not total across all time.
    pub total_unresolved_in_window: u32,
    pub truncated: bool,
}

pub struct ActiveDiscussion {
    pub discussion_id: i64,
    pub entity_type: String,
    pub entity_iid: i64,
    pub entity_title: String,
    pub project_path: String,
    pub last_note_at: i64,
    pub note_count: u32,
    pub participants: Vec<String>,
    pub participants_total: u32,
    pub participants_truncated: bool,
}

// --- Overlap ---

pub struct OverlapResult {
    pub path_query: String,
    /// "exact" or "prefix" — how the path was matched in SQL.
    pub path_match: String,
    pub users: Vec<OverlapUser>,
    pub truncated: bool,
}

pub struct OverlapUser {
    pub username: String,
    pub author_touch_count: u32,
    pub review_touch_count: u32,
    pub touch_count: u32,
    pub last_seen_at: i64,
    /// Stable MR references like "group/project!123"
    pub mr_refs: Vec<String>,
    pub mr_refs_total: u32,
    pub mr_refs_truncated: bool,
}

Overlap role tracking: OverlapUser tracks author_touch_count and review_touch_count separately instead of collapsing to a single role: String. This preserves valuable information — a user who is both author and reviewer on different MRs at the same path is a stronger signal than either role alone. The human output renders this as A, R, or A+R.

mr_refs instead of mr_iids: In a multi-project database, bare !123 is ambiguous — iid 123 could exist in multiple projects. mr_refs stores stable project-qualified references like group/project!123 that are globally unique and directly actionable.

discussion_id in ActiveDiscussion: Agents need stable IDs to dereference entities (e.g., follow up on a specific discussion). Without this, the combination of entity_type + entity_iid + project_path is the only way to identify a discussion, and that's fragile.

since_mode in WhoResolvedInput: Agents replaying queries need to distinguish "user explicitly asked for 6 months" ("explicit") from "6 months was the mode default" ("default") from "no time window at all" ("none", Workload mode). The boolean since_was_default was ambiguous for Workload, which has no default window — true incorrectly implied a default was applied. The tri-state resolves this: an explicit --since 6m means "the user cares about this window"; "default" means "use whatever's reasonable"; "none" means "query is unbounded".

Entry Point

/// Main entry point. Resolves mode + resolved inputs once, then dispatches.
pub fn run_who(config: &Config, args: &WhoArgs) -> Result<WhoRun> {
    let db_path = get_db_path(config.storage.db_path.as_deref());
    let conn = create_connection(&db_path)?;

    let project_id = args
        .project
        .as_deref()
        .map(|p| resolve_project(&conn, p))
        .transpose()?;

    let project_path = project_id
        .map(|id| lookup_project_path(&conn, id))
        .transpose()?;

    let mode = resolve_mode(args)?;

    // since_mode semantics:
    // - expert/reviews/active/overlap: default window applies if args.since is None -> "default"
    // - workload: no default window; args.since None => "none"
    let since_mode_for_defaulted = if args.since.is_some() { "explicit" } else { "default" };
    let since_mode_for_workload = if args.since.is_some() { "explicit" } else { "none" };

    match mode {
        WhoMode::Expert { path } => {
            let since_ms = resolve_since(args.since.as_deref(), "6m")?;
            let limit = usize::from(args.limit);
            let result = query_expert(&conn, &path, project_id, since_ms, limit)?;
            Ok(WhoRun {
                resolved_input: WhoResolvedInput {
                    mode: "expert".to_string(),
                    project_id,
                    project_path,
                    since_ms: Some(since_ms),
                    since_iso: Some(ms_to_iso(since_ms)),
                    since_mode: since_mode_for_defaulted.to_string(),
                    limit: args.limit,
                },
                result: WhoResult::Expert(result),
            })
        }
        WhoMode::Workload { username } => {
            let since_ms = args
                .since
                .as_deref()
                .map(|s| resolve_since_required(s))
                .transpose()?;
            let limit = usize::from(args.limit);
            let result = query_workload(&conn, username, project_id, since_ms, limit)?;
            Ok(WhoRun {
                resolved_input: WhoResolvedInput {
                    mode: "workload".to_string(),
                    project_id,
                    project_path,
                    since_ms,
                    since_iso: since_ms.map(ms_to_iso),
                    since_mode: since_mode_for_workload.to_string(),
                    limit: args.limit,
                },
                result: WhoResult::Workload(result),
            })
        }
        WhoMode::Reviews { username } => {
            let since_ms = resolve_since(args.since.as_deref(), "6m")?;
            let result = query_reviews(&conn, username, project_id, since_ms)?;
            Ok(WhoRun {
                resolved_input: WhoResolvedInput {
                    mode: "reviews".to_string(),
                    project_id,
                    project_path,
                    since_ms: Some(since_ms),
                    since_iso: Some(ms_to_iso(since_ms)),
                    since_mode: since_mode_for_defaulted.to_string(),
                    limit: args.limit,
                },
                result: WhoResult::Reviews(result),
            })
        }
        WhoMode::Active => {
            let since_ms = resolve_since(args.since.as_deref(), "7d")?;
            let limit = usize::from(args.limit);
            let result = query_active(&conn, project_id, since_ms, limit)?;
            Ok(WhoRun {
                resolved_input: WhoResolvedInput {
                    mode: "active".to_string(),
                    project_id,
                    project_path,
                    since_ms: Some(since_ms),
                    since_iso: Some(ms_to_iso(since_ms)),
                    since_mode: since_mode_for_defaulted.to_string(),
                    limit: args.limit,
                },
                result: WhoResult::Active(result),
            })
        }
        WhoMode::Overlap { path } => {
            let since_ms = resolve_since(args.since.as_deref(), "30d")?;
            let limit = usize::from(args.limit);
            let result = query_overlap(&conn, &path, project_id, since_ms, limit)?;
            Ok(WhoRun {
                resolved_input: WhoResolvedInput {
                    mode: "overlap".to_string(),
                    project_id,
                    project_path,
                    since_ms: Some(since_ms),
                    since_iso: Some(ms_to_iso(since_ms)),
                    since_mode: since_mode_for_defaulted.to_string(),
                    limit: args.limit,
                },
                result: WhoResult::Overlap(result),
            })
        }
    }
}

/// Look up the project path for a resolved project ID.
fn lookup_project_path(conn: &Connection, project_id: i64) -> Result<String> {
    conn.query_row(
        "SELECT path_with_namespace FROM projects WHERE id = ?1",
        rusqlite::params![project_id],
        |row| row.get(0),
    )
    .map_err(|e| LoreError::Other(format!("Failed to look up project path: {e}")))
}

/// Parse --since with a default fallback.
fn resolve_since(input: Option<&str>, default: &str) -> Result<i64> {
    let s = input.unwrap_or(default);
    parse_since(s).ok_or_else(|| {
        LoreError::Other(format!(
            "Invalid --since value: '{s}'. Use a duration (7d, 2w, 6m) or date (2024-01-15)"
        ))
    })
}

/// Parse --since without a default (returns error if invalid).
fn resolve_since_required(input: &str) -> Result<i64> {
    parse_since(input).ok_or_else(|| {
        LoreError::Other(format!(
            "Invalid --since value: '{input}'. Use a duration (7d, 2w, 6m) or date (2024-01-15)"
        ))
    })
}

Helper: Path Query Construction

All path-based queries (Expert, Overlap) need to convert user input into a SQL match. This logic is centralized in one helper to avoid duplication and ensure consistent behavior. The helper produces a PathQuery struct that indicates whether to use = (exact file match) or LIKE (directory prefix match), enabling the caller to select the appropriate static SQL string.

/// Describes how to match a user-supplied path in SQL.
struct PathQuery {
    /// The parameter value to bind.
    value: String,
    /// If true: use `LIKE value ESCAPE '\'`. If false: use `= value`.
    is_prefix: bool,
}

/// Build a path query from a user-supplied path, with project-scoped DB probes.
///
/// Rules:
/// - If the path ends with `/`, it's a directory prefix -> `escaped_path/%` (LIKE)
/// - If the path is a root path (no `/`) and does NOT end with `/`, treat as exact (=)
///   (this makes `--path Makefile` and `--path LICENSE` work as intended)
/// - Else if the last path segment contains `.`, heuristic suggests file (=)
/// - **Two-way DB probe** (project-scoped): when heuristics are ambiguous,
///   probe the DB to resolve. First probe exact path (scoped to project if
///   `-p` was given), then probe prefix (scoped). Only fall back to heuristics
///   if both probes fail. This prevents cross-project misclassification:
///   a path that exists as a file in project A shouldn't force exact-match
///   when the user runs `-p project_B` where it's a directory (or absent).
///   Each probe uses the existing `idx_notes_diffnote_path_created` partial
///   index and costs at most one indexed existence query.
/// - Otherwise, treat as directory prefix -> `escaped_path/%` (LIKE)
///
/// LIKE metacharacters (`%`, `_`, `\`) in the path are escaped with `\`.
/// All LIKE queries using this pattern MUST include `ESCAPE '\'`.
///
/// **IMPORTANT:** `escape_like()` is ONLY called for prefix (LIKE) matches.
/// For exact matches (`=`), the raw trimmed path is used directly — LIKE
/// metacharacters are not special in `=` comparisons, so escaping them
/// would break the match (e.g., `README\_with\_underscore.md` != stored
/// `README_with_underscore.md`).
///
/// Root file paths passed via `--path` (including dotless files like Makefile/LICENSE)
/// are treated as exact matches unless they end with `/`.
fn build_path_query(conn: &Connection, path: &str, project_id: Option<i64>) -> Result<PathQuery> {
    let trimmed = path.trim_end_matches('/');
    let last_segment = trimmed.rsplit('/').next().unwrap_or(trimmed);
    let is_root = !trimmed.contains('/');
    let forced_dir = path.ends_with('/');
    // Heuristic is now only a fallback; probes decide first when ambiguous.
    let looks_like_file = !forced_dir && (is_root || last_segment.contains('.'));

    // Probe 1: exact file exists (project-scoped via nullable binding)
    // Runs for ALL non-forced-dir paths, not just heuristically ambiguous ones.
    // This catches dotless files in subdirectories (src/Dockerfile, infra/Makefile)
    // that heuristics would misclassify as directories.
    let exact_exists = conn.query_row(
        "SELECT 1 FROM notes
         WHERE note_type = 'DiffNote'
           AND is_system = 0
           AND position_new_path = ?1
           AND (?2 IS NULL OR project_id = ?2)
         LIMIT 1",
        rusqlite::params![trimmed, project_id],
        |_| Ok(()),
    ).is_ok();

    // Probe 2: directory prefix exists (project-scoped)
    // Only probed when not forced-dir and exact didn't match — avoids
    // redundant work when we already know the answer.
    let prefix_exists = if !forced_dir && !exact_exists {
        let escaped = escape_like(trimmed);
        let pat = format!("{escaped}/%");
        conn.query_row(
            "SELECT 1 FROM notes
             WHERE note_type = 'DiffNote'
               AND is_system = 0
               AND position_new_path LIKE ?1 ESCAPE '\\'
               AND (?2 IS NULL OR project_id = ?2)
             LIMIT 1",
            rusqlite::params![pat, project_id],
            |_| Ok(()),
        ).is_ok()
    } else {
        false
    };

    // Forced directory always wins; otherwise: exact > prefix > heuristic
    let is_file = if forced_dir {
        false
    } else if exact_exists {
        true
    } else if prefix_exists {
        false
    } else {
        looks_like_file
    };

    if is_file {
        // IMPORTANT: do NOT escape for exact match (=). LIKE metacharacters
        // are not special in `=`, so escaping would produce wrong values.
        Ok(PathQuery {
            value: trimmed.to_string(),
            is_prefix: false,
        })
    } else {
        let escaped = escape_like(trimmed);
        Ok(PathQuery {
            value: format!("{escaped}/%"),
            is_prefix: true,
        })
    }
}

/// Escape LIKE metacharacters. All queries using this must include `ESCAPE '\'`.
fn escape_like(input: &str) -> String {
    input
        .replace('\\', "\\\\")
        .replace('%', "\\%")
        .replace('_', "\\_")
}

Path classification rationale: The original plan used path.contains('.') to detect files, which misclassifies directories like .github/workflows/ or src/v1.2/auth/. The improved approach checks only the last path segment for a dot, and respects trailing / as an explicit directory signal. Root files without / (like README.md) can't be passed as a positional arg (would be treated as a username); they require --path.

Root path handling (dotless files like LICENSE, Makefile): Root paths — those with no / separator — passed via --path are treated as exact matches by default. This makes --path Makefile and --path LICENSE produce correct results instead of the previous behavior of generating a prefix Makefile/% that would return zero results. Users can append / (e.g., --path Makefile/) to force prefix matching if desired.

Dotless subdirectory file handling (src/Dockerfile, infra/Makefile): Files like src/Dockerfile have no extension in their last segment, making them indistinguishable from directories via heuristics alone. Rather than maintaining a brittle allowlist of known extensionless filenames, build_path_query performs project-scoped two-way DB probes — first exact, then prefix — each a SELECT 1 ... LIMIT 1 against the existing partial index. If the exact path exists in DiffNotes (within the scoped project, when -p is given), it's treated as a file (exact match). If only prefix matches exist, it's treated as a directory. If neither probe finds data, heuristics take over. This costs at most two cheap indexed queries per invocation and prevents cross-project misclassification: a path that exists as a dotless file in project A shouldn't force exact-match behavior when running -p project_B where it doesn't exist.

Exact vs prefix SQL selection: ChatGPT feedback-3 correctly identified that for exact file paths (e.g., src/auth/login.rs), using LIKE ?1 ESCAPE '\' where ?1 has no wildcard is logically correct but gives the query planner a weaker signal than =. The PathQuery.is_prefix flag lets callers select between two static SQL strings at prepare time — no dynamic SQL assembly, just if pq.is_prefix { conn.prepare_cached(sql_prefix)? } else { conn.prepare_cached(sql_exact)? }. This gives SQLite the best possible information for both cases.

Query: Expert Mode

fn query_expert(
    conn: &Connection,
    path: &str,
    project_id: Option<i64>,
    since_ms: i64,
    limit: usize,
) -> Result<ExpertResult> {
    let pq = build_path_query(conn, path, project_id)?;
    let limit_plus_one = (limit + 1) as i64;

    // Two static SQL strings: one for prefix (LIKE), one for exact (=).
    // Both produce identical output columns; only the path predicate differs.
    // Scoring is done entirely in SQL via CTE, so no Rust-side merge/sort needed.
    // Reviewer branch: JOINs through discussions -> merge_requests to:
    //   1. COUNT(DISTINCT m.id) — MR breadth, the primary scoring signal
    //   2. COUNT(*) — note intensity, a secondary signal
    //   3. Exclude self-reviews (n.author_username != m.author_username)
    //
    // Author branch: COUNT(DISTINCT m.id) for MR breadth.
    //
    // Scoring uses integer arithmetic:
    //   review_mr_count * 20 + author_mr_count * 12 + review_note_count * 1
    // This weights MR breadth heavily (prevents "comment storm" gaming)
    // while still giving a small bonus for review intensity.
    let sql_prefix = "
        WITH activity AS (
            SELECT
                n.author_username AS username,
                'reviewer' AS role,
                COUNT(DISTINCT m.id) AS mr_cnt,
                COUNT(*) AS note_cnt,
                MAX(n.created_at) AS last_seen_at
            FROM notes n
            JOIN discussions d ON n.discussion_id = d.id
            JOIN merge_requests m ON d.merge_request_id = m.id
            WHERE n.note_type = 'DiffNote'
              AND n.is_system = 0
              AND n.author_username IS NOT NULL
              AND (m.author_username IS NULL OR n.author_username != m.author_username)
              AND m.state IN ('opened','merged')
              AND n.position_new_path LIKE ?1 ESCAPE '\\'
              AND n.created_at >= ?2
              AND (?3 IS NULL OR n.project_id = ?3)
            GROUP BY n.author_username

            UNION ALL

            SELECT
                m.author_username AS username,
                'author' AS role,
                COUNT(DISTINCT m.id) AS mr_cnt,
                0 AS note_cnt,
                MAX(n.created_at) AS last_seen_at
            FROM merge_requests m
            JOIN discussions d ON d.merge_request_id = m.id
            JOIN notes n ON n.discussion_id = d.id
            WHERE n.note_type = 'DiffNote'
              AND n.is_system = 0
              AND m.author_username IS NOT NULL
              AND n.position_new_path LIKE ?1 ESCAPE '\\'
              AND n.created_at >= ?2
              AND (?3 IS NULL OR n.project_id = ?3)
            GROUP BY m.author_username
        )
        SELECT
            username,
            SUM(CASE WHEN role = 'reviewer' THEN mr_cnt ELSE 0 END) AS review_mr_count,
            SUM(CASE WHEN role = 'reviewer' THEN note_cnt ELSE 0 END) AS review_note_count,
            SUM(CASE WHEN role = 'author' THEN mr_cnt ELSE 0 END) AS author_mr_count,
            MAX(last_seen_at) AS last_seen_at,
            (
              (SUM(CASE WHEN role = 'reviewer' THEN mr_cnt ELSE 0 END) * 20) +
              (SUM(CASE WHEN role = 'author' THEN mr_cnt ELSE 0 END) * 12) +
              (SUM(CASE WHEN role = 'reviewer' THEN note_cnt ELSE 0 END) * 1)
            ) AS score
        FROM activity
        GROUP BY username
        ORDER BY score DESC, last_seen_at DESC, username ASC
        LIMIT ?4
    ";

    let sql_exact = "
        WITH activity AS (
            SELECT
                n.author_username AS username,
                'reviewer' AS role,
                COUNT(DISTINCT m.id) AS mr_cnt,
                COUNT(*) AS note_cnt,
                MAX(n.created_at) AS last_seen_at
            FROM notes n
            JOIN discussions d ON n.discussion_id = d.id
            JOIN merge_requests m ON d.merge_request_id = m.id
            WHERE n.note_type = 'DiffNote'
              AND n.is_system = 0
              AND n.author_username IS NOT NULL
              AND (m.author_username IS NULL OR n.author_username != m.author_username)
              AND m.state IN ('opened','merged')
              AND n.position_new_path = ?1
              AND n.created_at >= ?2
              AND (?3 IS NULL OR n.project_id = ?3)
            GROUP BY n.author_username

            UNION ALL

            SELECT
                m.author_username AS username,
                'author' AS role,
                COUNT(DISTINCT m.id) AS mr_cnt,
                0 AS note_cnt,
                MAX(n.created_at) AS last_seen_at
            FROM merge_requests m
            JOIN discussions d ON d.merge_request_id = m.id
            JOIN notes n ON n.discussion_id = d.id
            WHERE n.note_type = 'DiffNote'
              AND n.is_system = 0
              AND m.author_username IS NOT NULL
              AND n.position_new_path = ?1
              AND n.created_at >= ?2
              AND (?3 IS NULL OR n.project_id = ?3)
            GROUP BY m.author_username
        )
        SELECT
            username,
            SUM(CASE WHEN role = 'reviewer' THEN mr_cnt ELSE 0 END) AS review_mr_count,
            SUM(CASE WHEN role = 'reviewer' THEN note_cnt ELSE 0 END) AS review_note_count,
            SUM(CASE WHEN role = 'author' THEN mr_cnt ELSE 0 END) AS author_mr_count,
            MAX(last_seen_at) AS last_seen_at,
            (
              (SUM(CASE WHEN role = 'reviewer' THEN mr_cnt ELSE 0 END) * 20) +
              (SUM(CASE WHEN role = 'author' THEN mr_cnt ELSE 0 END) * 12) +
              (SUM(CASE WHEN role = 'reviewer' THEN note_cnt ELSE 0 END) * 1)
            ) AS score
        FROM activity
        GROUP BY username
        ORDER BY score DESC, last_seen_at DESC, username ASC
        LIMIT ?4
    ";

    let mut stmt = if pq.is_prefix {
        conn.prepare_cached(sql_prefix)?
    } else {
        conn.prepare_cached(sql_exact)?
    };

    let experts: Vec<Expert> = stmt
        .query_map(
            rusqlite::params![pq.value, since_ms, project_id, limit_plus_one],
            |row| {
                Ok(Expert {
                    username: row.get(0)?,
                    review_mr_count: row.get(1)?,
                    review_note_count: row.get(2)?,
                    author_mr_count: row.get(3)?,
                    last_seen_ms: row.get(4)?,
                    score: row.get(5)?,
                })
            },
        )?
        .collect::<std::result::Result<Vec<_>, _>>()?;

    let truncated = experts.len() > limit;
    let experts: Vec<Expert> = experts.into_iter().take(limit).collect();

    Ok(ExpertResult {
        path_query: path.to_string(),
        path_match: if pq.is_prefix { "prefix" } else { "exact" }.to_string(),
        experts,
        truncated,
    })
}

Key design decisions:

note_type = 'DiffNote' filter — Only DiffNotes have reliable position_new_path. Including other note types skews counts and scans more rows.
Two static SQL strings (prefix vs exact) — PathQuery.is_prefix selects the right one at prepare time. For exact file paths, = gives the planner a stronger signal than LIKE without wildcards. Both strings are &str literals — no format!().
(?3 IS NULL OR n.project_id = ?3) — Static SQL with nullable binding, matching the timeline_seed.rs pattern. Eliminates dynamic SQL assembly and param vector juggling. Critically, scoping is on n.project_id (not m.project_id) so SQLite can use idx_notes_diffnote_path_created which has project_id as a trailing column.
n.is_system = 0 on BOTH branches — Author branch filters system notes too, matching reviewer branch for consistency. System-generated DiffNotes (bot activity, CI annotations) would skew "author touch" counts.
Author branch uses n.created_at instead of m.updated_at — Anchors "last seen" to actual review activity at the path, not just MR update time (which could be a label change, rebase, etc.). 5b. MR state filtering on reviewer branches — m.state IN ('opened','merged') excludes closed/unmerged noise from reviewer branches, matching the existing filter on Overlap author branches. Closed MRs that were never merged are low-signal; if archaeology across abandoned MRs is needed later, a separate flag can enable it.
SQL-level aggregation and scoring via CTE — All aggregation (reviewer + author merge), scoring, sorting, and LIMIT happens in SQL. No Rust-side HashMap merge/sort/truncate needed. This reduces materialized rows, makes LIMIT effective at the DB level, and is deterministic (tiebreaker: last_seen_at DESC, username ASC).
prepare_cached() — Static SQL enables SQLite prepared statement caching for free.
Self-review exclusion — Reviewer branch JOINs through discussions -> merge_requests and filters n.author_username != m.author_username (with IS NULL guard). MR authors replying to DiffNotes on their own diffs (clarifications, followups) are NOT reviewers. Without this, an author who leaves 30 clarification comments on their own MR would appear as the top "reviewer" for that path. Reviews mode already does this (m.author_username != ?1); now Expert and Overlap are consistent.
MR-breadth scoring — Reviewer branch uses COUNT(DISTINCT m.id) as the primary metric (how many MRs reviewed) instead of raw DiffNote count (how many comments). This prevents "comment storm" gaming — someone who writes 30 comments on one MR shouldn't outrank someone who reviewed 5 different MRs. Note count is kept as a secondary intensity signal (review_note_count * 1 in the score formula). Author branch was already counting distinct MRs.
Truncation — Queries request LIMIT + 1 rows. If more than limit rows are returned, truncated: true is set and results are trimmed. Both human and robot output surface this so consumers know to retry with a higher --limit.

Query: Workload Mode

fn query_workload(
    conn: &Connection,
    username: &str,
    project_id: Option<i64>,
    since_ms: Option<i64>,
    limit: usize,
) -> Result<WorkloadResult> {
    let limit_plus_one = (limit + 1) as i64;

    // Query 1: Open issues assigned to user
    // SQL computes canonical ref (group/project#iid) for direct copy-paste use
    let issues_sql =
        "SELECT i.iid,
                (p.path_with_namespace || '#' || i.iid) AS ref,
                i.title, p.path_with_namespace, i.updated_at
         FROM issues i
         JOIN issue_assignees ia ON ia.issue_id = i.id
         JOIN projects p ON i.project_id = p.id
         WHERE ia.username = ?1
           AND i.state = 'opened'
           AND (?2 IS NULL OR i.project_id = ?2)
           AND (?3 IS NULL OR i.updated_at >= ?3)
         ORDER BY i.updated_at DESC
         LIMIT ?4";

    let mut stmt = conn.prepare_cached(issues_sql)?;
    let assigned_issues: Vec<WorkloadIssue> = stmt
        .query_map(rusqlite::params![username, project_id, since_ms, limit_plus_one], |row| {
            Ok(WorkloadIssue {
                iid: row.get(0)?,
                ref_: row.get(1)?,
                title: row.get(2)?,
                project_path: row.get(3)?,
                updated_at: row.get(4)?,
            })
        })?
        .collect::<std::result::Result<Vec<_>, _>>()?;

    // Query 2: Open MRs authored
    // SQL computes canonical ref (group/project!iid) for direct copy-paste use
    let authored_sql =
        "SELECT m.iid,
                (p.path_with_namespace || '!' || m.iid) AS ref,
                m.title, m.draft, p.path_with_namespace, m.updated_at
         FROM merge_requests m
         JOIN projects p ON m.project_id = p.id
         WHERE m.author_username = ?1
           AND m.state = 'opened'
           AND (?2 IS NULL OR m.project_id = ?2)
           AND (?3 IS NULL OR m.updated_at >= ?3)
         ORDER BY m.updated_at DESC
         LIMIT ?4";
    let mut stmt = conn.prepare_cached(authored_sql)?;
    let authored_mrs: Vec<WorkloadMr> = stmt
        .query_map(rusqlite::params![username, project_id, since_ms, limit_plus_one], |row| {
            Ok(WorkloadMr {
                iid: row.get(0)?,
                ref_: row.get(1)?,
                title: row.get(2)?,
                draft: row.get::<_, i32>(3)? != 0,
                project_path: row.get(4)?,
                author_username: None,
                updated_at: row.get(5)?,
            })
        })?
        .collect::<std::result::Result<Vec<_>, _>>()?;

    // Query 3: Open MRs where user is reviewer
    let reviewing_sql =
        "SELECT m.iid,
                (p.path_with_namespace || '!' || m.iid) AS ref,
                m.title, m.draft, p.path_with_namespace,
                m.author_username, m.updated_at
         FROM merge_requests m
         JOIN mr_reviewers r ON r.merge_request_id = m.id
         JOIN projects p ON m.project_id = p.id
         WHERE r.username = ?1
           AND m.state = 'opened'
           AND (?2 IS NULL OR m.project_id = ?2)
           AND (?3 IS NULL OR m.updated_at >= ?3)
         ORDER BY m.updated_at DESC
         LIMIT ?4";
    let mut stmt = conn.prepare_cached(reviewing_sql)?;
    let reviewing_mrs: Vec<WorkloadMr> = stmt
        .query_map(rusqlite::params![username, project_id, since_ms, limit_plus_one], |row| {
            Ok(WorkloadMr {
                iid: row.get(0)?,
                ref_: row.get(1)?,
                title: row.get(2)?,
                draft: row.get::<_, i32>(3)? != 0,
                project_path: row.get(4)?,
                author_username: row.get(5)?,
                updated_at: row.get(6)?,
            })
        })?
        .collect::<std::result::Result<Vec<_>, _>>()?;

    // Query 4: Unresolved discussions where user participated
    // Note: --since is intentionally NOT applied by default to discussions.
    // Unresolved threads remain relevant regardless of age. When --since is
    // explicitly provided, it filters on d.last_note_at to show only threads
    // with recent activity.
    // Canonical ref uses CASE to pick '#' for issues, '!' for MRs
    let disc_sql =
        "SELECT d.noteable_type,
                COALESCE(i.iid, m.iid) AS entity_iid,
                (p.path_with_namespace ||
                 CASE WHEN d.noteable_type = 'MergeRequest' THEN '!' ELSE '#' END ||
                 COALESCE(i.iid, m.iid)) AS ref,
                COALESCE(i.title, m.title) AS entity_title,
                p.path_with_namespace,
                d.last_note_at
         FROM discussions d
         JOIN projects p ON d.project_id = p.id
         LEFT JOIN issues i ON d.issue_id = i.id
         LEFT JOIN merge_requests m ON d.merge_request_id = m.id
         WHERE d.resolvable = 1 AND d.resolved = 0
           AND EXISTS (
             SELECT 1 FROM notes n
             WHERE n.discussion_id = d.id
             AND n.author_username = ?1
             AND n.is_system = 0
           )
           AND (?2 IS NULL OR d.project_id = ?2)
           AND (?3 IS NULL OR d.last_note_at >= ?3)
         ORDER BY d.last_note_at DESC
         LIMIT ?4";

    let mut stmt = conn.prepare_cached(disc_sql)?;
    let unresolved_discussions: Vec<WorkloadDiscussion> = stmt
        .query_map(rusqlite::params![username, project_id, since_ms, limit_plus_one], |row| {
            let noteable_type: String = row.get(0)?;
            let entity_type = if noteable_type == "MergeRequest" {
                "MR"
            } else {
                "Issue"
            };
            Ok(WorkloadDiscussion {
                entity_type: entity_type.to_string(),
                entity_iid: row.get(1)?,
                ref_: row.get(2)?,
                entity_title: row.get(3)?,
                project_path: row.get(4)?,
                last_note_at: row.get(5)?,
            })
        })?
        .collect::<std::result::Result<Vec<_>, _>>()?;

    // Truncation detection: each section queried LIMIT+1 rows.
    // Detect overflow, set truncated flags, trim to requested limit.
    let assigned_issues_truncated = assigned_issues.len() > limit;
    let authored_mrs_truncated = authored_mrs.len() > limit;
    let reviewing_mrs_truncated = reviewing_mrs.len() > limit;
    let unresolved_discussions_truncated = unresolved_discussions.len() > limit;

    let assigned_issues: Vec<WorkloadIssue> = assigned_issues.into_iter().take(limit).collect();
    let authored_mrs: Vec<WorkloadMr> = authored_mrs.into_iter().take(limit).collect();
    let reviewing_mrs: Vec<WorkloadMr> = reviewing_mrs.into_iter().take(limit).collect();
    let unresolved_discussions: Vec<WorkloadDiscussion> = unresolved_discussions.into_iter().take(limit).collect();

    Ok(WorkloadResult {
        username: username.to_string(),
        assigned_issues,
        authored_mrs,
        reviewing_mrs,
        unresolved_discussions,
        assigned_issues_truncated,
        authored_mrs_truncated,
        reviewing_mrs_truncated,
        unresolved_discussions_truncated,
    })
}

Key design decisions:

Fully static SQL — All four queries are &str literals with LIMIT ?4 bound as a parameter. No format!() anywhere. This enables SQLite prepared statement caching and eliminates dynamic SQL creep.
prepare_cached() — All four queries use prepare_cached() for statement caching.
Unified parameter binding — All four queries use rusqlite::params![username, project_id, since_ms, limit_plus_one] with (?N IS NULL OR ...). Eliminates the disc_params / disc_param_refs duplication.
Consistent --since behavior — The since_ms param (which is Option<i64>) is bound in all four queries. When None (user didn't pass --since), the (?3 IS NULL OR ...) clause is a no-op. This means --since consistently filters everything when provided, but doesn't artificially restrict when omitted. Unresolved discussions still show regardless of age by default, but respect --since when it's explicitly set.
Canonical ref_ field — SQL computes project-qualified references (group/project#iid for issues, group/project!iid for MRs) directly in the SELECT. This single copy-pasteable token replaces the need to display iid + project_path separately in human output, and reduces agent stitching work in robot JSON. Both ref and project_path are included in robot output for flexibility.

Query: Reviews Mode

fn query_reviews(
    conn: &Connection,
    username: &str,
    project_id: Option<i64>,
    since_ms: i64,
) -> Result<ReviewsResult> {
    // Count total DiffNotes by this user on MRs they didn't author
    let total_sql =
        "SELECT COUNT(*) FROM notes n
         JOIN discussions d ON n.discussion_id = d.id
         JOIN merge_requests m ON d.merge_request_id = m.id
         WHERE n.author_username = ?1
           AND n.note_type = 'DiffNote'
           AND n.is_system = 0
           AND m.author_username != ?1
           AND n.created_at >= ?2
           AND (?3 IS NULL OR n.project_id = ?3)";

    let total_diffnotes: u32 =
        conn.query_row(total_sql, rusqlite::params![username, since_ms, project_id], |row| row.get(0))?;

    // Count distinct MRs reviewed
    let mrs_sql =
        "SELECT COUNT(DISTINCT m.id) FROM notes n
         JOIN discussions d ON n.discussion_id = d.id
         JOIN merge_requests m ON d.merge_request_id = m.id
         WHERE n.author_username = ?1
           AND n.note_type = 'DiffNote'
           AND n.is_system = 0
           AND m.author_username != ?1
           AND n.created_at >= ?2
           AND (?3 IS NULL OR n.project_id = ?3)";

    let mrs_reviewed: u32 =
        conn.query_row(mrs_sql, rusqlite::params![username, since_ms, project_id], |row| row.get(0))?;

    // Extract prefixed categories: body starts with **prefix**
    // ltrim(n.body) tolerates leading whitespace (e.g., " **suggestion**: ...")
    // which is common in practice but would be missed by a strict LIKE '**%**%'.
    let cat_sql =
        "SELECT
            SUBSTR(ltrim(n.body), 3, INSTR(SUBSTR(ltrim(n.body), 3), '**') - 1) AS raw_prefix,
            COUNT(*) AS cnt
         FROM notes n
         JOIN discussions d ON n.discussion_id = d.id
         JOIN merge_requests m ON d.merge_request_id = m.id
         WHERE n.author_username = ?1
           AND n.note_type = 'DiffNote'
           AND n.is_system = 0
           AND m.author_username != ?1
           AND ltrim(n.body) LIKE '**%**%'
           AND n.created_at >= ?2
           AND (?3 IS NULL OR n.project_id = ?3)
         GROUP BY raw_prefix
         ORDER BY cnt DESC";

    let mut stmt = conn.prepare_cached(cat_sql)?;
    let raw_categories: Vec<(String, u32)> = stmt
        .query_map(rusqlite::params![username, since_ms, project_id], |row| {
            Ok((row.get::<_, String>(0)?, row.get(1)?))
        })?
        .collect::<std::result::Result<Vec<_>, _>>()?;

    // Normalize categories: lowercase, strip trailing colon/space,
    // merge nit/nitpick variants, merge (non-blocking) variants
    let mut merged: HashMap<String, u32> = HashMap::new();
    for (raw, count) in &raw_categories {
        let normalized = normalize_review_prefix(raw);
        if !normalized.is_empty() {
            *merged.entry(normalized).or_insert(0) += count;
        }
    }

    let categorized_count: u32 = merged.values().sum();

    let mut categories: Vec<ReviewCategory> = merged
        .into_iter()
        .map(|(name, count)| {
            let percentage = if categorized_count > 0 {
                f64::from(count) / f64::from(categorized_count) * 100.0
            } else {
                0.0
            };
            ReviewCategory {
                name,
                count,
                percentage,
            }
        })
        .collect();

    categories.sort_by(|a, b| b.count.cmp(&a.count));

    Ok(ReviewsResult {
        username: username.to_string(),
        total_diffnotes,
        categorized_count,
        mrs_reviewed,
        categories,
    })
}

/// Normalize a raw review prefix like "Suggestion (non-blocking):" into "suggestion".
fn normalize_review_prefix(raw: &str) -> String {
    let s = raw
        .trim()
        .trim_end_matches(':')
        .trim()
        .to_lowercase();

    // Strip "(non-blocking)" and similar parentheticals
    let s = if let Some(idx) = s.find('(') {
        s[..idx].trim().to_string()
    } else {
        s
    };

    // Merge nit/nitpick variants
    match s.as_str() {
        "nitpick" | "nit" => "nit".to_string(),
        other => other.to_string(),
    }
}

Query: Active Mode

fn query_active(
    conn: &Connection,
    project_id: Option<i64>,
    since_ms: i64,
    limit: usize,
) -> Result<ActiveResult> {
    let limit_plus_one = (limit + 1) as i64;

    // Total unresolved count — two static variants to avoid nullable-OR planner ambiguity.
    // The (?N IS NULL OR ...) pattern prevents SQLite from knowing at prepare time whether
    // the constraint is active, potentially choosing a suboptimal index. Separate statements
    // ensure the intended index (global vs project-scoped) is used.
    let total_sql_global =
        "SELECT COUNT(*) FROM discussions d
         WHERE d.resolvable = 1 AND d.resolved = 0
           AND d.last_note_at >= ?1";
    let total_sql_scoped =
        "SELECT COUNT(*) FROM discussions d
         WHERE d.resolvable = 1 AND d.resolved = 0
           AND d.last_note_at >= ?1
           AND d.project_id = ?2";

    let total_unresolved_in_window: u32 = match project_id {
        None => conn.query_row(total_sql_global, rusqlite::params![since_ms], |row| row.get(0))?,
        Some(pid) => conn.query_row(total_sql_scoped, rusqlite::params![since_ms, pid], |row| row.get(0))?,
    };

    // Active discussions with context — two static SQL variants (global vs project-scoped).
    //
    // CTE-based approach: first select the limited set of discussions (using
    // idx_discussions_unresolved_recent or idx_discussions_unresolved_recent_global),
    // then join notes ONCE to aggregate counts and participants. This avoids two
    // correlated subqueries per discussion row (note_count + participants), which can
    // create spiky performance if the planner chooses poorly.
    //
    // NOTE on GROUP_CONCAT: SQLite does not support
    //   GROUP_CONCAT(DISTINCT col, separator)
    // with both DISTINCT and a custom separator. The participants CTE uses a
    // subquery with SELECT DISTINCT first, then GROUP_CONCAT with the separator.
    //
    // Why two variants instead of (?2 IS NULL OR d.project_id = ?2):
    // SQLite evaluates `IS NULL` at runtime, not prepare time. With the nullable-OR
    // pattern, the planner can't assume project_id is constrained and may choose
    // idx_discussions_unresolved_recent (which leads with project_id) even for the
    // global case, forcing a full scan + sort. Separate statements let each use
    // the optimal index: global uses idx_discussions_unresolved_recent_global,
    // scoped uses idx_discussions_unresolved_recent.
    let sql_global = "
        WITH picked AS (
            SELECT d.id, d.noteable_type, d.issue_id, d.merge_request_id,
                   d.project_id, d.last_note_at
            FROM discussions d
            WHERE d.resolvable = 1 AND d.resolved = 0
              AND d.last_note_at >= ?1
            ORDER BY d.last_note_at DESC
            LIMIT ?2
        ),
        note_counts AS (
            SELECT
                n.discussion_id,
                COUNT(*) AS note_count
            FROM notes n
            JOIN picked p ON p.id = n.discussion_id
            WHERE n.is_system = 0
            GROUP BY n.discussion_id
        ),
        participants AS (
            SELECT
                x.discussion_id,
                GROUP_CONCAT(x.author_username, X'1F') AS participants
            FROM (
                SELECT DISTINCT n.discussion_id, n.author_username
                FROM notes n
                JOIN picked p ON p.id = n.discussion_id
                WHERE n.is_system = 0 AND n.author_username IS NOT NULL
            ) x
            GROUP BY x.discussion_id
        )
        SELECT
            p.id AS discussion_id,
            p.noteable_type,
            COALESCE(i.iid, m.iid) AS entity_iid,
            COALESCE(i.title, m.title) AS entity_title,
            proj.path_with_namespace,
            p.last_note_at,
            COALESCE(nc.note_count, 0) AS note_count,
            COALESCE(pa.participants, '') AS participants
        FROM picked p
        JOIN projects proj ON p.project_id = proj.id
        LEFT JOIN issues i ON p.issue_id = i.id
        LEFT JOIN merge_requests m ON p.merge_request_id = m.id
        LEFT JOIN note_counts nc ON nc.discussion_id = p.id
        LEFT JOIN participants pa ON pa.discussion_id = p.id
        ORDER BY p.last_note_at DESC
    ";

    let sql_scoped = "
        WITH picked AS (
            SELECT d.id, d.noteable_type, d.issue_id, d.merge_request_id,
                   d.project_id, d.last_note_at
            FROM discussions d
            WHERE d.resolvable = 1 AND d.resolved = 0
              AND d.last_note_at >= ?1
              AND d.project_id = ?2
            ORDER BY d.last_note_at DESC
            LIMIT ?3
        ),
        note_counts AS (
            SELECT
                n.discussion_id,
                COUNT(*) AS note_count
            FROM notes n
            JOIN picked p ON p.id = n.discussion_id
            WHERE n.is_system = 0
            GROUP BY n.discussion_id
        ),
        participants AS (
            SELECT
                x.discussion_id,
                GROUP_CONCAT(x.author_username, X'1F') AS participants
            FROM (
                SELECT DISTINCT n.discussion_id, n.author_username
                FROM notes n
                JOIN picked p ON p.id = n.discussion_id
                WHERE n.is_system = 0 AND n.author_username IS NOT NULL
            ) x
            GROUP BY x.discussion_id
        )
        SELECT
            p.id AS discussion_id,
            p.noteable_type,
            COALESCE(i.iid, m.iid) AS entity_iid,
            COALESCE(i.title, m.title) AS entity_title,
            proj.path_with_namespace,
            p.last_note_at,
            COALESCE(nc.note_count, 0) AS note_count,
            COALESCE(pa.participants, '') AS participants
        FROM picked p
        JOIN projects proj ON p.project_id = proj.id
        LEFT JOIN issues i ON p.issue_id = i.id
        LEFT JOIN merge_requests m ON p.merge_request_id = m.id
        LEFT JOIN note_counts nc ON nc.discussion_id = p.id
        LEFT JOIN participants pa ON pa.discussion_id = p.id
        ORDER BY p.last_note_at DESC
    ";

    // Row-mapping closure shared between both variants
    let map_row = |row: &rusqlite::Row| -> rusqlite::Result<ActiveDiscussion> {
        let noteable_type: String = row.get(1)?;
        let entity_type = if noteable_type == "MergeRequest" {
            "MR"
        } else {
            "Issue"
        };
        let participants_csv: Option<String> = row.get(7)?;
        // Sort participants for deterministic output — GROUP_CONCAT order is undefined
        let mut participants: Vec<String> = participants_csv
            .as_deref()
            .filter(|s| !s.is_empty())
            .map(|csv| csv.split('\x1F').map(String::from).collect())
            .unwrap_or_default();
        participants.sort();

        const MAX_PARTICIPANTS: usize = 50;
        let participants_total = participants.len() as u32;
        let participants_truncated = participants.len() > MAX_PARTICIPANTS;
        if participants_truncated {
            participants.truncate(MAX_PARTICIPANTS);
        }

        Ok(ActiveDiscussion {
            discussion_id: row.get(0)?,
            entity_type: entity_type.to_string(),
            entity_iid: row.get(2)?,
            entity_title: row.get(3)?,
            project_path: row.get(4)?,
            last_note_at: row.get(5)?,
            note_count: row.get(6)?,
            participants,
            participants_total,
            participants_truncated,
        })
    };

    // Select variant first, then prepare exactly one statement
    let discussions: Vec<ActiveDiscussion> = match project_id {
        None => {
            let mut stmt = conn.prepare_cached(sql_global)?;
            stmt.query_map(rusqlite::params![since_ms, limit_plus_one], &map_row)?
                .collect::<std::result::Result<Vec<_>, _>>()?
        }
        Some(pid) => {
            let mut stmt = conn.prepare_cached(sql_scoped)?;
            stmt.query_map(rusqlite::params![since_ms, pid, limit_plus_one], &map_row)?
                .collect::<std::result::Result<Vec<_>, _>>()?
        }
    };

    let truncated = discussions.len() > limit;
    let discussions: Vec<ActiveDiscussion> = discussions.into_iter().take(limit).collect();

    Ok(ActiveResult {
        discussions,
        total_unresolved_in_window,
        truncated,
    })
}

CTE-based refactor (from feedback-3 Change 5, corrected in feedback-5 Change 2): The original plan used two correlated subqueries per discussion row (one for note_count, one for participants). While not catastrophic with LIMIT 20, it creates "spiky" behavior and unnecessary work if the planner chooses poorly. The CTE approach:

picked — selects the limited set of discussions first (hits idx_discussions_unresolved_recent or idx_discussions_unresolved_recent_global)
note_counts — counts all non-system notes per picked discussion (actual note count, not participant count)
participants — distinct usernames per picked discussion, then GROUP_CONCAT
Final SELECT — joins everything together

Why note_counts and participants are separate CTEs: The iteration-4 plan used a single note_agg CTE that first did SELECT DISTINCT discussion_id, author_username then COUNT(*). This produced a participant count, not a note count. A discussion with 5 notes from 2 people would show note_count: 2 instead of note_count: 5. Splitting into two CTEs fixes the semantic mismatch while keeping both scoped to only the picked discussions.

This is structurally cleaner and more predictable than correlated subqueries.

Two static SQL variants (global vs scoped): The (?N IS NULL OR d.project_id = ?N) nullable-OR pattern undermines index selection for Active mode because SQLite evaluates the IS NULL check at runtime, not prepare time. With both idx_discussions_unresolved_recent (leading with project_id) and idx_discussions_unresolved_recent_global (single-column last_note_at), the planner may choose a "good enough for both" plan that is optimal for neither. Using two static SQL strings — sql_global (no project predicate, uses global index) and sql_scoped (explicit d.project_id = ?2, uses scoped index) — selected at runtime via match project_id ensures the intended index is always used. Both are prepared via prepare_cached() and only one is prepared per invocation.

LIMIT is parameterized — not assembled via format!(). This keeps the SQL fully static and enables prepared statement caching. In the global variant it's ?2, in the scoped variant it's ?3.

Query: Overlap Mode

fn query_overlap(
    conn: &Connection,
    path: &str,
    project_id: Option<i64>,
    since_ms: i64,
    limit: usize,
) -> Result<OverlapResult> {
    let pq = build_path_query(conn, path, project_id)?;

    // Two static SQL strings: prefix (LIKE) vs exact (=).
    // Both produce project-qualified MR references (group/project!iid)
    // via JOIN to projects table, eliminating multi-project ambiguity.
    // Reviewer branch excludes self-reviews (n.author_username != m.author_username)
    // to prevent MR authors from inflating reviewer counts.
    let sql_prefix = "SELECT username, role, touch_count, last_seen_at, mr_refs FROM (
            -- Reviewers who left DiffNotes on files matching the path.
            -- Self-reviews excluded: MR authors commenting on their own diffs
            -- are clarifications, not reviews.
            SELECT
                n.author_username AS username,
                'reviewer' AS role,
                COUNT(DISTINCT m.id) AS touch_count,
                MAX(n.created_at) AS last_seen_at,
                GROUP_CONCAT(DISTINCT (p.path_with_namespace || '!' || m.iid)) AS mr_refs
            FROM notes n
            JOIN discussions d ON n.discussion_id = d.id
            JOIN merge_requests m ON d.merge_request_id = m.id
            JOIN projects p ON m.project_id = p.id
            WHERE n.note_type = 'DiffNote'
              AND n.position_new_path LIKE ?1 ESCAPE '\\'
              AND n.is_system = 0
              AND n.author_username IS NOT NULL
              AND (m.author_username IS NULL OR n.author_username != m.author_username)
              AND m.state IN ('opened','merged')
              AND n.created_at >= ?2
              AND (?3 IS NULL OR n.project_id = ?3)
            GROUP BY n.author_username

            UNION ALL

            -- Authors who wrote MRs with reviews at this path
            SELECT
                m.author_username AS username,
                'author' AS role,
                COUNT(DISTINCT m.id) AS touch_count,
                MAX(n.created_at) AS last_seen_at,
                GROUP_CONCAT(DISTINCT (p.path_with_namespace || '!' || m.iid)) AS mr_refs
            FROM merge_requests m
            JOIN discussions d ON d.merge_request_id = m.id
            JOIN notes n ON n.discussion_id = d.id
            JOIN projects p ON m.project_id = p.id
            WHERE n.note_type = 'DiffNote'
              AND n.position_new_path LIKE ?1 ESCAPE '\\'
              AND n.is_system = 0
              AND m.state IN ('opened', 'merged')
              AND m.author_username IS NOT NULL
              AND n.created_at >= ?2
              AND (?3 IS NULL OR n.project_id = ?3)
            GROUP BY m.author_username
        )";

    let sql_exact = "SELECT username, role, touch_count, last_seen_at, mr_refs FROM (
            -- Self-reviews excluded for reviewer counts
            SELECT
                n.author_username AS username,
                'reviewer' AS role,
                COUNT(DISTINCT m.id) AS touch_count,
                MAX(n.created_at) AS last_seen_at,
                GROUP_CONCAT(DISTINCT (p.path_with_namespace || '!' || m.iid)) AS mr_refs
            FROM notes n
            JOIN discussions d ON n.discussion_id = d.id
            JOIN merge_requests m ON d.merge_request_id = m.id
            JOIN projects p ON m.project_id = p.id
            WHERE n.note_type = 'DiffNote'
              AND n.position_new_path = ?1
              AND n.is_system = 0
              AND n.author_username IS NOT NULL
              AND (m.author_username IS NULL OR n.author_username != m.author_username)
              AND m.state IN ('opened','merged')
              AND n.created_at >= ?2
              AND (?3 IS NULL OR n.project_id = ?3)
            GROUP BY n.author_username

            UNION ALL

            SELECT
                m.author_username AS username,
                'author' AS role,
                COUNT(DISTINCT m.id) AS touch_count,
                MAX(n.created_at) AS last_seen_at,
                GROUP_CONCAT(DISTINCT (p.path_with_namespace || '!' || m.iid)) AS mr_refs
            FROM merge_requests m
            JOIN discussions d ON d.merge_request_id = m.id
            JOIN notes n ON n.discussion_id = d.id
            JOIN projects p ON m.project_id = p.id
            WHERE n.note_type = 'DiffNote'
              AND n.position_new_path = ?1
              AND n.is_system = 0
              AND m.state IN ('opened', 'merged')
              AND m.author_username IS NOT NULL
              AND n.created_at >= ?2
              AND (?3 IS NULL OR n.project_id = ?3)
            GROUP BY m.author_username
        )";

    let mut stmt = if pq.is_prefix {
        conn.prepare_cached(sql_prefix)?
    } else {
        conn.prepare_cached(sql_exact)?
    };
    let rows: Vec<(String, String, u32, i64, Option<String>)> = stmt
        .query_map(rusqlite::params![pq.value, since_ms, project_id], |row| {
            Ok((
                row.get(0)?,
                row.get(1)?,
                row.get(2)?,
                row.get(3)?,
                row.get(4)?,
            ))
        })?
        .collect::<std::result::Result<Vec<_>, _>>()?;

    // Internal accumulator uses HashSet for MR refs from the start — avoids
    // expensive drain-rebuild per row that the old Vec + HashSet pattern did.
    struct OverlapAcc {
        username: String,
        author_touch_count: u32,
        review_touch_count: u32,
        touch_count: u32,
        last_seen_at: i64,
        mr_refs: HashSet<String>,
    }

    let mut user_map: HashMap<String, OverlapAcc> = HashMap::new();
    for (username, role, count, last_seen, mr_refs_csv) in &rows {
        let mr_refs: Vec<String> = mr_refs_csv
            .as_deref()
            .map(|csv| csv.split(',').map(|s| s.trim().to_string()).collect())
            .unwrap_or_default();

        let entry = user_map.entry(username.clone()).or_insert_with(|| OverlapAcc {
            username: username.clone(),
            author_touch_count: 0,
            review_touch_count: 0,
            touch_count: 0,
            last_seen_at: 0,
            mr_refs: HashSet::new(),
        });
        entry.touch_count += count;
        if role == "author" {
            entry.author_touch_count += count;
        } else {
            entry.review_touch_count += count;
        }
        if *last_seen > entry.last_seen_at {
            entry.last_seen_at = *last_seen;
        }
        for r in mr_refs {
            entry.mr_refs.insert(r);
        }
    }

    // Convert accumulators to output structs. Sort + bound mr_refs for deterministic,
    // bounded output. Agents get total + truncated metadata per principle #11.
    const MAX_MR_REFS_PER_USER: usize = 50;
    let mut users: Vec<OverlapUser> = user_map
        .into_values()
        .map(|a| {
            let mut mr_refs: Vec<String> = a.mr_refs.into_iter().collect();
            mr_refs.sort();
            let mr_refs_total = mr_refs.len() as u32;
            let mr_refs_truncated = mr_refs.len() > MAX_MR_REFS_PER_USER;
            if mr_refs_truncated {
                mr_refs.truncate(MAX_MR_REFS_PER_USER);
            }
            OverlapUser {
                username: a.username,
                author_touch_count: a.author_touch_count,
                review_touch_count: a.review_touch_count,
                touch_count: a.touch_count,
                last_seen_at: a.last_seen_at,
                mr_refs,
                mr_refs_total,
                mr_refs_truncated,
            }
        })
        .collect();

    // Stable sort with full tie-breakers for deterministic output
    users.sort_by(|a, b| {
        b.touch_count
            .cmp(&a.touch_count)
            .then_with(|| b.last_seen_at.cmp(&a.last_seen_at))
            .then_with(|| a.username.cmp(&b.username))
    });

    let truncated = users.len() > limit;
    users.truncate(limit);

    Ok(OverlapResult {
        path_query: path.to_string(),
        path_match: if pq.is_prefix { "prefix" } else { "exact" }.to_string(),
        users,
        truncated,
    })
}

/// Format overlap role for display: "A", "R", or "A+R".
fn format_overlap_role(user: &OverlapUser) -> &'static str {
    match (user.author_touch_count > 0, user.review_touch_count > 0) {
        (true, true) => "A+R",
        (true, false) => "A",
        (false, true) => "R",
        (false, false) => "-",
    }
}

Key design decisions:

Dual role tracking — OverlapUser tracks author_touch_count and review_touch_count separately. A user who both authored and reviewed at the same path gets A+R — a stronger signal than either alone.
Coherent touch_count units — Both reviewer and author branches use COUNT(DISTINCT m.id) to count MRs, not DiffNotes. This makes touch_count a consistent, comparable metric across roles — summing author + reviewer MR counts is meaningful, whereas mixing DiffNote counts (reviewer) with MR counts (author) produces misleading totals. The human output header says "MRs" (not "Touches") to reflect this.
n.created_at for author branch — Anchors "last seen" to actual review note timestamps at the path, not MR-level updated_at which can change for unrelated reasons.
n.is_system = 0 on BOTH branches — Consistent system note exclusion across all DiffNote queries. This matches Expert mode and prevents bot/CI annotation noise. 4b. m.state IN ('opened','merged') on reviewer branches — Consistent with author branches. Closed/unmerged MRs are noise. Matches Expert mode. 4c. Bounded mr_refs — mr_refs are capped at 50 per user with mr_refs_total + mr_refs_truncated metadata. Prevents unbounded JSON payloads per principle #11.
n.project_id for scoping — Same as Expert mode, scoping on the notes table to maximize index usage.
Project-qualified MR refs (group/project!iid) — The SQL joins the projects table and concatenates p.path_with_namespace || '!' || m.iid to produce globally unique references. In a multi-project database, bare !123 is ambiguous — the same iid can exist in multiple projects. This makes overlap output directly actionable without needing to cross-reference project context.
Accumulator pattern — OverlapAcc uses HashSet<String> for MR refs from the start, avoiding the expensive drain-rebuild per row that the old Vec + HashSet pattern did. Convert once to Vec at the end with deterministic sorting.
Two static SQL strings (prefix vs exact) — Same pattern as Expert mode.
Self-review exclusion — Same as Expert: reviewer branch filters n.author_username != m.author_username to prevent MR authors from inflating reviewer counts.
Deterministic sorting — Users are sorted by touch_count DESC, last_seen_at DESC, username ASC. All three tie-breaker fields ensure identical output across runs. MR refs within each user are also sorted alphabetically.
Truncation — Same LIMIT + 1 pattern as Expert and Active modes. truncated: bool exposed in both human and robot output.

Human Output

pub fn print_who_human(result: &WhoResult, project_path: Option<&str>) {
    match result {
        WhoResult::Expert(r) => print_expert_human(r, project_path),
        WhoResult::Workload(r) => print_workload_human(r),
        WhoResult::Reviews(r) => print_reviews_human(r),
        WhoResult::Active(r) => print_active_human(r, project_path),
        WhoResult::Overlap(r) => print_overlap_human(r, project_path),
    }
}

/// Print a dim hint when results aggregate across all projects.
/// Modes that could be ambiguous without project context (Expert, Active, Overlap)
/// call this to prevent misinterpretation in multi-project databases.
fn print_scope_hint(project_path: Option<&str>) {
    if project_path.is_none() {
        println!(
            "  {}",
            style("(aggregated across all projects; use -p to scope)").dim()
        );
    }
}

fn print_expert_human(r: &ExpertResult, project_path: Option<&str>) {
    println!();
    println!(
        "{}",
        style(format!("Experts for {}", r.path_query)).bold()
    );
    println!("{}", "─".repeat(60));
    println!(
        "  {}",
        style(format!("(matching {} {})", r.path_match, if r.path_match == "exact" { "file" } else { "directory prefix" })).dim()
    );
    print_scope_hint(project_path);
    println!();

    if r.experts.is_empty() {
        println!("  {}", style("No experts found for this path.").dim());
        println!();
        return;
    }

    println!(
        "  {:<16} {:>6} {:>12} {:>6} {:>12}  {}",
        style("Username").bold(),
        style("Score").bold(),
        style("Reviewed(MRs)").bold(),
        style("Notes").bold(),
        style("Authored(MRs)").bold(),
        style("Last Seen").bold(),
    );

    for expert in &r.experts {
        let reviews = if expert.review_mr_count > 0 {
            expert.review_mr_count.to_string()
        } else {
            "-".to_string()
        };
        let notes = if expert.review_note_count > 0 {
            expert.review_note_count.to_string()
        } else {
            "-".to_string()
        };
        let authored = if expert.author_mr_count > 0 {
            expert.author_mr_count.to_string()
        } else {
            "-".to_string()
        };
        println!(
            "  {:<16} {:>6} {:>12} {:>6} {:>12}  {}",
            style(format!("@{}", expert.username)).cyan(),
            expert.score,
            reviews,
            notes,
            authored,
            style(format_relative_time(expert.last_seen_ms)).dim(),
        );
    }
    if r.truncated {
        println!("  {}", style("(showing first -n; rerun with a higher --limit)").dim());
    }
    println!();
}

fn print_workload_human(r: &WorkloadResult) {
    println!();
    println!(
        "{}",
        style(format!("@{} -- Workload Summary", r.username)).bold()
    );
    println!("{}", "─".repeat(60));

    if !r.assigned_issues.is_empty() {
        println!();
        println!(
            "  {} ({})",
            style("Assigned Issues").bold(),
            r.assigned_issues.len()
        );
        for item in &r.assigned_issues {
            println!(
                "    {} {}  {}",
                style(&item.ref_).cyan(),
                truncate_str(&item.title, 40),
                style(format_relative_time(item.updated_at)).dim(),
            );
        }
        if r.assigned_issues_truncated {
            println!("    {}", style("(truncated; rerun with a higher --limit)").dim());
        }
    }

    if !r.authored_mrs.is_empty() {
        println!();
        println!(
            "  {} ({})",
            style("Authored MRs").bold(),
            r.authored_mrs.len()
        );
        for mr in &r.authored_mrs {
            let draft = if mr.draft { " [draft]" } else { "" };
            println!(
                "    {} {}{}  {}",
                style(&mr.ref_).cyan(),
                truncate_str(&mr.title, 35),
                style(draft).dim(),
                style(format_relative_time(mr.updated_at)).dim(),
            );
        }
        if r.authored_mrs_truncated {
            println!("    {}", style("(truncated; rerun with a higher --limit)").dim());
        }
    }

    if !r.reviewing_mrs.is_empty() {
        println!();
        println!(
            "  {} ({})",
            style("Reviewing MRs").bold(),
            r.reviewing_mrs.len()
        );
        for mr in &r.reviewing_mrs {
            let author = mr
                .author_username
                .as_deref()
                .map(|a| format!(" by @{a}"))
                .unwrap_or_default();
            println!(
                "    {} {}{}  {}",
                style(&mr.ref_).cyan(),
                truncate_str(&mr.title, 30),
                style(author).dim(),
                style(format_relative_time(mr.updated_at)).dim(),
            );
        }
        if r.reviewing_mrs_truncated {
            println!("    {}", style("(truncated; rerun with a higher --limit)").dim());
        }
    }

    if !r.unresolved_discussions.is_empty() {
        println!();
        println!(
            "  {} ({})",
            style("Unresolved Discussions").bold(),
            r.unresolved_discussions.len()
        );
        for disc in &r.unresolved_discussions {
            println!(
                "    {} {} {}  {}",
                style(&disc.entity_type).dim(),
                style(&disc.ref_).cyan(),
                truncate_str(&disc.entity_title, 35),
                style(format_relative_time(disc.last_note_at)).dim(),
            );
        }
        if r.unresolved_discussions_truncated {
            println!("    {}", style("(truncated; rerun with a higher --limit)").dim());
        }
    }

    if r.assigned_issues.is_empty()
        && r.authored_mrs.is_empty()
        && r.reviewing_mrs.is_empty()
        && r.unresolved_discussions.is_empty()
    {
        println!();
        println!(
            "  {}",
            style("No open work items found for this user.").dim()
        );
    }

    println!();
}

fn print_reviews_human(r: &ReviewsResult) {
    println!();
    println!(
        "{}",
        style(format!("@{} -- Review Patterns", r.username)).bold()
    );
    println!("{}", "─".repeat(60));
    println!();

    if r.total_diffnotes == 0 {
        println!(
            "  {}",
            style("No review comments found for this user.").dim()
        );
        println!();
        return;
    }

    println!(
        "  {} DiffNotes across {} MRs ({} categorized)",
        style(r.total_diffnotes).bold(),
        style(r.mrs_reviewed).bold(),
        style(r.categorized_count).bold(),
    );
    println!();

    if !r.categories.is_empty() {
        println!(
            "  {:<16} {:>6} {:>6}",
            style("Category").bold(),
            style("Count").bold(),
            style("%").bold(),
        );

        for cat in &r.categories {
            println!(
                "  {:<16} {:>6} {:>5.1}%",
                style(&cat.name).cyan(),
                cat.count,
                cat.percentage,
            );
        }
    }

    let uncategorized = r.total_diffnotes - r.categorized_count;
    if uncategorized > 0 {
        println!();
        println!(
            "  {} {} uncategorized (no **prefix** convention)",
            style("Note:").dim(),
            uncategorized,
        );
    }

    println!();
}

fn print_active_human(r: &ActiveResult, project_path: Option<&str>) {
    println!();
    println!(
        "{}",
        style(format!(
            "Active Discussions ({} unresolved in window)",
            r.total_unresolved_in_window
        ))
        .bold()
    );
    println!("{}", "─".repeat(60));
    print_scope_hint(project_path);
    println!();

    if r.discussions.is_empty() {
        println!(
            "  {}",
            style("No active unresolved discussions in this time window.").dim()
        );
        println!();
        return;
    }

    for disc in &r.discussions {
        let prefix = if disc.entity_type == "MR" { "!" } else { "#" };
        let participants_str = disc
            .participants
            .iter()
            .map(|p| format!("@{p}"))
            .collect::<Vec<_>>()
            .join(", ");

        println!(
            "  {} {} {}  {} notes  {}",
            style(format!("{}{}", prefix, disc.entity_iid)).cyan(),
            truncate_str(&disc.entity_title, 40),
            style(format_relative_time(disc.last_note_at)).dim(),
            disc.note_count,
            style(&disc.project_path).dim(),
        );
        if !participants_str.is_empty() {
            println!("    {}", style(participants_str).dim());
        }
    }
    if r.truncated {
        println!("  {}", style("(showing first -n; rerun with a higher --limit)").dim());
    }
    println!();
}

fn print_overlap_human(r: &OverlapResult, project_path: Option<&str>) {
    println!();
    println!(
        "{}",
        style(format!("Overlap for {}", r.path_query)).bold()
    );
    println!("{}", "─".repeat(60));
    println!(
        "  {}",
        style(format!("(matching {} {})", r.path_match, if r.path_match == "exact" { "file" } else { "directory prefix" })).dim()
    );
    print_scope_hint(project_path);
    println!();

    if r.users.is_empty() {
        println!(
            "  {}",
            style("No overlapping users found for this path.").dim()
        );
        println!();
        return;
    }

    println!(
        "  {:<16} {:<6} {:>7}  {:<12}  {}",
        style("Username").bold(),
        style("Role").bold(),
        style("MRs").bold(),
        style("Last Seen").bold(),
        style("MR Refs").bold(),
    );

    for user in &r.users {
        let mr_str = user
            .mr_refs
            .iter()
            .take(5)
            .cloned()
            .collect::<Vec<_>>()
            .join(", ");
        let overflow = if user.mr_refs.len() > 5 {
            format!(" +{}", user.mr_refs.len() - 5)
        } else {
            String::new()
        };

        println!(
            "  {:<16} {:<6} {:>7}  {:<12}  {}{}",
            style(format!("@{}", user.username)).cyan(),
            format_overlap_role(user),
            user.touch_count,
            format_relative_time(user.last_seen_at),
            mr_str,
            overflow,
        );
    }
    if r.truncated {
        println!("  {}", style("(showing first -n; rerun with a higher --limit)").dim());
    }
    println!();
}

Human output multi-project clarity (from feedback-3 Change 7, enhanced in feedback-5 Change 5): When not project-scoped, #42 and !100 aren't unique. The workload output now uses canonical refs (group/project#iid for issues, group/project!iid for MRs) — the same token is copy-pasteable by humans and directly parseable by agents. This eliminates the need for a separate project_path column since the ref already contains the project context. Robot JSON includes both ref and project_path for flexibility. Overlap output uses the project-qualified mr_refs directly.

Robot JSON Output

pub fn print_who_json(run: &WhoRun, args: &WhoArgs, elapsed_ms: u64) {
    let (mode, data) = match &run.result {
        WhoResult::Expert(r) => ("expert", expert_to_json(r)),
        WhoResult::Workload(r) => ("workload", workload_to_json(r)),
        WhoResult::Reviews(r) => ("reviews", reviews_to_json(r)),
        WhoResult::Active(r) => ("active", active_to_json(r)),
        WhoResult::Overlap(r) => ("overlap", overlap_to_json(r)),
    };

    // Raw CLI args — what the user typed
    let input = serde_json::json!({
        "target": args.target,
        "path": args.path,
        "project": args.project,
        "since": args.since,
        "limit": args.limit,
    });

    // Resolved/computed values — what actually ran
    let resolved_input = serde_json::json!({
        "mode": run.resolved_input.mode,
        "project_id": run.resolved_input.project_id,
        "project_path": run.resolved_input.project_path,
        "since_ms": run.resolved_input.since_ms,
        "since_iso": run.resolved_input.since_iso,
        "since_mode": run.resolved_input.since_mode,
        "limit": run.resolved_input.limit,
    });

    let output = WhoJsonEnvelope {
        ok: true,
        data: WhoJsonData {
            mode: mode.to_string(),
            input,
            resolved_input,
            result: data,
        },
        meta: RobotMeta { elapsed_ms },
    };

    println!("{}", serde_json::to_string(&output).unwrap());
}

#[derive(Serialize)]
struct WhoJsonEnvelope {
    ok: bool,
    data: WhoJsonData,
    meta: RobotMeta,
}

#[derive(Serialize)]
struct WhoJsonData {
    mode: String,
    input: serde_json::Value,
    resolved_input: serde_json::Value,
    #[serde(flatten)]
    result: serde_json::Value,
}

fn expert_to_json(r: &ExpertResult) -> serde_json::Value {
    serde_json::json!({
        "path_query": r.path_query,
        "path_match": r.path_match,
        "truncated": r.truncated,
        "experts": r.experts.iter().map(|e| serde_json::json!({
            "username": e.username,
            "score": e.score,
            "review_mr_count": e.review_mr_count,
            "review_note_count": e.review_note_count,
            "author_mr_count": e.author_mr_count,
            "last_seen_at": ms_to_iso(e.last_seen_ms),
        })).collect::<Vec<_>>(),
    })
}

fn workload_to_json(r: &WorkloadResult) -> serde_json::Value {
    serde_json::json!({
        "username": r.username,
        "assigned_issues": r.assigned_issues.iter().map(|i| serde_json::json!({
            "iid": i.iid,
            "ref": i.ref_,
            "title": i.title,
            "project_path": i.project_path,
            "updated_at": ms_to_iso(i.updated_at),
        })).collect::<Vec<_>>(),
        "authored_mrs": r.authored_mrs.iter().map(|m| serde_json::json!({
            "iid": m.iid,
            "ref": m.ref_,
            "title": m.title,
            "draft": m.draft,
            "project_path": m.project_path,
            "updated_at": ms_to_iso(m.updated_at),
        })).collect::<Vec<_>>(),
        "reviewing_mrs": r.reviewing_mrs.iter().map(|m| serde_json::json!({
            "iid": m.iid,
            "ref": m.ref_,
            "title": m.title,
            "draft": m.draft,
            "project_path": m.project_path,
            "author_username": m.author_username,
            "updated_at": ms_to_iso(m.updated_at),
        })).collect::<Vec<_>>(),
        "unresolved_discussions": r.unresolved_discussions.iter().map(|d| serde_json::json!({
            "entity_type": d.entity_type,
            "entity_iid": d.entity_iid,
            "ref": d.ref_,
            "entity_title": d.entity_title,
            "project_path": d.project_path,
            "last_note_at": ms_to_iso(d.last_note_at),
        })).collect::<Vec<_>>(),
        "summary": {
            "assigned_issue_count": r.assigned_issues.len(),
            "authored_mr_count": r.authored_mrs.len(),
            "reviewing_mr_count": r.reviewing_mrs.len(),
            "unresolved_discussion_count": r.unresolved_discussions.len(),
        },
        "truncation": {
            "assigned_issues_truncated": r.assigned_issues_truncated,
            "authored_mrs_truncated": r.authored_mrs_truncated,
            "reviewing_mrs_truncated": r.reviewing_mrs_truncated,
            "unresolved_discussions_truncated": r.unresolved_discussions_truncated,
        }
    })
}

fn reviews_to_json(r: &ReviewsResult) -> serde_json::Value {
    serde_json::json!({
        "username": r.username,
        "total_diffnotes": r.total_diffnotes,
        "categorized_count": r.categorized_count,
        "mrs_reviewed": r.mrs_reviewed,
        "categories": r.categories.iter().map(|c| serde_json::json!({
            "name": c.name,
            "count": c.count,
            "percentage": (c.percentage * 10.0).round() / 10.0,
        })).collect::<Vec<_>>(),
    })
}

fn active_to_json(r: &ActiveResult) -> serde_json::Value {
    serde_json::json!({
        "total_unresolved_in_window": r.total_unresolved_in_window,
        "truncated": r.truncated,
        "discussions": r.discussions.iter().map(|d| serde_json::json!({
            "discussion_id": d.discussion_id,
            "entity_type": d.entity_type,
            "entity_iid": d.entity_iid,
            "entity_title": d.entity_title,
            "project_path": d.project_path,
            "last_note_at": ms_to_iso(d.last_note_at),
            "note_count": d.note_count,
            "participants": d.participants,
            "participants_total": d.participants_total,
            "participants_truncated": d.participants_truncated,
        })).collect::<Vec<_>>(),
    })
}

fn overlap_to_json(r: &OverlapResult) -> serde_json::Value {
    serde_json::json!({
        "path_query": r.path_query,
        "path_match": r.path_match,
        "truncated": r.truncated,
        "users": r.users.iter().map(|u| serde_json::json!({
            "username": u.username,
            "role": format_overlap_role(u),
            "author_touch_count": u.author_touch_count,
            "review_touch_count": u.review_touch_count,
            "touch_count": u.touch_count,
            "last_seen_at": ms_to_iso(u.last_seen_at),
            "mr_refs": u.mr_refs,
            "mr_refs_total": u.mr_refs_total,
            "mr_refs_truncated": u.mr_refs_truncated,
        })).collect::<Vec<_>>(),
    })
}

Robot JSON dual-envelope design: The JSON output now includes BOTH input (raw CLI args) AND resolved_input (computed values). This serves two distinct purposes:

input lets agents see what was typed (for error reporting, UI display)
resolved_input lets agents reproduce the exact query (fuzzy project resolution -> concrete project_id + project_path, default since -> concrete since_ms + since_iso, since_was_default flag)

This is a genuine improvement over the original plan which only echoed raw args. An agent comparing runs across time needs the resolved values — fuzzy project matching can resolve differently as projects are added/renamed.

since_mode: Agents replaying intent need to know whether --since 6m was explicit ("explicit"), defaulted ("default"), or absent ("none" — Workload). The original boolean since_was_default was misleading for Workload mode, which has no default window: true incorrectly implied a default was applied. The tri-state eliminates this reproducibility gap.

discussion_id in active output: Agents need stable entity IDs to follow up on specific discussions. Without this, the only way to identify a discussion is entity_type + entity_iid + project_path, which is fragile and requires multi-field matching.

mr_refs instead of mr_iids in overlap output: Project-qualified references like group/project!123 are globally unique and directly actionable, unlike bare iids which are ambiguous in a multi-project database.

Intentionally excluded from resolved_input: db_path. ChatGPT proposed including the database file path. This leaks internal filesystem structure without adding value — agents already know their config and db_path is invariant within a session.

Helper Functions (duplicated from list.rs)

fn format_relative_time(ms_epoch: i64) -> String {
    let now = now_ms();
    let diff = now - ms_epoch;

    if diff < 0 {
        return "in the future".to_string();
    }

    match diff {
        d if d < 60_000 => "just now".to_string(),
        d if d < 3_600_000 => format!("{} min ago", d / 60_000),
        d if d < 86_400_000 => {
            let n = d / 3_600_000;
            format!("{n} {} ago", if n == 1 { "hour" } else { "hours" })
        }
        d if d < 604_800_000 => {
            let n = d / 86_400_000;
            format!("{n} {} ago", if n == 1 { "day" } else { "days" })
        }
        d if d < 2_592_000_000 => {
            let n = d / 604_800_000;
            format!("{n} {} ago", if n == 1 { "week" } else { "weeks" })
        }
        _ => {
            let n = diff / 2_592_000_000;
            format!("{n} {} ago", if n == 1 { "month" } else { "months" })
        }
    }
}

fn truncate_str(s: &str, max: usize) -> String {
    if s.chars().count() <= max {
        s.to_owned()
    } else {
        let truncated: String = s.chars().take(max.saturating_sub(3)).collect();
        format!("{truncated}...")
    }
}

Tests

#[cfg(test)]
mod tests {
    use super::*;
    use crate::core::db::{create_connection, run_migrations};
    use std::path::Path;

    fn setup_test_db() -> Connection {
        let conn = create_connection(Path::new(":memory:")).unwrap();
        run_migrations(&conn).unwrap();
        conn
    }

    fn insert_project(conn: &Connection, id: i64, path: &str) {
        conn.execute(
            "INSERT INTO projects (id, gitlab_project_id, path_with_namespace, web_url)
             VALUES (?1, ?2, ?3, ?4)",
            rusqlite::params![id, id * 100, path, format!("https://git.example.com/{}", path)],
        )
        .unwrap();
    }

    fn insert_mr(conn: &Connection, id: i64, project_id: i64, iid: i64, author: &str, state: &str) {
        conn.execute(
            "INSERT INTO merge_requests (id, gitlab_id, project_id, iid, title, author_username, state, last_seen_at, updated_at)
             VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8, ?9)",
            rusqlite::params![
                id, id * 10, project_id, iid,
                format!("MR {iid}"), author, state,
                now_ms(), now_ms()
            ],
        ).unwrap();
    }

    fn insert_issue(conn: &Connection, id: i64, project_id: i64, iid: i64, author: &str) {
        conn.execute(
            "INSERT INTO issues (id, gitlab_id, project_id, iid, title, state, author_username, created_at, updated_at, last_seen_at)
             VALUES (?1, ?2, ?3, ?4, ?5, 'opened', ?6, ?7, ?8, ?9)",
            rusqlite::params![
                id, id * 10, project_id, iid,
                format!("Issue {iid}"), author,
                now_ms(), now_ms(), now_ms()
            ],
        ).unwrap();
    }

    fn insert_discussion(conn: &Connection, id: i64, project_id: i64, mr_id: Option<i64>, issue_id: Option<i64>, resolvable: bool, resolved: bool) {
        let noteable_type = if mr_id.is_some() { "MergeRequest" } else { "Issue" };
        conn.execute(
            "INSERT INTO discussions (id, gitlab_discussion_id, project_id, merge_request_id, issue_id, noteable_type, resolvable, resolved, last_seen_at, last_note_at)
             VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8, ?9, ?10)",
            rusqlite::params![
                id, format!("disc-{id}"), project_id, mr_id, issue_id,
                noteable_type,
                i32::from(resolvable), i32::from(resolved),
                now_ms(), now_ms()
            ],
        ).unwrap();
    }

    #[allow(clippy::too_many_arguments)]
    fn insert_diffnote(conn: &Connection, id: i64, discussion_id: i64, project_id: i64, author: &str, file_path: &str, body: &str) {
        conn.execute(
            "INSERT INTO notes (id, gitlab_id, discussion_id, project_id, note_type, is_system, author_username, body, created_at, updated_at, last_seen_at, position_new_path)
             VALUES (?1, ?2, ?3, ?4, 'DiffNote', 0, ?5, ?6, ?7, ?8, ?9, ?10)",
            rusqlite::params![
                id, id * 10, discussion_id, project_id,
                author, body,
                now_ms(), now_ms(), now_ms(), file_path
            ],
        ).unwrap();
    }

    fn insert_assignee(conn: &Connection, issue_id: i64, username: &str) {
        conn.execute(
            "INSERT INTO issue_assignees (issue_id, username) VALUES (?1, ?2)",
            rusqlite::params![issue_id, username],
        )
        .unwrap();
    }

    fn insert_reviewer(conn: &Connection, mr_id: i64, username: &str) {
        conn.execute(
            "INSERT INTO mr_reviewers (merge_request_id, username) VALUES (?1, ?2)",
            rusqlite::params![mr_id, username],
        )
        .unwrap();
    }

    #[test]
    fn test_is_file_path_discrimination() {
        // Contains '/' -> file path
        assert!(matches!(
            resolve_mode(&WhoArgs {
                target: Some("src/auth/".to_string()),
                path: None, active: false, overlap: None, reviews: false,
                since: None, project: None, limit: 20,
            }).unwrap(),
            WhoMode::Expert { .. }
        ));

        // No '/' -> username
        assert!(matches!(
            resolve_mode(&WhoArgs {
                target: Some("asmith".to_string()),
                path: None, active: false, overlap: None, reviews: false,
                since: None, project: None, limit: 20,
            }).unwrap(),
            WhoMode::Workload { .. }
        ));

        // With @ prefix -> username (stripped)
        assert!(matches!(
            resolve_mode(&WhoArgs {
                target: Some("@asmith".to_string()),
                path: None, active: false, overlap: None, reviews: false,
                since: None, project: None, limit: 20,
            }).unwrap(),
            WhoMode::Workload { .. }
        ));

        // --reviews flag -> reviews mode
        assert!(matches!(
            resolve_mode(&WhoArgs {
                target: Some("asmith".to_string()),
                path: None, active: false, overlap: None, reviews: true,
                since: None, project: None, limit: 20,
            }).unwrap(),
            WhoMode::Reviews { .. }
        ));

        // --path flag -> expert mode (handles root files)
        assert!(matches!(
            resolve_mode(&WhoArgs {
                target: None,
                path: Some("README.md".to_string()),
                active: false, overlap: None, reviews: false,
                since: None, project: None, limit: 20,
            }).unwrap(),
            WhoMode::Expert { .. }
        ));

        // --path flag with dotless file -> expert mode
        assert!(matches!(
            resolve_mode(&WhoArgs {
                target: None,
                path: Some("Makefile".to_string()),
                active: false, overlap: None, reviews: false,
                since: None, project: None, limit: 20,
            }).unwrap(),
            WhoMode::Expert { .. }
        ));
    }

    #[test]
    fn test_build_path_query() {
        let conn = setup_test_db();

        // Directory with trailing slash -> prefix
        let pq = build_path_query(&conn, "src/auth/", None).unwrap();
        assert_eq!(pq.value, "src/auth/%");
        assert!(pq.is_prefix);

        // Directory without trailing slash (no dot in last segment) -> prefix
        let pq = build_path_query(&conn, "src/auth", None).unwrap();
        assert_eq!(pq.value, "src/auth/%");
        assert!(pq.is_prefix);

        // File with extension -> exact
        let pq = build_path_query(&conn, "src/auth/login.rs", None).unwrap();
        assert_eq!(pq.value, "src/auth/login.rs");
        assert!(!pq.is_prefix);

        // Root file -> exact
        let pq = build_path_query(&conn, "README.md", None).unwrap();
        assert_eq!(pq.value, "README.md");
        assert!(!pq.is_prefix);

        // Directory with dots in non-leaf segment -> prefix
        let pq = build_path_query(&conn, ".github/workflows/", None).unwrap();
        assert_eq!(pq.value, ".github/workflows/%");
        assert!(pq.is_prefix);

        // Versioned directory path -> prefix
        let pq = build_path_query(&conn, "src/v1.2/auth/", None).unwrap();
        assert_eq!(pq.value, "src/v1.2/auth/%");
        assert!(pq.is_prefix);

        // Path with LIKE metacharacters -> prefix, escaped
        let pq = build_path_query(&conn, "src/test_files/", None).unwrap();
        assert_eq!(pq.value, "src/test\\_files/%");
        assert!(pq.is_prefix);

        // Dotless root file -> exact match (root path without '/')
        let pq = build_path_query(&conn, "Makefile", None).unwrap();
        assert_eq!(pq.value, "Makefile");
        assert!(!pq.is_prefix);

        let pq = build_path_query(&conn, "LICENSE", None).unwrap();
        assert_eq!(pq.value, "LICENSE");
        assert!(!pq.is_prefix);

        // Dotless root path with trailing '/' -> directory prefix (explicit override)
        let pq = build_path_query(&conn, "Makefile/", None).unwrap();
        assert_eq!(pq.value, "Makefile/%");
        assert!(pq.is_prefix);
    }

    #[test]
    fn test_escape_like() {
        assert_eq!(escape_like("normal/path"), "normal/path");
        assert_eq!(escape_like("has_underscore"), "has\\_underscore");
        assert_eq!(escape_like("has%percent"), "has\\%percent");
        assert_eq!(escape_like("has\\backslash"), "has\\\\backslash");
    }

    #[test]
    fn test_build_path_query_exact_does_not_escape() {
        let conn = setup_test_db();
        // '_' must NOT be escaped for exact match (=).
        // escape_like would produce "README\_with\_underscore.md" which
        // wouldn't match stored "README_with_underscore.md" via =.
        let pq = build_path_query(&conn, "README_with_underscore.md", None).unwrap();
        assert_eq!(pq.value, "README_with_underscore.md");
        assert!(!pq.is_prefix);
    }

    #[test]
    fn test_path_flag_dotless_root_file_is_exact() {
        let conn = setup_test_db();
        // --path Makefile must produce an exact match, not Makefile/%
        let pq = build_path_query(&conn, "Makefile", None).unwrap();
        assert_eq!(pq.value, "Makefile");
        assert!(!pq.is_prefix);

        let pq = build_path_query(&conn, "Dockerfile", None).unwrap();
        assert_eq!(pq.value, "Dockerfile");
        assert!(!pq.is_prefix);
    }

    #[test]
    fn test_expert_query() {
        let conn = setup_test_db();
        insert_project(&conn, 1, "team/backend");
        insert_mr(&conn, 1, 1, 100, "author_a", "merged");
        insert_discussion(&conn, 1, 1, Some(1), None, true, false);
        insert_diffnote(&conn, 1, 1, 1, "reviewer_b", "src/auth/login.rs", "**suggestion**: use const");
        insert_diffnote(&conn, 2, 1, 1, "reviewer_b", "src/auth/login.rs", "**question**: why?");
        insert_diffnote(&conn, 3, 1, 1, "reviewer_c", "src/auth/session.rs", "looks good");

        let result = query_expert(&conn, "src/auth/", None, 0, 20).unwrap();
        assert_eq!(result.experts.len(), 3); // reviewer_b, reviewer_c, author_a
        assert_eq!(result.experts[0].username, "reviewer_b"); // highest score
    }

    #[test]
    fn test_workload_query() {
        let conn = setup_test_db();
        insert_project(&conn, 1, "team/backend");
        insert_issue(&conn, 1, 1, 42, "someone_else");
        insert_assignee(&conn, 1, "dev_a");
        insert_mr(&conn, 1, 1, 100, "dev_a", "opened");

        let result = query_workload(&conn, "dev_a", None, None, 20).unwrap();
        assert_eq!(result.assigned_issues.len(), 1);
        assert_eq!(result.authored_mrs.len(), 1);
    }

    #[test]
    fn test_reviews_query() {
        let conn = setup_test_db();
        insert_project(&conn, 1, "team/backend");
        insert_mr(&conn, 1, 1, 100, "author_a", "merged");
        insert_discussion(&conn, 1, 1, Some(1), None, true, false);
        insert_diffnote(&conn, 1, 1, 1, "reviewer_b", "src/foo.rs", "**suggestion**: refactor");
        insert_diffnote(&conn, 2, 1, 1, "reviewer_b", "src/bar.rs", "**question**: why?");
        insert_diffnote(&conn, 3, 1, 1, "reviewer_b", "src/baz.rs", "looks good");

        let result = query_reviews(&conn, "reviewer_b", None, 0).unwrap();
        assert_eq!(result.total_diffnotes, 3);
        assert_eq!(result.categorized_count, 2);
        assert_eq!(result.categories.len(), 2);
    }

    #[test]
    fn test_active_query() {
        let conn = setup_test_db();
        insert_project(&conn, 1, "team/backend");
        insert_mr(&conn, 1, 1, 100, "author_a", "opened");
        insert_discussion(&conn, 1, 1, Some(1), None, true, false);
        insert_diffnote(&conn, 1, 1, 1, "reviewer_b", "src/foo.rs", "needs work");
        // Second note by same participant — note_count should be 2, participants still ["reviewer_b"]
        insert_diffnote(&conn, 2, 1, 1, "reviewer_b", "src/foo.rs", "follow-up");

        let result = query_active(&conn, None, 0, 20).unwrap();
        assert_eq!(result.total_unresolved_in_window, 1);
        assert_eq!(result.discussions.len(), 1);
        assert_eq!(result.discussions[0].participants, vec!["reviewer_b"]);
        // This was a regression in iteration 4: note_count was counting participants, not notes
        assert_eq!(result.discussions[0].note_count, 2);
        assert!(result.discussions[0].discussion_id > 0);
    }

    #[test]
    fn test_overlap_dual_roles() {
        let conn = setup_test_db();
        insert_project(&conn, 1, "team/backend");
        // User is both author of one MR and reviewer of another at same path
        insert_mr(&conn, 1, 1, 100, "dual_user", "opened");
        insert_mr(&conn, 2, 1, 200, "other_author", "opened");
        insert_discussion(&conn, 1, 1, Some(1), None, true, false);
        insert_discussion(&conn, 2, 1, Some(2), None, true, false);
        insert_diffnote(&conn, 1, 1, 1, "someone", "src/auth/login.rs", "review of dual_user's MR");
        insert_diffnote(&conn, 2, 2, 1, "dual_user", "src/auth/login.rs", "dual_user reviewing other MR");

        let result = query_overlap(&conn, "src/auth/", None, 0, 20).unwrap();
        let dual = result.users.iter().find(|u| u.username == "dual_user").unwrap();
        assert!(dual.author_touch_count > 0);
        assert!(dual.review_touch_count > 0);
        assert_eq!(format_overlap_role(dual), "A+R");
        // MR refs should be project-qualified
        assert!(dual.mr_refs.iter().any(|r| r.contains("team/backend!")));
    }

    #[test]
    fn test_overlap_multi_project_mr_refs() {
        let conn = setup_test_db();
        insert_project(&conn, 1, "team/backend");
        insert_project(&conn, 2, "team/frontend");
        insert_mr(&conn, 1, 1, 100, "author_a", "opened");
        insert_mr(&conn, 2, 2, 100, "author_a", "opened"); // Same iid, different project
        insert_discussion(&conn, 1, 1, Some(1), None, true, false);
        insert_discussion(&conn, 2, 2, Some(2), None, true, false);
        insert_diffnote(&conn, 1, 1, 1, "reviewer_x", "src/auth/login.rs", "review");
        insert_diffnote(&conn, 2, 2, 2, "reviewer_x", "src/auth/login.rs", "review");

        let result = query_overlap(&conn, "src/auth/", None, 0, 20).unwrap();
        let reviewer = result.users.iter().find(|u| u.username == "reviewer_x").unwrap();
        // Should have two distinct refs despite same iid
        assert!(reviewer.mr_refs.contains(&"team/backend!100".to_string()));
        assert!(reviewer.mr_refs.contains(&"team/frontend!100".to_string()));
    }

    #[test]
    fn test_normalize_review_prefix() {
        assert_eq!(normalize_review_prefix("suggestion"), "suggestion");
        assert_eq!(normalize_review_prefix("Suggestion:"), "suggestion");
        assert_eq!(normalize_review_prefix("suggestion (non-blocking):"), "suggestion");
        assert_eq!(normalize_review_prefix("Nitpick:"), "nit");
        assert_eq!(normalize_review_prefix("nit (non-blocking):"), "nit");
        assert_eq!(normalize_review_prefix("question"), "question");
        assert_eq!(normalize_review_prefix("TODO:"), "todo");
    }

    #[test]
    fn test_normalize_repo_path() {
        // Strips leading ./
        assert_eq!(normalize_repo_path("./src/foo/"), "src/foo/");
        // Strips leading /
        assert_eq!(normalize_repo_path("/src/foo/"), "src/foo/");
        // Strips leading ./ recursively
        assert_eq!(normalize_repo_path("././src/foo"), "src/foo");
        // Converts Windows backslashes when no forward slashes
        assert_eq!(normalize_repo_path("src\\foo\\bar.rs"), "src/foo/bar.rs");
        // Does NOT convert backslashes when forward slashes present (escaped regex etc.)
        assert_eq!(normalize_repo_path("src/foo\\bar"), "src/foo\\bar");
        // Collapses repeated //
        assert_eq!(normalize_repo_path("src//foo//bar/"), "src/foo/bar/");
        // Trims whitespace
        assert_eq!(normalize_repo_path("  src/foo/  "), "src/foo/");
        // Identity for clean paths
        assert_eq!(normalize_repo_path("src/foo/bar.rs"), "src/foo/bar.rs");
    }

    #[test]
    fn test_lookup_project_path() {
        let conn = setup_test_db();
        insert_project(&conn, 1, "team/backend");
        assert_eq!(lookup_project_path(&conn, 1).unwrap(), "team/backend");
    }

    #[test]
    fn test_build_path_query_dotless_subdir_file_uses_db_probe() {
        // Dotless file in subdirectory (src/Dockerfile) would normally be
        // treated as a directory. The DB probe detects it's actually a file.
        let conn = setup_test_db();
        insert_project(&conn, 1, "team/backend");
        insert_mr(&conn, 1, 1, 100, "author_a", "opened");
        insert_discussion(&conn, 1, 1, Some(1), None, true, false);
        insert_diffnote(&conn, 1, 1, 1, "reviewer_b", "src/Dockerfile", "note");

        let pq = build_path_query(&conn, "src/Dockerfile", None).unwrap();
        assert_eq!(pq.value, "src/Dockerfile");
        assert!(!pq.is_prefix);

        // Same path without DB data -> falls through to prefix
        let conn2 = setup_test_db();
        let pq2 = build_path_query(&conn2, "src/Dockerfile", None).unwrap();
        assert_eq!(pq2.value, "src/Dockerfile/%");
        assert!(pq2.is_prefix);
    }

    #[test]
    fn test_build_path_query_probe_is_project_scoped() {
        // Path exists as a dotless file in project 1; project 2 should not
        // treat it as an exact file unless it exists there too.
        let conn = setup_test_db();
        insert_project(&conn, 1, "team/a");
        insert_project(&conn, 2, "team/b");
        insert_mr(&conn, 1, 1, 10, "author_a", "opened");
        insert_discussion(&conn, 1, 1, Some(1), None, true, false);
        insert_diffnote(&conn, 1, 1, 1, "rev", "infra/Makefile", "note");

        // Unscoped: finds exact match in project 1 -> exact
        let pq_unscoped = build_path_query(&conn, "infra/Makefile", None).unwrap();
        assert!(!pq_unscoped.is_prefix);

        // Scoped to project 2: no data -> falls back to prefix
        let pq_scoped = build_path_query(&conn, "infra/Makefile", Some(2)).unwrap();
        assert!(pq_scoped.is_prefix);

        // Scoped to project 1: finds data -> exact
        let pq_scoped1 = build_path_query(&conn, "infra/Makefile", Some(1)).unwrap();
        assert!(!pq_scoped1.is_prefix);
    }

    #[test]
    fn test_expert_excludes_self_review_notes() {
        // MR author commenting on their own diff should not be counted as reviewer
        let conn = setup_test_db();
        insert_project(&conn, 1, "team/backend");
        insert_mr(&conn, 1, 1, 100, "author_a", "opened");
        insert_discussion(&conn, 1, 1, Some(1), None, true, false);
        // author_a comments on their own MR diff (clarification)
        insert_diffnote(&conn, 1, 1, 1, "author_a", "src/auth/login.rs", "clarification");
        // reviewer_b also reviews
        insert_diffnote(&conn, 2, 1, 1, "reviewer_b", "src/auth/login.rs", "looks good");

        let result = query_expert(&conn, "src/auth/", None, 0, 20).unwrap();
        // author_a should appear as author only, not as reviewer
        let author = result.experts.iter().find(|e| e.username == "author_a").unwrap();
        assert_eq!(author.review_mr_count, 0);
        assert!(author.author_mr_count > 0);

        // reviewer_b should be a reviewer
        let reviewer = result.experts.iter().find(|e| e.username == "reviewer_b").unwrap();
        assert!(reviewer.review_mr_count > 0);
    }

    #[test]
    fn test_overlap_excludes_self_review_notes() {
        // MR author commenting on their own diff should not inflate reviewer counts
        let conn = setup_test_db();
        insert_project(&conn, 1, "team/backend");
        insert_mr(&conn, 1, 1, 100, "author_a", "opened");
        insert_discussion(&conn, 1, 1, Some(1), None, true, false);
        // author_a comments on their own MR diff (clarification)
        insert_diffnote(&conn, 1, 1, 1, "author_a", "src/auth/login.rs", "clarification");

        let result = query_overlap(&conn, "src/auth/", None, 0, 20).unwrap();
        let u = result.users.iter().find(|u| u.username == "author_a");
        // Should NOT be credited as reviewer touch
        assert!(u.map(|x| x.review_touch_count).unwrap_or(0) == 0);
    }

    #[test]
    fn test_active_participants_sorted() {
        // Participants should be sorted alphabetically for deterministic output
        let conn = setup_test_db();
        insert_project(&conn, 1, "team/backend");
        insert_mr(&conn, 1, 1, 100, "author_a", "opened");
        insert_discussion(&conn, 1, 1, Some(1), None, true, false);
        insert_diffnote(&conn, 1, 1, 1, "zebra_user", "src/foo.rs", "note 1");
        insert_diffnote(&conn, 2, 1, 1, "alpha_user", "src/foo.rs", "note 2");

        let result = query_active(&conn, None, 0, 20).unwrap();
        assert_eq!(result.discussions[0].participants, vec!["alpha_user", "zebra_user"]);
    }

    #[test]
    fn test_expert_truncation() {
        let conn = setup_test_db();
        insert_project(&conn, 1, "team/backend");
        // Create 3 experts
        for i in 1..=3 {
            insert_mr(&conn, i, 1, 100 + i, &format!("author_{i}"), "opened");
            insert_discussion(&conn, i, 1, Some(i), None, true, false);
            insert_diffnote(&conn, i, i, 1, &format!("reviewer_{i}"), "src/auth/login.rs", "note");
        }

        // limit = 2, should return truncated = true
        let result = query_expert(&conn, "src/auth/", None, 0, 2).unwrap();
        assert!(result.truncated);
        assert_eq!(result.experts.len(), 2);

        // limit = 10, should return truncated = false
        let result = query_expert(&conn, "src/auth/", None, 0, 10).unwrap();
        assert!(!result.truncated);
    }
}

Implementation Order

Step	What	Files
1	Migration 017: composite indexes for who query paths (5 indexes including global active)	`core/db.rs`
2	CLI skeleton: `WhoArgs` + `Commands::Who` + dispatch + stub	`cli/mod.rs`, `commands/mod.rs`, `main.rs`
3	Mode resolution + `normalize_repo_path()` + path query helpers (project-scoped two-way DB probe, exact-match escaping fix, `path_match` output) + entry point (`run_who` with `since_mode` tri-state)	`commands/who.rs`
4	Workload mode (4 SELECT queries, uniform params, canonical `ref_` fields, LIMIT+1 truncation on all 4 sections)	`commands/who.rs`
5	Active discussions mode (CTE-based: picked + note_counts + participants, sorted participants, truncation, bounded participants, two SQL variants global/scoped for index utilization)	`commands/who.rs`
6	Expert mode (CTE + MR-breadth scoring, self-review exclusion, MR state filter, two SQL variants for prefix/exact, truncation)	`commands/who.rs`
7	Overlap mode (dual role tracking, self-review exclusion, MR state filter, accumulator pattern, deterministic sort with tie-breakers, project-qualified mr_refs, bounded mr_refs, truncation)	`commands/who.rs`
8	Reviews mode (prefix extraction with `ltrim()` + normalization)	`commands/who.rs`
9	Human output for all 5 modes (canonical refs, scope warning, project context, truncation hints)	`commands/who.rs`
10	Robot JSON output for all 5 modes (input + resolved_input + `ref` + `since_mode` + `path_match` + discussion_id + mr_refs + bounded metadata + workload truncation)	`commands/who.rs`
11	Tests (mode discrimination, path queries incl. project-scoped probe + exact-no-escape + root-exact, LIKE escaping, path normalization, queries, self-review exclusion, note_count correctness, sorted participants, overlap dual roles, multi-project mr_refs, truncation, project path lookup)	`commands/who.rs`
12	VALID_COMMANDS + robot-docs manifest	`main.rs`
13	cargo check + clippy + fmt	(verification)
14	EXPLAIN QUERY PLAN verification for each mode (including global AND scoped active index variants)	(verification)

Verification

# Compile + lint
cargo check --all-targets
cargo clippy --all-targets -- -D warnings
cargo fmt --check

# Run tests
cargo test

# Manual verification against real data
cargo run --release -- who src/features/global-search/
cargo run --release -- who @asmith
cargo run --release -- who @asmith --reviews
cargo run --release -- who --active
cargo run --release -- who --active --since 30d
cargo run --release -- who --overlap libs/shared-frontend/src/features/global-search/
cargo run --release -- who --path README.md
cargo run --release -- who --path Makefile
cargo run --release -- -J who src/features/global-search/   # robot mode
cargo run --release -- -J who @asmith                       # robot mode
cargo run --release -- -J who @asmith --reviews             # robot mode
cargo run --release -- who src/features/global-search/ -p typescript  # project scoped

# Performance verification (required before merge):
# Confirm indexes are used as intended. One representative query per mode.
sqlite3 path/to/db.sqlite "
  EXPLAIN QUERY PLAN
  SELECT n.author_username, COUNT(*), MAX(n.created_at)
  FROM notes n
  WHERE n.note_type = 'DiffNote'
    AND n.is_system = 0
    AND n.position_new_path LIKE 'src/features/global-search/%' ESCAPE '\'
    AND n.created_at >= 0
  GROUP BY n.author_username;
"
# Expected: SEARCH notes USING INDEX idx_notes_diffnote_path_created

sqlite3 path/to/db.sqlite "
  EXPLAIN QUERY PLAN
  SELECT d.id, d.last_note_at
  FROM discussions d
  WHERE d.resolvable = 1 AND d.resolved = 0
    AND d.last_note_at >= 0
  ORDER BY d.last_note_at DESC
  LIMIT 20;
"
# Expected: SEARCH discussions USING INDEX idx_discussions_unresolved_recent_global
# (uses the global index because project_id is unconstrained)

sqlite3 path/to/db.sqlite "
  EXPLAIN QUERY PLAN
  SELECT d.id, d.last_note_at
  FROM discussions d
  WHERE d.resolvable = 1 AND d.resolved = 0
    AND d.project_id = 1
    AND d.last_note_at >= 0
  ORDER BY d.last_note_at DESC
  LIMIT 20;
"
# Expected: SEARCH discussions USING INDEX idx_discussions_unresolved_recent
# (uses the project-scoped index because project_id is constrained)
#
# NOTE: The Active mode query now uses two static SQL variants (sql_global
# and sql_scoped) instead of the nullable-OR pattern. Both EXPLAIN QUERY PLAN
# checks above should confirm that each variant uses the intended index.
# If nullable-OR were used instead, the planner might choose a suboptimal
# "good enough for both" plan.

Revision Changelog

Iteration 1 (2026-02-07): Initial plan incorporating ChatGPT feedback-1

Changes integrated from first external review:

#	Change	Origin	Impact
1	Fix `GROUP_CONCAT(DISTINCT x, sep)` -> subquery pattern	ChatGPT feedback-1 (Change 1)	Bug fix: prevents SQLite runtime error
2	Segment-aware path classification + `--path` flag	ChatGPT feedback-1 (Change 2)	Bug fix: `.github/`, `v1.2/`, root files now work
3	`(?N IS NULL OR ...)` nullable binding pattern	ChatGPT feedback-1 (Change 3)	Simplification: matches existing `timeline_seed.rs` pattern, eliminates dynamic param vectors
4	Filter path queries to `DiffNote` + LIKE escaping	ChatGPT feedback-1 (Change 4)	Correctness + perf: fewer rows scanned, no metachar edge cases
5	Use `n.created_at` for author branch timestamps	ChatGPT feedback-1 (Change 5)	Accuracy: anchors "last touch" to review activity, not MR metadata changes
6	Consistent `--since` via nullable binding on all workload queries	ChatGPT feedback-1 (Change 6), modified	Consistency: `--since` applies uniformly when provided, no-op when omitted
7	Dual role tracking in Overlap (`A`, `R`, `A+R`)	ChatGPT feedback-1 (Change 7)	Information: preserves reviewer+author signal instead of collapsing
8	Migration 017: composite indexes for hot paths	ChatGPT feedback-1 (Change 8)	Performance: makes queries usable on 280K notes
9	`input` object in robot JSON envelope	ChatGPT feedback-1 (Change 9)	Reproducibility: agents can trace exactly what query ran
10	Clap `conflicts_with` / `requires` constraints	ChatGPT feedback-1 (Change 10)	UX: invalid combos caught at parse time with good error messages
11	`escape_like()` + `build_path_pattern()` helpers	Original + ChatGPT synthesis	Correctness: centralized, tested path handling
12	Additional tests: `build_path_pattern`, `escape_like`, overlap dual roles	New	Coverage: validates new path logic and role tracking

Iteration 1.1 (2026-02-07): Incorporating ChatGPT feedback-2

Changes integrated from second external review:

#	Change	Origin	Impact
13	`resolved_input` in robot JSON (mode, project_id, project_path, since_ms, since_iso)	ChatGPT feedback-2 (Change 1)	Reproducibility: agents get computed values, not just raw args; fuzzy project resolution becomes stable
14	`WhoRun` wrapper struct carrying resolved inputs + result	ChatGPT feedback-2 (Change 1)	Architecture: resolved values computed once in `run_who`, propagated cleanly to JSON output
15	`lookup_project_path()` helper for resolved project display	ChatGPT feedback-2 (Change 1)	Completeness: project_path in resolved_input needs a DB lookup
16	Parameterize LIMIT as `?N` — eliminate ALL `format!()` in SQL	ChatGPT feedback-2 (Change 2)	Consistency: aligns with "fully static SQL" principle; enables prepared statement caching
17	`is_system = 0` on Expert/Overlap author branches	ChatGPT feedback-2 (Change 4)	Consistency: system note exclusion now uniform across all DiffNote queries
18	Migration 017 index reordering: `position_new_path` leads (not `note_type`)	ChatGPT feedback-2 (Change 5a)	Performance: partial index predicate already handles `note_type`; leading column should be most selective variable predicate
19	New `idx_notes_discussion_author` composite index	ChatGPT feedback-2 (Change 5b)	Performance: covers EXISTS subquery in Workload and participants subquery in Active
20	Partial index on DiffNote index includes `is_system = 0`	ChatGPT feedback-2 (Change 5a), synthesized	Performance: index covers the full predicate, not just note_type
21	`--path` help text updated to mention dotless files (Makefile, LICENSE)	ChatGPT feedback-2 (Change 3), adapted	UX: clear documentation of the limitation and escape hatch
22	Test for `lookup_project_path` + dotless file `build_path_pattern`	New	Coverage: validates new helper and documents known behavior

Iteration 2 (2026-02-07): Incorporating ChatGPT feedback-3

This iteration integrates the strongest improvements from the third external review, focused on query plan quality, multi-project correctness, reproducibility, and performance.

Honest assessment of what feedback-3 got right:

The third review identified several real deficiencies in the plan. The project scoping bug (m.project_id vs n.project_id) was a genuine correctness issue that would have degraded index usage. The stable MR refs proposal addressed a real ambiguity in multi-project databases. The CTE-based Active query was structurally cleaner. And prepare_cached() was a free win we should have had from the start.

#	Change	Origin	Impact
23	Fix project scoping to `n.project_id` in Expert/Overlap author branches	ChatGPT feedback-3 (Change 1)	Correctness + perf: scoping on `n.project_id` lets SQLite use `idx_notes_diffnote_path_created` (which has `project_id` as a trailing column). `m.project_id` forces a broader join before filtering.
24	`PathQuery` struct with `is_prefix` flag for exact vs prefix matching	ChatGPT feedback-3 (Change 2), adapted	Performance: exact file paths use `=` (stronger planner signal) instead of `LIKE` without wildcards. Two static SQL strings selected at prepare time — no dynamic SQL.
25	SQL-level Expert aggregation via CTE	ChatGPT feedback-3 (Change 3), adapted	Performance + simplicity: scoring, sorting, and LIMIT all happen in SQL. Eliminates Rust-side HashMap merge/sort/truncate. Makes SQL LIMIT effective (previously meaningless due to post-query aggregation). Deterministic tiebreaker (`last_seen_at DESC, username ASC`).
26	Project-qualified MR refs (`group/project!iid`) in Overlap	ChatGPT feedback-3 (Change 4)	Multi-project correctness: bare `!123` is ambiguous. JOIN to `projects` table produces globally unique, actionable references.
27	HashSet dedup for Overlap MR refs	ChatGPT feedback-3 (Change 4)	Performance: replaces O(n^2) `Vec.contains()` with O(1) `HashSet.insert()`.
28	CTE-based Active query (eliminate correlated subqueries)	ChatGPT feedback-3 (Change 5)	Performance: `picked` CTE selects limited discussions first, `note_agg` CTE joins notes once. Avoids per-row correlated subqueries for note_count and participants.
29	`prepare_cached()` on all query functions	ChatGPT feedback-3 (Change 6)	Performance: free win since all SQL is static. Statement cache avoids re-parsing.
30	Project path shown in human output (Workload + Overlap)	ChatGPT feedback-3 (Change 7)	Multi-project UX: `#42` and `!100` aren't unique without project context. Human output now shows project_path for issues, MRs, and overlap.
31	`since_was_default` in resolved_input	ChatGPT feedback-3 (Change 8)	Reproducibility: agents can distinguish explicit `--since 6m` from defaulted. Important for intent replay.
32	`discussion_id` in ActiveDiscussion	ChatGPT feedback-3 (Change 8)	Agent ergonomics: stable entity ID for follow-up. Without this, discussions require fragile multi-field matching.
33	EXPLAIN QUERY PLAN verification step	ChatGPT feedback-3 (Change 9)	Operational: confirms indexes are actually used. One representative query per mode.
34	Multi-project overlap test (`test_overlap_multi_project_mr_refs`)	New	Coverage: validates project-qualified MR refs with same iid in different projects.

Decisions NOT adopted from feedback-3

Proposal	Why not
`?4 = 1 AND ... OR ?4 = 0 AND ...` conditional path matching in single SQL	Adds a runtime conditional branch inside SQL that the planner can't optimize statically. Two separate SQL strings selected at prepare time is cleaner — the planner sees `=` or `LIKE` unambiguously, and both strings are independently cacheable.
Fully SQL-aggregated Overlap mode	Overlap still needs Rust-side merge because the UNION ALL returns rows per-role per-user, and we need to combine author + reviewer counts + deduplicate MR refs. The CTE approach works for Expert (simple numeric aggregation), but Overlap's `GROUP_CONCAT` of refs across branches requires set operations that are awkward in pure SQL.

Decisions NOT adopted from feedback-2

Proposal	Why not
`PathMatch` struct with dual exact+prefix SQL matching	Adds branching complexity (`?4 = 1 AND ... OR ...`) to every path query for narrow edge case (dotless root files). `--path` handles this cleanly with zero query complexity. Revisit if real usage shows confusion.
`db_path` in `resolved_input`	Leaks internal filesystem details. Agents already know their config. No debugging value over `project_path`.

Feedback-2 items truncated (file cut off at line 302)

ChatGPT feedback-2 was truncated mid-sentence during index redesign discussion (Change 5c about discussions index column ordering). The surviving feedback was fully evaluated and integrated where appropriate. The truncated portion appeared to be discussing (project_id, last_note_at) ordering for the discussions index, which was already adopted in the migration redesign above.

Iteration 5 (2026-02-07): Incorporating ChatGPT feedback-5

Honest assessment of what feedback-5 got right:

Feedback-5 identified two genuine correctness bugs that would have shipped broken behavior. First, build_path_query applied escape_like() even for exact matches — LIKE metacharacters (_, %, \) are not special in = comparisons, so the escaped value wouldn't match the stored path. Second, the Active mode note_agg CTE used SELECT DISTINCT discussion_id, author_username then COUNT(*), producing a participant count instead of a note count. Both would have been caught in testing, but catching them in the plan is cheaper.

Beyond bugs, feedback-5 correctly identified: the global Active index gap (existing (project_id, last_note_at) can't serve unscoped ordering), incoherent Overlap units (reviewer counting DiffNotes vs author counting MRs), and the value of canonical refs in Workload output. The ltrim() suggestion for Reviews was a practical robustness fix. The scope-warning suggestion was pure presentation but prevents misinterpretation in multi-project databases.

#	Change	Origin	Impact
35	Fix `build_path_query`: don't `escape_like()` for exact match (`=`)	ChatGPT feedback-5 (Change 1)	Bug fix: LIKE metacharacters in filenames (`_`, `%`) would cause exact-match misses. Only escape for LIKE, use raw path for `=`.
36	Root paths (no `/`) via `--path` treated as exact match	ChatGPT feedback-5 (Change 1)	Bug fix: `--path Makefile` previously produced `Makefile/%` (prefix), returning zero results. Now produces `Makefile` (exact).
37	Active mode: split `note_agg` into `note_counts` + `participants` CTEs	ChatGPT feedback-5 (Change 2)	Bug fix: `note_count` was counting distinct participants, not notes. A discussion with 5 notes from 2 users showed `note_count: 2`. Now correctly shows `note_count: 5`.
38	Add `idx_discussions_unresolved_recent_global` index (single-column `last_note_at`)	ChatGPT feedback-5 (Change 3)	Performance: `(project_id, last_note_at)` can't satisfy `ORDER BY last_note_at DESC` when `project_id` is unconstrained. The global partial index handles the common `lore who --active` case.
39	Overlap reviewer branch: `COUNT(*)` -> `COUNT(DISTINCT m.id)`	ChatGPT feedback-5 (Change 4)	Correctness: reviewer counted DiffNotes, author counted MRs. Summing into `touch_count` mixed units. Both now count distinct MRs. Human output header changed from "Touches" to "MRs".
40	Canonical `ref_` field on `WorkloadIssue` and `WorkloadMr`	ChatGPT feedback-5 (Change 5)	Actionability: `group/project#iid` / `group/project!iid` in both human and robot output. Single copy-pasteable token. Replaces separate `iid` + `project_path` display in human output.
41	`ltrim(n.body)` in Reviews category extraction SQL	ChatGPT feedback-5 (Change 6)	Robustness: tolerates leading whitespace before `prefix` (e.g., `suggestion: ...`), which is common in practice.
42	Scope warning in human output for Expert/Active/Overlap	ChatGPT feedback-5 (stretch)	UX: dim hint "(aggregated across all projects; use -p to scope)" when `project_id` is None. Prevents misinterpretation in multi-project databases.
43	`test_build_path_query_exact_does_not_escape` test	ChatGPT feedback-5 (Change 7)	Coverage: catches the exact-match escaping bug.
44	`test_path_flag_dotless_root_file_is_exact` test	ChatGPT feedback-5 (Change 7)	Coverage: ensures `--path Makefile` produces exact match, not prefix.
45	Active `test_active_query` extended to verify `note_count` correctness	ChatGPT feedback-5 (Change 7)	Coverage: second note by same participant must produce `note_count: 2`, catching the iteration-4 regression.
46	Updated `test_build_path_query` expectations for root files	Consequential	Consistency: Makefile/LICENSE now assert exact match, not prefix. Added trailing `/` override test.

Decisions NOT adopted from feedback-5

Proposal	Why not
(none rejected)	All feedback-5 changes were adopted. The feedback was tightly scoped, correctness-focused, and every change addressed a real deficiency.

Iteration 6 (2026-02-07): Incorporating ChatGPT feedback-6

Honest assessment of what feedback-6 got right:

Feedback-6 identified six distinct improvements, all with genuine merit. The self-review exclusion (#2) was a data quality issue that already existed in Reviews mode but was inconsistently applied to Expert and Overlap — a real "write the same fix everywhere" oversight. The MR-breadth scoring (#3) addresses a signal quality problem: raw DiffNote counts reward comment volume over breadth of engagement, meaning someone who writes 30 comments on one MR would outrank someone who reviewed 5 different MRs. The DB probe for dotless subdir files (#1) closes the last heuristic gap without adding any user-facing complexity. The deterministic ordering (#4) and truncation signaling (#5) are robot-mode correctness issues that would cause real flakes in automated pipelines. The accumulator pattern (#6) is a clean O(n) improvement over the previous drain-rebuild-per-row approach.

The one area where feedback-6 was mostly right but needed adaptation was the scoring weights. The proposed review_mr_count * 20 + author_mr_count * 12 + review_note_count * 1 is directionally correct (MR breadth dominates, note count is secondary), but the specific coefficients are inherently arbitrary. We adopted the formula as-is since it produces reasonable rankings and can be tuned later with real data.

#	Change	Origin	Impact
47	DB probe for dotless subdir files (`src/Dockerfile`, `infra/Makefile`) in `build_path_query`	ChatGPT feedback-6 (Change 1)	Correctness: dotless files in subdirectories were misclassified as directories, producing `path/%` (zero or wrong hits). DB probe uses existing partial index.
48	`build_path_query` signature: `(path) -> PathQuery` -> `(conn, path) -> Result<PathQuery>`	ChatGPT feedback-6 (Change 1), consequential	Architecture: DB probe requires connection. All callers updated.
49	Self-review exclusion in Expert reviewer branch	ChatGPT feedback-6 (Change 2)	Data quality: MR authors commenting on their own diffs were counted as reviewers, inflating scores. Now filtered `n.author_username != m.author_username`.
50	Self-review exclusion in Overlap reviewer branch	ChatGPT feedback-6 (Change 2)	Data quality: same fix for Overlap mode. Consistent with Reviews mode which already filtered `m.author_username != ?1`.
51	Expert reviewer branch: JOINs through `discussions -> merge_requests`	ChatGPT feedback-6 (Changes 2+3), consequential	Architecture: needed for both self-review exclusion and MR-breadth counting. Reviewer branch now matches the JOIN pattern of the author branch.
52	Expert scoring: MR-breadth primary (`COUNT(DISTINCT m.id)`) with note intensity secondary	ChatGPT feedback-6 (Change 3)	Signal quality: prevents "comment storm" gaming. Score formula: `review_mr * 20 + author_mr * 12 + review_notes * 1`.
53	`Expert` struct: `review_count -> review_mr_count + review_note_count`, `author_count -> author_mr_count`, `score: f64 -> i64`	ChatGPT feedback-6 (Change 3), adapted	API clarity: distinct field names reflect distinct metrics. Integer score avoids floating-point display issues.
54	Deterministic `participants` ordering: sort after `GROUP_CONCAT` parse	ChatGPT feedback-6 (Change 4)	Robot reproducibility: `GROUP_CONCAT` order is undefined in SQLite. Sorting eliminates run-to-run flakes.
55	Deterministic `mr_refs` ordering: sort `HashSet`-derived vectors	ChatGPT feedback-6 (Change 4)	Robot reproducibility: `HashSet` iteration order is nondeterministic. Sorting produces stable output.
56	Overlap sort: full tie-breakers (`touch_count DESC, last_seen_at DESC, username ASC`)	ChatGPT feedback-6 (Change 4)	Robot reproducibility: users with equal `touch_count` now have a stable order instead of arbitrary HashMap iteration order.
57	Truncation signaling: `truncated: bool` on `ExpertResult`, `ActiveResult`, `OverlapResult`	ChatGPT feedback-6 (Change 5)	Observability: agents and humans know when results were cut off and can retry with higher `--limit`.
58	`LIMIT + 1` query pattern for truncation detection	ChatGPT feedback-6 (Change 5)	Implementation: query one extra row, detect overflow, trim to requested limit. Applied to Expert, Active, Overlap.
59	Truncation hints in human output	ChatGPT feedback-6 (Change 5)	UX: dim hint "(showing first -n; rerun with a higher --limit)" when results are truncated.
60	`truncated` field in robot JSON for Expert, Active, Overlap	ChatGPT feedback-6 (Change 5)	Robot ergonomics: agents can programmatically detect truncation and adjust limit.
61	Overlap accumulator pattern: `OverlapAcc` with `HashSet<String>` for `mr_refs`	ChatGPT feedback-6 (Change 6)	Performance: replaces expensive `drain -> collect -> insert -> collect` per row with `HashSet.insert()` from the start. Convert to sorted `Vec` once at end.
62	Test: `test_build_path_query_dotless_subdir_file_uses_db_probe`	ChatGPT feedback-6 (Change 7)	Coverage: validates DB probe detects dotless subdir files, and falls through to prefix when no DB data.
63	Test: `test_expert_excludes_self_review_notes`	ChatGPT feedback-6 (Change 7)	Coverage: validates MR author DiffNotes on own MR don't inflate reviewer counts.
64	Test: `test_overlap_excludes_self_review_notes`	ChatGPT feedback-6 (Change 7)	Coverage: validates self-review exclusion in Overlap mode.
65	Test: `test_active_participants_sorted`	ChatGPT feedback-6 (Change 7)	Coverage: validates participants are alphabetically sorted for deterministic output.
66	Test: `test_expert_truncation`	ChatGPT feedback-6 (Change 7)	Coverage: validates truncation flag is set correctly when results exceed limit.
67	Updated human output headers: `Reviews -> Reviewed(MRs)`, added `Notes` column	Consequential	Clarity: column headers now reflect the underlying metrics (MR count vs note count).

Decisions NOT adopted from feedback-6

Proposal	Why not
(none rejected)	All feedback-6 changes were adopted. Every proposal addressed a real deficiency — correctness (self-review exclusion, dotless files), signal quality (MR-breadth scoring), reproducibility (deterministic ordering, truncation), or performance (accumulator pattern). The feedback was exceptionally well-targeted.

Iteration 7 (2026-02-07): Incorporating ChatGPT feedback-7

Honest assessment of what feedback-7 got right:

Feedback-7 identified six improvements, all with genuine merit and no rejected proposals. The strongest was the project-scoped path probe (#1): the original build_path_query probed the entire DB, meaning a path that exists as a dotless file in project A would force exact-match behavior even when running -p project_B where it doesn't exist. This is a real cross-project misclassification bug. The two-way probe (exact then prefix, both scoped) resolves it cleanly within the existing static-SQL constraint. The bounded payload proposal (#2, #3) addresses a real operational hazard — unbounded participants and mr_refs arrays in robot JSON can produce enormous payloads that break downstream pipelines. The "Last Seen" rename (#4) is a semantic accuracy fix — for author rows the timestamp comes from review notes on their MR, not their own action. The MR state filter on reviewer branches (#5) was an obvious consistency gap — author branches already filtered to opened|merged, but reviewer branches didn't. The design principle (#6) codifies the bounded payload contract.

#	Change	Origin	Impact
68	Project-scoped two-way probe in `build_path_query(conn, path, project_id)`	ChatGPT feedback-7 (Change 1)	Correctness: prevents cross-project misclassification. Exact probe and prefix probe both scoped via `(?2 IS NULL OR project_id = ?2)`. Costs at most 2 indexed queries.
69	Bounded `participants` in `ActiveDiscussion`: `participants_total` + `participants_truncated`, cap at 50	ChatGPT feedback-7 (Change 2)	Robot safety: prevents unbounded JSON arrays. Agents get metadata to detect truncation.
70	Bounded `mr_refs` in `OverlapUser`: `mr_refs_total` + `mr_refs_truncated`, cap at 50	ChatGPT feedback-7 (Change 2)	Robot safety: same bounded payload pattern for MR refs.
71	Workload truncation metadata: `*_truncated` booleans on all 4 sections, LIMIT+1 pattern	ChatGPT feedback-7 (Change 3)	Consistency: Workload now has the same truncation contract as Expert/Active/Overlap. Silent truncation was the only mode without this.
72	Rename `last_active_ms` -> `last_seen_ms` (Expert), `last_touch_at` -> `last_seen_at` (Overlap)	ChatGPT feedback-7 (Change 4)	Semantic accuracy: for author rows, the timestamp is derived from review activity on their MR, not their direct action. "Last Seen" is accurate across both reviewer and author branches.
73	`m.state IN ('opened','merged')` on Expert/Overlap reviewer branches	ChatGPT feedback-7 (Change 5)	Signal quality: closed/unmerged MRs are noise in reviewer branches. Matches existing author branch filter. Consistent across all DiffNote queries.
74	Design principle #11: Bounded payloads	ChatGPT feedback-7 (Change 6)	Contract: codifies that robot JSON never emits unbounded arrays, with `_total` + `_truncated` metadata.
75	Truncation hints in Workload human output (per-section)	Consequential	UX: each Workload section now shows dim truncation hint when applicable.
76	Test: `test_build_path_query_probe_is_project_scoped`	ChatGPT feedback-7 (Change 1)	Coverage: validates cross-project probe isolation — data in project 1 doesn't force exact-match in project 2.
77	Updated SQL aliases: `last_seen_at` everywhere (was `last_active_at` / `last_touch_at`)	Consequential	Consistency: unified timestamp naming across Expert and Overlap internal SQL and output.
78	Robot JSON: `truncation` object in Workload output with per-section booleans	Consequential	Robot ergonomics: agents can detect which Workload sections were truncated.

Decisions NOT adopted from feedback-7

Proposal	Why not
(none rejected)	All feedback-7 changes were adopted. Every proposal addressed a real deficiency — correctness (project-scoped probes), operational safety (bounded payloads), consistency (workload truncation, MR state filtering), and semantic accuracy (timestamp naming). The feedback was well-scoped and practical.

Iteration 8 (2026-02-07): Incorporating ChatGPT feedback-8

Honest assessment of what feedback-8 got right:

Feedback-8 identified seven improvements, all adopted. The strongest was the Active mode query variant split (#2): the (?N IS NULL OR d.project_id = ?N) nullable-OR pattern prevents SQLite from knowing at prepare time whether project_id is constrained, potentially choosing a suboptimal index. With two carefully crafted indexes (idx_discussions_unresolved_recent for scoped, idx_discussions_unresolved_recent_global for global), using nullable-OR undermines the very indexes we built. Splitting into two static SQL variants — selected at runtime via match project_id — ensures the planner sees unambiguous predicates and picks the intended index. This was the highest-impact correctness fix.

The since_mode tri-state (#1) fixed a genuine semantic bug: since_was_default = true was ambiguous for Workload mode (which has no default window, not a "default was applied" scenario). The path normalization (#3) prevents the most common "silent zero results" footgun (users pasting ./src/foo/ or /src/foo/). The path_match observability (#4) is low-cost debugging aid that helps both agents and humans understand exact-vs-prefix classification. The --limit hard bound (#5) is a safety valve against misconfigured agents producing enormous payloads. The total_unresolved_in_window rename (#6) eliminates an ambiguous field name that could be misread as "total unresolved globally." The statement cache discipline note (#7) codifies an implicit best practice.

#	Change	Origin	Impact
79	`since_mode` tri-state replaces `since_was_default` boolean	ChatGPT feedback-8 (Change 1)	Semantic correctness: Workload's "no window" is distinct from "default applied". Tri-state (`"default"` / `"explicit"` / `"none"`) eliminates robot-mode ambiguity for intent replay.
80	Active mode: split nullable-OR into two static SQL variants (global + scoped)	ChatGPT feedback-8 (Change 2)	Index utilization: ensures `idx_discussions_unresolved_recent_global` is used for global queries and `idx_discussions_unresolved_recent` for scoped. Nullable-OR prevented the planner from committing to either index at prepare time. Applied to both total count and main CTE query.
81	`normalize_repo_path()` input normalization for paths	ChatGPT feedback-8 (Change 3)	Defensive UX: strips `./`, leading `/`, collapses `//`, converts `\` -> `/` (Windows paste). Prevents silent zero-result footguns from common path copy-paste patterns.
82	`WhoMode` path variants own `String` (not `&'a str`)	ChatGPT feedback-8 (Change 3), consequential	Architecture: path normalization produces new strings; borrowing from args is no longer possible for path variants. Username variants still borrow.
83	`path_match` field in `ExpertResult` / `OverlapResult` + robot JSON	ChatGPT feedback-8 (Change 4)	Debuggability: `"exact"` or `"prefix"` in both human (dim hint line) and robot output. Helps diagnose zero-result cases caused by path classification. Minimal runtime cost.
84	Hard upper bound on `--limit` via `value_parser!(u16).range(1..=500)`	ChatGPT feedback-8 (Change 5)	Output safety: prevents `--limit 50000` from producing enormous payloads, slow queries, or memory spikes. Clamped at the CLI boundary before execution. Type changed to `u16`.
85	Rename `total_unresolved` -> `total_unresolved_in_window` in `ActiveResult`	ChatGPT feedback-8 (Change 6)	Semantic clarity: the count is scoped to the time window, not a global total. Prevents misinterpretation in human header and robot JSON.
86	Design principle #1 amended: nullable-OR caveat + statement cache discipline	ChatGPT feedback-8 (Changes 2, 7)	Plan hygiene: codifies when to split SQL variants vs use nullable-OR, and mandates preparing exactly one statement per invocation when multiple variants exist.
87	Test: `test_normalize_repo_path`	ChatGPT feedback-8 (Change 3)	Coverage: validates `./`, `/`, `\\`, `//`, whitespace normalization and identity for clean paths.

Decisions NOT adopted from feedback-8

Proposal	Why not
`Box::leak(norm.into_boxed_str())` for path ownership in `WhoMode`	Memory leak (even if small and short-lived in a CLI). Changed `WhoMode` path variants to own `String` directly instead, which is clean and zero-cost for a CLI that exits after one invocation.

158 KiB Raw Blame History

Plan: lore who — People Intelligence Commands

Context

Design Principles

Files to Create/Modify

Reusable Utilities (DO NOT reimplement)

Step 0: Migration 017 — Composite Indexes for Who Query Paths

Why this step exists

src/core/db.rs — Add migration to MIGRATIONS array

Step 1: CLI Args + Commands Enum

src/cli/mod.rs — Add WhoArgs struct

src/cli/mod.rs — Add to Commands enum

src/cli/commands/mod.rs — Register module + exports

src/main.rs — Import additions

src/main.rs — Dispatch arm

src/main.rs — Handler function

src/main.rs — VALID_COMMANDS array

src/main.rs — robot-docs manifest

Step 2: src/cli/commands/who.rs — Complete Implementation

File Header and Imports

Mode Discrimination

Result Types

Entry Point

Helper: Path Query Construction

Query: Expert Mode

Query: Workload Mode

Query: Reviews Mode

Query: Active Mode

Query: Overlap Mode

Human Output

Robot JSON Output

Helper Functions (duplicated from list.rs)

Tests

Implementation Order

Verification

Revision Changelog

Iteration 1 (2026-02-07): Initial plan incorporating ChatGPT feedback-1

Iteration 1.1 (2026-02-07): Incorporating ChatGPT feedback-2

Iteration 2 (2026-02-07): Incorporating ChatGPT feedback-3

Decisions NOT adopted from feedback-3

Decisions NOT adopted from feedback-2

Feedback-2 items truncated (file cut off at line 302)

Iteration 5 (2026-02-07): Incorporating ChatGPT feedback-5

Decisions NOT adopted from feedback-5

Iteration 6 (2026-02-07): Incorporating ChatGPT feedback-6

Decisions NOT adopted from feedback-6

Iteration 7 (2026-02-07): Incorporating ChatGPT feedback-7

Decisions NOT adopted from feedback-7

Iteration 8 (2026-02-07): Incorporating ChatGPT feedback-8

Decisions NOT adopted from feedback-8

158 KiB

Raw Blame History

Plan: `lore who` — People Intelligence Commands

`src/core/db.rs` — Add migration to MIGRATIONS array

`src/cli/mod.rs` — Add `WhoArgs` struct

`src/cli/mod.rs` — Add to `Commands` enum

`src/cli/commands/mod.rs` — Register module + exports

`src/main.rs` — Import additions

`src/main.rs` — Dispatch arm

`src/main.rs` — Handler function

`src/main.rs` — VALID_COMMANDS array

`src/main.rs` — robot-docs manifest

Step 2: `src/cli/commands/who.rs` — Complete Implementation