Commit Graph

8 Commits

Author SHA1 Message Date
teernisse
acc5e12e3d perf: force partial index for DiffNote queries, batch stats counts
Query optimizer fixes for the `who` and `stats` commands based on
a systematic performance audit of the SQLite query plans.

who command (expert/reviews/detail modes):
- Add INDEXED BY idx_notes_diffnote_path_created hints to all DiffNote
  queries. SQLite's planner was selecting idx_notes_system (38% of rows)
  over the far more selective partial index (9.3% of rows). Measured
  50-133x speedup on expert queries, 26x on reviews queries.
- Reorder JOIN clauses in detail mode's MR-author sub-select to match
  the index scan direction (notes -> discussions -> merge_requests).

stats command:
- Replace 12+ sequential COUNT(*) queries with conditional aggregates
  (COALESCE + SUM + CASE). Documents, dirty_sources, pending_discussion_
  fetches, and pending_dependent_fetches tables each scanned once instead
  of 2-3 times. Measured 1.7x speedup (109ms -> 65ms warm cache).
- Switch FTS document count from COUNT(*) on the virtual table to
  COUNT(*) on documents_fts_docsize shadow table (B-tree scan vs FTS5
  virtual table overhead). Measured 19x speedup for that single query.

Database: 61652 docs, 282K notes, 211K discussions, 1.5GB.
2026-02-12 11:21:00 -05:00
Taylor Eernisse
d9f99ef21d feat(cli): status display/filtering, expanded --fields, and robot-docs --brief
Work item status integration across all CLI output:

Issue listing (lore list issues):
- New Status column appears when any issue has status data, with
  hex-color rendering using ANSI 256-color approximation
- New --status flag for case-insensitive filtering (OR logic for
  multiple values): lore issues --status "In progress" --status "To do"
- Status fields (name, category, color, icon_name, synced_at) in issue
  list query and JSON output with conditional serialization

Issue detail (lore show issue):
- Displays "Status: In progress (in_progress)" with color-coded output
  using ANSI 256-color approximation from hex color values
- Status fields included in robot mode JSON with ISO timestamps
- IssueRow, IssueDetail, IssueDetailJson all carry status columns

Robot mode field selection expanded to new commands:
- search: --fields with "minimal" preset (document_id, title, source_type, score)
- timeline: --fields with "minimal" preset (timestamp, type, entity_iid, detail)
- who: --fields with per-mode presets (expert_minimal, workload_minimal, etc.)
- robot-docs: new --brief flag strips response_schema from output (~60% smaller)
- strip_schemas() utility in robot.rs for --brief mode
- expand_fields_preset() extended for search, timeline, and all who modes

Robot-docs manifest updated with --status flag documentation, --fields
flags for search/timeline/who, fields_presets sections, and corrected
search response schema field names.

Note: replaces empty commit dcfd449 which lost staging during hook execution.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 08:13:37 -05:00
Taylor Eernisse
41504b4941 feat(who): configurable scoring weights, MR refs, detail mode, and suffix path resolution
Expert mode now surfaces the specific MR references (project/path!iid) that
contributed to each expert's score, capped at 50 per user. A new --detail flag
adds per-MR breakdowns showing role (Author/Reviewer/both), note count, and
last activity timestamp.

Scoring weights (author_weight, reviewer_weight, note_bonus) are now
configurable via the config file's `scoring` section with validation that
rejects negative values. Defaults shift to author_weight=25, reviewer_weight=10,
note_bonus=1 — better reflecting that code authorship is a stronger expertise
signal than review assignment alone.

Path resolution gains suffix matching: typing "login.rs" auto-resolves to
"src/auth/login.rs" when unambiguous, with clear disambiguation errors when
multiple paths match. Project-scoping (-p) narrows the candidate set.

The MAX_MR_REFS_PER_USER constant is promoted to module scope for reuse
across expert and overlap modes. Human output shows MR refs inline and detail
sub-rows when requested. Robot JSON includes mr_refs, mr_refs_total,
mr_refs_truncated, and optional details array.

Includes comprehensive tests for suffix resolution, scoring weight
configurability, MR ref aggregation across projects, and detail mode.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 10:15:15 -05:00
Taylor Eernisse
c54a969269 fix(who): exclude self-assigned reviewers from file-change reviewer signal
Signal 4 (mr_reviewers + mr_file_changes) was missing the self-review
exclusion that signal 1 (DiffNote reviewer) already had. An MR author
listed as their own reviewer would be double-counted as both author
and reviewer, inflating their score.

Also removes redundant SELECT DISTINCT from signal 2 (GROUP BY
already ensures uniqueness).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 13:42:40 -05:00
Taylor Eernisse
95b7183add feat(who): expand expert + overlap queries with mr_file_changes and mr_reviewers
Chain: bd-jec (config flag) -> bd-2yo (fetch MR diffs) -> bd-3qn6 (rewrite who queries)

- Add fetch_mr_file_changes config option and --no-file-changes CLI flag
- Add GitLab MR diffs API fetch pipeline with watermark-based sync
- Create migration 020 for diffs_synced_for_updated_at watermark column
- Rewrite query_expert() and query_overlap() to use 4-signal UNION ALL:
  DiffNote reviewers, DiffNote MR authors, file-change authors, file-change reviewers
- Deduplicate across signal types via COUNT(DISTINCT CASE WHEN ... THEN mr_id END)
- Add insert_file_change test helper, 8 new who tests, all 397 tests pass
- Also includes: list performance migration 019, autocorrect module, README updates

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 13:35:14 -05:00
Taylor Eernisse
cc11d3e5a0 fix: peer review — 5 correctness bugs across who, db, lock, embedding, main
Comprehensive peer code review identified and fixed the following:

1. who.rs: @-prefixed path routing used `target` (with @) instead of
   `clean` (stripped) when checking for '/' and passing to Expert mode,
   causing `lore who @src/auth/` to silently return zero results because
   the SQL LIKE matched against `@src/auth/%` which never exists.

2. db.rs: After ROLLBACK TO savepoint on migration failure, the savepoint
   was never RELEASEd, leaving it active on the connection. Fixed in both
   run_migrations() and run_migrations_from_dir().

3. lock.rs: Multiple acquire() calls (e.g. re-acquiring a stale lock)
   replaced the heartbeat_handle without stopping the old thread, causing
   two concurrent heartbeat writers competing on the same lock row. Now
   signals the old thread to stop and joins it before spawning a new one.

4. chunk_ids.rs: encode_rowid() had no guard for chunk_index >= 1000
   (CHUNK_ROWID_MULTIPLIER), which would cause rowid collisions between
   adjacent documents. Added range assertion [0, 1000).

5. main.rs: Fallback JSON error formatting in handle_auth_test
   interpolated LoreError Display output without escaping quotes or
   backslashes, potentially producing malformed JSON for robot-mode
   consumers. Now escapes both characters before interpolation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 08:07:59 -05:00
Taylor Eernisse
e6b880cbcb fix: prevent panics in robot-mode JSON output and arithmetic paths
Peer code review found multiple panic-reachable paths:

1. serde_json::to_string().unwrap() in 4 robot-mode output functions
   (who.rs, main.rs x3). If serialization ever failed (e.g., NaN from
   edge-case division), the CLI would panic with an unhelpful stack trace.
   Replaced with unwrap_or_else that emits a structured JSON error fallback.

2. encode_rowid() in chunk_ids.rs used unchecked multiplication
   (document_id * 1000). On extreme document IDs this could silently wrap
   in release mode, causing embedding rowid collisions. Now uses
   checked_mul + checked_add with a diagnostic panic message.

3. HTTP response body truncation at byte index 500 in client.rs could
   split a multi-byte UTF-8 character, causing a panic. Now uses
   floor_char_boundary(500) for safe truncation.

4. who.rs reviews mode: SQL used `m.author_username != ?1` which silently
   dropped MRs with NULL author_username (SQL NULL != anything = NULL).
   Changed to `(m.author_username IS NULL OR m.author_username != ?1)`
   to match the pattern already used in expert mode.

5. handle_auth_test hardcoded exit code 5 for all errors regardless of
   type. Config not found (20), token not set (4), and network errors (8)
   all incorrectly returned 5. Now uses e.exit_code() from the actual
   LoreError, with proper suggestion hints in human mode.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 07:55:20 -05:00
Taylor Eernisse
f267578aab feat: implement lore who — people intelligence commands (5 modes)
Add `lore who` command with 5 query modes answering collaboration questions
using existing DB data (280K notes, 210K discussions, 33K DiffNotes):

- Expert: who knows about a file/directory (DiffNote path analysis + MR breadth scoring)
- Workload: what is a person working on (assigned issues, authored/reviewing MRs, discussions)
- Active: what discussions need attention (unresolved resolvable, global/project-scoped)
- Overlap: who else is touching these files (dual author+reviewer role tracking)
- Reviews: what review patterns does a person have (prefix-based category extraction)

Includes migration 017 (5 composite indexes), CLI skeleton with clap conflicts_with
validation, robot JSON output with input+resolved_input reproducibility, human terminal
output, and 20 unit tests. All quality gates pass.

Closes: bd-1q8z, bd-34rr, bd-2rk9, bd-2ldg, bd-zqpf, bd-s3rc, bd-m7k1, bd-b51e,
bd-2711, bd-1rdi, bd-3mj2, bd-tfh3, bd-zibc, bd-g0d5

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 23:11:14 -05:00