Commit Graph

116 Commits

Author SHA1 Message Date
teernisse
e6771709f1 refactor(core): extract path_resolver module, fix old_path matching in who
Extract shared path resolution logic from who.rs into a new
core::path_resolver module for cross-module reuse. Functions moved:
escape_like, normalize_repo_path, PathQuery, SuffixResult,
build_path_query, suffix_probe. Duplicate escape_like copies removed
from list.rs, project.rs, and filters.rs — all now import from
path_resolver.

Additionally fixes two bugs in query_expert_details() and
query_overlap() where only position_new_path was checked (missing
old_path matches for renamed files) and state filter excluded 'closed'
MRs despite the main scoring query including them with a decay
multiplier.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 13:50:14 -05:00
teernisse
6e55b2470d bugfix: DB column and size issues 2026-02-13 11:11:35 -05:00
Taylor Eernisse
7e0e6a91f2 refactor: extract unit tests into separate _tests.rs files
Move inline #[cfg(test)] mod tests { ... } blocks from 22 source files
into dedicated _tests.rs companion files, wired via:

    #[cfg(test)]
    #[path = "module_tests.rs"]
    mod tests;

This keeps implementation-focused source files leaner and more scannable
while preserving full access to private items through `use super::*;`.

Modules extracted:
  core:      db, note_parser, payloads, project, references, sync_run,
             timeline_collect, timeline_expand, timeline_seed
  cli:       list (55 tests), who (75 tests)
  documents: extractor (43 tests), regenerator
  embedding: change_detector, chunking
  gitlab:    graphql (wiremock async tests), transformers/issue
  ingestion: dirty_tracker, discussions, issues, mr_diffs

Also adds conflicts_with("explain_score") to the --detail flag in the
who command to prevent mutually exclusive flags from being combined.

All 629 unit tests pass. No behavior changes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 10:54:02 -05:00
teernisse
94c8613420 feat(bd-226s): implement time-decay expert scoring model
Replace flat-weight expertise scoring with exponential half-life decay,
split reviewer signals (participated vs assigned-only), dual-path rename
awareness, and new CLI flags (--as-of, --explain-score, --include-bots,
--all-history).

Changes:
- ScoringConfig: 8 new fields with validation (config.rs)
- half_life_decay() and normalize_query_path() pure functions (who.rs)
- CTE-based SQL with dual-path matching, mr_activity, reviewer_participation (who.rs)
- Rust-side decay aggregation with deterministic f64 ordering (who.rs)
- Path resolution probes check old_path columns (who.rs)
- Migration 026: 5 new indexes for dual-path and reviewer participation
- Default --since changed from 6m to 24m
- 31 new tests (example-based + invariant), 621 total who tests passing
- Autocorrect registry updated with new flags

Closes: bd-226s, bd-2w1p, bd-1soz, bd-18dn, bd-2ao4, bd-2yu5, bd-1b50,
bd-1hoq, bd-1h3f, bd-13q8, bd-11mg, bd-1vti, bd-1j5o
2026-02-12 15:44:55 -05:00
teernisse
83cd16c918 feat: implement per-note search and document pipeline
- Add SourceType::Note with extract_note_document() and ParentMetadataCache
- Migration 022: composite indexes for notes queries + author_id column
- Migration 024: table rebuild adding 'note' to CHECK constraints, defense triggers
- Migration 025: backfill existing non-system notes into dirty queue
- Add lore notes CLI command with 17 filter options (author, path, resolution, etc.)
- Support table/json/jsonl/csv output formats with field selection
- Wire note dirty tracking through discussion and MR discussion ingestion
- Fix test_migration_024_preserves_existing_data off-by-one (tested wrong migration)
- Fix upsert_document_inner returning false for label/path-only changes
2026-02-12 13:31:24 -05:00
teernisse
c8d609ab78 chore: add drift to autocorrect command registry 2026-02-12 12:10:02 -05:00
teernisse
35c828ba73 feat(bd-91j1): enhance robot-docs with quick_start and example_output
Add quick_start section with glab equivalents, lore-exclusive features,
and read/write split guidance. Add example_output to issues, mrs, search,
and who commands. Update strip_schemas to also strip example_output in
brief mode. Update beads tracking state.

Closes: bd-91j1
2026-02-12 12:09:44 -05:00
teernisse
ecbfef537a feat(bd-1ksf): wire hybrid search (FTS5 + vector + RRF) to CLI
Make run_search async, replace hardcoded lexical mode with SearchMode::parse(),
wire search_hybrid() with OllamaClient for semantic/hybrid modes, graceful
degradation when Ollama unavailable.

Closes: bd-1ksf
2026-02-12 12:03:47 -05:00
teernisse
47eecce8e9 feat(bd-1cjx): add lore drift command for discussion divergence detection
Implement drift detection using cosine similarity between issue description
embedding and chronological note embeddings. Sliding window (size 3) identifies
topic drift points. Includes human and robot output formatters.

New files: drift.rs, similarity.rs
Closes: bd-1cjx
2026-02-12 12:02:15 -05:00
teernisse
b29c382583 feat(bd-2g50): fill data gaps in issue detail view
Add references_full, user_notes_count, merge_requests_count computed
fields to show issue. Add closed_at and confidential columns via
migration 023.

Closes: bd-2g50
2026-02-12 11:59:44 -05:00
teernisse
acc5e12e3d perf: force partial index for DiffNote queries, batch stats counts
Query optimizer fixes for the `who` and `stats` commands based on
a systematic performance audit of the SQLite query plans.

who command (expert/reviews/detail modes):
- Add INDEXED BY idx_notes_diffnote_path_created hints to all DiffNote
  queries. SQLite's planner was selecting idx_notes_system (38% of rows)
  over the far more selective partial index (9.3% of rows). Measured
  50-133x speedup on expert queries, 26x on reviews queries.
- Reorder JOIN clauses in detail mode's MR-author sub-select to match
  the index scan direction (notes -> discussions -> merge_requests).

stats command:
- Replace 12+ sequential COUNT(*) queries with conditional aggregates
  (COALESCE + SUM + CASE). Documents, dirty_sources, pending_discussion_
  fetches, and pending_dependent_fetches tables each scanned once instead
  of 2-3 times. Measured 1.7x speedup (109ms -> 65ms warm cache).
- Switch FTS document count from COUNT(*) on the virtual table to
  COUNT(*) on documents_fts_docsize shadow table (B-tree scan vs FTS5
  virtual table overhead). Measured 19x speedup for that single query.

Database: 61652 docs, 282K notes, 211K discussions, 1.5GB.
2026-02-12 11:21:00 -05:00
teernisse
3a1307dcdc feat(cli): wire defaultProject through init and all commands
Integrates the defaultProject config field across the entire CLI
surface so that omitting `-p` now falls back to the configured default.

Init command:
- New `--default-project` flag on `lore init` (and robot-mode variant)
- InitInputs.default_project: Option<String> passed through to run_init
- Validation in run_init ensures the default matches a configured path
- Interactive mode: when multiple projects are configured, prompts
  whether to set a default and which project to use
- Robot mode: InitOutputJson now includes default_project (omitted when
  null) for downstream automation
- Autocorrect dictionary updated with `--default-project`

Command handlers applying effective_project():
- handle_issues: list filters use config default when -p omitted
- handle_mrs: same cascading resolution for MR listing
- handle_ingest: dry-run and full sync respect the default
- handle_timeline: TimelineParams.project resolved via effective_project
- handle_search: SearchCliFilters.project resolved via effective_project
- handle_generate_docs: project filter cascades
- handle_who: falls back to config.default_project when -p omitted
- handle_count: both count subcommands respect the default
- handle_discussions: discussion count filters respect the default

Robot-docs:
- init command schema updated with --default-project flag and
  response_schema showing default_project as string?
- New config_notes section documents the defaultProject field with
  type, description, and example

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 15:09:46 -05:00
Taylor Eernisse
06229ce98b feat(cli): expose available_statuses in robot mode and hide status_category
(Supersedes empty commit f3788eb — jj auto-snapshot race.)

Three related refinements to how work item status is presented:

1. available_statuses in meta (list.rs, main.rs):
   Robot-mode issue list responses now include meta.available_statuses —
   a sorted array of all distinct status_name values in the database.
   Agents can use this to validate --status filter values or display
   valid options without a separate query.

2. Hide status_category from JSON (list.rs, show.rs):
   status_category is a GitLab internal classification that duplicates
   the state field. Switched to skip_serializing so it never appears
   in JSON output while remaining available internally.

3. Simplify human-readable status display (show.rs):
   Removed the "(category)" parenthetical from the Status line.

4. robot-docs schema updates (main.rs):
   Documented --status filter semantics and meta.available_statuses.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 10:24:41 -05:00
Taylor Eernisse
e9af529f6e feat(ingestion): add progress reporting for status enrichment pipeline
Previously the status enrichment phase (GraphQL work item status fetch)
ran silently — users saw no feedback between "syncing issues" and the
final enrichment summary. For projects with hundreds of issues and
adaptive page-size retries, this felt like a hang.

Changes across three layers:

GraphQL (graphql.rs):
  - Extract fetch_issue_statuses_with_progress() accepting an optional
    on_page callback invoked after each paginated fetch with the
    running count of fetched IIDs
  - Original fetch_issue_statuses() preserved as a zero-cost
    delegation wrapper (no callback overhead)

Orchestrator (orchestrator.rs):
  - Three new ProgressEvent variants: StatusEnrichmentStarted,
    StatusEnrichmentPageFetched, StatusEnrichmentWriting
  - Wire the page callback through to the new _with_progress fn

CLI (ingest.rs):
  - Handle all three new events in the progress callback, updating
    both the per-project spinner and the stage bar with live counts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 10:22:20 -05:00
Taylor Eernisse
d9f99ef21d feat(cli): status display/filtering, expanded --fields, and robot-docs --brief
Work item status integration across all CLI output:

Issue listing (lore list issues):
- New Status column appears when any issue has status data, with
  hex-color rendering using ANSI 256-color approximation
- New --status flag for case-insensitive filtering (OR logic for
  multiple values): lore issues --status "In progress" --status "To do"
- Status fields (name, category, color, icon_name, synced_at) in issue
  list query and JSON output with conditional serialization

Issue detail (lore show issue):
- Displays "Status: In progress (in_progress)" with color-coded output
  using ANSI 256-color approximation from hex color values
- Status fields included in robot mode JSON with ISO timestamps
- IssueRow, IssueDetail, IssueDetailJson all carry status columns

Robot mode field selection expanded to new commands:
- search: --fields with "minimal" preset (document_id, title, source_type, score)
- timeline: --fields with "minimal" preset (timestamp, type, entity_iid, detail)
- who: --fields with per-mode presets (expert_minimal, workload_minimal, etc.)
- robot-docs: new --brief flag strips response_schema from output (~60% smaller)
- strip_schemas() utility in robot.rs for --brief mode
- expand_fields_preset() extended for search, timeline, and all who modes

Robot-docs manifest updated with --status flag documentation, --fields
flags for search/timeline/who, fields_presets sections, and corrected
search response schema field names.

Note: replaces empty commit dcfd449 which lost staging during hook execution.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 08:13:37 -05:00
Taylor Eernisse
6b75697638 feat(ingestion): enrich issues with work item status from GraphQL API
Add a "Phase 1.5" status enrichment step to the issue ingestion pipeline
that fetches work item statuses via the GitLab GraphQL API after the
standard REST API ingestion completes.

Schema changes (migration 021):
- Add status_name, status_category, status_color, status_icon_name, and
  status_synced_at columns to the issues table (all nullable)

Ingestion pipeline changes:
- New `enrich_issue_statuses_txn()` function that applies fetched
  statuses in a single transaction with two phases: clear stale statuses
  for issues that no longer have a status widget, then apply new/updated
  statuses from the GraphQL response
- ProgressEvent variants for status enrichment (complete/skipped)
- IngestProjectResult tracks enrichment metrics (seen, enriched, cleared,
  without_widget, partial_error_count, enrichment_mode, errors)
- Robot mode JSON output includes per-project status enrichment details

Configuration:
- New `sync.fetchWorkItemStatus` config option (defaults true) to disable
  GraphQL status enrichment on instances without Premium/Ultimate
- `LoreError::GitLabAuthFailed` now treated as permanent API error so
  status enrichment auth failures don't trigger retries

Also removes the unnecessary nested SAVEPOINT in store_closes_issues_refs
(already runs within the orchestrator's transaction context).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 08:09:21 -05:00
Taylor Eernisse
dfa44e5bcd fix(ingestion): label upsert reliability, init idempotency, and sync health
Label upsert (issues + merge_requests): Replace INSERT ... ON CONFLICT DO
UPDATE RETURNING with INSERT OR IGNORE + SELECT. The prior RETURNING-based
approach relied on last_insert_rowid() matching the returned id, which is
not guaranteed when ON CONFLICT triggers an update (SQLite may return 0).
The new two-step approach is unambiguous and correctly tracks created_count.

Init: Add ON CONFLICT(gitlab_project_id) DO UPDATE to the project insert
so re-running `lore init` updates path/branch/url instead of failing with
a unique constraint violation.

MR discussions sync: Reset discussions_sync_attempts to 0 when clearing a
sync health error, so previously-failed MRs get a fresh retry budget after
successful sync.

Count: format_number now handles negative numbers correctly by extracting
the sign before inserting thousand-separators.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 10:15:53 -05:00
Taylor Eernisse
41504b4941 feat(who): configurable scoring weights, MR refs, detail mode, and suffix path resolution
Expert mode now surfaces the specific MR references (project/path!iid) that
contributed to each expert's score, capped at 50 per user. A new --detail flag
adds per-MR breakdowns showing role (Author/Reviewer/both), note count, and
last activity timestamp.

Scoring weights (author_weight, reviewer_weight, note_bonus) are now
configurable via the config file's `scoring` section with validation that
rejects negative values. Defaults shift to author_weight=25, reviewer_weight=10,
note_bonus=1 — better reflecting that code authorship is a stronger expertise
signal than review assignment alone.

Path resolution gains suffix matching: typing "login.rs" auto-resolves to
"src/auth/login.rs" when unambiguous, with clear disambiguation errors when
multiple paths match. Project-scoping (-p) narrows the candidate set.

The MAX_MR_REFS_PER_USER constant is promoted to module scope for reuse
across expert and overlap modes. Human output shows MR refs inline and detail
sub-rows when requested. Robot JSON includes mr_refs, mr_refs_total,
mr_refs_truncated, and optional details array.

Includes comprehensive tests for suffix resolution, scoring weight
configurability, MR ref aggregation across projects, and detail mode.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 10:15:15 -05:00
Taylor Eernisse
b704e33188 feat(sync): surface MR diff fetch/fail counters in sync output
Adds mr_diffs_fetched and mr_diffs_failed fields to IngestResult and
SyncResult, threads them through the orchestrator aggregation, includes
them in the structured tracing span and human-readable sync summary.
Previously MR diff failures were silently swallowed — now they appear
alongside resource event counts for full pipeline observability.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 14:33:53 -05:00
Taylor Eernisse
940a96375a refactor(search): rename --after/--updated-after to --since/--updated-since
The --since naming is more intuitive (matches git log --since) and
consistent with the list commands which already use --since. Renames
the CLI flags, SearchCliFilters fields, SearchFilters fields,
autocorrect registry, and robot-docs manifest. No behavioral change.

Affected paths:
- cli/mod.rs: SearchArgs field + clap attribute rename
- cli/commands/search.rs: SearchCliFilters + run_search plumbing
- search/filters.rs: SearchFilters struct + apply_filters logic
- main.rs: handle_search + robot-docs JSON
- cli/autocorrect.rs: COMMAND_FLAGS entry for search

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 14:33:24 -05:00
Taylor Eernisse
c54a969269 fix(who): exclude self-assigned reviewers from file-change reviewer signal
Signal 4 (mr_reviewers + mr_file_changes) was missing the self-review
exclusion that signal 1 (DiffNote reviewer) already had. An MR author
listed as their own reviewer would be double-counted as both author
and reviewer, inflating their score.

Also removes redundant SELECT DISTINCT from signal 2 (GROUP BY
already ensures uniqueness).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 13:42:40 -05:00
Taylor Eernisse
95b7183add feat(who): expand expert + overlap queries with mr_file_changes and mr_reviewers
Chain: bd-jec (config flag) -> bd-2yo (fetch MR diffs) -> bd-3qn6 (rewrite who queries)

- Add fetch_mr_file_changes config option and --no-file-changes CLI flag
- Add GitLab MR diffs API fetch pipeline with watermark-based sync
- Create migration 020 for diffs_synced_for_updated_at watermark column
- Rewrite query_expert() and query_overlap() to use 4-signal UNION ALL:
  DiffNote reviewers, DiffNote MR authors, file-change authors, file-change reviewers
- Deduplicate across signal types via COUNT(DISTINCT CASE WHEN ... THEN mr_id END)
- Add insert_file_change test helper, 8 new who tests, all 397 tests pass
- Also includes: list performance migration 019, autocorrect module, README updates

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 13:35:14 -05:00
Taylor Eernisse
cc11d3e5a0 fix: peer review — 5 correctness bugs across who, db, lock, embedding, main
Comprehensive peer code review identified and fixed the following:

1. who.rs: @-prefixed path routing used `target` (with @) instead of
   `clean` (stripped) when checking for '/' and passing to Expert mode,
   causing `lore who @src/auth/` to silently return zero results because
   the SQL LIKE matched against `@src/auth/%` which never exists.

2. db.rs: After ROLLBACK TO savepoint on migration failure, the savepoint
   was never RELEASEd, leaving it active on the connection. Fixed in both
   run_migrations() and run_migrations_from_dir().

3. lock.rs: Multiple acquire() calls (e.g. re-acquiring a stale lock)
   replaced the heartbeat_handle without stopping the old thread, causing
   two concurrent heartbeat writers competing on the same lock row. Now
   signals the old thread to stop and joins it before spawning a new one.

4. chunk_ids.rs: encode_rowid() had no guard for chunk_index >= 1000
   (CHUNK_ROWID_MULTIPLIER), which would cause rowid collisions between
   adjacent documents. Added range assertion [0, 1000).

5. main.rs: Fallback JSON error formatting in handle_auth_test
   interpolated LoreError Display output without escaping quotes or
   backslashes, potentially producing malformed JSON for robot-mode
   consumers. Now escapes both characters before interpolation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 08:07:59 -05:00
Taylor Eernisse
e6b880cbcb fix: prevent panics in robot-mode JSON output and arithmetic paths
Peer code review found multiple panic-reachable paths:

1. serde_json::to_string().unwrap() in 4 robot-mode output functions
   (who.rs, main.rs x3). If serialization ever failed (e.g., NaN from
   edge-case division), the CLI would panic with an unhelpful stack trace.
   Replaced with unwrap_or_else that emits a structured JSON error fallback.

2. encode_rowid() in chunk_ids.rs used unchecked multiplication
   (document_id * 1000). On extreme document IDs this could silently wrap
   in release mode, causing embedding rowid collisions. Now uses
   checked_mul + checked_add with a diagnostic panic message.

3. HTTP response body truncation at byte index 500 in client.rs could
   split a multi-byte UTF-8 character, causing a panic. Now uses
   floor_char_boundary(500) for safe truncation.

4. who.rs reviews mode: SQL used `m.author_username != ?1` which silently
   dropped MRs with NULL author_username (SQL NULL != anything = NULL).
   Changed to `(m.author_username IS NULL OR m.author_username != ?1)`
   to match the pattern already used in expert mode.

5. handle_auth_test hardcoded exit code 5 for all errors regardless of
   type. Config not found (20), token not set (4), and network errors (8)
   all incorrectly returned 5. Now uses e.exit_code() from the actual
   LoreError, with proper suggestion hints in human mode.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 07:55:20 -05:00
Taylor Eernisse
f267578aab feat: implement lore who — people intelligence commands (5 modes)
Add `lore who` command with 5 query modes answering collaboration questions
using existing DB data (280K notes, 210K discussions, 33K DiffNotes):

- Expert: who knows about a file/directory (DiffNote path analysis + MR breadth scoring)
- Workload: what is a person working on (assigned issues, authored/reviewing MRs, discussions)
- Active: what discussions need attention (unresolved resolvable, global/project-scoped)
- Overlap: who else is touching these files (dual author+reviewer role tracking)
- Reviews: what review patterns does a person have (prefix-based category extraction)

Includes migration 017 (5 composite indexes), CLI skeleton with clap conflicts_with
validation, robot JSON output with input+resolved_input reproducibility, human terminal
output, and 20 unit tests. All quality gates pass.

Closes: bd-1q8z, bd-34rr, bd-2rk9, bd-2ldg, bd-zqpf, bd-s3rc, bd-m7k1, bd-b51e,
bd-2711, bd-1rdi, bd-3mj2, bd-tfh3, bd-zibc, bd-g0d5

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 23:11:14 -05:00
Taylor Eernisse
b5f78e31a8 fix(cli): audit-driven improvements to flags, help, exit codes, and deprecation
Addresses findings from a comprehensive CLI readiness audit:

Flag design (I2):
- Add hidden --no-verbose flag with overrides_with semantics, matching
  the --no-quiet pattern already established for all other boolean flags.

Help text (I3):
- Add after_help examples to issues, mrs, search, sync, and timeline
  subcommands. Each shows 3-4 concrete, runnable commands with comments.

Help headings (I4/P5):
- Move --mode and --fts-mode from "Output" heading to "Mode" heading
  in the search subcommand. These control search strategy, not output
  format — "Output" is reserved for --limit, --explain, --fields.

Exit codes (I5):
- Health check failure now exits 19 (was 1). Exit code 1 is reserved
  for internal errors only. robot-docs updated to document code 19.

Deprecation visibility (P4):
- Deprecated commands (list, show, auth-test, sync-status) now emit
  structured JSON warnings to stderr in robot mode:
  {"warning":{"type":"DEPRECATED","message":"...","successor":"..."}}
  Previously these were silently swallowed in robot mode.

Version string (P1):
- Cli struct uses env!("LORE_VERSION") from build.rs so --version shows
  git hash (see previous commit).

Fields flag (P3):
- --fields help text updated to document the "minimal" preset.

Robot-docs (parallel work):
- response_schema added for every command, documenting the JSON shape
  agents will receive. Agents can now introspect expected fields before
  calling a command.
- error_format documents the new "actions" array.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 23:47:04 -05:00
Taylor Eernisse
cf6d27435a feat(robot): add elapsed_ms timing, --fields support, and actionable error actions
Robot mode consistency improvements across all command output:

Timing:
- Every robot JSON response now includes meta.elapsed_ms measuring
  wall-clock time from command start to serialization. Agents can use
  this to detect slow queries and tune --limit or --project filters.

Field selection (--fields):
- print_list_issues_json and print_list_mrs_json accept an optional
  fields slice that prunes each item in the response array to only
  the requested keys. A "minimal" preset expands to
  [iid, title, state, updated_at_iso] for token-efficient agent scans.
- filter_fields and expand_fields_preset live in the new
  src/cli/robot.rs module alongside RobotMeta.

Actionable error recovery:
- LoreError gains an actions() method returning concrete shell commands
  an agent can execute to recover (e.g. "ollama serve" for
  OllamaUnavailable, "lore init" for ConfigNotFound).
- RobotError now serializes an "actions" array (empty array omitted)
  so agents can parse and offer one-click fixes.

Envelope consistency:
- show issue/MR JSON responses now use the standard
  {"ok":true,"data":...,"meta":...} envelope instead of bare data,
  matching all other commands.

Files: src/cli/robot.rs (new), src/core/error.rs,
       src/cli/commands/{count,embed,generate_docs,ingest,list,show,stats,sync_status}.rs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 23:46:48 -05:00
Taylor Eernisse
a855759bf8 fix: shutdown safety, CLI hardening, exit code collision
Shutdown signal improvements:
- Upgrade ShutdownSignal from Relaxed to Release/Acquire ordering.
  Relaxed was technically sufficient for a single flag but
  Release/Acquire is the textbook correct pattern and ensures
  visibility guarantees across threads without relying on x86 TSO.
- Add double Ctrl+C support to all three signal handlers (ingest,
  embed, sync). First Ctrl+C sets cooperative flag with user message;
  second Ctrl+C force-exits with code 130 (standard SIGINT convention).

CLI hardening:
- LORE_ROBOT env var now checks for truthy values (!empty, !="0",
  !="false") instead of mere existence. Setting LORE_ROBOT=0 or
  LORE_ROBOT=false no longer activates robot mode.
- Replace unreachable!() in color mode match with defensive warning
  and fallback to auto. Clap validates the values but defense in depth
  prevents panics if the value_parser is ever changed.
- Replace unreachable!() in completions shell match with proper error
  return for unsupported shells.

Exit code collision fix:
- ConfigNotFound was mapped to exit code 2 (error.rs:56) which
  collided with handle_clap_error() also using exit code 2 for parse
  errors. Agents calling lore --robot could not distinguish "bad
  arguments" from "missing config file."
- Restore ConfigNotFound to exit code 20 (its original dedicated code).
- Update robot-docs exit code table: code 2 = "Usage error", code 20 =
  "Config not found".

Build script:
- Track .git/refs/heads directory for Cargo rebuild triggers. Ensures
  GIT_HASH env var updates when branch refs change, not just HEAD.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 22:42:59 -05:00
Taylor Eernisse
c2036c64e9 feat(embed): docs_embedded tracking, buffer reuse, retry hardening
Embedding pipeline improvements building on the concurrent batching
foundation:

- Track docs_embedded vs chunks_embedded separately. A document counts
  as embedded only when ALL its chunks succeed, giving accurate
  progress reporting. The sync command reads docs_embedded for its
  document count.

- Reuse a single Vec<u8> buffer (embed_buf) across all store_embedding
  calls instead of allocating per chunk. Eliminates ~3KB allocation per
  768-dim embedding.

- Detect and record errors when Ollama silently returns fewer
  embeddings than inputs (batch mismatch). Previously these dropped
  chunks were invisible.

- Improve retry error messages: distinguish "retry returned unexpected
  result" (wrong dims/count) from "retry request failed" (network
  error) instead of generic "chunk too large" message.

- Convert all hot-path SQL from conn.execute() to prepare_cached() for
  statement cache reuse (clear_document_embeddings, store_embedding,
  record_embedding_error).

- Record embedding_metadata errors for empty documents so they don't
  appear as perpetually pending on subsequent runs.

- Accept concurrency parameter (configurable via config.embedding.concurrency)
  instead of hardcoded EMBED_CONCURRENCY=2.

- Add schema version pre-flight check in embed command to fail fast
  with actionable error instead of cryptic SQL errors.

- Fix --retry-failed to use DELETE instead of UPDATE. UPDATE clears
  last_error but the row still matches config params in the LEFT JOIN,
  making the doc permanently invisible to find_pending_documents.
  DELETE removes the row entirely so the LEFT JOIN returns NULL.
  Regression test added (old_update_approach_leaves_doc_invisible).

- Add chunking forward-progress guard: after floor_char_boundary()
  rounds backward, ensure start advances by at least one full
  character to prevent infinite loops on multi-byte sequences
  (box-drawing chars, smart quotes). Test cases cover the exact
  patterns that caused production hangs on document 18526.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 22:42:08 -05:00
Taylor Eernisse
39cb0cb087 feat(embed): concurrent batching, UTF-8 safe chunking, right-sized chunks
Three fixes to the embedding pipeline:

1. Concurrent HTTP batching: fire EMBED_CONCURRENCY (2) Ollama requests
   in parallel via join_all, then write results serially to SQLite.
   ~2x throughput improvement on GPU-bound workloads.

2. UTF-8 boundary safety: all computed byte offsets in split_into_chunks
   (paragraph/sentence/word break finders + overlap advance) now use
   floor_char_boundary() to prevent panics on multi-byte characters
   like smart quotes and non-breaking spaces.

3. CHUNK_MAX_BYTES reduced from 6000 to 1500 to fit nomic-embed-text's
   actual 2048-token context window, eliminating context-length retry
   storms that were causing 10x slowdowns.

Also threads ShutdownSignal through embed pipeline for graceful Ctrl+C.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 14:48:34 -05:00
Taylor Eernisse
1c45725cba fix(sync): pass options.full through to generate-docs stage
The sync pipeline was hardcoding `false` for the `full` parameter when
calling run_generate_docs, so `lore sync --full` would re-ingest all
entities but then only regenerate documents for newly-dirtied ones.
Entities loaded before migration 007 (which introduced the dirty_sources
system) were never marked dirty and thus never got documents generated.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 11:42:11 -05:00
Taylor Eernisse
405e5370dc feat(sync): concurrent drains, atomic watermarks, graceful Ctrl+C shutdown
Three fixes to the sync pipeline:

1. Atomic watermarks: wrap complete_job + update_watermark in a single
   SQLite transaction so crash between them can't leave partial state.

2. Concurrent drain loops: prefetch HTTP requests via join_all (batch
   size = dependent_concurrency), then write serially to DB. Reduces
   ~9K sequential requests from ~19 min to ~2.4 min.

3. Graceful shutdown: install Ctrl+C handler via ShutdownSignal
   (Arc<AtomicBool>), thread through orchestrator/CLI, release locked
   jobs on interrupt, record sync_run as "failed".

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 11:22:04 -05:00
Taylor Eernisse
32783080f1 fix(timeline): report true total_events in robot JSON meta
The robot JSON envelope's meta.total_events field was incorrectly
reporting events.len() (the post-limit count), making it identical
to meta.showing. This defeated the purpose of having both fields.

Changes across the pipeline to fix this:

- collect_events now returns (Vec<TimelineEvent>, usize) where the
  second element is the total event count before truncation
- TimelineResult gains a total_events_before_limit field (serde-skipped)
  so the value flows cleanly from collect through to the renderer
- main.rs passes the real total instead of the events.len() workaround

Additional cleanup in this pass:

- Derive PartialEq/Eq/PartialOrd/Ord on TimelineEventType, replacing
  the hand-rolled event_type_discriminant() function. Variant declaration
  order now defines sort tiebreak, documented in a doc comment.
- Validate --since input with a proper LoreError::Other instead of
  silently treating invalid values as None
- Fix ANSI-aware tag column padding with console::pad_str (colored tags
  like "[merged]" were misaligned because ANSI escapes consumed width)
- Remove dead print_timeline_json and infer_max_depth functions that
  were superseded by print_timeline_json_with_meta

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 09:35:02 -05:00
Taylor Eernisse
69df8a5603 feat(timeline): wire up lore timeline command with human + robot renderers
Complete Gate 3 by implementing the final three beads:
- bd-2f2: Human output renderer with colored event tags, entity refs,
  evidence snippets, and expansion summary footer
- bd-dty: Robot JSON output with {ok,data,meta} envelope, ISO timestamps,
  nested via provenance, and per-event-type details objects
- bd-1nf: CLI wiring with TimelineArgs (9 flags), Commands::Timeline
  variant, handle_timeline handler, VALID_COMMANDS entry, and robot-docs
  manifest with temporal_intelligence workflow

All 7 Gate 3 children now closed. Pipeline: SEED -> HYDRATE -> EXPAND ->
COLLECT -> RENDER fully operational.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 08:49:48 -05:00
Taylor Eernisse
5d1586b88e feat(show): Display full discussion content without truncation
Remove artificial length limits from `lore show` output to display
complete descriptions and discussion threads.

Previously, descriptions were truncated to 500 characters and discussion
notes to 300 characters, which cut off important context when reviewing
issues and MRs. Users often need the full content to understand the
complete discussion history.

Changes:
- Remove truncate() helper function and its 2 unit tests
- Pass description and note bodies directly to wrap_text()
- Affects both print_show_issue() and print_show_mr()

The wrap_text() function continues to handle line wrapping for
readability at the configured widths (76/72/68 chars depending on
nesting level).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 11:46:29 -05:00
Taylor Eernisse
c730b0ec54 feat(cli): Improve help text, error handling, and add fuzzy command suggestions
CLI help improvements (cli/mod.rs):
- Add descriptive help text to all global flags (-c, --robot, -J, etc.)
- Add descriptions to all subcommands (Issues, Mrs, Sync, etc.)
- Add --no-quiet flag for explicit quiet override
- Shell completions now shows installation instructions for each shell
- Optional subcommand: running bare 'lore' shows help in terminal mode,
  robot-docs in robot mode

Structured clap error handling (main.rs):
- Early robot mode detection before parsing (env + args)
- JSON error output for parse failures in robot mode
- Semantic error codes: UNKNOWN_COMMAND, UNKNOWN_FLAG, MISSING_REQUIRED,
  INVALID_VALUE, ARGUMENT_CONFLICT, etc.
- Fuzzy command suggestion using Jaro-Winkler similarity (>0.7 threshold)
- Help/version requests handled normally (exit 0, not error)

Robot-docs enhancements (main.rs):
- Document deprecated command aliases (list issues -> issues, etc.)
- Document clap error codes for programmatic error handling
- Include completions command in manifest
- Update flag documentation to show short forms (-n, -s, -p, etc.)

Dependencies:
- Add strsim 0.11 for Jaro-Winkler fuzzy matching

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 11:22:38 -05:00
Taylor Eernisse
ab43bbd2db feat: Add dry-run mode to ingest, sync, and stats commands
Enables preview of operations without making changes, useful for
understanding what would happen before committing to a full sync.

Ingest dry-run (--dry-run flag):
- Shows resource type, sync mode (full vs incremental), project list
- Per-project info: existing count, has_cursor, last_synced timestamp
- No GitLab API calls, no database writes

Sync dry-run (--dry-run flag):
- Preview all four stages: issues ingest, MRs ingest, docs, embed
- Shows which stages would run vs be skipped (--no-docs, --no-embed)
- Per-project breakdown for both entity types

Stats repair dry-run (--dry-run flag):
- Shows what would be repaired without executing repairs
- "would fix" vs "fixed" indicator in terminal output
- dry_run: true field in JSON response

Implementation details:
- DryRunPreview struct captures project-level sync state
- SyncDryRunResult aggregates previews for all sync stages
- Terminal output uses yellow styling for "would" actions
- JSON output includes dry_run: true at top level

Flag handling:
- --dry-run and --no-dry-run pair for explicit control
- Defaults to false (normal operation)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 11:22:22 -05:00
Taylor Eernisse
784fe79b80 feat(show): Enrich issue detail with assignees, milestones, and closing MRs
Issue detail now includes:
- assignees: List of assigned usernames from issue_assignees table
- due_date: Issue due date when set
- milestone: Milestone title when assigned
- closing_merge_requests: MRs that will close this issue when merged

Closing MR detection:
- Queries entity_references table for 'closes' reference type
- Shows MR iid, title, state (with color coding) in terminal output
- Full MR metadata included in JSON output

Human-readable output:
- "Assignees:" line shows comma-separated @usernames
- "Development:" section lists closing MRs with state indicator
- Green for merged, cyan for opened, red for closed

JSON output:
- New fields: assignees, due_date, milestone, closing_merge_requests
- closing_merge_requests array contains iid, title, state, web_url

Test coverage:
- get_issue_assignees: empty, single, multiple (alphabetical order)
- get_closing_mrs: empty, single, ignores 'mentioned' references

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 11:22:02 -05:00
Taylor Eernisse
db750e4fc5 fix: Graceful HTTP client fallbacks and overflow protection
HTTP client initialization (embedding/ollama.rs, gitlab/client.rs):
- Replace expect/panic with unwrap_or_else fallback to default Client
- Log warning when configured client fails to build
- Prevents crash on TLS/system configuration issues

Doctor command (cli/commands/doctor.rs):
- Handle reqwest Client::builder() failure in Ollama health check
- Return Warning status with descriptive message instead of panicking
- Ensures doctor command remains operational even with HTTP issues

These changes improve resilience when running in unusual environments
(containers with limited TLS, restrictive network policies, etc.)
without affecting normal operation.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 11:21:40 -05:00
Taylor Eernisse
72f1cafdcf perf: Optimize SQL queries and reduce allocations in hot paths
Change detection queries (embedding/change_detector.rs):
- Replace triple-EXISTS subquery pattern with LEFT JOIN + NULL check
- SQLite now scans embedding_metadata once instead of three times
- Semantically identical: returns docs needing embedding when no
  embedding exists, hash changed, or config mismatch

Count queries (cli/commands/count.rs):
- Consolidate 3 separate COUNT queries for issues into single query
  using conditional aggregation (CASE WHEN state = 'x' THEN 1)
- Same optimization for MRs: 5 queries reduced to 1

Search filter queries (search/filters.rs):
- Replace N separate EXISTS clauses for label filtering with single
  IN() clause with COUNT/GROUP BY HAVING pattern
- For multi-label AND queries, this reduces N subqueries to 1

FTS tokenization (search/fts.rs):
- Replace collect-into-Vec-then-join pattern with direct String building
- Pre-allocate capacity hint for result string

Discussion truncation (documents/truncation.rs):
- Calculate total length without allocating concatenated string first
- Only allocate full string when we know it fits within limit

Embedding pipeline (embedding/pipeline.rs):
- Add Vec::with_capacity hints for chunk work and cleared_docs hashset
- Reduces reallocations during embedding batch processing

Backoff calculation (core/backoff.rs):
- Replace unchecked addition with saturating_add to prevent overflow
- Add test case verifying overflow protection

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 11:21:28 -05:00
Taylor Eernisse
65583ed5d6 refactor: Remove redundant doc comments throughout codebase
Removes module-level doc comments (//! lines) and excessive inline doc
comments that were duplicating information already evident from:
- Function/struct names (self-documenting code)
- Type signatures (the what is clear from types)
- Implementation context (the how is clear from code)

Affected modules:
- cli/* - Removed command descriptions duplicating clap help text
- core/* - Removed module headers and obvious function docs
- documents/* - Removed extractor/regenerator/truncation docs
- embedding/* - Removed pipeline and chunking docs
- gitlab/* - Removed client and transformer docs (kept type definitions)
- ingestion/* - Removed orchestrator and ingestion docs
- search/* - Removed FTS and vector search docs

Philosophy: Code should be self-documenting. Comments should explain
"why" (business decisions, non-obvious constraints) not "what" (which
the code itself shows). This change reduces noise and maintenance burden
while keeping the codebase just as understandable.

Retains comments for:
- Non-obvious business logic
- Important safety invariants
- Complex algorithm explanations
- Public API boundaries where generated docs matter

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 00:04:32 -05:00
Taylor Eernisse
1d003aeac2 fix(sync): Replace text-only progress with animated bars for docs/embed stages
Stages 3 (generate-docs) and 4 (embed) reported progress by appending
"(N/M)" text to the stage spinner message, while stages 1-2 (ingest)
used dedicated indicatif progress bars with animated [====> ] rendering
registered with the global MultiProgress. This visual inconsistency
was introduced when progress callbacks were wired through in 266ed78.

Replace the spinner.set_message() callbacks with proper ProgressBar
instances that match the ingest stage pattern:
- Create a bar-style ProgressBar registered via multi().add()
- Use the same template/progress_chars as the ingest discussion bars
- Lazy-init the tick via AtomicBool to avoid showing the bar before
  the first callback fires (matching how ingest enables ticks only
  at DiscussionSyncStarted)
- Update set_length on every callback for the docs stage, since the
  regenerator's estimated_total can grow if new dirty items are
  queued during processing (using .max() internally)
- Clean up both the sub-bar and stage spinner on completion/error

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 15:02:13 -05:00
Taylor Eernisse
925ec9f574 fix: Retry loop safety, doctor model matching, regenerator robustness
Three defensive improvements from peer code review:

Replace unreachable!() in GitLab client retry loops:
Both request() and request_with_headers() had unreachable!() after
their for loops. While the logic was sound (the final iteration always
reaches the return/break), any refactor to the loop condition would
turn this into a runtime panic. Restructured both to store
last_response with explicit break, making the control flow
self-documenting and the .expect() message useful if ever violated.

Doctor model name comparison asymmetry:
Ollama model names were stripped of their tag (:latest, :v1.5) for
comparison, but the configured model name was compared as-is. A config
value like "nomic-embed-text:v1.5" would never match. Now strips the
tag from both sides before comparing.

Regenerator savepoint cleanup and progress accuracy:
- upsert_document's error path did ROLLBACK TO but never RELEASE,
  leaving a dangling savepoint that could nest on the next call. Added
  RELEASE after rollback so the connection is clean.
- estimated_total for progress reporting was computed once at start but
  the dirty queue can grow during processing. Now recounts each loop
  iteration with max() so the progress fraction never goes backwards.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 14:16:54 -05:00
Taylor Eernisse
266ed78e73 feat(sync): Wire progress callbacks through sync pipeline stages
The sync command's stage spinners now show real-time aggregate progress
for each pipeline phase instead of static "syncing..." messages.

- Add `progress_callback` parameter to `run_embed` and
  `run_generate_docs` so callers can receive `(processed, total)` updates
- Add `stage_bar` parameter to `run_ingest` for aggregate progress
  across concurrently-ingested projects using shared AtomicUsize counters
- Update `stage_spinner` to use `{prefix}` for the `[N/M]` label,
  allowing `{msg}` to be updated independently with progress details
- Thread `ProgressBar` clones into each concurrent project task so
  per-entity progress (fetch, discussions, events) is reflected on the
  aggregate spinner
- Pass `None` for progress callbacks at standalone CLI entry points
  (handle_ingest, handle_generate_docs, handle_embed) to preserve
  existing behavior when commands are run outside of sync

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 14:16:21 -05:00
teernisse
f6d19a9467 feat(sync): Instrument pipeline with tracing spans, run_id correlation, and metrics
Add end-to-end observability to the sync and ingest pipelines:

Sync command:
- Generate UUID-based run_id for each sync invocation, propagated through
  all child spans for log correlation across stages
- Accept MetricsLayer reference to extract hierarchical StageTiming data
  after pipeline completion for robot-mode performance output
- Record sync runs in DB via SyncRunRecorder (start/succeed/fail lifecycle)
- Wrap entire sync execution in a root tracing span with run_id field

Ingest command:
- Wrap run_ingest in an instrumented root span with run_id and resource_type
- Add project path prefix to discussion progress bars for multi-project clarity
- Reset resource_events_synced_for_updated_at on --full re-sync

Sync status:
- Expand from single last_run to configurable recent runs list (default 10)
- Parse and expose StageTiming metrics from stored metrics_json
- Add run_id, total_items_processed, total_errors to SyncRunInfo
- Add mr_count to DataSummary for complete entity coverage

Orchestrator:
- Add #[instrument] with structured fields to issue and MR ingestion functions
- Record items_processed, items_skipped, errors on span close for MetricsLayer
- Emit granular progress events (IssuesFetchStarted, IssuesFetchComplete)
- Pass project_id through to drain_resource_events for scoped job claiming

Document regenerator and embedding pipeline:
- Add #[instrument] spans with items_processed, items_skipped, errors fields
- Record final counts on span close for metrics extraction

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 13:39:00 -05:00
teernisse
362503d3bf feat(cli): Add verbosity controls, JSON log format, and triple-layer subscriber
Overhaul the CLI logging infrastructure for production observability:

CLI flags:
- Add -v/-vv/-vvv (--verbose) for progressive stderr verbosity control:
  0=INFO, 1=DEBUG app, 2=DEBUG all, 3+=TRACE
- Add --log-format text|json for structured stderr output in automation
- Existing -q/--quiet overrides verbosity for silent operation

Subscriber architecture (main.rs):
- Replace single-layer subscriber with triple-layer setup:
  1. stderr layer: human-readable or JSON, filtered by -v flags
  2. file layer: always-on JSON to daily-rotated logs (lore.YYYY-MM-DD.log)
  3. MetricsLayer: captures span timing for robot-mode performance payloads
- Parse CLI before subscriber init so verbosity is known at setup time
- Load LoggingConfig early (with graceful fallback for pre-init commands)
- Clean up old log files before subscriber init to avoid holding deleted handles
- Hold WorkerGuard at function scope to ensure flush on exit

Doctor command:
- Add logging health check: validates log directory exists, reports file
  count and total size, warns on missing or inaccessible log directory

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 13:38:43 -05:00
Taylor Eernisse
f5b4a765b7 perf: Configurable rate limit, 429 auto-retry, concurrent project ingestion
The sync pipeline was bottlenecked at 10 req/s (hardcoded) with
sequential project processing and no retry on rate limiting. These
changes target 3-5x throughput improvement.

Rate limit configuration:
- Add requestsPerSecond to SyncConfig (default 30.0, was hardcoded 10)
- Pass configured rate through to GitLabClient::new from ingest
- Floor rate at 0.1 rps in RateLimiter::new to prevent panic on
  Duration::from_secs_f64(1.0 / 0.0) — now reachable via user config

429 auto-retry:
- Both request() and request_with_headers() retry up to 3 times on
  HTTP 429, respecting the retry-after header (default 60s)
- Extract parse_retry_after helper, reused by handle_response fallback
- After exhausting retries, the 429 error propagates as before
- Improved JSON decode errors now include a response body preview

Concurrent project ingestion:
- Derive Clone on GitLabClient (cheap: shares Arc<Mutex<RateLimiter>>
  and reqwest::Client which is already Arc-backed)
- Restructure project loop to use futures::stream::buffer_unordered
  with primary_concurrency (default 4) as the parallelism bound
- Each project gets its own SQLite connection (WAL mode + busy_timeout
  handles concurrent writes)
- Add show_spinner field to IngestDisplay to separate the per-project
  spinner from the sync-level stage spinner
- Error aggregation defers failures: all successful projects get their
  summaries printed and results counted before returning the first error
- Bump dependentConcurrency default from 2 to 8 for discussion prefetch

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 17:37:06 -05:00
Taylor Eernisse
c35f485e0e refactor(cli): Replace tracing-indicatif with shared MultiProgress
tracing-indicatif pulled in vt100, arrayvec, and its own indicatif
integration layer. Replace it with a minimal SuspendingWriter that
coordinates tracing output with progress bars via a global LazyLock
MultiProgress.

- Add src/cli/progress.rs: shared MultiProgress singleton via LazyLock
  and a SuspendingWriter that suspends bars before writing log lines,
  preventing interleaving/flicker
- Wire all progress bar creation through multi().add() in sync and
  ingest commands
- Replace IndicatifLayer in main.rs with SuspendingWriter for
  tracing-subscriber's fmt layer
- Remove tracing-indicatif from Cargo.toml (drops vt100 and arrayvec
  transitive deps)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 17:36:31 -05:00
Taylor Eernisse
bb75a9d228 fix(events): Resource events now run on incremental syncs, fix output and progress bar
Three bugs fixed:

1. Early return in orchestrator when no discussions needed sync also
   skipped resource event enqueue+drain. On incremental syncs (the most
   common case), resource events were never fetched. Restructured to use
   if/else instead of early return so Step 4 always executes.

2. Ingest command JSON and human-readable output silently dropped
   resource_events_fetched/failed counts. Added to IngestJsonData and
   print_ingest_summary.

3. Progress bar reuse after finish_and_clear caused indicatif to silently
   ignore subsequent set_position/set_length calls. Added reset() call
   before reconfiguring the bar for resource events.

Also removed stale comment referencing "unsafe" that didn't reflect
the actual unchecked_transaction approach.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 13:06:35 -05:00
Taylor Eernisse
2bcd8db0e9 feat(events): Wire resource event fetching into sync pipeline (bd-1ep)
Integrate resource event fetching as Step 4 of both issue and MR
ingestion, gated behind the fetch_resource_events config flag.

Orchestrator changes:
- Add ProgressEvent variants: ResourceEventsFetchStarted,
  ResourceEventFetched, ResourceEventsFetchComplete
- Add resource_events_fetched/failed fields to IngestProjectResult
  and IngestMrProjectResult
- New enqueue_resource_events_for_entity_type() queries all
  issues/MRs for a project and enqueues resource_events jobs via
  the dependent queue (INSERT OR IGNORE for idempotency)
- New drain_resource_events() claims jobs in batches, fetches
  state/label/milestone events from GitLab API, stores them
  atomically via unchecked_transaction, and handles failures
  with exponential backoff via fail_job()
- Max-iterations guard prevents infinite retry loops within a
  single drain run
- New store_resource_events() + per-type _tx helpers write events
  using prepared statements inside a single transaction
- DrainResult struct tracks fetched/failed counts

CLI ingest changes:
- IngestResult gains resource_events_fetched/failed fields
- Progress bar repurposed for resource event fetch phase
  (reuses discussion bar with updated template)
- Accumulates event counts from both issue and MR ingestion

CLI sync changes:
- SyncResult gains resource_events_fetched/failed fields
- Accumulates counts from both ingest stages
- print_sync() conditionally displays event counts
- Structured logging includes event counts

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 13:02:15 -05:00