gitlore

Author	SHA1	Message	Date
Taylor Eernisse	81f049a7fa	refactor(main): wire LoreRenderer init, migrate to Theme, improve UX polish Wire the LoreRenderer singleton initialization into main.rs color mode handling, replacing the console::style import with Theme throughout. Key changes: - Color initialization: LoreRenderer::init() called for all code paths (NO_COLOR, --color never/always/auto, unknown mode fallback) alongside the existing console::set_colors_enabled() calls. Both systems must agree since some transitive code still uses console (e.g. dialoguer). - Tracing: Replace .with_target(false) with .event_format(CompactHumanFormat) for the stderr layer, producing the clean 'HH:MM:SS LEVEL message' format. - Error handling: handle_error() now shows machine-actionable recovery commands from gi_error.actions() below the hint, formatted with dim '$' prefix and bold command text. - Deprecation warnings: All 'lore list', 'lore show', 'lore auth-test', 'lore sync-status' warnings migrated to Theme::warning(). - Init wizard: All success/info/error messages migrated. Unicode check marks use explicit \u{2713} escapes instead of literal symbols. - Embed command: Added progress bar with indicatif for embedding stage, showing position/total with steady tick. Elapsed time shown on completion. - Generate-docs and ingest commands: Added 'Done in Xs' elapsed time and next-step hints (run embed after generate-docs, run generate-docs after ingest) for better workflow guidance. - Sync output: Interrupt message and lock release migrated to Theme. - Health command: Status labels and overall healthy/unhealthy styled. - Robot-docs: Added drift command schema, updated sync flags to include --no-file-changes, updated who flags with new options. - Timeline --expand-mentions -> --no-mentions flag rename wired through params and robot-docs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 22:33:09 -05:00
Taylor Eernisse	dd00a2b840	refactor(cli): migrate all command modules from console::style to Theme Replace all console::style() calls in command modules with the centralized Theme API and render:: utility functions. This ensures consistent color behavior across the entire CLI, proper NO_COLOR/--color never support via the LoreRenderer singleton, and eliminates duplicated formatting code. Changes per module: - count.rs: Theme for table headers, render::format_number replacing local duplicate. Removed local format_number implementation. - doctor.rs: Theme::success/warning/error for check status symbols and messages. Unicode escapes for check/warning/cross symbols. - drift.rs: Theme::bold/error/success for drift detection headers and status messages. - embed.rs: Compact output format — headline with count, zero-suppressed detail lines, 'nothing to embed' short-circuit for no-op runs. - generate_docs.rs: Same compact pattern — headline + detail + hint for next step. No-op short-circuit when regenerated==0. - ingest.rs: Theme for project summaries, sync status, dry-run preview. All console::style -> Theme replacements. - list.rs: Replace comfy-table with render::LoreTable for issue/MR listing. Remove local colored_cell, colored_cell_hex, format_relative_time, truncate_with_ellipsis, and format_labels (all moved to render.rs). - list_tests.rs: Update test assertions to use render:: functions. - search.rs: Add render_snippet() for FTS5 <mark> tag highlighting via Theme::bold().underline(). Compact result layout with type badges. - show.rs: Theme for entity detail views, delegate format_date and wrap_text to render module. - stats.rs: Section-based layout using render::section_divider. Compact middle-dot format for document counts. Color-coded embedding coverage percentage (green >=95%, yellow >=50%, red <50%). - sync.rs: Compact sync summary — headline with counts and elapsed time, zero-suppressed detail lines, visually prominent error-only section. - sync_status.rs: Theme for run history headers, removed local format_number duplicate. - timeline.rs: Theme for headers/footers, render:: for date/truncate, standard format! padding replacing console::pad_str. - who.rs: Theme for all expert/workload/active/overlap/review output modes, render:: for relative time and truncation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 22:32:35 -05:00
Taylor Eernisse	c6a5461d41	refactor(ingestion): compact log summaries and quieter shutdown messages Migrate all ingestion completion logs to use nonzero_summary() for compact, zero-suppressed output. Before: 8-14 individual key=value structured fields per completion message. After: a single summary field like '42 fetched · 3 labels · 12 notes' that only shows non-zero counters. Also downgrade all 'Shutdown requested...' messages from info! to debug!. These are emitted on every Ctrl+C and add noise to the partial results output that immediately follows. They remain visible at -vv for debugging graceful shutdown behavior. Affected modules: - issues.rs: issue ingestion completion - merge_requests.rs: MR ingestion completion, full-sync cursor reset - mr_discussions.rs: discussion ingestion completion - orchestrator.rs: project-level issue and MR completion summaries, all shutdown-requested checkpoints across discussion sync, resource events drain, closes-issues drain, and MR diffs drain Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 22:31:57 -05:00
Taylor Eernisse	a7f86b26e4	refactor(core): compact human log format, quieter lock lifecycle, nonzero_summary helper Three quality-of-life improvements to reduce log noise and improve readability: 1. logging.rs: Add CompactHumanFormat for stderr tracing output. Replaces the default format with a minimal 'HH:MM:SS LEVEL message key=value' layout — no span context, no full timestamps, no target module. The JSON file log layer is unaffected. This makes watching 'lore sync' output much cleaner. 2. lock.rs: Downgrade AppLock acquire/release messages from info! to debug!. Lock lifecycle events (acquired new, acquired existing, released) are operational bookkeeping that clutters normal output. They remain visible at -vv verbosity for troubleshooting. 3. ingestion/mod.rs: Add nonzero_summary() utility that formats named counters as a compact middle-dot-separated string, suppressing zero values. Produces output like '42 fetched · 3 labels · 12 notes' instead of verbose key=value structured fields. Returns 'nothing to update' when all values are zero. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 22:31:30 -05:00
Taylor Eernisse	5ee8b0841c	feat(cli): add centralized render module with semantic Theme and LoreRenderer Introduce src/cli/render.rs as the single source of truth for all terminal output styling and formatting utilities. Key components: - LoreRenderer: global singleton initialized once at startup, resolving color mode (Auto/Always/Never) against TTY state and NO_COLOR env var. This fixes lipgloss's limitation of hardcoded TrueColor rendering by gating all style application through a colors_on() check. - Theme: semantic style constants (success/warning/error/info/accent, entity refs, state colors, structural styles) that return plain Style::new() when colors are disabled. Replaces ad-hoc console::style() calls scattered across 15+ command modules. - Shared formatting utilities consolidated from duplicated implementations: format_relative_time (was in list.rs and who.rs), format_number (was in count.rs and sync_status.rs), truncate (was truncate_with_ellipsis in list.rs and truncate_summary in timeline.rs), format_labels, format_date, wrap_indent, section_divider. - LoreTable: lightweight table renderer replacing comfy-table with simple column alignment (Left/Right/Center), adaptive terminal width, and NO_COLOR-safe output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 22:31:02 -05:00
teernisse	e0041ed4d9	feat(cli): improve error recovery with alias-aware suggestions and error tolerance manifest Two related improvements to agent ergonomics in main.rs: 1. suggest_similar_command now matches against aliases (issue->issues, mr->mrs, find->search, stat->stats, note->notes, etc.) and provides contextual usage examples via a new command_example() helper, so agents get actionable recovery hints like "Did you mean 'lore mrs'? Example: lore --robot mrs -n 10" instead of just the command name. 2. robot-docs now includes an error_tolerance section documenting every auto-correction the CLI performs: types (single_dash_long_flag, case_normalization, flag_prefix, fuzzy_flag, subcommand_alias, value_normalization, value_fuzzy, prefix_matching), examples, and mode behavior (threshold differences). Also expands the aliases section with command_aliases and pre_clap_aliases maps for complete agent self-discovery. Together these ensure agents can programmatically discover and recover from any CLI input error without human intervention.	2026-02-13 17:27:49 -05:00
teernisse	a34751bd47	feat(autocorrect): expand pre-clap correction to 3-phase pipeline with subcommand aliases, value normalization, and flag prefix matching Three-phase pipeline replacing the single-pass correction: - Phase A: Subcommand alias correction — handles forms clap can't express (merge_requests, mergerequests, robotdocs, generatedocs, gen-docs, etc.) via case-insensitive alias map lookup. - Phase B: Per-arg flag corrections — adds unambiguous prefix expansion (--proj -> --project) alongside existing single-dash, case, and fuzzy rules. New FlagPrefix rule with 0.95 confidence. - Phase C: Enum value normalization — auto-corrects casing, prefixes, and typos for flags with known valid values. Handles both --flag value and --flag=value forms. Respects POSIX -- option terminator. Changes strict/robot mode from disabling fuzzy matching entirely to using a higher threshold (0.9 vs 0.8), still catching obvious typos like --projct while avoiding speculative corrections that mislead agents. New CorrectionRule variants: SubcommandAlias, ValueNormalization, ValueFuzzy, FlagPrefix. Each has a corresponding teaching note. Comprehensive test coverage for all new correction types including subcommand aliases, value normalization (case, prefix, fuzzy, eq-form), flag prefix (ambiguous rejection, eq-value preservation), and updated strict mode behavior.	2026-02-13 17:27:39 -05:00
teernisse	0aecbf33c0	feat(xref): extract cross-references from descriptions, user notes, and fix system note regex - Fix MENTIONED_RE/CLOSED_BY_RE to match real GitLab format ('mentioned in issue #N' / 'mentioned in merge request !N') - Add GITLAB_URL_RE + parse_url_refs() for full URL extraction - Add extract_refs_from_descriptions() -> source_method='description_parse' - Add extract_refs_from_user_notes() -> source_method='note_parse' - Wire both into orchestrator after system note extraction - 36 tests: regex fix, URL parsing, integration, idempotency	2026-02-13 17:19:36 -05:00
teernisse	c10471ddb9	feat(timeline): add entity-direct seeding (issue:N, mr:N syntax) Adds issue:N / i:N / mr:N / m:N query syntax to bypass hybrid search and seed the timeline directly from a known entity. All discussions for the entity are gathered without needing Ollama. - parse_timeline_query() detects entity-direct patterns - resolve_entity_by_iid() resolves IID to EntityRef with ambiguity handling - seed_timeline_direct() gathers all discussions for the entity - 20 new tests (5 resolve, 6 direct seed, 9 parse) - Updated CLI help text and robot-docs manifest	2026-02-13 15:22:45 -05:00
teernisse	94435c37f0	perf(timeline): hoist prepared statement outside discussion thread loop Moves the conn.prepare() call for fetching discussion notes outside the per-discussion loop in collect_discussion_threads(). The SQL is identical for every iteration, so preparing it once and rebinding parameters avoids redundant statement compilation on each matched discussion. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 14:56:40 -05:00
teernisse	59f65b127a	fix(search): pass FTS5 boolean operators through unquoted FTS5 boolean operators (AND, OR, NOT, NEAR) are case-sensitive uppercase keywords that must appear unquoted in the query string. Previously, the user-friendly query builder would double-quote every token, causing queries like "switch AND health" to search for the literal word "AND" instead of using it as a boolean conjunction. Adds a FTS5_OPERATORS constant and checks each token against it before quoting, allowing natural boolean search syntax to work as expected. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 14:56:29 -05:00
teernisse	f36e900570	feat(cli): add pipeline progress spinners to timeline and search Adds numbered stage spinners ([1/3], [2/3], [3/3]) to the timeline pipeline stages (seed, expand, collect) so users see activity during longer queries. TimelineParams gains a robot_mode field to suppress spinners in JSON output mode. Adds a [1/1] spinner to the search command for consistency, using the shared stage_spinner from cli/progress. Also refactors wrap_snippet() to delegate to wrap_text() with a 4-line cap, eliminating the duplicated word-wrapping logic. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 14:56:19 -05:00
teernisse	e2efc61beb	refactor(cli): extract stage_spinner to shared progress module Moves stage_spinner() from a private function in sync.rs to a pub function in cli/progress.rs so it can be reused by the timeline and search commands. The function creates a numbered spinner (e.g. [1/3]) for pipeline stages, returning a hidden no-op bar in robot mode to keep caller code path-uniform. sync.rs now imports from crate::cli::progress::stage_spinner instead of defining its own copy. Adds unit tests for robot mode (hidden bar), human mode (prefix/message properties), and prefix formatting. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 14:56:10 -05:00
teernisse	2da1a228b3	feat(timeline): collect and render full discussion threads Implements the downstream consumption of matched discussions from the seed phase, completing the discussion thread feature across collect, CLI, and integration tests. Collect phase (timeline_collect.rs): - New collect_discussion_threads() function assembles full threads by querying notes for each matched discussion_id, filtering out system notes (is_system = 0), ordering chronologically, and capping at THREAD_MAX_NOTES with a synthetic "[N more notes not shown]" summary note - build_entity_lookup() creates a (type, id) -> (iid, path) map from seed and expanded entities to provide display metadata for thread events - Thread timestamp is set to the first note's created_at for correct chronological interleaving with other timeline events - collect_events() gains a matched_discussions parameter; threads are collected after entity events and before evidence note merging CLI rendering (cli/commands/timeline.rs): - Human mode: threads render with box-drawing borders, bold @author tags, date-stamped notes, and word-wrapped bodies (60 char width) - Robot mode: DiscussionThread serializes as discussion_thread kind with note_count, full notes array (note_id, author, body, ISO created_at) - THREAD tag in yellow for human event tag styling - TimelineMeta gains discussion_threads_included count Tests: - 8 new collect tests: basic thread assembly, system note filtering, empty thread skipping, body truncation to THREAD_NOTE_MAX_CHARS, note cap with synthetic summary, timestamp from first note, chronological sort position, and deduplication of duplicate discussion_ids - Integration tests updated for new collect_events signature Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 14:18:36 -05:00
teernisse	0e65202778	feat(timeline): add DiscussionThread types and seed-phase discussion matching Introduces the foundation for full discussion thread support in the timeline pipeline. Adds three new domain types to timeline.rs: - ThreadNote: individual note within a thread (id, author, body, timestamp) - MatchedDiscussion: tracks discussions matched during seeding with their parent entity (issue or MR) for downstream collection - DiscussionThread variant on TimelineEventType: carries a full thread of notes, sorted between NoteEvidence and CrossReferenced Moves truncate_to_chars() from timeline_seed.rs to timeline.rs as pub(crate) for reuse by the collect phase. Adds THREAD_NOTE_MAX_CHARS (2000) and THREAD_MAX_NOTES (50) constants. Upgrades the seed SQL in resolve_documents_to_entities() to resolve note documents to their parent discussion via an additional LEFT JOIN chain (notes -> discussions), using COALESCE to unify the entity resolution path for both discussion and note source types. SeedResult gains a matched_discussions field that captures deduplicated discussion matches. Tests cover: discussion matching from discussion docs, note-to-parent resolution, deduplication of same discussion across multiple docs, and correct parent entity type (issue vs MR). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 14:18:18 -05:00
teernisse	f439c42b3d	chore: add gitignore for mock-seed, roam CI workflow, formatting - Add tools/mock-seed/ to .gitignore - Add .github/workflows/roam.yml CI workflow - Add .roam/fitness.yaml architectural fitness rules - Rustfmt formatting fixes in show.rs and vector.rs - Beads sync Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 13:50:30 -05:00
teernisse	4f3ec72923	feat(timeline): upgrade seed phase to hybrid search Replace FTS-only seed entity discovery with hybrid search (FTS + vector via RRF), using the same search_hybrid infrastructure as the search command. Falls back gracefully to FTS-only when Ollama is unavailable. Changes: - seed_timeline() now accepts OllamaClient, delegates to search_hybrid - New resolve_documents_to_entities() replaces find_seed_entities() - SeedResult gains search_mode field tracking actual mode used - TimelineResult carries search_mode through to JSON renderer - run_timeline wires up OllamaClient from config - handle_timeline made async for the hybrid search await - Tests updated for new function signatures Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 13:50:24 -05:00
teernisse	e6771709f1	refactor(core): extract path_resolver module, fix old_path matching in who Extract shared path resolution logic from who.rs into a new core::path_resolver module for cross-module reuse. Functions moved: escape_like, normalize_repo_path, PathQuery, SuffixResult, build_path_query, suffix_probe. Duplicate escape_like copies removed from list.rs, project.rs, and filters.rs — all now import from path_resolver. Additionally fixes two bugs in query_expert_details() and query_overlap() where only position_new_path was checked (missing old_path matches for renamed files) and state filter excluded 'closed' MRs despite the main scoring query including them with a decay multiplier. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 13:50:14 -05:00
teernisse	6e55b2470d	bugfix: DB column and size issues	2026-02-13 11:11:35 -05:00
Taylor Eernisse	48fbd4bfdb	feat(core): add file rename chain resolver with depth-bounded BFS New module: core::file_history with resolve_rename_chain() that traces a file path through its rename history in mr_file_changes using bidirectional BFS (forward: old_path->new_path, backward: new_path->old_path). Key design decisions: - Depth-bounded BFS: each queue entry carries its distance from the origin, so max_hops correctly limits by graph distance (not by total nodes discovered). This matters for branching rename graphs where a file was renamed differently in parallel MRs. - Cycle-safe: visited set prevents infinite loops from circular renames. - Project-scoped: queries are always scoped to a single project_id. - Deterministic: output is sorted for stable results. Tests cover: linear chains (forward/backward), cycles, max_hops=0, depth-bounded linear chains, branching renames, diamond patterns, and cross-project isolation (9 tests total). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 10:54:41 -05:00
Taylor Eernisse	9786ef27f5	refactor(core/time): extract parse_since_from for deterministic time parsing Factor out parse_since_from(input, reference_ms) so callers can compute relative durations against a fixed reference timestamp instead of always using now(). The existing parse_since() now delegates to it with now_ms(). Enables testable and reproducible time-relative queries for features like timeline --as-of and who --as-of. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 10:54:20 -05:00
Taylor Eernisse	7e0e6a91f2	refactor: extract unit tests into separate _tests.rs files Move inline #[cfg(test)] mod tests { ... } blocks from 22 source files into dedicated _tests.rs companion files, wired via: #[cfg(test)] #[path = "module_tests.rs"] mod tests; This keeps implementation-focused source files leaner and more scannable while preserving full access to private items through `use super::*;`. Modules extracted: core: db, note_parser, payloads, project, references, sync_run, timeline_collect, timeline_expand, timeline_seed cli: list (55 tests), who (75 tests) documents: extractor (43 tests), regenerator embedding: change_detector, chunking gitlab: graphql (wiremock async tests), transformers/issue ingestion: dirty_tracker, discussions, issues, mr_diffs Also adds conflicts_with("explain_score") to the --detail flag in the who command to prevent mutually exclusive flags from being combined. All 629 unit tests pass. No behavior changes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 10:54:02 -05:00
teernisse	94c8613420	feat(bd-226s): implement time-decay expert scoring model Replace flat-weight expertise scoring with exponential half-life decay, split reviewer signals (participated vs assigned-only), dual-path rename awareness, and new CLI flags (--as-of, --explain-score, --include-bots, --all-history). Changes: - ScoringConfig: 8 new fields with validation (config.rs) - half_life_decay() and normalize_query_path() pure functions (who.rs) - CTE-based SQL with dual-path matching, mr_activity, reviewer_participation (who.rs) - Rust-side decay aggregation with deterministic f64 ordering (who.rs) - Path resolution probes check old_path columns (who.rs) - Migration 026: 5 new indexes for dual-path and reviewer participation - Default --since changed from 6m to 24m - 31 new tests (example-based + invariant), 621 total who tests passing - Autocorrect registry updated with new flags Closes: bd-226s, bd-2w1p, bd-1soz, bd-18dn, bd-2ao4, bd-2yu5, bd-1b50, bd-1hoq, bd-1h3f, bd-13q8, bd-11mg, bd-1vti, bd-1j5o	2026-02-12 15:44:55 -05:00
teernisse	83cd16c918	feat: implement per-note search and document pipeline - Add SourceType::Note with extract_note_document() and ParentMetadataCache - Migration 022: composite indexes for notes queries + author_id column - Migration 024: table rebuild adding 'note' to CHECK constraints, defense triggers - Migration 025: backfill existing non-system notes into dirty queue - Add lore notes CLI command with 17 filter options (author, path, resolution, etc.) - Support table/json/jsonl/csv output formats with field selection - Wire note dirty tracking through discussion and MR discussion ingestion - Fix test_migration_024_preserves_existing_data off-by-one (tested wrong migration) - Fix upsert_document_inner returning false for label/path-only changes	2026-02-12 13:31:24 -05:00
teernisse	c8d609ab78	chore: add drift to autocorrect command registry	2026-02-12 12:10:02 -05:00
teernisse	35c828ba73	feat(bd-91j1): enhance robot-docs with quick_start and example_output Add quick_start section with glab equivalents, lore-exclusive features, and read/write split guidance. Add example_output to issues, mrs, search, and who commands. Update strip_schemas to also strip example_output in brief mode. Update beads tracking state. Closes: bd-91j1	2026-02-12 12:09:44 -05:00
teernisse	ecbfef537a	feat(bd-1ksf): wire hybrid search (FTS5 + vector + RRF) to CLI Make run_search async, replace hardcoded lexical mode with SearchMode::parse(), wire search_hybrid() with OllamaClient for semantic/hybrid modes, graceful degradation when Ollama unavailable. Closes: bd-1ksf	2026-02-12 12:03:47 -05:00
teernisse	47eecce8e9	feat(bd-1cjx): add lore drift command for discussion divergence detection Implement drift detection using cosine similarity between issue description embedding and chronological note embeddings. Sliding window (size 3) identifies topic drift points. Includes human and robot output formatters. New files: drift.rs, similarity.rs Closes: bd-1cjx	2026-02-12 12:02:15 -05:00
teernisse	b29c382583	feat(bd-2g50): fill data gaps in issue detail view Add references_full, user_notes_count, merge_requests_count computed fields to show issue. Add closed_at and confidential columns via migration 023. Closes: bd-2g50	2026-02-12 11:59:44 -05:00
teernisse	d9c9f6e541	fix: escape LIKE metacharacters in project resolver User-supplied project names containing `%` or `_` were passed directly into LIKE patterns, causing unintended wildcard matching. For example, `my_project` would match `my-project` because `_` is a single-char wildcard in SQL LIKE. Added escape_like() helper that escapes `\`, `%`, and `_` with backslash, and added ESCAPE '\' clauses to both the suffix-match and substring-match queries in resolve_project(). Includes two regression tests: - test_underscore_not_wildcard: `_` in input must not match `-` - test_percent_not_wildcard: `%` in input must not match arbitrary strings	2026-02-12 11:21:09 -05:00
teernisse	acc5e12e3d	perf: force partial index for DiffNote queries, batch stats counts Query optimizer fixes for the `who` and `stats` commands based on a systematic performance audit of the SQLite query plans. who command (expert/reviews/detail modes): - Add INDEXED BY idx_notes_diffnote_path_created hints to all DiffNote queries. SQLite's planner was selecting idx_notes_system (38% of rows) over the far more selective partial index (9.3% of rows). Measured 50-133x speedup on expert queries, 26x on reviews queries. - Reorder JOIN clauses in detail mode's MR-author sub-select to match the index scan direction (notes -> discussions -> merge_requests). stats command: - Replace 12+ sequential COUNT() queries with conditional aggregates (COALESCE + SUM + CASE). Documents, dirty_sources, pending_discussion_ fetches, and pending_dependent_fetches tables each scanned once instead of 2-3 times. Measured 1.7x speedup (109ms -> 65ms warm cache). - Switch FTS document count from COUNT() on the virtual table to COUNT(*) on documents_fts_docsize shadow table (B-tree scan vs FTS5 virtual table overhead). Measured 19x speedup for that single query. Database: 61652 docs, 282K notes, 211K discussions, 1.5GB.	2026-02-12 11:21:00 -05:00
teernisse	3a1307dcdc	feat(cli): wire defaultProject through init and all commands Integrates the defaultProject config field across the entire CLI surface so that omitting `-p` now falls back to the configured default. Init command: - New `--default-project` flag on `lore init` (and robot-mode variant) - InitInputs.default_project: Option<String> passed through to run_init - Validation in run_init ensures the default matches a configured path - Interactive mode: when multiple projects are configured, prompts whether to set a default and which project to use - Robot mode: InitOutputJson now includes default_project (omitted when null) for downstream automation - Autocorrect dictionary updated with `--default-project` Command handlers applying effective_project(): - handle_issues: list filters use config default when -p omitted - handle_mrs: same cascading resolution for MR listing - handle_ingest: dry-run and full sync respect the default - handle_timeline: TimelineParams.project resolved via effective_project - handle_search: SearchCliFilters.project resolved via effective_project - handle_generate_docs: project filter cascades - handle_who: falls back to config.default_project when -p omitted - handle_count: both count subcommands respect the default - handle_discussions: discussion count filters respect the default Robot-docs: - init command schema updated with --default-project flag and response_schema showing default_project as string? - New config_notes section documents the defaultProject field with type, description, and example Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-11 15:09:46 -05:00
teernisse	6ea3108a20	feat(config): add defaultProject with validation and cascading resolver Introduces a new optional `defaultProject` field on Config (and MinimalConfig for init output) that acts as a fallback when the `-p`/`--project` CLI flag is omitted. Domain-layer changes: - Config.default_project: Option<String> with camelCase serde rename - Config::load validates that defaultProject matches a configured project path (exact or case-insensitive suffix match), returning ConfigInvalid on mismatch - Config::effective_project(cli_flag) -> Option<&str>: cascading resolver that prefers the CLI flag, then the config default, then None - MinimalConfig.default_project with skip_serializing_if for clean JSON output when unset Tests added: - effective_project: CLI overrides default, falls back to default, returns None when both absent - Config::load: accepts valid defaultProject, rejects nonexistent, accepts suffix match - MinimalConfig: omits null defaultProject, includes when set - Helper write_config_with_default_project for parameterized tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-11 15:09:33 -05:00
Taylor Eernisse	06229ce98b	feat(cli): expose available_statuses in robot mode and hide status_category (Supersedes empty commit `f3788eb` — jj auto-snapshot race.) Three related refinements to how work item status is presented: 1. available_statuses in meta (list.rs, main.rs): Robot-mode issue list responses now include meta.available_statuses — a sorted array of all distinct status_name values in the database. Agents can use this to validate --status filter values or display valid options without a separate query. 2. Hide status_category from JSON (list.rs, show.rs): status_category is a GitLab internal classification that duplicates the state field. Switched to skip_serializing so it never appears in JSON output while remaining available internally. 3. Simplify human-readable status display (show.rs): Removed the "(category)" parenthetical from the Status line. 4. robot-docs schema updates (main.rs): Documented --status filter semantics and meta.available_statuses. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-11 10:24:41 -05:00
Taylor Eernisse	e9af529f6e	feat(ingestion): add progress reporting for status enrichment pipeline Previously the status enrichment phase (GraphQL work item status fetch) ran silently — users saw no feedback between "syncing issues" and the final enrichment summary. For projects with hundreds of issues and adaptive page-size retries, this felt like a hang. Changes across three layers: GraphQL (graphql.rs): - Extract fetch_issue_statuses_with_progress() accepting an optional on_page callback invoked after each paginated fetch with the running count of fetched IIDs - Original fetch_issue_statuses() preserved as a zero-cost delegation wrapper (no callback overhead) Orchestrator (orchestrator.rs): - Three new ProgressEvent variants: StatusEnrichmentStarted, StatusEnrichmentPageFetched, StatusEnrichmentWriting - Wire the page callback through to the new _with_progress fn CLI (ingest.rs): - Handle all three new events in the progress callback, updating both the per-project spinner and the stage bar with live counts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-11 10:22:20 -05:00
Taylor Eernisse	70271c14d6	fix(core): ensure migration framework records schema version automatically The migration runner now inserts (OR REPLACE) the schema_version row after each successful migration batch, regardless of whether the migration SQL itself contains a self-registering INSERT. This prevents version tracking gaps when a .sql migration omits the bookkeeping statement, which would leave the schema at an unrecorded version and cause re-execution attempts on next startup. Legacy migrations that already self-register are unaffected thanks to the OR REPLACE conflict resolution. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-11 10:21:49 -05:00
Taylor Eernisse	d9f99ef21d	feat(cli): status display/filtering, expanded --fields, and robot-docs --brief Work item status integration across all CLI output: Issue listing (lore list issues): - New Status column appears when any issue has status data, with hex-color rendering using ANSI 256-color approximation - New --status flag for case-insensitive filtering (OR logic for multiple values): lore issues --status "In progress" --status "To do" - Status fields (name, category, color, icon_name, synced_at) in issue list query and JSON output with conditional serialization Issue detail (lore show issue): - Displays "Status: In progress (in_progress)" with color-coded output using ANSI 256-color approximation from hex color values - Status fields included in robot mode JSON with ISO timestamps - IssueRow, IssueDetail, IssueDetailJson all carry status columns Robot mode field selection expanded to new commands: - search: --fields with "minimal" preset (document_id, title, source_type, score) - timeline: --fields with "minimal" preset (timestamp, type, entity_iid, detail) - who: --fields with per-mode presets (expert_minimal, workload_minimal, etc.) - robot-docs: new --brief flag strips response_schema from output (~60% smaller) - strip_schemas() utility in robot.rs for --brief mode - expand_fields_preset() extended for search, timeline, and all who modes Robot-docs manifest updated with --status flag documentation, --fields flags for search/timeline/who, fields_presets sections, and corrected search response schema field names. Note: replaces empty commit `dcfd449` which lost staging during hook execution. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-11 08:13:37 -05:00
Taylor Eernisse	6b75697638	feat(ingestion): enrich issues with work item status from GraphQL API Add a "Phase 1.5" status enrichment step to the issue ingestion pipeline that fetches work item statuses via the GitLab GraphQL API after the standard REST API ingestion completes. Schema changes (migration 021): - Add status_name, status_category, status_color, status_icon_name, and status_synced_at columns to the issues table (all nullable) Ingestion pipeline changes: - New `enrich_issue_statuses_txn()` function that applies fetched statuses in a single transaction with two phases: clear stale statuses for issues that no longer have a status widget, then apply new/updated statuses from the GraphQL response - ProgressEvent variants for status enrichment (complete/skipped) - IngestProjectResult tracks enrichment metrics (seen, enriched, cleared, without_widget, partial_error_count, enrichment_mode, errors) - Robot mode JSON output includes per-project status enrichment details Configuration: - New `sync.fetchWorkItemStatus` config option (defaults true) to disable GraphQL status enrichment on instances without Premium/Ultimate - `LoreError::GitLabAuthFailed` now treated as permanent API error so status enrichment auth failures don't trigger retries Also removes the unnecessary nested SAVEPOINT in store_closes_issues_refs (already runs within the orchestrator's transaction context). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-11 08:09:21 -05:00
Taylor Eernisse	dc49f5209e	feat(gitlab): add GraphQL client with adaptive pagination and work item status types Introduce a reusable GraphQL client (`src/gitlab/graphql.rs`) that handles GitLab's GraphQL API with full error handling for auth failures, rate limiting, and partial errors. Key capabilities: - Adaptive page sizing (100 → 50 → 25 → 10) to handle GitLab GraphQL complexity limits without hardcoding a single safe page size - Paginated issue status fetching via the workItems GraphQL query - Graceful detection of unsupported instances (missing GraphQL endpoint or forbidden auth) so ingestion continues without status data - Retry-After header parsing via the `httpdate` crate for rate limit compliance Also adds `WorkItemStatus` type to `gitlab::types` with name, category, color, and icon_name fields (all optional except name) with comprehensive deserialization tests covering all system statuses (TO_DO, IN_PROGRESS, DONE, CANCELED) and edge cases (null category, unknown future values). The `GitLabClient` gains a `graphql_client()` factory method for ergonomic access from the ingestion pipeline. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-11 08:08:53 -05:00
Taylor Eernisse	7d40a81512	fix(ingestion): remove nested transaction in upsert_mr_file_changes drain_mr_diffs in orchestrator.rs already wraps each MR diff store in an unchecked_transaction (alongside job completion and watermark update). upsert_mr_file_changes was also starting its own inner transaction via conn.unchecked_transaction(), causing every call to fail with "cannot start a transaction within a transaction". Remove the inner transaction management from upsert_mr_file_changes so it operates on whatever Connection (or Transaction deref'd to Connection) the caller provides. The caller in drain_mr_diffs owns the transaction boundary. Standalone callers (tests, future direct use) auto-commit each statement, which is correct for their use case. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 11:56:15 -05:00
Taylor Eernisse	45126f04a6	fix: document upsert project_id, truncation budget, and Ollama model matching - regenerator: Include project_id in the ON CONFLICT UPDATE clause for document upserts. Previously, if a document moved between projects (e.g., during re-ingestion), the project_id would remain stale. - truncation: Compute the omission marker ("N notes omitted") before checking whether first+last notes fit in the budget. The old order computed the marker after the budget check, meaning the marker's byte cost was unaccounted for and could cause over-budget output. - ollama: Tighten model name matching to require either an exact match or a colon-delimited tag prefix (model == name or name starts with "model:"). The prior starts_with check would false-positive on "nomic-embed-text-v2" when looking for "nomic-embed-text". Tests updated to cover exact match, tagged, wrong model, and prefix false-positive cases. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 10:16:14 -05:00
Taylor Eernisse	dfa44e5bcd	fix(ingestion): label upsert reliability, init idempotency, and sync health Label upsert (issues + merge_requests): Replace INSERT ... ON CONFLICT DO UPDATE RETURNING with INSERT OR IGNORE + SELECT. The prior RETURNING-based approach relied on last_insert_rowid() matching the returned id, which is not guaranteed when ON CONFLICT triggers an update (SQLite may return 0). The new two-step approach is unambiguous and correctly tracks created_count. Init: Add ON CONFLICT(gitlab_project_id) DO UPDATE to the project insert so re-running `lore init` updates path/branch/url instead of failing with a unique constraint violation. MR discussions sync: Reset discussions_sync_attempts to 0 when clearing a sync health error, so previously-failed MRs get a fresh retry budget after successful sync. Count: format_number now handles negative numbers correctly by extracting the sign before inserting thousand-separators. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 10:15:53 -05:00
Taylor Eernisse	53ef21d653	fix: propagate DB errors instead of silently swallowing them Replace .unwrap_or(), .ok(), and .filter_map(\|r\| r.ok()) patterns with proper error propagation using ? and rusqlite::OptionalExtension where the query may legitimately return no rows. Affected areas: - events_db::count_events: three count queries now propagate errors instead of defaulting to (0, 0) on failure - note_parser::extract_refs_from_system_notes: row iteration errors are now propagated instead of silently dropped via filter_map - note_parser::noteable_type_to_entity_type: unknown types now log a debug warning before defaulting to "issue" - payloads::store_payload/read_payload: use .optional()? instead of .ok() to distinguish "no row" from "query failed" - backoff::compute_next_attempt_at: use .clamp(0, 30) to guard against negative attempt_count, not just .min(30) - search::vector::max_chunks_per_document: returns Result<i64> with proper error propagation through .optional()?.flatten() - embedding::chunk_ids::decode_rowid: promote debug_assert to assert since negative rowids indicate data corruption worth failing fast on - ingestion::dirty_tracker::record_dirty_error: use .optional()? to handle missing dirty_sources row gracefully instead of hard error Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 10:15:36 -05:00
Taylor Eernisse	41504b4941	feat(who): configurable scoring weights, MR refs, detail mode, and suffix path resolution Expert mode now surfaces the specific MR references (project/path!iid) that contributed to each expert's score, capped at 50 per user. A new --detail flag adds per-MR breakdowns showing role (Author/Reviewer/both), note count, and last activity timestamp. Scoring weights (author_weight, reviewer_weight, note_bonus) are now configurable via the config file's `scoring` section with validation that rejects negative values. Defaults shift to author_weight=25, reviewer_weight=10, note_bonus=1 — better reflecting that code authorship is a stronger expertise signal than review assignment alone. Path resolution gains suffix matching: typing "login.rs" auto-resolves to "src/auth/login.rs" when unambiguous, with clear disambiguation errors when multiple paths match. Project-scoping (-p) narrows the candidate set. The MAX_MR_REFS_PER_USER constant is promoted to module scope for reuse across expert and overlap modes. Human output shows MR refs inline and detail sub-rows when requested. Robot JSON includes mr_refs, mr_refs_total, mr_refs_truncated, and optional details array. Includes comprehensive tests for suffix resolution, scoring weight configurability, MR ref aggregation across projects, and detail mode. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 10:15:15 -05:00
Taylor Eernisse	b168a58134	fix(search): cap vector search k-value and add rowid assertion The vector search multiplier could grow unbounded on documents with many chunks, producing enormous k values that cause SQLite to scan far more rows than necessary. Clamp the multiplier to [8, 200] and cap k at 10,000 to prevent degenerate performance on large corpora. Also adds a debug_assert in decode_rowid to catch negative rowids early — these indicate a bug in the encoding pipeline and should fail fast rather than silently produce garbage document IDs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 14:34:05 -05:00
Taylor Eernisse	b704e33188	feat(sync): surface MR diff fetch/fail counters in sync output Adds mr_diffs_fetched and mr_diffs_failed fields to IngestResult and SyncResult, threads them through the orchestrator aggregation, includes them in the structured tracing span and human-readable sync summary. Previously MR diff failures were silently swallowed — now they appear alongside resource event counts for full pipeline observability. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 14:33:53 -05:00
Taylor Eernisse	6e82f723c3	fix(ingestion): unify store + watermark + job-complete in single transaction Previously, drain_resource_events, drain_mr_closes_issues, and drain_mr_diffs each opened a transaction only for the job-complete + watermark update, but the store operation ran outside that transaction. If the process crashed between the store and the watermark update, data would be persisted without the watermark advancing, causing silent duplicates on the next sync. Now each drain function opens the transaction before the store call and commits it only after both the store and the watermark update succeed. On error, the transaction is explicitly dropped so the connection is not left in a half-committed state. Also: - store_resource_events no longer manages its own transaction; the caller passes in a connection (which is actually the transaction) - upsert_mr_file_changes wraps DELETE + INSERT in a transaction internally - reset_discussion_watermarks now also clears diffs_synced_for_updated_at - Orchestrator error span now includes closes_issues_failed + mr_diffs_failed Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 14:33:47 -05:00
Taylor Eernisse	940a96375a	refactor(search): rename --after/--updated-after to --since/--updated-since The --since naming is more intuitive (matches git log --since) and consistent with the list commands which already use --since. Renames the CLI flags, SearchCliFilters fields, SearchFilters fields, autocorrect registry, and robot-docs manifest. No behavioral change. Affected paths: - cli/mod.rs: SearchArgs field + clap attribute rename - cli/commands/search.rs: SearchCliFilters + run_search plumbing - search/filters.rs: SearchFilters struct + apply_filters logic - main.rs: handle_search + robot-docs JSON - cli/autocorrect.rs: COMMAND_FLAGS entry for search Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 14:33:24 -05:00
Taylor Eernisse	c54a969269	fix(who): exclude self-assigned reviewers from file-change reviewer signal Signal 4 (mr_reviewers + mr_file_changes) was missing the self-review exclusion that signal 1 (DiffNote reviewer) already had. An MR author listed as their own reviewer would be double-counted as both author and reviewer, inflating their score. Also removes redundant SELECT DISTINCT from signal 2 (GROUP BY already ensures uniqueness). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 13:42:40 -05:00
Taylor Eernisse	95b7183add	feat(who): expand expert + overlap queries with mr_file_changes and mr_reviewers Chain: bd-jec (config flag) -> bd-2yo (fetch MR diffs) -> bd-3qn6 (rewrite who queries) - Add fetch_mr_file_changes config option and --no-file-changes CLI flag - Add GitLab MR diffs API fetch pipeline with watermark-based sync - Create migration 020 for diffs_synced_for_updated_at watermark column - Rewrite query_expert() and query_overlap() to use 4-signal UNION ALL: DiffNote reviewers, DiffNote MR authors, file-change authors, file-change reviewers - Deduplicate across signal types via COUNT(DISTINCT CASE WHEN ... THEN mr_id END) - Add insert_file_change test helper, 8 new who tests, all 397 tests pass - Also includes: list performance migration 019, autocorrect module, README updates Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 13:35:14 -05:00

1 2 3 4

197 Commits