gitlore

Author	SHA1	Message	Date
teernisse	e46a2fe590	test(core): add lookup-by-gitlab_project_id test for projects table Validates that the projects table schema uses gitlab_project_id (not gitlab_id) and that queries filtering by this column return the correct project. Uses the test helper convention where insert_project sets gitlab_project_id = id * 100.	2026-03-12 10:08:22 -04:00
teernisse	06889ec85a	fix(explain): address review findings — N+1 queries, duplicate decisions, silent errors 1. fetch_open_threads: replace N+1 loop (2 queries per thread) with a single query using correlated subqueries for note_count and started_by. 2. extract_key_decisions: track consumed notes so the same note is not matched to multiple events, preventing duplicate decision entries. 3. build_timeline_excerpt_from_pipeline: log tracing::warn on seed/collect failures instead of silently returning empty timeline.	2026-03-10 16:43:06 -04:00
teernisse	e8d6c5b15f	feat(runtime): replace tokio+reqwest with asupersync async runtime - Add HTTP adapter layer (src/http.rs) wrapping asupersync h1 client - Migrate gitlab client, graphql, and ollama to HTTP adapter - Swap entrypoint from #[tokio::main] to RuntimeBuilder::new().block_on() - Rewrite signal handler for asupersync (RuntimeHandle::spawn + ctrl_c()) - Migrate rate limiter sleeps to asupersync::time::sleep(wall_now(), d) - Add asupersync-native HTTP integration tests - Convert timeline_seed_tests to RuntimeBuilder pattern Phases 1-3 of asupersync migration (atomic: code won't compile without all pieces).	2026-03-06 15:57:20 -05:00
teernisse	bf977eca1a	refactor(structure): reorganize codebase into domain-focused modules	2026-03-06 15:24:09 -05:00
teernisse	4d41d74ea7	refactor(deps): replace tokio Mutex/join!, add NetworkErrorKind enum, remove reqwest from error types	2026-03-06 15:22:42 -05:00
teernisse	3a4fc96558	refactor(shutdown): extract 4 identical Ctrl+C handlers into core/shutdown.rs	2026-03-06 15:22:37 -05:00
teernisse	a45c37c7e4	feat(timeline): add entity-direct seeding and round-robin evidence selection Enhance the timeline command with two major improvements: 1. Entity-direct seeding syntax (bypass search): lore timeline issue:42 # Timeline for specific issue lore timeline i:42 # Short form lore timeline mr:99 # Timeline for specific MR lore timeline m:99 # Short form This directly resolves the entity and gathers ALL its discussions without requiring search/embedding. Useful when you know exactly which entity you want. 2. Round-robin evidence note selection: Previously, evidence notes were taken in FTS rank order, which could result in all notes coming from a single high-traffic discussion. Now we: - Fetch 5x the requested limit (or minimum 50) - Group notes by discussion_id - Select round-robin across discussions - This ensures diverse evidence from multiple conversations API changes: - Renamed total_events_before_limit -> total_filtered_events (clearer semantics) - Added resolve_entity_by_iid() in timeline.rs for IID-based entity resolution - Added seed_timeline_direct() in timeline_seed.rs for search-free seeding - Added round_robin_select_by_discussion() helper function The entity-direct mode uses search_mode: "direct" to distinguish from "hybrid" or "lexical" search modes in the response metadata. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-26 11:06:23 -05:00
teernisse	7fdeafa330	feat(db): add migration 028 for discussions.merge_request_id FK constraint Add foreign key constraint on discussions.merge_request_id to prevent orphaned discussions when MRs are deleted. SQLite doesn't support ALTER TABLE ADD CONSTRAINT, so this migration recreates the table with: 1. New table with FK: REFERENCES merge_requests(id) ON DELETE CASCADE 2. Data copy with FK validation (only copies rows with valid MR references) 3. Table swap (DROP old, RENAME new) 4. Full index recreation (all 10 indexes from migrations 002-022) The migration also includes a CHECK constraint ensuring mutual exclusivity: - Issue discussions have issue_id NOT NULL and merge_request_id NULL - MR discussions have merge_request_id NOT NULL and issue_id NULL Also fixes run_migrations() to properly propagate query errors instead of silently returning unwrap_or defaults, improving error diagnostics. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-26 11:06:01 -05:00
teernisse	eac640225f	feat(core): add cursor persistence module for session-based timestamps Introduces a lightweight file-based cursor system for persisting per-user timestamps across CLI invocations. This enables "since last check" semantics where `lore me` can track what the user has seen. Key design decisions: - Per-user cursor files: ~/.local/share/lore/me_cursor_<username>.json - Atomic writes via temp-file + rename pattern (crash-safe) - Graceful degradation: missing/corrupt files return None - Username sanitization: non-safe chars replaced with underscore The cursor module provides three operations: - read_cursor(username) -> Option<i64>: read last-check timestamp - write_cursor(username, timestamp_ms): atomically persist timestamp - reset_cursor(username): delete cursor file (no-op if missing) Tests cover: missing file, roundtrip, per-user isolation, reset isolation, JSON validity after overwrites, corrupt file handling. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-25 10:02:13 -05:00
teernisse	f9e7913232	fix(error): replace misleading Database error suggestions The Database(rusqlite::Error) catch-all variant was suggesting 'lore reset --yes' for ALL database errors, including transient SQLITE_BUSY lock contention. This was wrong on two counts: 1. `lore reset` is not implemented (prints "not yet implemented") 2. Nuking the database is not the fix for a transient lock Changes: - Detect SQLITE_BUSY specifically via sqlite_error_code() and provide targeted advice: "Another process has the database locked" with common causes (cron sync, concurrent lore command) - Map SQLITE_BUSY to ErrorCode::DatabaseLocked (exit code 9) instead of DatabaseError (exit code 10) — semantically correct - Set BUSY actions to ["lore cron status"] (diagnostic) instead of the useless "lore sync --force" (--force overrides the app-level lock table, but SQLITE_BUSY fires before that table is even reached) - Fix MigrationFailed suggestion: also referenced non-existent 'lore reset', now says "try again" with lore migrate / lore doctor - Non-BUSY database errors get a simpler suggestion pointing to lore doctor (no more phantom reset command) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 10:36:16 -05:00
teernisse	9c1a9bfe5d	feat(me): add lore me personal work dashboard command Implement a personal work dashboard that shows everything relevant to the configured GitLab user: open issues assigned to them, MRs they authored, MRs they are reviewing, and a chronological activity feed. Design decisions: - Attention state computed from GitLab interaction data (comments, reviews) with no local state tracking -- purely derived from existing synced data - Username resolution: --user flag > config.gitlab.username > actionable error - Project scoping: --project (fuzzy) \| --all \| default_project \| all - Section filtering: --issues, --mrs, --activity (combinable, default = all) - Activity feed controlled by --since (default 30d); work item sections always show all open items regardless of --since Architecture (src/cli/commands/me/): - types.rs: MeDashboard, MeSummary, AttentionState data types - queries.rs: 4 SQL queries (open_issues, authored_mrs, reviewing_mrs, activity) using existing issue_assignees, mr_reviewers, notes tables - render_human.rs: colored terminal output with attention state indicators - render_robot.rs: {ok, data, meta} JSON envelope with field selection - mod.rs: orchestration (resolve_username, resolve_project_scope, run_me) - me_tests.rs: comprehensive unit tests covering all query paths Config additions: - New optional gitlab.username field in config.json - Tests for config with/without username - Existing test configs updated with username: None CLI wiring: - MeArgs struct with section filter, since, project, all, user, fields flags - Autocorrect support for me command flags - LoreRenderer::try_get() for safe renderer access in me module - Robot mode field selection presets (me_items, me_activity) - handle_me() in main.rs command dispatch Also fixes duplicate assertions in surgical sync tests (removed 6 duplicate assert! lines that were copy-paste artifacts). Spec: docs/lore-me-spec.md	2026-02-20 14:31:57 -05:00
teernisse	9ec1344945	feat(surgical-sync): add per-IID surgical sync pipeline with preflight validation Add the ability to sync specific issues or merge requests by IID without running a full incremental sync. This enables fast, targeted data refresh for individual entities — useful for agent workflows, debugging, and real-time investigation of specific issues or MRs. Architecture: - New CLI flags: --issue <IID> and --mr <IID> (repeatable, up to 100 total) scoped to a single project via -p/--project - Preflight phase validates all IIDs exist on GitLab before any DB writes, with TOCTOU-aware soft verification at ingest time - 6-stage pipeline: preflight -> fetch -> ingest -> dependents -> docs -> embed - Each stage is cancellation-aware via ShutdownSignal - Dedicated SyncRunRecorder extensions track surgical-specific counters (issues_fetched, mrs_ingested, docs_regenerated, etc.) New modules: - src/ingestion/surgical.rs: Core surgical fetch/ingest/dependent logic with preflight_fetch(), ingest_issue_by_iid(), ingest_mr_by_iid(), and fetch_dependents_for_{issue,mr}() - src/cli/commands/sync_surgical.rs: Full CLI orchestrator with progress spinners, human/robot output, and cancellation handling - src/embedding/pipeline.rs: embed_documents_by_ids() for scoped embedding - src/documents/regenerator.rs: regenerate_dirty_documents_for_sources() for scoped document regeneration Database changes: - Migration 027: Extends sync_runs with mode, phase, surgical_iids_json, per-entity counters, and cancelled_at column - New indexes: idx_sync_runs_mode_started, idx_sync_runs_status_phase_started GitLab client: - get_issue_by_iid() and get_mr_by_iid() single-entity fetch methods Error handling: - New SurgicalPreflightFailed error variant with entity_type, iid, project, and reason fields. Shares exit code 6 with GitLabNotFound. Includes comprehensive test coverage: - 645 lines of surgical ingestion tests (wiremock-based) - 184 lines of scoped embedding tests - 85 lines of scoped regeneration tests - 113 lines of GitLab client single-entity tests - 236 lines of sync_run surgical column/counter tests - Unit tests for SyncOptions, error codes, and CLI validation	2026-02-18 16:28:21 -05:00
teernisse	30ed02c694	feat(token): add stored token support with resolve_token and token_source Introduce a centralized token resolution system that supports both environment variables and config-file-stored tokens with clear priority (env var wins). This enables cron-based sync which runs in minimal shell environments without env vars. Core changes: - GitLabConfig gains optional `token` field and `resolve_token()` method that checks env var first, then config file, returning trimmed values - `token_source()` returns human-readable provenance ("environment variable" or "config file") for diagnostics - `ensure_config_permissions()` enforces 0600 on config files containing tokens (Unix only, no-op on other platforms) New CLI commands: - `lore token set [--token VALUE]` — validates against GitLab API, stores in config, enforces file permissions. Supports flag, stdin pipe, or interactive entry. - `lore token show [--unmask]` — displays masked token with source label Consumers updated to use resolve_token(): - auth_test: removes manual env var lookup - doctor: shows token source in health check output - ingest: uses centralized resolution Includes 10 unit tests for resolve/source logic and 2 for mask_token.	2026-02-18 16:27:48 -05:00
teernisse	53ce20595b	feat(cron): add lore cron command for automated sync scheduling Add lore cron {install,uninstall,status} to manage a crontab entry that runs lore sync on a configurable interval. Supports both human and robot output modes. Core implementation (src/core/cron.rs): - install_cron: appends a tagged crontab entry, detects existing entries - uninstall_cron: removes the tagged entry - cron_status: reads crontab + checks last-sync time from the database - Unix-only (#[cfg(unix)]) — compiles out on Windows CLI wiring: - CronAction enum and CronArgs in cli/mod.rs with after_help examples - Robot JSON envelope with RobotMeta timing for all 3 sub-actions - Dispatch in main.rs Also in this commit: - Add after_help example blocks to Status, Auth, Doctor, Init, Migrate, Health commands for better discoverability - Add LORE_ICONS env var documentation to CLI help text - Simplify notes format dispatch in main.rs (removed csv/jsonl paths) - Update commands/mod.rs re-exports for cron + notes cleanup Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 13:29:20 -05:00
teernisse	8442bcf367	feat(trace,file-history): add tracing instrumentation and diagnostic hints Add structured tracing spans to trace and file-history pipelines so debug logging (-vv) shows path resolution counts, MR match counts, and discussion counts at each stage. This makes empty-result debugging straightforward. Add a hints field to TraceResult and FileHistoryResult that carries machine-readable diagnostic strings explaining why results may be empty (e.g., "Run 'lore sync' to fetch MR file changes"). The CLI renders these as info lines; robot mode includes them in JSON when non-empty. Also: fix filter_map(Result::ok) → collect::<Result> in trace.rs (same pattern fixed in prior commit for file_history/path_resolver), and switch conn.prepare → conn.prepare_cached for the MR query. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 13:28:47 -05:00
teernisse	c0ca501662	fix: replace silent error swallowing with proper error propagation Replace .filter_map(Result::ok).collect() with .collect::<Result<Vec<_>,_>>()? in rename chain resolution and suffix probe queries. The old pattern silently discarded database errors, making failures invisible. Now any rusqlite error propagates to the caller immediately. Affected: resolve_rename_chain (2 queries) and resolve_ambiguity (1 query). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 13:28:37 -05:00
teernisse	714c8c2623	feat(path): rename-aware ambiguity resolution for suffix probe When a bare filename like 'operators.ts' matches multiple full paths, check if they are the same file connected by renames (via BFS on mr_file_changes). If so, auto-resolve to the newest path instead of erroring. Also wires path resolution into file-history and trace commands so bare filenames work everywhere. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 10:34:28 -05:00
teernisse	171260a772	feat(cli): implement 'lore trace' command (bd-2n4, bd-9dd) Gate 5 Code Trace - Tier 1 (API-only, no git blame). Answers 'Why was this code introduced?' by building file -> MR -> issue -> discussion chains. New files: - src/core/trace.rs: run_trace() query logic with rename-aware path resolution, entity_reference-based issue linking, and DiffNote discussion extraction - src/core/trace_tests.rs: 7 unit tests for query logic - src/cli/commands/trace.rs: CLI command with human output, robot JSON output, and :line suffix parsing (5 tests) Human output shows full content (no truncation). Robot JSON truncates discussion bodies to 500 chars for token efficiency. Wiring: - TraceArgs + Commands::Trace in cli/mod.rs - handle_trace in main.rs - VALID_COMMANDS + robot-docs manifest entry - COMMAND_FLAGS autocorrect registry entry Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-17 14:57:21 -05:00
teernisse	eef73decb5	fix(cli): timeline tag width, test env isolation, and logging verbosity Miscellaneous fixes across CLI and core modules: - Timeline: widen TAG_WIDTH from 10 to 11 to accommodate longer event type labels without truncation - render.rs: save and restore LORE_ICONS env var in glyph_mode test to prevent interference from the test environment leaking into or from other tests that set LORE_ICONS - logging.rs: adjust verbose=1 to info level (was debug), verbose=2 to debug — this reduces noise at -v while keeping -vv as the full debug experience - issues.rs, merge_requests.rs: use infodebug! macro consistently for ingestion summary logging Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-14 11:25:42 -05:00
Taylor Eernisse	a7f86b26e4	refactor(core): compact human log format, quieter lock lifecycle, nonzero_summary helper Three quality-of-life improvements to reduce log noise and improve readability: 1. logging.rs: Add CompactHumanFormat for stderr tracing output. Replaces the default format with a minimal 'HH:MM:SS LEVEL message key=value' layout — no span context, no full timestamps, no target module. The JSON file log layer is unaffected. This makes watching 'lore sync' output much cleaner. 2. lock.rs: Downgrade AppLock acquire/release messages from info! to debug!. Lock lifecycle events (acquired new, acquired existing, released) are operational bookkeeping that clutters normal output. They remain visible at -vv verbosity for troubleshooting. 3. ingestion/mod.rs: Add nonzero_summary() utility that formats named counters as a compact middle-dot-separated string, suppressing zero values. Produces output like '42 fetched · 3 labels · 12 notes' instead of verbose key=value structured fields. Returns 'nothing to update' when all values are zero. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 22:31:30 -05:00
teernisse	0aecbf33c0	feat(xref): extract cross-references from descriptions, user notes, and fix system note regex - Fix MENTIONED_RE/CLOSED_BY_RE to match real GitLab format ('mentioned in issue #N' / 'mentioned in merge request !N') - Add GITLAB_URL_RE + parse_url_refs() for full URL extraction - Add extract_refs_from_descriptions() -> source_method='description_parse' - Add extract_refs_from_user_notes() -> source_method='note_parse' - Wire both into orchestrator after system note extraction - 36 tests: regex fix, URL parsing, integration, idempotency	2026-02-13 17:19:36 -05:00
teernisse	c10471ddb9	feat(timeline): add entity-direct seeding (issue:N, mr:N syntax) Adds issue:N / i:N / mr:N / m:N query syntax to bypass hybrid search and seed the timeline directly from a known entity. All discussions for the entity are gathered without needing Ollama. - parse_timeline_query() detects entity-direct patterns - resolve_entity_by_iid() resolves IID to EntityRef with ambiguity handling - seed_timeline_direct() gathers all discussions for the entity - 20 new tests (5 resolve, 6 direct seed, 9 parse) - Updated CLI help text and robot-docs manifest	2026-02-13 15:22:45 -05:00
teernisse	94435c37f0	perf(timeline): hoist prepared statement outside discussion thread loop Moves the conn.prepare() call for fetching discussion notes outside the per-discussion loop in collect_discussion_threads(). The SQL is identical for every iteration, so preparing it once and rebinding parameters avoids redundant statement compilation on each matched discussion. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 14:56:40 -05:00
teernisse	2da1a228b3	feat(timeline): collect and render full discussion threads Implements the downstream consumption of matched discussions from the seed phase, completing the discussion thread feature across collect, CLI, and integration tests. Collect phase (timeline_collect.rs): - New collect_discussion_threads() function assembles full threads by querying notes for each matched discussion_id, filtering out system notes (is_system = 0), ordering chronologically, and capping at THREAD_MAX_NOTES with a synthetic "[N more notes not shown]" summary note - build_entity_lookup() creates a (type, id) -> (iid, path) map from seed and expanded entities to provide display metadata for thread events - Thread timestamp is set to the first note's created_at for correct chronological interleaving with other timeline events - collect_events() gains a matched_discussions parameter; threads are collected after entity events and before evidence note merging CLI rendering (cli/commands/timeline.rs): - Human mode: threads render with box-drawing borders, bold @author tags, date-stamped notes, and word-wrapped bodies (60 char width) - Robot mode: DiscussionThread serializes as discussion_thread kind with note_count, full notes array (note_id, author, body, ISO created_at) - THREAD tag in yellow for human event tag styling - TimelineMeta gains discussion_threads_included count Tests: - 8 new collect tests: basic thread assembly, system note filtering, empty thread skipping, body truncation to THREAD_NOTE_MAX_CHARS, note cap with synthetic summary, timestamp from first note, chronological sort position, and deduplication of duplicate discussion_ids - Integration tests updated for new collect_events signature Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 14:18:36 -05:00
teernisse	0e65202778	feat(timeline): add DiscussionThread types and seed-phase discussion matching Introduces the foundation for full discussion thread support in the timeline pipeline. Adds three new domain types to timeline.rs: - ThreadNote: individual note within a thread (id, author, body, timestamp) - MatchedDiscussion: tracks discussions matched during seeding with their parent entity (issue or MR) for downstream collection - DiscussionThread variant on TimelineEventType: carries a full thread of notes, sorted between NoteEvidence and CrossReferenced Moves truncate_to_chars() from timeline_seed.rs to timeline.rs as pub(crate) for reuse by the collect phase. Adds THREAD_NOTE_MAX_CHARS (2000) and THREAD_MAX_NOTES (50) constants. Upgrades the seed SQL in resolve_documents_to_entities() to resolve note documents to their parent discussion via an additional LEFT JOIN chain (notes -> discussions), using COALESCE to unify the entity resolution path for both discussion and note source types. SeedResult gains a matched_discussions field that captures deduplicated discussion matches. Tests cover: discussion matching from discussion docs, note-to-parent resolution, deduplication of same discussion across multiple docs, and correct parent entity type (issue vs MR). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 14:18:18 -05:00
teernisse	4f3ec72923	feat(timeline): upgrade seed phase to hybrid search Replace FTS-only seed entity discovery with hybrid search (FTS + vector via RRF), using the same search_hybrid infrastructure as the search command. Falls back gracefully to FTS-only when Ollama is unavailable. Changes: - seed_timeline() now accepts OllamaClient, delegates to search_hybrid - New resolve_documents_to_entities() replaces find_seed_entities() - SeedResult gains search_mode field tracking actual mode used - TimelineResult carries search_mode through to JSON renderer - run_timeline wires up OllamaClient from config - handle_timeline made async for the hybrid search await - Tests updated for new function signatures Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 13:50:24 -05:00
teernisse	e6771709f1	refactor(core): extract path_resolver module, fix old_path matching in who Extract shared path resolution logic from who.rs into a new core::path_resolver module for cross-module reuse. Functions moved: escape_like, normalize_repo_path, PathQuery, SuffixResult, build_path_query, suffix_probe. Duplicate escape_like copies removed from list.rs, project.rs, and filters.rs — all now import from path_resolver. Additionally fixes two bugs in query_expert_details() and query_overlap() where only position_new_path was checked (missing old_path matches for renamed files) and state filter excluded 'closed' MRs despite the main scoring query including them with a decay multiplier. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 13:50:14 -05:00
Taylor Eernisse	48fbd4bfdb	feat(core): add file rename chain resolver with depth-bounded BFS New module: core::file_history with resolve_rename_chain() that traces a file path through its rename history in mr_file_changes using bidirectional BFS (forward: old_path->new_path, backward: new_path->old_path). Key design decisions: - Depth-bounded BFS: each queue entry carries its distance from the origin, so max_hops correctly limits by graph distance (not by total nodes discovered). This matters for branching rename graphs where a file was renamed differently in parallel MRs. - Cycle-safe: visited set prevents infinite loops from circular renames. - Project-scoped: queries are always scoped to a single project_id. - Deterministic: output is sorted for stable results. Tests cover: linear chains (forward/backward), cycles, max_hops=0, depth-bounded linear chains, branching renames, diamond patterns, and cross-project isolation (9 tests total). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 10:54:41 -05:00
Taylor Eernisse	9786ef27f5	refactor(core/time): extract parse_since_from for deterministic time parsing Factor out parse_since_from(input, reference_ms) so callers can compute relative durations against a fixed reference timestamp instead of always using now(). The existing parse_since() now delegates to it with now_ms(). Enables testable and reproducible time-relative queries for features like timeline --as-of and who --as-of. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 10:54:20 -05:00
Taylor Eernisse	7e0e6a91f2	refactor: extract unit tests into separate _tests.rs files Move inline #[cfg(test)] mod tests { ... } blocks from 22 source files into dedicated _tests.rs companion files, wired via: #[cfg(test)] #[path = "module_tests.rs"] mod tests; This keeps implementation-focused source files leaner and more scannable while preserving full access to private items through `use super::*;`. Modules extracted: core: db, note_parser, payloads, project, references, sync_run, timeline_collect, timeline_expand, timeline_seed cli: list (55 tests), who (75 tests) documents: extractor (43 tests), regenerator embedding: change_detector, chunking gitlab: graphql (wiremock async tests), transformers/issue ingestion: dirty_tracker, discussions, issues, mr_diffs Also adds conflicts_with("explain_score") to the --detail flag in the who command to prevent mutually exclusive flags from being combined. All 629 unit tests pass. No behavior changes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 10:54:02 -05:00
teernisse	94c8613420	feat(bd-226s): implement time-decay expert scoring model Replace flat-weight expertise scoring with exponential half-life decay, split reviewer signals (participated vs assigned-only), dual-path rename awareness, and new CLI flags (--as-of, --explain-score, --include-bots, --all-history). Changes: - ScoringConfig: 8 new fields with validation (config.rs) - half_life_decay() and normalize_query_path() pure functions (who.rs) - CTE-based SQL with dual-path matching, mr_activity, reviewer_participation (who.rs) - Rust-side decay aggregation with deterministic f64 ordering (who.rs) - Path resolution probes check old_path columns (who.rs) - Migration 026: 5 new indexes for dual-path and reviewer participation - Default --since changed from 6m to 24m - 31 new tests (example-based + invariant), 621 total who tests passing - Autocorrect registry updated with new flags Closes: bd-226s, bd-2w1p, bd-1soz, bd-18dn, bd-2ao4, bd-2yu5, bd-1b50, bd-1hoq, bd-1h3f, bd-13q8, bd-11mg, bd-1vti, bd-1j5o	2026-02-12 15:44:55 -05:00
teernisse	83cd16c918	feat: implement per-note search and document pipeline - Add SourceType::Note with extract_note_document() and ParentMetadataCache - Migration 022: composite indexes for notes queries + author_id column - Migration 024: table rebuild adding 'note' to CHECK constraints, defense triggers - Migration 025: backfill existing non-system notes into dirty queue - Add lore notes CLI command with 17 filter options (author, path, resolution, etc.) - Support table/json/jsonl/csv output formats with field selection - Wire note dirty tracking through discussion and MR discussion ingestion - Fix test_migration_024_preserves_existing_data off-by-one (tested wrong migration) - Fix upsert_document_inner returning false for label/path-only changes	2026-02-12 13:31:24 -05:00
teernisse	b29c382583	feat(bd-2g50): fill data gaps in issue detail view Add references_full, user_notes_count, merge_requests_count computed fields to show issue. Add closed_at and confidential columns via migration 023. Closes: bd-2g50	2026-02-12 11:59:44 -05:00
teernisse	d9c9f6e541	fix: escape LIKE metacharacters in project resolver User-supplied project names containing `%` or `_` were passed directly into LIKE patterns, causing unintended wildcard matching. For example, `my_project` would match `my-project` because `_` is a single-char wildcard in SQL LIKE. Added escape_like() helper that escapes `\`, `%`, and `_` with backslash, and added ESCAPE '\' clauses to both the suffix-match and substring-match queries in resolve_project(). Includes two regression tests: - test_underscore_not_wildcard: `_` in input must not match `-` - test_percent_not_wildcard: `%` in input must not match arbitrary strings	2026-02-12 11:21:09 -05:00
teernisse	6ea3108a20	feat(config): add defaultProject with validation and cascading resolver Introduces a new optional `defaultProject` field on Config (and MinimalConfig for init output) that acts as a fallback when the `-p`/`--project` CLI flag is omitted. Domain-layer changes: - Config.default_project: Option<String> with camelCase serde rename - Config::load validates that defaultProject matches a configured project path (exact or case-insensitive suffix match), returning ConfigInvalid on mismatch - Config::effective_project(cli_flag) -> Option<&str>: cascading resolver that prefers the CLI flag, then the config default, then None - MinimalConfig.default_project with skip_serializing_if for clean JSON output when unset Tests added: - effective_project: CLI overrides default, falls back to default, returns None when both absent - Config::load: accepts valid defaultProject, rejects nonexistent, accepts suffix match - MinimalConfig: omits null defaultProject, includes when set - Helper write_config_with_default_project for parameterized tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-11 15:09:33 -05:00
Taylor Eernisse	70271c14d6	fix(core): ensure migration framework records schema version automatically The migration runner now inserts (OR REPLACE) the schema_version row after each successful migration batch, regardless of whether the migration SQL itself contains a self-registering INSERT. This prevents version tracking gaps when a .sql migration omits the bookkeeping statement, which would leave the schema at an unrecorded version and cause re-execution attempts on next startup. Legacy migrations that already self-register are unaffected thanks to the OR REPLACE conflict resolution. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-11 10:21:49 -05:00
Taylor Eernisse	6b75697638	feat(ingestion): enrich issues with work item status from GraphQL API Add a "Phase 1.5" status enrichment step to the issue ingestion pipeline that fetches work item statuses via the GitLab GraphQL API after the standard REST API ingestion completes. Schema changes (migration 021): - Add status_name, status_category, status_color, status_icon_name, and status_synced_at columns to the issues table (all nullable) Ingestion pipeline changes: - New `enrich_issue_statuses_txn()` function that applies fetched statuses in a single transaction with two phases: clear stale statuses for issues that no longer have a status widget, then apply new/updated statuses from the GraphQL response - ProgressEvent variants for status enrichment (complete/skipped) - IngestProjectResult tracks enrichment metrics (seen, enriched, cleared, without_widget, partial_error_count, enrichment_mode, errors) - Robot mode JSON output includes per-project status enrichment details Configuration: - New `sync.fetchWorkItemStatus` config option (defaults true) to disable GraphQL status enrichment on instances without Premium/Ultimate - `LoreError::GitLabAuthFailed` now treated as permanent API error so status enrichment auth failures don't trigger retries Also removes the unnecessary nested SAVEPOINT in store_closes_issues_refs (already runs within the orchestrator's transaction context). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-11 08:09:21 -05:00
Taylor Eernisse	53ef21d653	fix: propagate DB errors instead of silently swallowing them Replace .unwrap_or(), .ok(), and .filter_map(\|r\| r.ok()) patterns with proper error propagation using ? and rusqlite::OptionalExtension where the query may legitimately return no rows. Affected areas: - events_db::count_events: three count queries now propagate errors instead of defaulting to (0, 0) on failure - note_parser::extract_refs_from_system_notes: row iteration errors are now propagated instead of silently dropped via filter_map - note_parser::noteable_type_to_entity_type: unknown types now log a debug warning before defaulting to "issue" - payloads::store_payload/read_payload: use .optional()? instead of .ok() to distinguish "no row" from "query failed" - backoff::compute_next_attempt_at: use .clamp(0, 30) to guard against negative attempt_count, not just .min(30) - search::vector::max_chunks_per_document: returns Result<i64> with proper error propagation through .optional()?.flatten() - embedding::chunk_ids::decode_rowid: promote debug_assert to assert since negative rowids indicate data corruption worth failing fast on - ingestion::dirty_tracker::record_dirty_error: use .optional()? to handle missing dirty_sources row gracefully instead of hard error Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 10:15:36 -05:00
Taylor Eernisse	41504b4941	feat(who): configurable scoring weights, MR refs, detail mode, and suffix path resolution Expert mode now surfaces the specific MR references (project/path!iid) that contributed to each expert's score, capped at 50 per user. A new --detail flag adds per-MR breakdowns showing role (Author/Reviewer/both), note count, and last activity timestamp. Scoring weights (author_weight, reviewer_weight, note_bonus) are now configurable via the config file's `scoring` section with validation that rejects negative values. Defaults shift to author_weight=25, reviewer_weight=10, note_bonus=1 — better reflecting that code authorship is a stronger expertise signal than review assignment alone. Path resolution gains suffix matching: typing "login.rs" auto-resolves to "src/auth/login.rs" when unambiguous, with clear disambiguation errors when multiple paths match. Project-scoping (-p) narrows the candidate set. The MAX_MR_REFS_PER_USER constant is promoted to module scope for reuse across expert and overlap modes. Human output shows MR refs inline and detail sub-rows when requested. Robot JSON includes mr_refs, mr_refs_total, mr_refs_truncated, and optional details array. Includes comprehensive tests for suffix resolution, scoring weight configurability, MR ref aggregation across projects, and detail mode. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 10:15:15 -05:00
Taylor Eernisse	95b7183add	feat(who): expand expert + overlap queries with mr_file_changes and mr_reviewers Chain: bd-jec (config flag) -> bd-2yo (fetch MR diffs) -> bd-3qn6 (rewrite who queries) - Add fetch_mr_file_changes config option and --no-file-changes CLI flag - Add GitLab MR diffs API fetch pipeline with watermark-based sync - Create migration 020 for diffs_synced_for_updated_at watermark column - Rewrite query_expert() and query_overlap() to use 4-signal UNION ALL: DiffNote reviewers, DiffNote MR authors, file-change authors, file-change reviewers - Deduplicate across signal types via COUNT(DISTINCT CASE WHEN ... THEN mr_id END) - Add insert_file_change test helper, 8 new who tests, all 397 tests pass - Also includes: list performance migration 019, autocorrect module, README updates Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 13:35:14 -05:00
Taylor Eernisse	435a208c93	perf: eliminate unnecessary clones and pre-allocate collections Three micro-optimizations with zero behavioral change: 1. timeline_collect.rs: Reorder format!() before enum construction so the owned String moves into the variant directly, eliminating .clone() on state, label, and milestone strings in StateChanged, LabelAdded/Removed, and MilestoneSet/Removed event paths. 2. pipeline.rs: Use Arc<str> for doc_hash shared across a document's chunks instead of cloning the full String per chunk. Also remove redundant embed_buf.reserve() since extend_from_slice already handles growth and the buffer is reused across iterations. 3. rrf.rs: Pre-allocate HashMap with combined vector+fts result count via with_capacity() to avoid rehashing during RRF score accumulation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 08:08:14 -05:00
Taylor Eernisse	cc11d3e5a0	fix: peer review — 5 correctness bugs across who, db, lock, embedding, main Comprehensive peer code review identified and fixed the following: 1. who.rs: @-prefixed path routing used `target` (with @) instead of `clean` (stripped) when checking for '/' and passing to Expert mode, causing `lore who @src/auth/` to silently return zero results because the SQL LIKE matched against `@src/auth/%` which never exists. 2. db.rs: After ROLLBACK TO savepoint on migration failure, the savepoint was never RELEASEd, leaving it active on the connection. Fixed in both run_migrations() and run_migrations_from_dir(). 3. lock.rs: Multiple acquire() calls (e.g. re-acquiring a stale lock) replaced the heartbeat_handle without stopping the old thread, causing two concurrent heartbeat writers competing on the same lock row. Now signals the old thread to stop and joins it before spawning a new one. 4. chunk_ids.rs: encode_rowid() had no guard for chunk_index >= 1000 (CHUNK_ROWID_MULTIPLIER), which would cause rowid collisions between adjacent documents. Added range assertion [0, 1000). 5. main.rs: Fallback JSON error formatting in handle_auth_test interpolated LoreError Display output without escaping quotes or backslashes, potentially producing malformed JSON for robot-mode consumers. Now escapes both characters before interpolation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 08:07:59 -05:00
Taylor Eernisse	5786d7f4b6	fix: defensive hardening — lock release logging, SQLite param guard, vector cast Three defensive improvements found via peer code review: 1. lock.rs: Lock release errors were silently discarded with `let _ =`. If the DELETE failed (disk full, corruption), the lock stayed in the database with no diagnostic. Next sync would require --force with no clue why. Now logs with error!() including the underlying error message. 2. filters.rs: Dynamic SQL label filter construction had no upper bound on bind parameters. With many combined filters, param_idx + labels.len() could exceed SQLite's 999-parameter limit, producing an opaque error. Added a guard that caps labels at 900 - param_idx. 3. vector.rs: max_chunks_per_document returned i64 which was cast to usize. A negative value from a corrupt database would wrap to a huge number, causing overflow in the multiplier calculation. Now clamped to .max(1) and cast via unsigned_abs(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 07:55:54 -05:00
Taylor Eernisse	121a634653	fix: critical data integrity — timeline dedup, discussion atomicity, index collision Three correctness bugs found via peer code review: 1. TimelineEvent PartialEq/Ord omitted entity_type — issue #42 and MR #42 with the same timestamp and event_type were treated as equal. In a BTreeSet or dedup, one would silently be dropped. Added entity_type to both PartialEq and Ord comparisons. 2. discussions.rs: store_payload() was called outside the transaction (on bare conn) while upsert_discussion/notes were inside. A crash between them left orphaned payload rows. Moved store_payload inside the unchecked_transaction block, matching mr_discussions.rs pattern. 3. Migration 017 created idx_issue_assignees_username(username, issue_id) but migration 005 already created the same index name with just (username). SQLite's IF NOT EXISTS silently skipped the composite version on every existing database. New migration 018 drops and recreates the index with correct composite columns. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 07:54:59 -05:00
Taylor Eernisse	f267578aab	feat: implement lore who — people intelligence commands (5 modes) Add `lore who` command with 5 query modes answering collaboration questions using existing DB data (280K notes, 210K discussions, 33K DiffNotes): - Expert: who knows about a file/directory (DiffNote path analysis + MR breadth scoring) - Workload: what is a person working on (assigned issues, authored/reviewing MRs, discussions) - Active: what discussions need attention (unresolved resolvable, global/project-scoped) - Overlap: who else is touching these files (dual author+reviewer role tracking) - Reviews: what review patterns does a person have (prefix-based category extraction) Includes migration 017 (5 composite indexes), CLI skeleton with clap conflicts_with validation, robot JSON output with input+resolved_input reproducibility, human terminal output, and 20 unit tests. All quality gates pass. Closes: bd-1q8z, bd-34rr, bd-2rk9, bd-2ldg, bd-zqpf, bd-s3rc, bd-m7k1, bd-b51e, bd-2711, bd-1rdi, bd-3mj2, bd-tfh3, bd-zibc, bd-g0d5 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 23:11:14 -05:00
Taylor Eernisse	cf6d27435a	feat(robot): add elapsed_ms timing, --fields support, and actionable error actions Robot mode consistency improvements across all command output: Timing: - Every robot JSON response now includes meta.elapsed_ms measuring wall-clock time from command start to serialization. Agents can use this to detect slow queries and tune --limit or --project filters. Field selection (--fields): - print_list_issues_json and print_list_mrs_json accept an optional fields slice that prunes each item in the response array to only the requested keys. A "minimal" preset expands to [iid, title, state, updated_at_iso] for token-efficient agent scans. - filter_fields and expand_fields_preset live in the new src/cli/robot.rs module alongside RobotMeta. Actionable error recovery: - LoreError gains an actions() method returning concrete shell commands an agent can execute to recover (e.g. "ollama serve" for OllamaUnavailable, "lore init" for ConfigNotFound). - RobotError now serializes an "actions" array (empty array omitted) so agents can parse and offer one-click fixes. Envelope consistency: - show issue/MR JSON responses now use the standard {"ok":true,"data":...,"meta":...} envelope instead of bare data, matching all other commands. Files: src/cli/robot.rs (new), src/core/error.rs, src/cli/commands/{count,embed,generate_docs,ingest,list,show,stats,sync_status}.rs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 23:46:48 -05:00
Taylor Eernisse	a855759bf8	fix: shutdown safety, CLI hardening, exit code collision Shutdown signal improvements: - Upgrade ShutdownSignal from Relaxed to Release/Acquire ordering. Relaxed was technically sufficient for a single flag but Release/Acquire is the textbook correct pattern and ensures visibility guarantees across threads without relying on x86 TSO. - Add double Ctrl+C support to all three signal handlers (ingest, embed, sync). First Ctrl+C sets cooperative flag with user message; second Ctrl+C force-exits with code 130 (standard SIGINT convention). CLI hardening: - LORE_ROBOT env var now checks for truthy values (!empty, !="0", !="false") instead of mere existence. Setting LORE_ROBOT=0 or LORE_ROBOT=false no longer activates robot mode. - Replace unreachable!() in color mode match with defensive warning and fallback to auto. Clap validates the values but defense in depth prevents panics if the value_parser is ever changed. - Replace unreachable!() in completions shell match with proper error return for unsupported shells. Exit code collision fix: - ConfigNotFound was mapped to exit code 2 (error.rs:56) which collided with handle_clap_error() also using exit code 2 for parse errors. Agents calling lore --robot could not distinguish "bad arguments" from "missing config file." - Restore ConfigNotFound to exit code 20 (its original dedicated code). - Update robot-docs exit code table: code 2 = "Usage error", code 20 = "Config not found". Build script: - Track .git/refs/heads directory for Cargo rebuild triggers. Ensures GIT_HASH env var updates when branch refs change, not just HEAD. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 22:42:59 -05:00
Taylor Eernisse	405e5370dc	feat(sync): concurrent drains, atomic watermarks, graceful Ctrl+C shutdown Three fixes to the sync pipeline: 1. Atomic watermarks: wrap complete_job + update_watermark in a single SQLite transaction so crash between them can't leave partial state. 2. Concurrent drain loops: prefetch HTTP requests via join_all (batch size = dependent_concurrency), then write serially to DB. Reduces ~9K sequential requests from ~19 min to ~2.4 min. 3. Graceful shutdown: install Ctrl+C handler via ShutdownSignal (Arc<AtomicBool>), thread through orchestrator/CLI, release locked jobs on interrupt, record sync_run as "failed". Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 11:22:04 -05:00
Taylor Eernisse	32783080f1	fix(timeline): report true total_events in robot JSON meta The robot JSON envelope's meta.total_events field was incorrectly reporting events.len() (the post-limit count), making it identical to meta.showing. This defeated the purpose of having both fields. Changes across the pipeline to fix this: - collect_events now returns (Vec<TimelineEvent>, usize) where the second element is the total event count before truncation - TimelineResult gains a total_events_before_limit field (serde-skipped) so the value flows cleanly from collect through to the renderer - main.rs passes the real total instead of the events.len() workaround Additional cleanup in this pass: - Derive PartialEq/Eq/PartialOrd/Ord on TimelineEventType, replacing the hand-rolled event_type_discriminant() function. Variant declaration order now defines sort tiebreak, documented in a doc comment. - Validate --since input with a proper LoreError::Other instead of silently treating invalid values as None - Fix ANSI-aware tag column padding with console::pad_str (colored tags like "[merged]" were misaligned because ANSI escapes consumed width) - Remove dead print_timeline_json and infer_max_depth functions that were superseded by print_timeline_json_with_meta Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 09:35:02 -05:00
Taylor Eernisse	03d9f8cce5	docs(db): document safety invariants for sqlite-vec transmute Adds a SAFETY comment explaining why the transmute of sqlite3_vec_init to the sqlite3_auto_extension callback type is sound. The three invariants (stable C-ABI signature, single-call-per-connection contract, idempotency) were previously undocumented, which left the lone unsafe block without justification for future readers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 08:38:41 -05:00

1 2

79 Commits