Commit Graph

11 Commits

Author SHA1 Message Date
teernisse
c1b1300675 refactor: extract who result types to core::who_types for TUI reuse
Move the 14 result structs and enums (WhoResult, ExpertResult, Expert,
ScoreComponents, ExpertMrDetail, WorkloadResult, WorkloadIssue, WorkloadMr,
WorkloadDiscussion, ReviewsResult, ReviewCategory, ActiveResult,
ActiveDiscussion, OverlapResult, OverlapUser) from cli::commands::who into
a new core::who_types module.

The TUI Who screen needs these types to render results, but importing from
the CLI layer would create a circular dependency (TUI -> CLI -> core). By
placing them in core, both the CLI and TUI can depend on them cleanly.

The CLI module re-exports all types via `pub use crate::core::who_types::*`
so existing consumers are unaffected.
2026-02-18 22:56:16 -05:00
teernisse
171260a772 feat(cli): implement 'lore trace' command (bd-2n4, bd-9dd)
Gate 5 Code Trace - Tier 1 (API-only, no git blame).
Answers 'Why was this code introduced?' by building
file -> MR -> issue -> discussion chains.

New files:
- src/core/trace.rs: run_trace() query logic with rename-aware
  path resolution, entity_reference-based issue linking, and
  DiffNote discussion extraction
- src/core/trace_tests.rs: 7 unit tests for query logic
- src/cli/commands/trace.rs: CLI command with human output,
  robot JSON output, and :line suffix parsing (5 tests)

Human output shows full content (no truncation).
Robot JSON truncates discussion bodies to 500 chars for token efficiency.

Wiring:
- TraceArgs + Commands::Trace in cli/mod.rs
- handle_trace in main.rs
- VALID_COMMANDS + robot-docs manifest entry
- COMMAND_FLAGS autocorrect registry entry

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 14:57:21 -05:00
teernisse
e6771709f1 refactor(core): extract path_resolver module, fix old_path matching in who
Extract shared path resolution logic from who.rs into a new
core::path_resolver module for cross-module reuse. Functions moved:
escape_like, normalize_repo_path, PathQuery, SuffixResult,
build_path_query, suffix_probe. Duplicate escape_like copies removed
from list.rs, project.rs, and filters.rs — all now import from
path_resolver.

Additionally fixes two bugs in query_expert_details() and
query_overlap() where only position_new_path was checked (missing
old_path matches for renamed files) and state filter excluded 'closed'
MRs despite the main scoring query including them with a decay
multiplier.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 13:50:14 -05:00
Taylor Eernisse
48fbd4bfdb feat(core): add file rename chain resolver with depth-bounded BFS
New module: core::file_history with resolve_rename_chain() that traces
a file path through its rename history in mr_file_changes using
bidirectional BFS (forward: old_path->new_path, backward: new_path->old_path).

Key design decisions:
- Depth-bounded BFS: each queue entry carries its distance from the
  origin, so max_hops correctly limits by graph distance (not by total
  nodes discovered). This matters for branching rename graphs where a
  file was renamed differently in parallel MRs.
- Cycle-safe: visited set prevents infinite loops from circular renames.
- Project-scoped: queries are always scoped to a single project_id.
- Deterministic: output is sorted for stable results.

Tests cover: linear chains (forward/backward), cycles, max_hops=0,
depth-bounded linear chains, branching renames, diamond patterns,
and cross-project isolation (9 tests total).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 10:54:41 -05:00
Taylor Eernisse
405e5370dc feat(sync): concurrent drains, atomic watermarks, graceful Ctrl+C shutdown
Three fixes to the sync pipeline:

1. Atomic watermarks: wrap complete_job + update_watermark in a single
   SQLite transaction so crash between them can't leave partial state.

2. Concurrent drain loops: prefetch HTTP requests via join_all (batch
   size = dependent_concurrency), then write serially to DB. Reduces
   ~9K sequential requests from ~19 min to ~2.4 min.

3. Graceful shutdown: install Ctrl+C handler via ShutdownSignal
   (Arc<AtomicBool>), thread through orchestrator/CLI, release locked
   jobs on interrupt, record sync_run as "failed".

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 11:22:04 -05:00
Taylor Eernisse
3767c33c28 feat: Implement Gate 3 timeline pipeline and Gate 4 migration scaffolding
Complete 5 beads for the Phase B temporal intelligence feature:

- bd-1oo: Register migration 015 (commit SHAs, closes watermark) and
  create migration 016 (mr_file_changes table with 4 indexes for
  Gate 4 file-history)

- bd-20e: Define TimelineEvent model with 9 event type variants,
  EntityRef, ExpandedEntityRef, UnresolvedRef, and TimelineResult
  types. Ord impl for chronological sorting with stable tiebreak.

- bd-32q: Implement timeline seed phase - FTS5 keyword search to
  entity IDs with discussion-to-parent resolution, entity dedup,
  and evidence note extraction with snippet truncation.

- bd-ypa: Implement timeline expand phase - BFS cross-reference
  expansion over entity_references with bidirectional traversal,
  depth limiting, mention filtering, provenance tracking, and
  unresolved reference collection.

- bd-3as: Implement timeline event collection - gathers Created,
  StateChanged, LabelAdded/Removed, MilestoneSet/Removed, Merged,
  and NoteEvidence events. Merged dedup (state=merged -> Merged
  variant only). NULL label/milestone fallbacks. Chronological
  interleaving with since filter and limit.

38 new tests, all 445 tests pass. All quality gates clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 16:54:28 -05:00
Taylor Eernisse
f748570d4d feat(core): Add cross-reference extraction infrastructure
Introduces two new modules for extracting and storing entity cross-references
from GitLab data:

note_parser.rs:
- Parses system notes for "mentioned in" and "closed by" patterns
- Extracts cross-project references (group/project#42, group/project!123)
- Uses lazy-compiled regexes for performance
- Handles both issue (#) and MR (!) sigils
- Provides extract_refs_from_system_notes() for batch processing

references.rs:
- Extracts refs from resource_state_events table (API-sourced closes links)
- Provides insert_entity_reference() for storing discovered references
- Includes resolution helpers: resolve_issue_local_id, resolve_mr_local_id,
  resolve_project_path for converting iids to internal IDs
- Enables cross-project reference resolution

These modules power the entity_references table, enabling features like
"find all MRs that close this issue" and "find all issues mentioned in this MR".

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 00:03:13 -05:00
teernisse
329c8f4539 feat(observability): Add metrics, logging, and sync-run core modules
Introduce the foundational observability layer for the sync pipeline:

- MetricsLayer: Custom tracing subscriber layer that captures span timing
  and structured fields, materializing them into a hierarchical
  Vec<StageTiming> tree for robot-mode performance data output
- logging: Dual-layer subscriber infrastructure with configurable stderr
  verbosity (-v/-vv/-vvv) and always-on JSON file logging with daily
  rotation and configurable retention (default 30 days)
- SyncRunRecorder: Compile-time enforced lifecycle recorder for sync_runs
  table (start -> succeed|fail), with correlation IDs and aggregate counts
- LoggingConfig: New config section for log_dir, retention_days, and
  file_logging toggle
- get_log_dir(): Path helper for log directory resolution
- is_permanent_api_error(): Distinguish retryable vs permanent API failures
  (only 404 is truly permanent; 403/auth errors may be environmental)

Database changes:
- Migration 013: Add resource_events_synced_for_updated_at watermark columns
  to issues and merge_requests tables for incremental resource event sync
- Migration 014: Enrich sync_runs with run_id correlation ID, aggregate
  counts (total_items_processed, total_errors), and run_id index
- Wrap file-based migrations in savepoints for rollback safety

Dependencies: Add uuid (run_id generation), tracing-appender (file logging)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 13:38:29 -05:00
Taylor Eernisse
724be4d265 feat(queue): Add generic dependent fetch queue with exponential backoff
New module src/core/dependent_queue.rs provides job queue operations
against the pending_dependent_fetches table. Designed for second-pass
fetches that depend on primary entity ingestion (resource events,
MR close references, MR file diffs).

Queue operations:
- enqueue_job: Idempotent INSERT OR IGNORE keyed on the UNIQUE
  (project_id, entity_type, entity_iid, job_type) constraint.
  Returns bool indicating whether the row was actually inserted.

- claim_jobs: Two-phase claim — SELECT available jobs (unlocked,
  past retry window) then UPDATE locked_at in batch. Orders by
  enqueued_at ASC for FIFO processing within a job type.

- complete_job: DELETE the row on successful processing.

- fail_job: Increments attempts, calculates exponential backoff
  (30s * 2^(attempts-1), capped at 480s), sets next_retry_at,
  clears locked_at, and records the error message. Reads current
  attempts via query with unwrap_or(0) fallback for robustness.

- reclaim_stale_locks: Clears locked_at on jobs locked longer than
  a configurable threshold, recovering from worker crashes.

- count_pending_jobs: GROUP BY job_type aggregation for progress
  reporting and stats display.

Registers both events_db and dependent_queue in src/core/mod.rs.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 12:07:48 -05:00
Taylor Eernisse
6e22f120d0 refactor(core): Rename GiError to LoreError and add search infrastructure
Mechanical rename of GiError -> LoreError across the core module to
match the project's rebranding from gitlab-inbox to gitlore/lore.
Updates the error enum name, all From impls, and the Result type alias.

Additionally introduces:

- New error variants for embedding pipeline: OllamaUnavailable,
  OllamaModelNotFound, EmbeddingFailed, EmbeddingsNotBuilt. Each
  includes actionable suggestions (e.g., "ollama serve", "ollama pull
  nomic-embed-text") to guide users through recovery.

- New error codes 14-16 for programmatic handling of Ollama failures.

- Savepoint-based migration execution in db.rs: each migration now
  runs inside a SQLite SAVEPOINT so a failed migration rolls back
  cleanly without corrupting the schema_version tracking. Previously
  a partial migration could leave the database in an inconsistent
  state.

- core::backoff module: exponential backoff with jitter utility for
  retry loops in the embedding pipeline and discussion queues.

- core::project module: helper for resolving project IDs and paths
  from the local database, used by the document regenerator and
  search filters.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 15:45:54 -05:00
Taylor Eernisse
7aaa51f645 feat(core): Implement infrastructure layer for CLI operations
Establishes foundational modules that all other components depend on.

src/core/config.rs - Configuration management:
- JSON-based config file with Zod-like validation via serde
- GitLab settings: base URL, token environment variable
- Project list with paths to track
- Sync settings: backfill days, stale lock timeout, cursor rewind
- Storage settings: database path, payload compression toggle
- XDG-compliant config path resolution via dirs crate
- Loads GITLAB_TOKEN from configured environment variable

src/core/db.rs - Database connection and migrations:
- Opens or creates SQLite database with WAL mode for concurrency
- Embeds migration SQL as const strings (001-005)
- Runs migrations idempotently with checksum verification
- Provides thread-safe connection management

src/core/error.rs - Unified error handling:
- GiError enum with variants for all failure modes
- Config, Database, GitLab, Ingestion, Lock, IO, Parse errors
- thiserror derive for automatic Display/Error impls
- Result type alias for ergonomic error propagation

src/core/lock.rs - Distributed sync locking:
- File-based locks to prevent concurrent syncs
- Stale lock detection with configurable timeout
- Force override for recovery scenarios
- Lock file contains PID and timestamp for debugging

src/core/paths.rs - Path resolution:
- XDG Base Directory Specification compliance
- Config: ~/.config/gi/config.json
- Data: ~/.local/share/gi/gi.db
- Creates parent directories on first access

src/core/payloads.rs - Raw payload storage:
- Optional gzip compression for storage efficiency
- SHA-256 content addressing for deduplication
- Type-prefixed keys (issue:, discussion:, note:)
- Batch insert with UPSERT for idempotent ingestion

src/core/time.rs - Timestamp utilities:
- Relative time parsing (7d, 2w, 1m) for --since flag
- ISO 8601 date parsing for absolute dates
- Human-friendly relative time formatting

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 11:28:07 -05:00