Commit Graph

260 Commits

Author SHA1 Message Date
Taylor Eernisse
549a0646d7 chore: Add test-runner agent, agent-swarm-launcher skill, review artifacts, and beads updates
- .claude/agents/test-runner.md: New Claude Code agent definition for
  running cargo test suites and analyzing results, configured with
  haiku model for fast execution.

- skills/agent-swarm-launcher/: New skill for bootstrapping coordinated
  multi-agent workflows with AGENTS.md reconnaissance, Agent Mail
  coordination, and beads task tracking.

- api-review.html, phase-a-review.html: Self-contained HTML review
  artifacts for API audit and Phase A search pipeline review.

- .beads/issues.jsonl, .beads/last-touched: Updated issue tracker
  state reflecting current project work items.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 09:36:05 -05:00
Taylor Eernisse
a417640faa docs: Overhaul AGENTS.md, update README, add pipeline spec and Phase B plan
AGENTS.md: Comprehensive rewrite adding file deletion safeguards,
destructive git command protocol, Rust toolchain conventions, code
editing discipline rules, compiler check requirements, TDD mandate,
MCP Agent Mail coordination protocol, beads/bv/ubs/ast-grep/cass
tool documentation, and session completion workflow.

README.md: Document NO_COLOR/CLICOLOR env vars, --since 1m duration,
project resolution cascading match logic, lore health and robot-docs
commands, exit codes 17 (not found) and 18 (ambiguous match),
--color/--quiet global flags, dirty_sources and
pending_discussion_fetches tables, and version command git hash output.

docs/embedding-pipeline-hardening.md: Detailed spec covering the three
problems from the chunk size reduction (broken --full wiring, mixed
chunk sizes in vector space, static dedup multiplier) with decision
records, implementation plan, and acceptance criteria.

docs/phase-b-temporal-intelligence.md: Draft planning document for
transforming gitlore from a search engine into a temporal code
intelligence system by ingesting structured event data from GitLab.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 09:35:51 -05:00
Taylor Eernisse
f560e6bc00 test(embedding): Add regression tests for pipeline hardening bugs
Three targeted regression tests covering bugs fixed in the embedding
pipeline hardening:

- overflow_doc_with_error_sentinel_not_re_detected_as_pending: verifies
  that documents skipped for producing too many chunks have their
  sentinel error recorded in embedding_metadata and are NOT returned by
  find_pending_documents or count_pending_documents on subsequent runs
  (prevents infinite re-processing loop).

- count_and_find_pending_agree: exercises four states (empty DB, new
  document, fully-embedded document, config-drifted document) and
  asserts that count_pending_documents and find_pending_documents
  produce consistent results across all of them.

- full_embed_delete_is_atomic: confirms the --full flag's two DELETE
  statements (embedding_metadata + embeddings) execute atomically
  within a transaction.

Also updates test DB creation to apply migration 010.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 09:35:34 -05:00
Taylor Eernisse
aebbe6b795 feat(cli): Wire --full flag for embed, add sync stage spinners
- Add --full / --no-full flag pair to EmbedArgs with overrides_with
  semantics matching the existing flag pattern. When active, atomically
  DELETEs all embedding_metadata and embeddings before re-embedding.

- Thread the full flag through run_embed -> run_sync so that
  'lore sync --full' triggers a complete re-embed alongside the full
  re-ingest it already performed.

- Add indicatif spinners to sync stages with dynamic stage numbering
  that adjusts when --no-docs or --no-embed skip stages. Spinners are
  hidden in robot mode.

- Update robot-docs manifest to advertise the new --full flag on the
  embed command.

- Replace hardcoded schema version 9 in health check with the
  LATEST_SCHEMA_VERSION constant from db.rs.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 09:35:22 -05:00
Taylor Eernisse
7d07f95d4c fix(embedding): Harden pipeline against chunk overflow, config drift, and partial failures
Reduces CHUNK_MAX_BYTES from 32KB to 6KB and CHUNK_OVERLAP_CHARS from
500 to 200 to stay within nomic-embed-text's 8,192-token context
window. This commit addresses all downstream consequences of that
reduction:

- Config drift detection: find_pending_documents and
  count_pending_documents now take model_name and compare
  chunk_max_bytes, model, and dims against stored metadata. Documents
  embedded with stale config are automatically re-queued.

- Overflow guard: documents producing >= CHUNK_ROWID_MULTIPLIER chunks
  are skipped with a sentinel error recorded in embedding_metadata,
  preventing both rowid collision and infinite re-processing loops.

- Deferred clearing: old embeddings are no longer cleared before
  attempting new ones. clear_document_embeddings is deferred until the
  first successful chunk embedding, so if all chunks fail the document
  retains its previous embeddings rather than losing all data.

- Savepoints: each page of DB writes is wrapped in a SQLite savepoint
  so a crash mid-page rolls back atomically instead of leaving partial
  state (cleared embeddings with no replacements).

- Per-chunk retry on context overflow: when a batch fails with a
  context-length error, each chunk is retried individually so one
  oversized chunk doesn't poison the entire batch.

- Adaptive dedup in vector search: replaces the static 3x over-fetch
  multiplier with a dynamic one based on actual max chunks per document
  (using the new chunk_count column with a fallback COUNT query for
  pre-migration data). Also replaces partial_cmp with total_cmp for
  f64 distance sorting.

- Stores chunk_max_bytes and chunk_count (on sentinel rows) in
  embedding_metadata to support config drift detection and adaptive
  dedup without runtime queries.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 09:35:08 -05:00
Taylor Eernisse
2a52594a60 feat(db): Add migration 010 for chunk config tracking columns
Add chunk_max_bytes and chunk_count columns to embedding_metadata to
support config drift detection and adaptive dedup sizing. Includes a
partial index on sentinel rows (chunk_index=0) to accelerate the drift
detection and max-chunk queries.

Also exports LATEST_SCHEMA_VERSION as a public constant derived from
the MIGRATIONS array length, replacing the previously hardcoded magic
number in the health check.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 09:34:48 -05:00
Taylor Eernisse
51c370fac2 feat(project): Add substring matching and use Ambiguous error for resolution
Extend resolve_project() with a 4th cascade step: case-insensitive
substring match when exact, case-insensitive, and suffix matches all
fail. This allows shorthand like "typescript" to match
"vs/typescript-code" when unambiguous. Multi-match still returns an
error with all candidates listed.

Also change ambiguity errors from LoreError::Other to LoreError::Ambiguous
so they get the proper AMBIGUOUS error code (exit 18) instead of
INTERNAL_ERROR.

Includes tests for unambiguous substring, case-insensitive substring,
ambiguous substring, and suffix-preferred-over-substring ordering.

Co-Authored-By: Claude (us.anthropic.claude-opus-4-5-20251101-v1:0) <noreply@anthropic.com>
2026-01-30 16:55:23 -05:00
Taylor Eernisse
7b7d781a19 docs: Update exit codes, add config precedence and shell completions
Exit code tables (README + AGENTS.md):
- Add codes 14-16 (Ollama unavailable, model not found, embedding failed)
- Add code 20 (Config not found, remapped from 2)
- Clarify code 1 (now includes health check failed + not implemented)
- Clarify code 2 (now exclusively usage/parsing errors from clap)

New sections:
- Configuration Precedence: CLI flags > env vars > config file > defaults
- Shell Completions: bash, zsh, fish, powershell installation instructions

Co-Authored-By: Claude (us.anthropic.claude-opus-4-5-20251101-v1:0) <noreply@anthropic.com>
2026-01-30 16:55:02 -05:00
Taylor Eernisse
03ea51513d feat(main): Wire SIGPIPE, color, quiet, completions, and negation flag handling
Runtime setup:
- Reset SIGPIPE to SIG_DFL on Unix at the very start of main() so
  piping to head/grep doesn't cause a panic.
- Apply --color flag to console::set_colors_enabled() after CLI parse.
- Extract quiet flag and thread it to handle_ingest.

Command dispatch:
- Add Completions match arm using clap_complete::generate().
- Resolve all --no-X negation flags in handlers: asc, has_due, open
  (issues/mrs), force/full (ingest/sync), check (stats), explain
  (search), retry_failed (embed).
- Auto-enable --check when --repair is used in handle_stats.
- Suppress deprecation warnings in robot mode for List, Show, AuthTest,
  and SyncStatus deprecated aliases.

Stubs:
- Change handle_backup/handle_reset from ok:true to structured error
  JSON on stderr with exit code 1. Remove unused NotImplementedOutput
  and NotImplementedData structs.

Version:
- Include GIT_HASH env var in handle_version output (human and robot).
- Add git_hash field to VersionData with skip_serializing_if for None.

Robot-docs:
- Update exit code table with codes 14-18 (Ollama, NotFound, Ambiguous)
  and code 20 (ConfigNotFound). Clarify code 1 and 2 descriptions.

Co-Authored-By: Claude (us.anthropic.claude-opus-4-5-20251101-v1:0) <noreply@anthropic.com>
2026-01-30 16:54:53 -05:00
Taylor Eernisse
667f70e177 refactor(commands): Add IngestDisplay, resolve_project, and color-aware tables
Ingest:
- Introduce IngestDisplay struct with show_progress/show_text booleans
  to decouple progress bars from text output. Replaces the robot_mode
  bool parameter with explicit display control, enabling sync to show
  progress without duplicating summary text (progress_only mode).
- Use resolve_project() for --project filtering instead of LIKE queries,
  providing proper error messages for ambiguous or missing projects.

List:
- Add colored_cell() helper that checks console::colors_enabled() before
  applying comfy-table foreground colors, bridging the gap between the
  console and comfy-table crates for --color flag support.
- Use resolve_project() for project filtering (exact ID match).
- Improve since filter to return explicit errors instead of silently
  ignoring invalid values.
- Improve format_relative_time for proper singular/plural forms.

Search:
- Validate --after/--updated-after with explicit error messages.
- Handle optional title field (Option<String>) in HydratedRow.

Show:
- Use resolve_project() for project disambiguation.

Sync:
- Thread robot_mode via SyncOptions for IngestDisplay selection.
- Use IngestDisplay::progress_only() in interactive sync mode.

GenerateDocs:
- Use resolve_project() for --project filtering.

Co-Authored-By: Claude (us.anthropic.claude-opus-4-5-20251101-v1:0) <noreply@anthropic.com>
2026-01-30 16:54:36 -05:00
Taylor Eernisse
585b746461 feat(cli): Add --color, --quiet, --no-X negations, completions, and help headings
Global flags:
- --color (auto|always|never) for explicit color control
- --quiet/-q to suppress non-essential output
- Hidden Completions subcommand for bash/zsh/fish/powershell

Flag negation (--no-X) with overrides_with for: has-due, asc, open
(issues/mrs), force/full (ingest/sync), check (stats), explain (search),
retry-failed (embed). Enables scripted flag composition where later flags
override earlier ones.

Validation:
- value_parser on search --mode, --type, --fts-mode for early rejection
- Remove requires="check" from --repair (auto-enabled in handler)

Polish:
- help_heading groups (Filters, Sorting, Output, Actions) on issues,
  mrs, and search args for cleaner --help output
- Hide Backup, Reset, and Completions from --help

Co-Authored-By: Claude (us.anthropic.claude-opus-4-5-20251101-v1:0) <noreply@anthropic.com>
2026-01-30 16:54:18 -05:00
Taylor Eernisse
730ddef339 fix(error): Remap ConfigNotFound to exit 20 and add NotFound/Ambiguous codes
ConfigNotFound previously used exit code 2 which collides with clap's
usage error code. Remap it to exit 20 to avoid ambiguity. Also add
dedicated NotFound (exit 17) and Ambiguous (exit 18) error codes with
proper ErrorCode variants and Display implementations, replacing the
previous incorrect mapping of these errors to GitLabNotFound.

Co-Authored-By: Claude (us.anthropic.claude-opus-4-5-20251101-v1:0) <noreply@anthropic.com>
2026-01-30 16:54:02 -05:00
Taylor Eernisse
5508d8464a build: Add clap_complete, libc dependencies and git hash build script
Add clap_complete for shell completion generation and libc (unix-only)
for SIGPIPE handling. Create build.rs to embed the git commit hash at
compile time via cargo:rustc-env=GIT_HASH, enabling `lore version` to
display the short hash alongside the version number.

Co-Authored-By: Claude (us.anthropic.claude-opus-4-5-20251101-v1:0) <noreply@anthropic.com>
2026-01-30 16:53:51 -05:00
Taylor Eernisse
41d20f1374 chore(beads): Update issue tracker with search pipeline beads
Add new beads for the checkpoint-3 search pipeline work including
document generation, FTS5 indexing, embedding pipeline, hybrid search,
and CLI command implementations. Update status on completed beads.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 15:47:39 -05:00
Taylor Eernisse
9b63671df9 docs: Update documentation for search pipeline and Phase A spec
- README.md: Add hybrid search and robot mode to feature list. Update
  quick start to use new noun-first CLI syntax (lore issues, lore mrs,
  lore search). Add embedding configuration section. Update command
  examples throughout.

- AGENTS.md: Update robot mode examples to new CLI syntax. Add search,
  sync, stats, and generate-docs commands to the robot mode reference.
  Update flag conventions (-n for limit, -s for state, -J for JSON).

- docs/prd/checkpoint-3.md: Major expansion with gated milestone
  structure (Gate A: lexical, Gate B: hybrid, Gate C: sync). Add
  prerequisite rename note, code sample conventions, chunking strategy
  details, and sqlite-vec rowid encoding scheme. Clarify that Gate A
  requires only SQLite + FTS5 with no sqlite-vec dependency.

- docs/phase-a-spec.md: New detailed specification for Gate A (lexical
  search MVP) covering document schema, FTS5 configuration, dirty
  queue mechanics, CLI interface, and acceptance criteria.

- docs/api-efficiency-findings.md: Analysis of GitLab API pagination
  behavior and efficiency observations from production sync runs.
  Documents the missing x-next-page header issue and heuristic fix.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 15:47:33 -05:00
Taylor Eernisse
d235f2b4dd test: Add test suites for embedding, FTS, hybrid search, and golden queries
Four new test modules covering the search infrastructure:

- tests/embedding.rs: Unit tests for the embedding pipeline including
  chunk ID encoding/decoding, change detection, and document chunking
  with overlap verification.

- tests/fts_search.rs: Integration tests for FTS5 search including
  safe query sanitization, multi-term queries, prefix matching, and
  the raw FTS mode for power users.

- tests/hybrid_search.rs: End-to-end tests for hybrid search mode
  including RRF fusion correctness, graceful degradation when
  embeddings are unavailable, and filter application.

- tests/golden_query_tests.rs: Golden query tests using fixtures
  from tests/fixtures/golden_queries.json to verify search quality
  against known-good query/result pairs. Ensures ranking stability
  across implementation changes.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 15:47:19 -05:00
Taylor Eernisse
daf5a73019 feat(cli): Add search, stats, embed, sync, health, and robot-docs commands
Extends the CLI with six new commands that complete the search pipeline:

- lore search <QUERY>: Hybrid search with mode selection (lexical,
  hybrid, semantic), rich filtering (--type, --author, --project,
  --label, --path, --after, --updated-after), result limits, and
  optional explain mode showing RRF score breakdowns. Safe FTS mode
  sanitizes user input; raw mode passes through for power users.

- lore stats: Document and index statistics with optional --check
  for integrity verification and --repair to fix inconsistencies
  (orphaned documents, missing FTS entries, stale dirty queue items).

- lore embed: Generate vector embeddings via Ollama. Supports
  --retry-failed to re-attempt previously failed embeddings.

- lore generate-docs: Drain the dirty queue to regenerate documents.
  --full seeds all entities for complete rebuild. --project scopes
  to a single project.

- lore sync: Full pipeline orchestration (ingest issues + MRs,
  generate-docs, embed) with --no-embed and --no-docs flags for
  partial runs. Reports per-stage results and total elapsed time.

- lore health: Quick pre-flight check (config exists, DB exists,
  schema current). Returns exit code 1 if unhealthy. Designed for
  agent pre-flight scripts.

- lore robot-docs: Machine-readable command manifest for agent
  self-discovery. Returns all commands, flags, examples, exit codes,
  and recommended workflows as structured JSON.

Also enhances lore init with --gitlab-url, --token-env-var, and
--projects flags for fully non-interactive robot-mode initialization.
Fixes init's force/non-interactive precedence logic and adds JSON
output for robot mode.

Updates all command files for the GiError -> LoreError rename.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 15:47:10 -05:00
Taylor Eernisse
559f0702ad feat(ingestion): Mark entities dirty on ingest for document regeneration
Integrates the dirty tracking system into all four ingestion paths
(issues, MRs, issue discussions, MR discussions). After each entity
is upserted within its transaction, a corresponding dirty_queue entry
is inserted so the document regenerator knows which documents need
rebuilding.

This ensures that document generation stays transactionally consistent
with data changes: if the ingest transaction rolls back, the dirty
marker rolls back too, preventing stale document regeneration attempts.

Also updates GiError references to LoreError in these files as part
of the codebase-wide rename, and adjusts issue discussion logging
from info to debug level to reduce noise during normal sync runs.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 15:46:51 -05:00
Taylor Eernisse
d5bdb24b0f feat(search): Add hybrid search engine with FTS5, vector, and RRF fusion
Implements the search module providing three search modes:

- Lexical (FTS5): Full-text search using SQLite FTS5 with safe query
  sanitization. User queries are automatically tokenized and wrapped
  in proper FTS5 syntax. Supports a "raw" mode for power users who
  want direct FTS5 query syntax (NEAR, column filters, etc.).

- Semantic (vector): Embeds the search query via Ollama, then performs
  cosine similarity search against stored document embeddings. Results
  are deduplicated by doc_id since documents may have multiple chunks.

- Hybrid (default): Executes both lexical and semantic searches in
  parallel, then fuses results using Reciprocal Rank Fusion (RRF) with
  k=60. This avoids the complexity of score normalization while
  producing high-quality merged rankings. Gracefully degrades to
  lexical-only when embeddings are unavailable.

Additional components:

- search::filters: Post-retrieval filtering by source_type, author,
  project, labels (AND logic), file path prefix, created_after, and
  updated_after. Date filters accept relative formats (7d, 2w) and
  ISO dates.

- search::rrf: Reciprocal Rank Fusion implementation with configurable
  k parameter and optional explain mode that annotates each result
  with its component ranks and fusion score breakdown.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 15:46:42 -05:00
Taylor Eernisse
723703bed9 feat(embedding): Add Ollama-powered vector embedding pipeline
Implements the embedding module that generates vector representations
of documents using a local Ollama instance with the nomic-embed-text
model. These embeddings enable semantic (vector) search and the hybrid
search mode that fuses lexical and semantic results via RRF.

Key components:

- embedding::ollama: HTTP client for the Ollama /api/embeddings
  endpoint. Handles connection errors with actionable error messages
  (OllamaUnavailable, OllamaModelNotFound) and validates response
  dimensions.

- embedding::chunking: Splits long documents into overlapping
  paragraph-aware chunks for embedding. Uses a configurable max token
  estimate (8192 default for nomic-embed-text) with 10% overlap to
  preserve cross-chunk context.

- embedding::chunk_ids: Encodes chunk identity as
  doc_id * 1000 + chunk_index for the embeddings table rowid. This
  allows vector search to map results back to documents and
  deduplicate by doc_id efficiently.

- embedding::change_detector: Compares document content_hash against
  stored embedding hashes to skip re-embedding unchanged documents,
  making incremental embedding runs fast.

- embedding::pipeline: Orchestrates the full embedding flow: detect
  changed documents, chunk them, call Ollama in configurable
  concurrency (default 4), store results. Supports --retry-failed
  to re-attempt previously failed embeddings.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 15:46:30 -05:00
Taylor Eernisse
20edff4ab1 feat(documents): Add document generation pipeline with dirty tracking
Implements the documents module that transforms raw ingested entities
(issues, MRs, discussions) into searchable document blobs stored in
the documents table. This is the foundation for both FTS5 lexical
search and vector embedding.

Key components:

- documents::extractor: Renders entities into structured text documents.
  Issues include title, description, labels, milestone, assignees, and
  threaded discussion summaries. MRs additionally include source/target
  branches, reviewers, and approval status. Discussions are rendered
  with full note threading.

- documents::regenerator: Drains the dirty_queue table to regenerate
  only documents whose source entities changed since last sync. Supports
  full rebuild mode (seeds all entities into dirty queue first) and
  project-scoped regeneration.

- documents::truncation: Safety cap at 2MB per document to prevent
  pathological outliers from degrading FTS or embedding performance.

- ingestion::dirty_tracker: Marks entities as dirty inside the
  ingestion transaction so document regeneration stays consistent
  with data changes. Uses INSERT OR IGNORE to deduplicate.

- ingestion::discussion_queue: Queue-based discussion fetching that
  isolates individual discussion failures from the broader ingestion
  pipeline, preventing a single corrupt discussion from blocking
  an entire project sync.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 15:46:18 -05:00
Taylor Eernisse
d31d5292f2 fix(gitlab): Improve pagination heuristics and fix rate limiter lock contention
Two targeted fixes to the GitLab API client:

1. Pagination: When the x-next-page header is missing but the current
   page returned a full page of results, heuristically advance to the
   next page instead of stopping. This fixes silent data truncation
   observed with certain GitLab instances that omit pagination headers
   on intermediate pages. The existing early-exit on empty or partial
   pages remains as the termination condition.

2. Rate limiter: Refactor the async acquire() method into a synchronous
   check_delay() that computes the required sleep duration and updates
   last_request time while holding the mutex, then releases the lock
   before sleeping. This eliminates holding the Mutex<RateLimiter>
   across an await point, which previously could block other request
   tasks unnecessarily during the sleep interval.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 15:46:05 -05:00
Taylor Eernisse
6e22f120d0 refactor(core): Rename GiError to LoreError and add search infrastructure
Mechanical rename of GiError -> LoreError across the core module to
match the project's rebranding from gitlab-inbox to gitlore/lore.
Updates the error enum name, all From impls, and the Result type alias.

Additionally introduces:

- New error variants for embedding pipeline: OllamaUnavailable,
  OllamaModelNotFound, EmbeddingFailed, EmbeddingsNotBuilt. Each
  includes actionable suggestions (e.g., "ollama serve", "ollama pull
  nomic-embed-text") to guide users through recovery.

- New error codes 14-16 for programmatic handling of Ollama failures.

- Savepoint-based migration execution in db.rs: each migration now
  runs inside a SQLite SAVEPOINT so a failed migration rolls back
  cleanly without corrupting the schema_version tracking. Previously
  a partial migration could leave the database in an inconsistent
  state.

- core::backoff module: exponential backoff with jitter utility for
  retry loops in the embedding pipeline and discussion queues.

- core::project module: helper for resolving project IDs and paths
  from the local database, used by the document regenerator and
  search filters.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 15:45:54 -05:00
Taylor Eernisse
4270603da4 feat(db): Add migrations for documents, FTS5, and embeddings
Three new migrations establish the search infrastructure:

- 007_documents: Creates the `documents` table as the central search
  unit. Each document is a rendered text blob derived from an issue,
  MR, or discussion. Includes `dirty_queue` table for tracking which
  entities need document regeneration after ingestion changes.

- 008_fts5: Creates FTS5 virtual table `documents_fts` with content
  sync triggers. Uses `unicode61` tokenizer with `remove_diacritics=2`
  for broad language support. Automatic insert/update/delete triggers
  keep the FTS index synchronized with the documents table.

- 009_embeddings: Creates `embeddings` table for storing vector
  chunks produced by Ollama. Uses `doc_id * 1000 + chunk_index`
  rowid encoding to support multi-chunk documents while enabling
  efficient doc-level deduplication in vector search results.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 15:45:41 -05:00
Taylor Eernisse
aca4773327 deps: Add rand crate for randomized backoff and jitter
The embedding pipeline and retry queues need randomized exponential
backoff to prevent thundering herd effects when Ollama or GitLab
recover from transient failures. The rand crate (0.8) provides the
thread-safe RNG needed for jitter computation.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 15:45:30 -05:00
Taylor Eernisse
f4dba386c9 docs: Restructure checkpoint-3 PRD with gated milestones
Reorganizes the Search & Sync MVP plan into three independently
verifiable gates (A: Lexical MVP, B: Hybrid MVP, C: Sync MVP)
to reduce integration risk. Each gate has explicit deliverables,
acceptance criteria, and can ship on its own.

Expands the specification with additional detail on document
generation, search API surface, sync orchestration, and
integrity repair paths. Removes the outdated rename note since
the project is now fully migrated to gitlore/lore naming.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 08:42:39 -05:00
Taylor Eernisse
856aad1641 feat(cli): Redesign CLI with noun-first subcommands
Replaces the verb-first pattern ('lore list issues', 'lore show
issue 42') with noun-first subcommands that feel more natural:

  lore issues          # list issues
  lore issues 42       # show issue #42
  lore mrs             # list merge requests
  lore mrs 99          # show MR #99
  lore ingest          # ingest everything
  lore ingest issues   # ingest only issues
  lore count issues    # count issues
  lore status          # sync status
  lore auth            # verify auth
  lore doctor          # health check

Key changes:
- New IssuesArgs, MrsArgs, IngestArgs, CountArgs structs with
  short flags (-n, -s, -p, -a, -l, -o, -f, -J, etc.)
- Global -J/--json flag as shorthand for --robot
- 'lore ingest' with no argument ingests both issues and MRs,
  emitting combined JSON summary in robot mode
- --asc flag replaces --order=asc/desc for brevity
- Renamed flags: --has-due-date -> --has-due, --type -> --for,
  --confirm -> --yes, target_branch -> --target, etc.

Old commands (list, show, auth-test, sync-status) are preserved
as hidden backward-compat aliases that emit deprecation warnings
to stderr before delegating to the new handlers.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 08:42:26 -05:00
Taylor Eernisse
8fe5feda7e fix(ingestion): Move counter increments after transaction commit
Ingestion counters (discussions_upserted, notes_upserted,
discussions_fetched, diffnotes_count) were incremented before
tx.commit(), meaning a failed commit would report inflated
metrics. Counters now increment only after successful commit
so reported numbers accurately reflect persisted state.

Also simplifies the stale-removal guard in issue discussions:
the received_first_response flag was unnecessary since an empty
seen_discussion_ids list is safe to pass to remove_stale -- if
there were no discussions, stale removal correctly sweeps all
previously-stored discussions. The two separate code paths
(empty vs populated) are collapsed into a single branch.

Derives Default on IngestResult to eliminate verbose zero-init.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 08:42:11 -05:00
Taylor Eernisse
753ff46bb4 fix(cli): Correct project filtering and GROUP_CONCAT delimiter
Two SQL correctness issues fixed:

1. Project filter used LIKE '%term%' which caused partial matches
   (e.g. filtering for "foo" matched "group/foobar"). Now uses
   exact match OR suffix match after '/' so "foo" matches
   "group/foo" but not "group/foobar".

2. GROUP_CONCAT used comma as delimiter for labels and assignees,
   which broke parsing when label names themselves contained commas.
   Switched to ASCII unit separator (0x1F) which cannot appear in
   GitLab entity names.

Also adds a guard for negative time deltas in format_relative_time
to handle clock skew gracefully instead of panicking.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 08:41:56 -05:00
Taylor Eernisse
d3a05cfb87 fix(error): Improve error suggestions with inline examples
Error suggestions now include concrete CLI examples so users
(and robot-mode consumers) can act immediately without consulting
docs. For instance, ConfigNotFound now shows the expected path
and the exact command to run, TokenNotSet shows the export syntax,
and Ambiguous shows the -p flag with example project paths.

Also fixes the error code for Ambiguous errors: it now maps to
GitLabNotFound instead of InternalError, since the entity exists
but the user needs to disambiguate -- not an internal failure.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 08:41:45 -05:00
Taylor Eernisse
390f8a9288 refactor(core): Centralize timestamp parsing in core::time
Duplicate ISO 8601 timestamp parsing functions existed in both
discussion.rs and merge_request.rs transformers. This extracts
iso_to_ms_strict() and iso_to_ms_opt_strict() into core::time
as the single source of truth, and updates both transformer
modules to use the shared implementations.

Also removes the private now_ms() from merge_request.rs in
favor of the existing core::time::now_ms(), and replaces the
local parse_timestamp_opt() in discussion.rs with the public
iso_to_ms() from core::time.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 08:41:34 -05:00
teernisse
55b895a2eb Update name to gitlore instead of gitlab-inbox 2026-01-28 15:49:14 -05:00
teernisse
9a6357c353 Begin planning phase 3-5 implementation 2026-01-27 22:40:49 -05:00
Taylor Eernisse
96ef60fa05 docs: Update documentation for CP2 merge request support
Updates project documentation to reflect the complete CP2 feature set
with merge request ingestion and robot mode capabilities.

README.md:
- Add MR-related CLI examples (gi list mrs, gi show mr, gi ingest)
- Document robot mode (--robot flag, GI_ROBOT env, auto-detect)
- Update feature list with MR support and DiffNote positions
- Add configuration section with all config file options
- Expand CLI reference with new commands and flags

AGENTS.md:
- Add MR ingestion patterns for AI agent consumption
- Document robot mode JSON schemas for parsing
- Include error handling patterns with exit codes
- Add discussion/note querying examples for code review context

Cargo.toml:
- Bump version to 0.2.0 reflecting major feature addition

The documentation emphasizes the robot mode design which enables
AI agents like Claude Code to reliably parse gi output for automated
GitLab workflow integration.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 22:47:34 -05:00
Taylor Eernisse
d338d68191 test: Add comprehensive test suite for MR ingestion
Introduces thorough test coverage for merge request functionality,
following the established testing patterns from issue ingestion.

New test files:
- mr_transformer_tests.rs: NormalizedMergeRequest transformation tests
  covering full MR with all fields, minimal MR, draft detection via
  title prefix and work_in_progress field, label/assignee/reviewer
  extraction, and timestamp conversion

- mr_discussion_tests.rs: MR discussion normalization tests including
  polymorphic noteable binding, DiffNote position extraction with
  line ranges and SHA triplet, and resolvable note handling

- diffnote_position_tests.rs: Exhaustive DiffNote position scenarios
  covering text/image/file types, single-line vs multi-line comments,
  added/removed/modified lines, and missing position handling

New fixtures:
- fixtures/gitlab_merge_request.json: Representative MR API response
  with nested structures for integration testing

Updated tests:
- gitlab_types_tests.rs: Add MR type deserialization tests
- migration_tests.rs: Update expected schema version to 6

Test design follows property-based patterns where feasible, with
explicit edge case coverage for nullable fields and API variants
across different GitLab versions.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 22:47:17 -05:00
Taylor Eernisse
8ddc974b89 feat(cli): Add MR support to list/show/count/ingest commands
Extends all data commands to support merge requests alongside issues,
with consistent patterns and JSON output for robot mode.

List command (gi list mrs):
- MR-specific columns: branches, draft status, reviewers
- Filters: --state (opened|merged|closed|locked|all), --draft,
  --no-draft, --reviewer, --target-branch, --source-branch
- Discussion count with unresolved indicator (e.g., "5/2!")
- JSON output includes full MR metadata

Show command (gi show mr <iid>):
- MR details with branches, assignees, reviewers, merge status
- DiffNote positions showing file:line for code review comments
- Full description and discussion bodies (no truncation in JSON)
- --json flag for structured output with ISO timestamps

Count command (gi count mrs):
- MR counting with optional --type filter for discussions/notes
- JSON output with breakdown by state

Ingest command (gi ingest --type mrs):
- Full MR sync with discussion prefetch
- Progress output shows MR-specific metrics (diffnotes count)
- JSON summary with comprehensive sync statistics

All commands respect global --robot mode for auto-JSON output.
The pattern "gi list mrs --json | jq '.mrs[] | .iid'" now works
for scripted MR processing.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 22:46:59 -05:00
Taylor Eernisse
7d0d586932 feat(cli): Add global robot mode for machine-readable output
Introduces a unified robot mode that enables JSON output across all
commands, designed for AI agent and script consumption.

Robot mode activation (any of):
- --robot flag: Explicit opt-in
- GI_ROBOT=1 env var: For persistent configuration
- Non-TTY stdout: Auto-detect when piped (e.g., gi list issues | jq)

Implementation:
- Cli::is_robot_mode(): Centralized detection logic
- All command handlers receive robot_mode boolean
- Errors emit structured JSON to stderr with exit codes
- Success responses emit JSON to stdout

Behavior changes in robot mode:
- No color/emoji output (no ANSI escapes)
- No progress spinners or interactive prompts
- Timestamps as ISO 8601 strings (not relative "2 hours ago")
- Full content (no truncation of descriptions/notes)
- Structured error objects with code, message, suggestion

This enables reliable parsing by Claude Code, shell scripts, and
automation pipelines. The auto-detect on non-TTY means simple piping
"just works" without explicit flags.

Per-command --json flags remain for explicit control and override
robot mode when needed for human-friendly terminal + JSON file output.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 22:46:27 -05:00
Taylor Eernisse
5fe76e46a3 fix(core): Add structured error handling and responsive lock release
Improves core infrastructure with robot-friendly error output and
faster lock release for better sync behavior.

Error handling improvements (error.rs):
- ErrorCode::exit_code(): Unique exit codes per error type (1-13)
  for programmatic error handling in scripts/agents
- GiError::suggestion(): Helpful hints for common error recovery
- GiError::to_robot_error(): Structured JSON error conversion
- RobotError/RobotErrorOutput: Serializable error types with code,
  message, and optional suggestion fields

Lock improvements (lock.rs):
- Heartbeat thread now polls every 100ms for release flag, only
  updating database heartbeat at full interval (5s default)
- Eliminates 5-10s delay after sync completion when waiting for
  heartbeat thread to notice release
- Reduces lock hold time after operation completes

Database (db.rs):
- Bump expected schema version to 6 for MR migration

The exit code mapping enables shell scripts and CI/CD pipelines to
distinguish between configuration errors (2-4), GitLab API errors
(5-8), and database errors (9-11) for appropriate retry/alert logic.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 22:46:08 -05:00
Taylor Eernisse
cd44e516e3 feat(ingestion): Implement MR sync with parallel discussion prefetch
Adds complete merge request ingestion pipeline with a novel two-phase
discussion sync strategy optimized for throughput.

New modules:
- merge_requests.rs: MR upsert with labels/assignees/reviewers handling,
  stale MR cleanup, and watermark-based incremental sync
- mr_discussions.rs: Parallel prefetch strategy for MR discussions

Two-phase MR discussion sync:
1. PREFETCH PHASE: Spawn concurrent tasks to fetch discussions for
   multiple MRs simultaneously (configurable concurrency, default 8).
   Transform and validate in parallel, storing results in memory.
2. WRITE PHASE: Serial database writes to avoid lock contention.
   Each MR's discussions written in a single transaction, with
   proper stale discussion cleanup.

This approach achieves ~4-8x throughput vs serial fetching while
maintaining database consistency. Transform errors are tracked per-MR
to prevent partial writes from corrupting watermarks.

Orchestrator updates:
- ingest_merge_requests(): Coordinates MR fetch -> discussion sync flow
- Progress callbacks emit MR-specific events for UI feedback
- Respects --full flag to reset discussion watermarks for full resync

The prefetch strategy is critical for MRs which typically have more
discussions than issues, and where API latency dominates sync time.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 22:45:48 -05:00
Taylor Eernisse
d33f24c91b feat(transformers): Add MR transformer and polymorphic discussion support
Introduces NormalizedMergeRequest transformer and updates discussion
normalization to handle both issue and MR discussions polymorphically.

New transformers:
- NormalizedMergeRequest: Transforms API MergeRequest to database row,
  extracting labels/assignees/reviewers into separate collections for
  junction table insertion. Handles draft detection, detailed_merge_status
  preference over deprecated merge_status, and merge_user over merged_by.

Discussion transformer updates:
- NormalizedDiscussion now takes noteable_type ("Issue" | "MergeRequest")
  and noteable_id for polymorphic FK binding
- normalize_discussions_for_issue(): Convenience wrapper for issues
- normalize_discussions_for_mr(): Convenience wrapper for MRs
- DiffNote position fields (type, line_range, SHA triplet) now extracted
  from API position object for code review context

Design decisions:
- Transformer returns (normalized_item, labels, assignees, reviewers)
  tuple for efficient batch insertion without re-querying
- Timestamps converted to ms epoch for SQLite storage consistency
- Optional fields use map() chains for clean null handling

The polymorphic discussion approach allows reusing the same discussions
and notes tables for both issues and MRs, with noteable_type + FK
determining the parent relationship.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 22:45:29 -05:00
Taylor Eernisse
cc8c489fd2 feat(gitlab): Add MR and MR discussion API endpoints to client
Extends GitLabClient with endpoints for fetching merge requests and
their discussions, following the same patterns established for issues.

New methods:
- fetch_merge_requests(): Paginated MR listing with cursor support,
  using updated_after filter for incremental sync. Uses 'all' scope
  to include MRs where user is author/assignee/reviewer.
- fetch_merge_requests_single_page(): Single page variant for callers
  managing their own pagination (used by parallel prefetch)
- fetch_mr_discussions(): Paginated discussion listing for a single MR,
  returns full discussion trees with notes

API design notes:
- Uses keyset pagination (order_by=updated_at, keyset=true) for
  consistent results during sync operations
- MR endpoint uses /merge_requests (not /mrs) per GitLab API naming
- Discussion endpoint matches issue pattern for consistency
- Per_page defaults to 100 (GitLab max) for efficiency

The fetch_merge_requests_single_page method enables the parallel
prefetch strategy used in mr_discussions.rs, where multiple MRs'
discussions are fetched concurrently during the sweep phase.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 22:45:13 -05:00
Taylor Eernisse
a18908c377 feat(gitlab): Add MergeRequest and related types for API deserialization
Extends GitLab type definitions with comprehensive merge request support,
matching the API response structure for /projects/:id/merge_requests.

New types:
- MergeRequest: Full MR metadata including draft status, branch info,
  detailed_merge_status, merge_user (modern API fields replacing
  deprecated alternatives), and references for cross-project support
- MrReviewer: Reviewer user info (MR-specific, distinct from assignees)
- MrAssignee: Assignee user info with consistent structure
- MrDiscussion: MR discussion wrapper for polymorphic handling
- DiffNotePosition: Rich position data for code review comments with
  line ranges and SHA triplet for commit context

Design decisions:
- Use Option<T> for all nullable API fields to handle partial responses
- Include deprecated fields (merged_by, merge_status) alongside modern
  alternatives for backward compatibility with older GitLab instances
- DiffNotePosition uses Option for all fields since different position
  types (text/image/file) populate different subsets

These types enable type-safe deserialization of GitLab MR API responses
with full coverage of the fields needed for CP2 ingestion.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 22:44:58 -05:00
Taylor Eernisse
39a71d8b85 feat(db): Add schema migration v6 for merge request support
Introduces comprehensive database schema for merge request ingestion
(CP2), designed with forward compatibility for future features.

New tables:
- merge_requests: Core MR metadata with draft status, branch info,
  detailed_merge_status (modern API field), and sync health telemetry
  columns for debuggability
- mr_labels: Junction table linking MRs to shared labels table
- mr_assignees: MR assignee usernames (same pattern as issues)
- mr_reviewers: MR-specific reviewer tracking (not applicable to issues)

Additional indexes:
- discussions: Add merge_request_id and resolved status indexes
- notes: Add composite indexes for DiffNote file/line queries

DiffNote position enhancements:
- position_type: 'text' | 'image' | 'file' for diff comment semantics
- position_line_range_start/end: Multi-line comment range support
- position_base_sha/start_sha/head_sha: Commit context for diff notes

The schema captures CP3-ready fields (head_sha, references_short/full,
SHA triplet) at zero additional API cost, preparing for file-context
and cross-project reference features.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 22:44:37 -05:00
Taylor Eernisse
8afb2c2e75 docs: Expand README with comprehensive CLI and config documentation
Significantly expand the README to serve as complete user documentation
for the CLI tool, reflecting the full CP1 implementation.

Configuration section:
- Add missing config options: heartbeatIntervalSeconds, primaryConcurrency,
  dependentConcurrency, backupDir, embedding provider settings
- Document config file resolution order (CLI flag, env var, XDG, local)
- Add environment variables table with GITLAB_TOKEN, GI_CONFIG_PATH,
  XDG_CONFIG_HOME, XDG_DATA_HOME, RUST_LOG

Commands section:
- Document --full flag for complete re-sync (resets cursors and watermarks)
- Add output descriptions for list, show, and count commands
- Document assignee filter with @ prefix normalization
- Add gi doctor checks explanation (config, db, GitLab auth, Ollama)
- Add gi sync-status output description
- Add placeholder documentation for backup and reset commands

Database schema section:
- Reformat as table with descriptions
- Add sync_runs, sync_cursors, app_locks, schema_version tables
- Note WAL mode and foreign keys enabled

Development section:
- Add RUST_LOG=gi=trace example for detailed logging

Current status section:
- Document CP1 scope (issues, discussions, incremental sync)
- List not-yet-implemented features (MRs, embeddings, backup/reset)
- Reference SPEC.md for full roadmap

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 17:01:37 -05:00
Taylor Eernisse
0952d21a90 docs(prd): Add CP2 PRD and CP1-CP2 alignment audit
Add comprehensive planning documentation for Checkpoint 2 (Merge Request
support) and document the results of the CP1 implementation audit.

checkpoint-2.md (2093 lines):
- Complete PRD for adding merge request ingestion, querying, and display
- Detailed user stories with acceptance criteria
- ASCII wireframes for CLI output formats
- Database schema extensions (migrations 006-007)
- API integration specifications for MR endpoints
- State transition diagrams for MR lifecycle
- Performance requirements and test specifications
- Risk assessment and mitigation strategies

cp1-cp2-alignment-audit.md (344 lines):
- Gap analysis between CP1 PRD and actual implementation
- Identified issues prioritized by severity (P0/P1/P2/P3)
- P0: NormalizedDiscussion struct incompatible with MR discussions
- P1: --full flag not resetting discussion watermarks
- P2: Missing Link header pagination fallback
- P3: Missing sync health telemetry and selective payload storage
- Each issue includes root cause, recommended fix, and affected files

The P0 and P1 issues have been fixed in accompanying commits. P2 and P3
items are deferred to CP2 implementation.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 17:01:20 -05:00
Taylor Eernisse
4abbe2a226 fix(ingest): Reset discussion watermarks when --full flag is used
This is a P1 fix from the CP1-CP2 alignment audit. The --full flag was
designed to enable complete data re-synchronization, but it only reset
sync_cursors for issues—it failed to reset the per-issue
discussions_synced_for_updated_at watermark.

The result was an inconsistent state: issues would be re-fetched from
GitLab (because sync_cursors were cleared), but their discussions would
NOT be re-synced (because the watermark comparison prevented it). This
was a subtle bug because the watermark check uses:

  WHERE updated_at > COALESCE(discussions_synced_for_updated_at, 0)

When discussions_synced_for_updated_at is already set to the issue's
updated_at, the comparison fails and discussions are skipped.

Fix: Before clearing sync_cursors, set discussions_synced_for_updated_at
to NULL for all issues in the project. This makes COALESCE return 0,
ensuring all issues become eligible for discussion sync.

The ordering is important: watermarks must be reset BEFORE cursors to
ensure the full sync behaves consistently.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 17:01:04 -05:00
Taylor Eernisse
d9d749ac57 fix(discussion): Make NormalizedDiscussion polymorphic for MR support
This is a P0 fix from the CP1-CP2 alignment audit. The original
NormalizedDiscussion struct had issue_id as a non-optional i64 and
hardcoded noteable_type to "Issue", making it incompatible with merge
request discussions even though the database schema already supports
both via nullable columns and a CHECK constraint.

Changes:
- Add NoteableRef enum with Issue(i64) and MergeRequest(i64) variants
  to provide compile-time safety against mixing up issue vs MR IDs
- Change NormalizedDiscussion.issue_id from i64 to Option<i64>
- Add NormalizedDiscussion.merge_request_id: Option<i64>
- Update transform_discussion() signature to take NoteableRef instead
  of local_issue_id, deriving issue_id/merge_request_id/noteable_type
  from the enum variant
- Update upsert_discussion() SQL to include merge_request_id column
  (now 12 parameters instead of 11)
- Export NoteableRef from transformers module
- Add test for MergeRequest discussion transformation
- Update all existing tests to use NoteableRef::Issue(id)

The database schema (migration 002) was forward-thinking and already
supports both issue_id and merge_request_id as nullable columns with
a CHECK constraint. This change prepares the application layer for
CP2 merge request support without requiring any migrations.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 17:00:49 -05:00
teernisse
fbdfd8f4cb beads 2026-01-26 11:34:04 -05:00
Taylor Eernisse
f53645790a test: Add comprehensive test suite with fixtures
Establishes testing infrastructure for reliable development.

tests/fixtures/ - GitLab API response samples:
- gitlab_issue.json: Single issue with full metadata
- gitlab_issues_page.json: Paginated issue list response
- gitlab_discussion.json: Discussion thread with notes
- gitlab_discussions_page.json: Paginated discussions response
All fixtures captured from real GitLab API responses with
sensitive data redacted, ensuring tests match actual behavior.

tests/gitlab_types_tests.rs - Type deserialization tests:
- Validates serde parsing of all GitLab API types
- Tests edge cases: null fields, empty arrays, nested objects
- Ensures GitLabIssue, GitLabDiscussion, GitLabNote parse correctly
- Verifies optional fields handle missing data gracefully
- Tests author/assignee extraction from various formats

tests/fixture_tests.rs - Integration with fixtures:
- Loads fixture files and validates parsing
- Tests transformer functions produce correct database rows
- Verifies IssueWithMetadata extracts labels and assignees
- Tests NormalizedDiscussion/NormalizedNote structure
- Validates raw payload preservation logic

tests/migration_tests.rs - Database schema tests:
- Creates in-memory SQLite for isolation
- Runs all migrations and verifies schema
- Tests table creation with expected columns
- Validates foreign key constraints
- Tests index creation for query performance
- Verifies idempotent migration behavior

Test infrastructure uses:
- tempfile for isolated database instances
- wiremock for HTTP mocking (available for future API tests)
- Standard Rust #[test] attributes

Run with: cargo test
Run single: cargo test test_name
Run with output: cargo test -- --nocapture

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 11:29:06 -05:00
Taylor Eernisse
8fb890c528 feat(cli): Implement complete command-line interface
Provides a user-friendly CLI for all GitLab Inbox operations.

src/cli/mod.rs - Clap command definitions:
- Global --config flag for alternate config path
- Subcommands: init, auth-test, doctor, version, backup, reset,
  migrate, sync-status, ingest, list, count, show
- Ingest supports --type (issues/merge_requests), --project filter,
  --force lock override, --full resync
- List supports rich filtering: --state, --author, --assignee,
  --label, --milestone, --since, --due-before, --has-due-date
- List supports --sort (updated/created/iid), --order (asc/desc)
- List supports --open to launch browser, --json for scripting

src/cli/commands/ - Command implementations:

init.rs: Interactive configuration wizard
- Prompts for GitLab URL, token env var, projects to track
- Creates config file and initializes database
- Supports --force overwrite and --non-interactive mode

auth_test.rs: Verify GitLab authentication
- Calls /api/v4/user to validate token
- Displays username and GitLab instance URL

doctor.rs: Environment health check
- Validates config file exists and parses correctly
- Checks database connectivity and migration state
- Verifies GitLab authentication
- Reports token environment variable status
- Supports --json output for CI integration

ingest.rs: Data synchronization from GitLab
- Acquires sync lock with stale detection
- Shows progress bars for issues and discussions
- Reports sync statistics on completion
- Supports --full flag to reset cursors and refetch all data

list.rs: Query local database
- Formatted table output with comfy-table
- Filters build dynamic SQL with parameterized queries
- Username filters normalize @ prefix automatically
- --open flag uses 'open' crate for cross-platform browser launch
- --json outputs array of issue objects

show.rs: Detailed entity view
- Displays issue metadata in structured format
- Shows full description with markdown
- Lists labels, assignees, milestone
- Shows discussion threads with notes

count.rs: Entity statistics
- Counts issues, discussions, or notes
- Supports --type filter for discussions/notes

sync_status.rs: Display sync watermarks
- Shows last sync time per project
- Displays cursor positions for debugging

src/main.rs - Application entry point:
- Initializes tracing subscriber with env-filter
- Parses CLI arguments via clap
- Dispatches to appropriate command handler
- Consistent error formatting for all failure modes

src/lib.rs - Library entry point:
- Exports cli, core, gitlab, ingestion modules
- Re-exports Config, GiError, Result for convenience

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 11:28:52 -05:00