Comprehensive peer code review identified and fixed the following: 1. who.rs: @-prefixed path routing used `target` (with @) instead of `clean` (stripped) when checking for '/' and passing to Expert mode, causing `lore who @src/auth/` to silently return zero results because the SQL LIKE matched against `@src/auth/%` which never exists. 2. db.rs: After ROLLBACK TO savepoint on migration failure, the savepoint was never RELEASEd, leaving it active on the connection. Fixed in both run_migrations() and run_migrations_from_dir(). 3. lock.rs: Multiple acquire() calls (e.g. re-acquiring a stale lock) replaced the heartbeat_handle without stopping the old thread, causing two concurrent heartbeat writers competing on the same lock row. Now signals the old thread to stop and joins it before spawning a new one. 4. chunk_ids.rs: encode_rowid() had no guard for chunk_index >= 1000 (CHUNK_ROWID_MULTIPLIER), which would cause rowid collisions between adjacent documents. Added range assertion [0, 1000). 5. main.rs: Fallback JSON error formatting in handle_auth_test interpolated LoreError Display output without escaping quotes or backslashes, potentially producing malformed JSON for robot-mode consumers. Now escapes both characters before interpolation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Gitlore
Local GitLab data management with semantic search and temporal intelligence. Syncs issues, MRs, discussions, and notes from GitLab to a local SQLite database for fast, offline-capable querying, filtering, hybrid search, and chronological event reconstruction.
Features
- Local-first: All data stored in SQLite for instant queries
- Incremental sync: Cursor-based sync only fetches changes since last sync
- Full re-sync: Reset cursors and fetch all data from scratch when needed
- Multi-project: Track issues and MRs across multiple GitLab projects
- Rich filtering: Filter by state, author, assignee, labels, milestone, due date, draft status, reviewer, branches
- Hybrid search: Combines FTS5 lexical search with Ollama-powered vector embeddings via Reciprocal Rank Fusion
- Timeline pipeline: Reconstructs chronological event histories by combining search, graph traversal, and event aggregation across related entities
- Git history linking: Tracks merge and squash commit SHAs to connect MRs with git history
- File change tracking: Records which files each MR touches, enabling file-level history queries
- Raw payload storage: Preserves original GitLab API responses for debugging
- Discussion threading: Full support for issue and MR discussions including inline code review comments
- Cross-reference tracking: Automatic extraction of "closes", "mentioned" relationships between MRs and issues
- Resource event history: Tracks state changes, label events, and milestone events for issues and MRs
- Robot mode: Machine-readable JSON output with structured errors and meaningful exit codes
- Observability: Verbosity controls, JSON log format, structured metrics, and stage timing
Installation
cargo install --path .
Or build from source:
cargo build --release
./target/release/lore --help
Quick Start
# Initialize configuration (interactive)
lore init
# Verify authentication
lore auth
# Sync everything from GitLab (issues + MRs + docs + embeddings)
lore sync
# List recent issues
lore issues -n 10
# List open merge requests
lore mrs -s opened
# Show issue details
lore issues 123
# Show MR details with discussions
lore mrs 456
# Search across all indexed data
lore search "authentication bug"
# Robot mode (machine-readable JSON)
lore -J issues -n 5 | jq .
Configuration
Configuration is stored in ~/.config/lore/config.json (or $XDG_CONFIG_HOME/lore/config.json).
Example Configuration
{
"gitlab": {
"baseUrl": "https://gitlab.com",
"tokenEnvVar": "GITLAB_TOKEN"
},
"projects": [
{ "path": "group/project" },
{ "path": "other-group/other-project" }
],
"sync": {
"backfillDays": 14,
"staleLockMinutes": 10,
"heartbeatIntervalSeconds": 30,
"cursorRewindSeconds": 2,
"primaryConcurrency": 4,
"dependentConcurrency": 2
},
"storage": {
"compressRawPayloads": true
},
"embedding": {
"provider": "ollama",
"model": "nomic-embed-text",
"baseUrl": "http://localhost:11434",
"concurrency": 4
}
}
Configuration Options
| Section | Field | Default | Description |
|---|---|---|---|
gitlab |
baseUrl |
-- | GitLab instance URL (required) |
gitlab |
tokenEnvVar |
GITLAB_TOKEN |
Environment variable containing API token |
projects |
path |
-- | Project path (e.g., group/project) |
sync |
backfillDays |
14 |
Days to backfill on initial sync |
sync |
staleLockMinutes |
10 |
Minutes before sync lock considered stale |
sync |
heartbeatIntervalSeconds |
30 |
Frequency of lock heartbeat updates |
sync |
cursorRewindSeconds |
2 |
Seconds to rewind cursor for overlap safety |
sync |
primaryConcurrency |
4 |
Concurrent GitLab requests for primary resources |
sync |
dependentConcurrency |
2 |
Concurrent requests for dependent resources |
storage |
dbPath |
~/.local/share/lore/lore.db |
Database file path |
storage |
backupDir |
~/.local/share/lore/backups |
Backup directory |
storage |
compressRawPayloads |
true |
Compress stored API responses with gzip |
embedding |
provider |
ollama |
Embedding provider |
embedding |
model |
nomic-embed-text |
Model name for embeddings |
embedding |
baseUrl |
http://localhost:11434 |
Ollama server URL |
embedding |
concurrency |
4 |
Concurrent embedding requests |
Config File Resolution
The config file is resolved in this order:
--config/-cCLI flagLORE_CONFIG_PATHenvironment variable~/.config/lore/config.json(XDG default)./lore.config.json(local fallback for development)
GitLab Token
Create a personal access token with read_api scope:
- Go to GitLab > Settings > Access Tokens
- Create token with
read_apiscope - Export it:
export GITLAB_TOKEN=glpat-xxxxxxxxxxxx
Environment Variables
| Variable | Purpose | Required |
|---|---|---|
GITLAB_TOKEN |
GitLab API authentication token (name configurable via gitlab.tokenEnvVar) |
Yes |
LORE_CONFIG_PATH |
Override config file location | No |
LORE_ROBOT |
Enable robot mode globally (set to true or 1) |
No |
XDG_CONFIG_HOME |
XDG Base Directory for config (fallback: ~/.config) |
No |
XDG_DATA_HOME |
XDG Base Directory for data (fallback: ~/.local/share) |
No |
NO_COLOR |
Disable color output when set (any value) | No |
CLICOLOR |
Standard color control (0 to disable) | No |
RUST_LOG |
Logging level filter (e.g., lore=debug) |
No |
Commands
lore issues
Query issues from local database, or show a specific issue.
lore issues # Recent issues (default 50)
lore issues 123 # Show issue #123 with discussions
lore issues 123 -p group/repo # Disambiguate by project
lore issues -n 100 # More results
lore issues -s opened # Only open issues
lore issues -s closed # Only closed issues
lore issues -a username # By author (@ prefix optional)
lore issues -A username # By assignee (@ prefix optional)
lore issues -l bug # By label (AND logic)
lore issues -l bug -l urgent # Multiple labels
lore issues -m "v1.0" # By milestone title
lore issues --since 7d # Updated in last 7 days
lore issues --since 2w # Updated in last 2 weeks
lore issues --since 1m # Updated in last month
lore issues --since 2024-01-01 # Updated since date
lore issues --due-before 2024-12-31 # Due before date
lore issues --has-due # Only issues with due dates
lore issues -p group/repo # Filter by project
lore issues --sort created --asc # Sort by created date, ascending
lore issues -o # Open first result in browser
# Field selection (robot mode)
lore -J issues --fields minimal # Compact: iid, title, state, updated_at_iso
lore -J issues --fields iid,title,labels,state # Custom fields
When listing, output includes: IID, title, state, author, assignee, labels, and update time. In robot mode, the --fields flag controls which fields appear in the JSON response.
When showing a single issue (e.g., lore issues 123), output includes: title, description, state, author, assignees, labels, milestone, due date, web URL, and threaded discussions.
Project Resolution
The -p / --project flag uses cascading match logic across all commands:
- Exact match:
group/project - Case-insensitive:
Group/Project - Suffix match:
projectmatchesgroup/project(if unambiguous) - Substring match:
typescriptmatchesvs/typescript-code(if unambiguous)
If multiple projects match, an error lists the candidates with a hint to use the full path.
lore mrs
Query merge requests from local database, or show a specific MR.
lore mrs # Recent MRs (default 50)
lore mrs 456 # Show MR !456 with discussions
lore mrs 456 -p group/repo # Disambiguate by project
lore mrs -n 100 # More results
lore mrs -s opened # Only open MRs
lore mrs -s merged # Only merged MRs
lore mrs -s closed # Only closed MRs
lore mrs -s locked # Only locked MRs
lore mrs -s all # All states
lore mrs -a username # By author (@ prefix optional)
lore mrs -A username # By assignee (@ prefix optional)
lore mrs -r username # By reviewer (@ prefix optional)
lore mrs -d # Only draft/WIP MRs
lore mrs -D # Exclude draft MRs
lore mrs --target main # By target branch
lore mrs --source feature/foo # By source branch
lore mrs -l needs-review # By label (AND logic)
lore mrs --since 7d # Updated in last 7 days
lore mrs -p group/repo # Filter by project
lore mrs --sort created --asc # Sort by created date, ascending
lore mrs -o # Open first result in browser
# Field selection (robot mode)
lore -J mrs --fields minimal # Compact: iid, title, state, updated_at_iso
lore -J mrs --fields iid,title,draft,target_branch # Custom fields
When listing, output includes: IID, title (with [DRAFT] prefix if applicable), state, author, assignee, labels, and update time.
When showing a single MR (e.g., lore mrs 456), output includes: title, description, state, draft status, author, assignees, reviewers, labels, source/target branches, merge status, web URL, and threaded discussions. Inline code review comments (DiffNotes) display file context in the format [src/file.ts:45].
lore search
Search across indexed documents using hybrid (lexical + semantic), lexical-only, or semantic-only modes.
lore search "authentication bug" # Hybrid search (default)
lore search "login flow" --mode lexical # FTS5 lexical only
lore search "login flow" --mode semantic # Vector similarity only
lore search "auth" --type issue # Filter by source type
lore search "auth" --type mr # MR documents only
lore search "auth" --type discussion # Discussion documents only
lore search "deploy" --author username # Filter by author
lore search "deploy" -p group/repo # Filter by project
lore search "deploy" --label backend # Filter by label (AND logic)
lore search "deploy" --path src/ # Filter by file path (trailing / for prefix)
lore search "deploy" --after 7d # Created after (7d, 2w, 1m, or YYYY-MM-DD)
lore search "deploy" --updated-after 2w # Updated after
lore search "deploy" -n 50 # Limit results (default 20, max 100)
lore search "deploy" --explain # Show ranking explanation per result
lore search "deploy" --fts-mode raw # Raw FTS5 query syntax (advanced)
Requires lore generate-docs (or lore sync) to have been run at least once. Semantic and hybrid modes require lore embed (or lore sync) to have generated vector embeddings via Ollama.
lore sync
Run the full sync pipeline: ingest from GitLab, generate searchable documents, and compute embeddings.
lore sync # Full pipeline
lore sync --full # Reset cursors, fetch everything
lore sync --force # Override stale lock
lore sync --no-embed # Skip embedding step
lore sync --no-docs # Skip document regeneration
lore sync --no-events # Skip resource event fetching
The sync command displays animated progress bars for each stage and outputs timing metrics on completion. In robot mode (-J), detailed stage timing is included in the JSON response.
lore ingest
Sync data from GitLab to local database. Runs only the ingestion step (no doc generation or embeddings).
lore ingest # Ingest everything (issues + MRs)
lore ingest issues # Issues only
lore ingest mrs # MRs only
lore ingest issues -p group/repo # Single project
lore ingest --force # Override stale lock
lore ingest --full # Full re-sync (reset cursors)
The --full flag resets sync cursors and discussion watermarks, then fetches all data from scratch. Useful when:
- Assignee data or other fields were missing from earlier syncs
- You want to ensure complete data after schema changes
- Troubleshooting sync issues
lore generate-docs
Extract searchable documents from ingested issues, MRs, and discussions for the FTS5 index.
lore generate-docs # Incremental (dirty items only)
lore generate-docs --full # Full rebuild
lore generate-docs -p group/repo # Single project
lore embed
Generate vector embeddings for documents via Ollama. Requires Ollama running with the configured embedding model.
lore embed # Embed new/changed documents
lore embed --retry-failed # Retry previously failed embeddings
lore count
Count entities in local database.
lore count issues # Total issues
lore count mrs # Total MRs (with state breakdown)
lore count discussions # Total discussions
lore count discussions --for issue # Issue discussions only
lore count discussions --for mr # MR discussions only
lore count notes # Total notes (system vs user breakdown)
lore count notes --for issue # Issue notes only
lore stats
Show document and index statistics, with optional integrity checks.
lore stats # Document and index statistics
lore stats --check # Run integrity checks
lore stats --check --repair # Repair integrity issues
lore status
Show current sync state and watermarks.
lore status
Displays:
- Last sync run details (status, timing)
- Cursor positions per project and resource type (issues and MRs)
- Data summary counts
lore init
Initialize configuration and database interactively.
lore init # Interactive setup
lore init --force # Overwrite existing config
lore init --non-interactive # Fail if prompts needed
lore auth
Verify GitLab authentication is working.
lore auth
# Authenticated as @username (Full Name)
# GitLab: https://gitlab.com
lore doctor
Check environment health and configuration.
lore doctor
Checks performed:
- Config file existence and validity
- Database existence and pragmas (WAL mode, foreign keys)
- GitLab authentication
- Project accessibility
- Ollama connectivity (optional)
lore migrate
Run pending database migrations.
lore migrate
lore health
Quick pre-flight check for config, database, and schema version. Exits 0 if healthy, 1 if unhealthy.
lore health
Useful as a fast gate before running queries or syncs. For a more thorough check including authentication and project access, use lore doctor.
lore robot-docs
Machine-readable command manifest for agent self-discovery. Returns a JSON schema of all commands, flags, exit codes, and example workflows.
lore robot-docs # Pretty-printed JSON
lore --robot robot-docs # Compact JSON for parsing
lore version
Show version information including the git commit hash.
lore version
# lore version 0.1.0 (abc1234)
Robot Mode
Machine-readable JSON output for scripting and AI agent consumption. All responses use compact (single-line) JSON with a uniform envelope and timing metadata.
Activation
# Global flag
lore --robot issues -n 5
# JSON shorthand (-J)
lore -J issues -n 5
# Environment variable
LORE_ROBOT=1 lore issues -n 5
# Auto-detection (when stdout is not a TTY)
lore issues -n 5 | jq .
Response Format
All commands return a consistent JSON envelope to stdout:
{"ok":true,"data":{...},"meta":{"elapsed_ms":42}}
Every response includes meta.elapsed_ms (wall-clock milliseconds for the command).
Errors return structured JSON to stderr with machine-actionable recovery steps:
{"error":{"code":"CONFIG_NOT_FOUND","message":"...","suggestion":"Run 'lore init'","actions":["lore init"]}}
The actions array contains executable shell commands an agent can run to recover from the error. It is omitted when empty (e.g., for generic I/O errors).
Field Selection
The --fields flag on issues and mrs list commands controls which fields appear in the JSON response, reducing token usage for AI agent workflows:
# Minimal preset (~60% fewer tokens)
lore -J issues --fields minimal
# Custom field list
lore -J issues --fields iid,title,state,labels,updated_at_iso
# Available presets
# minimal: iid, title, state, updated_at_iso
Valid fields for issues: iid, title, state, author_username, labels, assignees, discussion_count, unresolved_count, created_at_iso, updated_at_iso, web_url, project_path
Valid fields for MRs: iid, title, state, author_username, labels, draft, target_branch, source_branch, discussion_count, unresolved_count, created_at_iso, updated_at_iso, web_url, project_path, reviewers
Agent Self-Discovery
The robot-docs command provides a complete machine-readable manifest including response schemas for every command:
lore robot-docs | jq '.data.commands.issues.response_schema'
Each command entry includes response_schema describing the shape of its JSON response, fields_presets for commands supporting --fields, and copy-paste example invocations.
Exit Codes
| Code | Meaning |
|---|---|
| 0 | Success |
| 1 | Internal error / health check failed / not implemented |
| 2 | Usage error (invalid flags or arguments) |
| 3 | Config invalid |
| 4 | Token not set |
| 5 | GitLab auth failed |
| 6 | Resource not found |
| 7 | Rate limited |
| 8 | Network error |
| 9 | Database locked |
| 10 | Database error |
| 11 | Migration failed |
| 12 | I/O error |
| 13 | Transform error |
| 14 | Ollama unavailable |
| 15 | Ollama model not found |
| 16 | Embedding failed |
| 17 | Not found (entity does not exist) |
| 18 | Ambiguous match (use -p to specify project) |
| 19 | Health check failed |
| 20 | Config not found |
Configuration Precedence
Settings are resolved in this order (highest to lowest priority):
- CLI flags (
--robot,--config,--color) - Environment variables (
LORE_ROBOT,GITLAB_TOKEN,LORE_CONFIG_PATH) - Config file (
~/.config/lore/config.json) - Built-in defaults
Global Options
lore -c /path/to/config.json <command> # Use alternate config
lore --robot <command> # Machine-readable JSON
lore -J <command> # JSON shorthand
lore --color never <command> # Disable color output
lore --color always <command> # Force color output
lore -q <command> # Suppress non-essential output
lore -v <command> # Debug logging
lore -vv <command> # More verbose debug logging
lore -vvv <command> # Trace-level logging
lore --log-format json <command> # JSON-formatted log output to stderr
Color output respects NO_COLOR and CLICOLOR environment variables in auto mode (the default).
Shell Completions
Generate shell completions for tab-completion support:
# Bash (add to ~/.bashrc)
lore completions bash > ~/.local/share/bash-completion/completions/lore
# Zsh (add to ~/.zshrc: fpath=(~/.zfunc $fpath))
lore completions zsh > ~/.zfunc/_lore
# Fish
lore completions fish > ~/.config/fish/completions/lore.fish
# PowerShell (add to $PROFILE)
lore completions powershell >> $PROFILE
Database Schema
Data is stored in SQLite with WAL mode and foreign keys enabled. Main tables:
| Table | Purpose |
|---|---|
projects |
Tracked GitLab projects with metadata |
issues |
Issue metadata (title, state, author, due date, milestone) |
merge_requests |
MR metadata (title, state, draft, branches, merge status, commit SHAs) |
milestones |
Project milestones with state and due dates |
labels |
Project labels with colors |
issue_labels |
Many-to-many issue-label relationships |
issue_assignees |
Many-to-many issue-assignee relationships |
mr_labels |
Many-to-many MR-label relationships |
mr_assignees |
Many-to-many MR-assignee relationships |
mr_reviewers |
Many-to-many MR-reviewer relationships |
mr_file_changes |
Files touched by each MR (path, change type, renames) |
discussions |
Issue/MR discussion threads |
notes |
Individual notes within discussions (with system note flag and DiffNote position data) |
resource_state_events |
Issue/MR state change history (opened, closed, merged, reopened) |
resource_label_events |
Label add/remove events with actor and timestamp |
resource_milestone_events |
Milestone add/remove events with actor and timestamp |
entity_references |
Cross-references between entities (MR closes issue, mentioned in, etc.) |
documents |
Extracted searchable text for FTS and embedding |
documents_fts |
FTS5 full-text search index |
embeddings |
Vector embeddings for semantic search |
dirty_sources |
Entities needing document regeneration after ingest |
pending_discussion_fetches |
Queue for discussion fetch operations |
sync_runs |
Audit trail of sync operations |
sync_cursors |
Cursor positions for incremental sync |
app_locks |
Crash-safe single-flight lock |
raw_payloads |
Compressed original API responses |
schema_version |
Migration version tracking |
The database is stored at ~/.local/share/lore/lore.db by default (XDG compliant).
Timeline Pipeline
The timeline pipeline reconstructs chronological event histories for GitLab entities by combining full-text search, cross-reference graph traversal, and resource event aggregation. Given a search query, it identifies relevant issues and MRs, discovers related entities through their reference graph, and assembles a unified, time-ordered event stream.
Stages
The pipeline executes in five stages:
-
SEED -- Full-text search identifies the most relevant issues and MRs matching the query. Documents (issue bodies, MR descriptions, discussion notes) are ranked by BM25 relevance.
-
HYDRATE -- Evidence notes are extracted from the seed results: the top FTS-matched discussion notes with 200-character snippets that explain why each entity was surfaced.
-
EXPAND -- Breadth-first traversal over the
entity_referencesgraph discovers related entities. Starting from seed entities, the pipeline follows "closes", "related", and optionally "mentioned" references up to a configurable depth, tracking provenance (which entity referenced which, via what method). -
COLLECT -- Events are gathered for all discovered entities (seeds + expanded). Event types include: creation, state changes, label adds/removes, milestone assignments, merge events, and evidence notes. Events are sorted chronologically with stable tiebreaking (timestamp, then entity ID, then event type).
-
RENDER -- Events are formatted for output as human-readable text or structured JSON.
Event Types
| Event | Description |
|---|---|
Created |
Entity creation |
StateChanged |
State transitions (opened, closed, reopened) |
LabelAdded |
Label applied to entity |
LabelRemoved |
Label removed from entity |
MilestoneSet |
Milestone assigned |
MilestoneRemoved |
Milestone removed |
Merged |
MR merged (deduplicated against state events) |
NoteEvidence |
Discussion note matched by FTS, with snippet |
CrossReferenced |
Reference to another entity |
Unresolved References
When the graph expansion encounters cross-project references to entities not yet synced locally, these are collected as unresolved references in the pipeline output. This enables discovery of external dependencies and can inform future sync targets.
Development
# Run tests
cargo test
# Run with debug logging
RUST_LOG=lore=debug lore issues
# Run with trace logging
RUST_LOG=lore=trace lore ingest issues
# Check formatting
cargo fmt --check
# Lint
cargo clippy
Tech Stack
- Rust (2024 edition)
- SQLite via rusqlite (bundled) with FTS5 and sqlite-vec
- Ollama for vector embeddings (nomic-embed-text)
- clap for CLI parsing
- reqwest for HTTP
- tokio for async runtime
- serde for serialization
- tracing for logging
- indicatif for progress bars
License
MIT