README.md: - Add cross-reference tracking feature description - Add resource event history feature description - Add observability feature description (verbosity, JSON logs, metrics) - Document --no-events flag for sync command - Add sync timing/progress bar behavior note - Document verbosity flags (-v, -vv, -vvv) - Document --log-format json option - Add new database tables to schema reference: - resource_state_events - resource_label_events - resource_milestone_events - entity_references AGENTS.md: - Add --no-events example for sync command - Document verbosity flags (-v, -vv, -vvv) - Document --log-format json option Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
582 lines
20 KiB
Markdown
582 lines
20 KiB
Markdown
# Gitlore
|
|
|
|
Local GitLab data management with semantic search. Syncs issues, MRs, discussions, and notes from GitLab to a local SQLite database for fast, offline-capable querying, filtering, and hybrid search.
|
|
|
|
## Features
|
|
|
|
- **Local-first**: All data stored in SQLite for instant queries
|
|
- **Incremental sync**: Cursor-based sync only fetches changes since last sync
|
|
- **Full re-sync**: Reset cursors and fetch all data from scratch when needed
|
|
- **Multi-project**: Track issues and MRs across multiple GitLab projects
|
|
- **Rich filtering**: Filter by state, author, assignee, labels, milestone, due date, draft status, reviewer, branches
|
|
- **Hybrid search**: Combines FTS5 lexical search with Ollama-powered vector embeddings via Reciprocal Rank Fusion
|
|
- **Raw payload storage**: Preserves original GitLab API responses for debugging
|
|
- **Discussion threading**: Full support for issue and MR discussions including inline code review comments
|
|
- **Cross-reference tracking**: Automatic extraction of "closes", "mentioned" relationships between MRs and issues
|
|
- **Resource event history**: Tracks state changes, label events, and milestone events for issues and MRs
|
|
- **Robot mode**: Machine-readable JSON output with structured errors and meaningful exit codes
|
|
- **Observability**: Verbosity controls, JSON log format, structured metrics, and stage timing
|
|
|
|
## Installation
|
|
|
|
```bash
|
|
cargo install --path .
|
|
```
|
|
|
|
Or build from source:
|
|
|
|
```bash
|
|
cargo build --release
|
|
./target/release/lore --help
|
|
```
|
|
|
|
## Quick Start
|
|
|
|
```bash
|
|
# Initialize configuration (interactive)
|
|
lore init
|
|
|
|
# Verify authentication
|
|
lore auth
|
|
|
|
# Sync everything from GitLab (issues + MRs + docs + embeddings)
|
|
lore sync
|
|
|
|
# List recent issues
|
|
lore issues -n 10
|
|
|
|
# List open merge requests
|
|
lore mrs -s opened
|
|
|
|
# Show issue details
|
|
lore issues 123
|
|
|
|
# Show MR details with discussions
|
|
lore mrs 456
|
|
|
|
# Search across all indexed data
|
|
lore search "authentication bug"
|
|
|
|
# Robot mode (machine-readable JSON)
|
|
lore -J issues -n 5 | jq .
|
|
```
|
|
|
|
## Configuration
|
|
|
|
Configuration is stored in `~/.config/lore/config.json` (or `$XDG_CONFIG_HOME/lore/config.json`).
|
|
|
|
### Example Configuration
|
|
|
|
```json
|
|
{
|
|
"gitlab": {
|
|
"baseUrl": "https://gitlab.com",
|
|
"tokenEnvVar": "GITLAB_TOKEN"
|
|
},
|
|
"projects": [
|
|
{ "path": "group/project" },
|
|
{ "path": "other-group/other-project" }
|
|
],
|
|
"sync": {
|
|
"backfillDays": 14,
|
|
"staleLockMinutes": 10,
|
|
"heartbeatIntervalSeconds": 30,
|
|
"cursorRewindSeconds": 2,
|
|
"primaryConcurrency": 4,
|
|
"dependentConcurrency": 2
|
|
},
|
|
"storage": {
|
|
"compressRawPayloads": true
|
|
},
|
|
"embedding": {
|
|
"provider": "ollama",
|
|
"model": "nomic-embed-text",
|
|
"baseUrl": "http://localhost:11434",
|
|
"concurrency": 4
|
|
}
|
|
}
|
|
```
|
|
|
|
### Configuration Options
|
|
|
|
| Section | Field | Default | Description |
|
|
|---------|-------|---------|-------------|
|
|
| `gitlab` | `baseUrl` | -- | GitLab instance URL (required) |
|
|
| `gitlab` | `tokenEnvVar` | `GITLAB_TOKEN` | Environment variable containing API token |
|
|
| `projects` | `path` | -- | Project path (e.g., `group/project`) |
|
|
| `sync` | `backfillDays` | `14` | Days to backfill on initial sync |
|
|
| `sync` | `staleLockMinutes` | `10` | Minutes before sync lock considered stale |
|
|
| `sync` | `heartbeatIntervalSeconds` | `30` | Frequency of lock heartbeat updates |
|
|
| `sync` | `cursorRewindSeconds` | `2` | Seconds to rewind cursor for overlap safety |
|
|
| `sync` | `primaryConcurrency` | `4` | Concurrent GitLab requests for primary resources |
|
|
| `sync` | `dependentConcurrency` | `2` | Concurrent requests for dependent resources |
|
|
| `storage` | `dbPath` | `~/.local/share/lore/lore.db` | Database file path |
|
|
| `storage` | `backupDir` | `~/.local/share/lore/backups` | Backup directory |
|
|
| `storage` | `compressRawPayloads` | `true` | Compress stored API responses with gzip |
|
|
| `embedding` | `provider` | `ollama` | Embedding provider |
|
|
| `embedding` | `model` | `nomic-embed-text` | Model name for embeddings |
|
|
| `embedding` | `baseUrl` | `http://localhost:11434` | Ollama server URL |
|
|
| `embedding` | `concurrency` | `4` | Concurrent embedding requests |
|
|
|
|
### Config File Resolution
|
|
|
|
The config file is resolved in this order:
|
|
1. `--config` / `-c` CLI flag
|
|
2. `LORE_CONFIG_PATH` environment variable
|
|
3. `~/.config/lore/config.json` (XDG default)
|
|
4. `./lore.config.json` (local fallback for development)
|
|
|
|
### GitLab Token
|
|
|
|
Create a personal access token with `read_api` scope:
|
|
|
|
1. Go to GitLab > Settings > Access Tokens
|
|
2. Create token with `read_api` scope
|
|
3. Export it: `export GITLAB_TOKEN=glpat-xxxxxxxxxxxx`
|
|
|
|
## Environment Variables
|
|
|
|
| Variable | Purpose | Required |
|
|
|----------|---------|----------|
|
|
| `GITLAB_TOKEN` | GitLab API authentication token (name configurable via `gitlab.tokenEnvVar`) | Yes |
|
|
| `LORE_CONFIG_PATH` | Override config file location | No |
|
|
| `LORE_ROBOT` | Enable robot mode globally (set to `true` or `1`) | No |
|
|
| `XDG_CONFIG_HOME` | XDG Base Directory for config (fallback: `~/.config`) | No |
|
|
| `XDG_DATA_HOME` | XDG Base Directory for data (fallback: `~/.local/share`) | No |
|
|
| `NO_COLOR` | Disable color output when set (any value) | No |
|
|
| `CLICOLOR` | Standard color control (0 to disable) | No |
|
|
| `RUST_LOG` | Logging level filter (e.g., `lore=debug`) | No |
|
|
|
|
## Commands
|
|
|
|
### `lore issues`
|
|
|
|
Query issues from local database, or show a specific issue.
|
|
|
|
```bash
|
|
lore issues # Recent issues (default 50)
|
|
lore issues 123 # Show issue #123 with discussions
|
|
lore issues 123 -p group/repo # Disambiguate by project
|
|
lore issues -n 100 # More results
|
|
lore issues -s opened # Only open issues
|
|
lore issues -s closed # Only closed issues
|
|
lore issues -a username # By author (@ prefix optional)
|
|
lore issues -A username # By assignee (@ prefix optional)
|
|
lore issues -l bug # By label (AND logic)
|
|
lore issues -l bug -l urgent # Multiple labels
|
|
lore issues -m "v1.0" # By milestone title
|
|
lore issues --since 7d # Updated in last 7 days
|
|
lore issues --since 2w # Updated in last 2 weeks
|
|
lore issues --since 1m # Updated in last month
|
|
lore issues --since 2024-01-01 # Updated since date
|
|
lore issues --due-before 2024-12-31 # Due before date
|
|
lore issues --has-due # Only issues with due dates
|
|
lore issues -p group/repo # Filter by project
|
|
lore issues --sort created --asc # Sort by created date, ascending
|
|
lore issues -o # Open first result in browser
|
|
```
|
|
|
|
When listing, output includes: IID, title, state, author, assignee, labels, and update time.
|
|
|
|
When showing a single issue (e.g., `lore issues 123`), output includes: title, description, state, author, assignees, labels, milestone, due date, web URL, and threaded discussions.
|
|
|
|
#### Project Resolution
|
|
|
|
The `-p` / `--project` flag uses cascading match logic across all commands:
|
|
|
|
1. **Exact match**: `group/project`
|
|
2. **Case-insensitive**: `Group/Project`
|
|
3. **Suffix match**: `project` matches `group/project` (if unambiguous)
|
|
4. **Substring match**: `typescript` matches `vs/typescript-code` (if unambiguous)
|
|
|
|
If multiple projects match, an error lists the candidates with a hint to use the full path.
|
|
|
|
### `lore mrs`
|
|
|
|
Query merge requests from local database, or show a specific MR.
|
|
|
|
```bash
|
|
lore mrs # Recent MRs (default 50)
|
|
lore mrs 456 # Show MR !456 with discussions
|
|
lore mrs 456 -p group/repo # Disambiguate by project
|
|
lore mrs -n 100 # More results
|
|
lore mrs -s opened # Only open MRs
|
|
lore mrs -s merged # Only merged MRs
|
|
lore mrs -s closed # Only closed MRs
|
|
lore mrs -s locked # Only locked MRs
|
|
lore mrs -s all # All states
|
|
lore mrs -a username # By author (@ prefix optional)
|
|
lore mrs -A username # By assignee (@ prefix optional)
|
|
lore mrs -r username # By reviewer (@ prefix optional)
|
|
lore mrs -d # Only draft/WIP MRs
|
|
lore mrs -D # Exclude draft MRs
|
|
lore mrs --target main # By target branch
|
|
lore mrs --source feature/foo # By source branch
|
|
lore mrs -l needs-review # By label (AND logic)
|
|
lore mrs --since 7d # Updated in last 7 days
|
|
lore mrs -p group/repo # Filter by project
|
|
lore mrs --sort created --asc # Sort by created date, ascending
|
|
lore mrs -o # Open first result in browser
|
|
```
|
|
|
|
When listing, output includes: IID, title (with [DRAFT] prefix if applicable), state, author, assignee, labels, and update time.
|
|
|
|
When showing a single MR (e.g., `lore mrs 456`), output includes: title, description, state, draft status, author, assignees, reviewers, labels, source/target branches, merge status, web URL, and threaded discussions. Inline code review comments (DiffNotes) display file context in the format `[src/file.ts:45]`.
|
|
|
|
### `lore search`
|
|
|
|
Search across indexed documents using hybrid (lexical + semantic), lexical-only, or semantic-only modes.
|
|
|
|
```bash
|
|
lore search "authentication bug" # Hybrid search (default)
|
|
lore search "login flow" --mode lexical # FTS5 lexical only
|
|
lore search "login flow" --mode semantic # Vector similarity only
|
|
lore search "auth" --type issue # Filter by source type
|
|
lore search "auth" --type mr # MR documents only
|
|
lore search "auth" --type discussion # Discussion documents only
|
|
lore search "deploy" --author username # Filter by author
|
|
lore search "deploy" -p group/repo # Filter by project
|
|
lore search "deploy" --label backend # Filter by label (AND logic)
|
|
lore search "deploy" --path src/ # Filter by file path (trailing / for prefix)
|
|
lore search "deploy" --after 7d # Created after (7d, 2w, 1m, or YYYY-MM-DD)
|
|
lore search "deploy" --updated-after 2w # Updated after
|
|
lore search "deploy" -n 50 # Limit results (default 20, max 100)
|
|
lore search "deploy" --explain # Show ranking explanation per result
|
|
lore search "deploy" --fts-mode raw # Raw FTS5 query syntax (advanced)
|
|
```
|
|
|
|
Requires `lore generate-docs` (or `lore sync`) to have been run at least once. Semantic and hybrid modes require `lore embed` (or `lore sync`) to have generated vector embeddings via Ollama.
|
|
|
|
### `lore sync`
|
|
|
|
Run the full sync pipeline: ingest from GitLab, generate searchable documents, and compute embeddings.
|
|
|
|
```bash
|
|
lore sync # Full pipeline
|
|
lore sync --full # Reset cursors, fetch everything
|
|
lore sync --force # Override stale lock
|
|
lore sync --no-embed # Skip embedding step
|
|
lore sync --no-docs # Skip document regeneration
|
|
lore sync --no-events # Skip resource event fetching
|
|
```
|
|
|
|
The sync command displays animated progress bars for each stage and outputs timing metrics on completion. In robot mode (`-J`), detailed stage timing is included in the JSON response.
|
|
|
|
### `lore ingest`
|
|
|
|
Sync data from GitLab to local database. Runs only the ingestion step (no doc generation or embeddings).
|
|
|
|
```bash
|
|
lore ingest # Ingest everything (issues + MRs)
|
|
lore ingest issues # Issues only
|
|
lore ingest mrs # MRs only
|
|
lore ingest issues -p group/repo # Single project
|
|
lore ingest --force # Override stale lock
|
|
lore ingest --full # Full re-sync (reset cursors)
|
|
```
|
|
|
|
The `--full` flag resets sync cursors and discussion watermarks, then fetches all data from scratch. Useful when:
|
|
- Assignee data or other fields were missing from earlier syncs
|
|
- You want to ensure complete data after schema changes
|
|
- Troubleshooting sync issues
|
|
|
|
### `lore generate-docs`
|
|
|
|
Extract searchable documents from ingested issues, MRs, and discussions for the FTS5 index.
|
|
|
|
```bash
|
|
lore generate-docs # Incremental (dirty items only)
|
|
lore generate-docs --full # Full rebuild
|
|
lore generate-docs -p group/repo # Single project
|
|
```
|
|
|
|
### `lore embed`
|
|
|
|
Generate vector embeddings for documents via Ollama. Requires Ollama running with the configured embedding model.
|
|
|
|
```bash
|
|
lore embed # Embed new/changed documents
|
|
lore embed --retry-failed # Retry previously failed embeddings
|
|
```
|
|
|
|
### `lore count`
|
|
|
|
Count entities in local database.
|
|
|
|
```bash
|
|
lore count issues # Total issues
|
|
lore count mrs # Total MRs (with state breakdown)
|
|
lore count discussions # Total discussions
|
|
lore count discussions --for issue # Issue discussions only
|
|
lore count discussions --for mr # MR discussions only
|
|
lore count notes # Total notes (system vs user breakdown)
|
|
lore count notes --for issue # Issue notes only
|
|
```
|
|
|
|
### `lore stats`
|
|
|
|
Show document and index statistics, with optional integrity checks.
|
|
|
|
```bash
|
|
lore stats # Document and index statistics
|
|
lore stats --check # Run integrity checks
|
|
lore stats --check --repair # Repair integrity issues
|
|
```
|
|
|
|
### `lore status`
|
|
|
|
Show current sync state and watermarks.
|
|
|
|
```bash
|
|
lore status
|
|
```
|
|
|
|
Displays:
|
|
- Last sync run details (status, timing)
|
|
- Cursor positions per project and resource type (issues and MRs)
|
|
- Data summary counts
|
|
|
|
### `lore init`
|
|
|
|
Initialize configuration and database interactively.
|
|
|
|
```bash
|
|
lore init # Interactive setup
|
|
lore init --force # Overwrite existing config
|
|
lore init --non-interactive # Fail if prompts needed
|
|
```
|
|
|
|
### `lore auth`
|
|
|
|
Verify GitLab authentication is working.
|
|
|
|
```bash
|
|
lore auth
|
|
# Authenticated as @username (Full Name)
|
|
# GitLab: https://gitlab.com
|
|
```
|
|
|
|
### `lore doctor`
|
|
|
|
Check environment health and configuration.
|
|
|
|
```bash
|
|
lore doctor
|
|
```
|
|
|
|
Checks performed:
|
|
- Config file existence and validity
|
|
- Database existence and pragmas (WAL mode, foreign keys)
|
|
- GitLab authentication
|
|
- Project accessibility
|
|
- Ollama connectivity (optional)
|
|
|
|
### `lore migrate`
|
|
|
|
Run pending database migrations.
|
|
|
|
```bash
|
|
lore migrate
|
|
```
|
|
|
|
### `lore health`
|
|
|
|
Quick pre-flight check for config, database, and schema version. Exits 0 if healthy, 1 if unhealthy.
|
|
|
|
```bash
|
|
lore health
|
|
```
|
|
|
|
Useful as a fast gate before running queries or syncs. For a more thorough check including authentication and project access, use `lore doctor`.
|
|
|
|
### `lore robot-docs`
|
|
|
|
Machine-readable command manifest for agent self-discovery. Returns a JSON schema of all commands, flags, exit codes, and example workflows.
|
|
|
|
```bash
|
|
lore robot-docs # Pretty-printed JSON
|
|
lore --robot robot-docs # Compact JSON for parsing
|
|
```
|
|
|
|
### `lore version`
|
|
|
|
Show version information including the git commit hash.
|
|
|
|
```bash
|
|
lore version
|
|
# lore version 0.1.0 (abc1234)
|
|
```
|
|
|
|
## Robot Mode
|
|
|
|
Machine-readable JSON output for scripting and AI agent consumption.
|
|
|
|
### Activation
|
|
|
|
```bash
|
|
# Global flag
|
|
lore --robot issues -n 5
|
|
|
|
# JSON shorthand (-J)
|
|
lore -J issues -n 5
|
|
|
|
# Environment variable
|
|
LORE_ROBOT=1 lore issues -n 5
|
|
|
|
# Auto-detection (when stdout is not a TTY)
|
|
lore issues -n 5 | jq .
|
|
```
|
|
|
|
### Response Format
|
|
|
|
All commands return consistent JSON:
|
|
|
|
```json
|
|
{"ok": true, "data": {...}, "meta": {...}}
|
|
```
|
|
|
|
Errors return structured JSON to stderr:
|
|
|
|
```json
|
|
{"error": {"code": "CONFIG_NOT_FOUND", "message": "...", "suggestion": "Run 'lore init'"}}
|
|
```
|
|
|
|
### Exit Codes
|
|
|
|
| Code | Meaning |
|
|
|------|---------|
|
|
| 0 | Success |
|
|
| 1 | Internal error / health check failed / not implemented |
|
|
| 2 | Usage error (invalid flags or arguments) |
|
|
| 3 | Config invalid |
|
|
| 4 | Token not set |
|
|
| 5 | GitLab auth failed |
|
|
| 6 | Resource not found |
|
|
| 7 | Rate limited |
|
|
| 8 | Network error |
|
|
| 9 | Database locked |
|
|
| 10 | Database error |
|
|
| 11 | Migration failed |
|
|
| 12 | I/O error |
|
|
| 13 | Transform error |
|
|
| 14 | Ollama unavailable |
|
|
| 15 | Ollama model not found |
|
|
| 16 | Embedding failed |
|
|
| 17 | Not found (entity does not exist) |
|
|
| 18 | Ambiguous match (use `-p` to specify project) |
|
|
| 20 | Config not found |
|
|
|
|
## Configuration Precedence
|
|
|
|
Settings are resolved in this order (highest to lowest priority):
|
|
|
|
1. CLI flags (`--robot`, `--config`, `--color`)
|
|
2. Environment variables (`LORE_ROBOT`, `GITLAB_TOKEN`, `LORE_CONFIG_PATH`)
|
|
3. Config file (`~/.config/lore/config.json`)
|
|
4. Built-in defaults
|
|
|
|
## Global Options
|
|
|
|
```bash
|
|
lore -c /path/to/config.json <command> # Use alternate config
|
|
lore --robot <command> # Machine-readable JSON
|
|
lore -J <command> # JSON shorthand
|
|
lore --color never <command> # Disable color output
|
|
lore --color always <command> # Force color output
|
|
lore -q <command> # Suppress non-essential output
|
|
lore -v <command> # Debug logging
|
|
lore -vv <command> # More verbose debug logging
|
|
lore -vvv <command> # Trace-level logging
|
|
lore --log-format json <command> # JSON-formatted log output to stderr
|
|
```
|
|
|
|
Color output respects `NO_COLOR` and `CLICOLOR` environment variables in `auto` mode (the default).
|
|
|
|
## Shell Completions
|
|
|
|
Generate shell completions for tab-completion support:
|
|
|
|
```bash
|
|
# Bash (add to ~/.bashrc)
|
|
lore completions bash > ~/.local/share/bash-completion/completions/lore
|
|
|
|
# Zsh (add to ~/.zshrc: fpath=(~/.zfunc $fpath))
|
|
lore completions zsh > ~/.zfunc/_lore
|
|
|
|
# Fish
|
|
lore completions fish > ~/.config/fish/completions/lore.fish
|
|
|
|
# PowerShell (add to $PROFILE)
|
|
lore completions powershell >> $PROFILE
|
|
```
|
|
|
|
## Database Schema
|
|
|
|
Data is stored in SQLite with WAL mode and foreign keys enabled. Main tables:
|
|
|
|
| Table | Purpose |
|
|
|-------|---------|
|
|
| `projects` | Tracked GitLab projects with metadata |
|
|
| `issues` | Issue metadata (title, state, author, due date, milestone) |
|
|
| `merge_requests` | MR metadata (title, state, draft, branches, merge status) |
|
|
| `milestones` | Project milestones with state and due dates |
|
|
| `labels` | Project labels with colors |
|
|
| `issue_labels` | Many-to-many issue-label relationships |
|
|
| `issue_assignees` | Many-to-many issue-assignee relationships |
|
|
| `mr_labels` | Many-to-many MR-label relationships |
|
|
| `mr_assignees` | Many-to-many MR-assignee relationships |
|
|
| `mr_reviewers` | Many-to-many MR-reviewer relationships |
|
|
| `discussions` | Issue/MR discussion threads |
|
|
| `notes` | Individual notes within discussions (with system note flag and DiffNote position data) |
|
|
| `resource_state_events` | Issue/MR state change history (opened, closed, merged, reopened) |
|
|
| `resource_label_events` | Label add/remove events with actor and timestamp |
|
|
| `resource_milestone_events` | Milestone add/remove events with actor and timestamp |
|
|
| `entity_references` | Cross-references between entities (MR closes issue, mentioned in, etc.) |
|
|
| `documents` | Extracted searchable text for FTS and embedding |
|
|
| `documents_fts` | FTS5 full-text search index |
|
|
| `embeddings` | Vector embeddings for semantic search |
|
|
| `dirty_sources` | Entities needing document regeneration after ingest |
|
|
| `pending_discussion_fetches` | Queue for discussion fetch operations |
|
|
| `sync_runs` | Audit trail of sync operations |
|
|
| `sync_cursors` | Cursor positions for incremental sync |
|
|
| `app_locks` | Crash-safe single-flight lock |
|
|
| `raw_payloads` | Compressed original API responses |
|
|
| `schema_version` | Migration version tracking |
|
|
|
|
The database is stored at `~/.local/share/lore/lore.db` by default (XDG compliant).
|
|
|
|
## Development
|
|
|
|
```bash
|
|
# Run tests
|
|
cargo test
|
|
|
|
# Run with debug logging
|
|
RUST_LOG=lore=debug lore issues
|
|
|
|
# Run with trace logging
|
|
RUST_LOG=lore=trace lore ingest issues
|
|
|
|
# Check formatting
|
|
cargo fmt --check
|
|
|
|
# Lint
|
|
cargo clippy
|
|
```
|
|
|
|
## Tech Stack
|
|
|
|
- **Rust** (2024 edition)
|
|
- **SQLite** via rusqlite (bundled) with FTS5 and sqlite-vec
|
|
- **Ollama** for vector embeddings (nomic-embed-text)
|
|
- **clap** for CLI parsing
|
|
- **reqwest** for HTTP
|
|
- **tokio** for async runtime
|
|
- **serde** for serialization
|
|
- **tracing** for logging
|
|
- **indicatif** for progress bars
|
|
|
|
## License
|
|
|
|
MIT
|