# Gitlore Local GitLab data management with semantic search. Syncs issues, MRs, discussions, and notes from GitLab to a local SQLite database for fast, offline-capable querying, filtering, and hybrid search. ## Features - **Local-first**: All data stored in SQLite for instant queries - **Incremental sync**: Cursor-based sync only fetches changes since last sync - **Full re-sync**: Reset cursors and fetch all data from scratch when needed - **Multi-project**: Track issues and MRs across multiple GitLab projects - **Rich filtering**: Filter by state, author, assignee, labels, milestone, due date, draft status, reviewer, branches - **Hybrid search**: Combines FTS5 lexical search with Ollama-powered vector embeddings via Reciprocal Rank Fusion - **Raw payload storage**: Preserves original GitLab API responses for debugging - **Discussion threading**: Full support for issue and MR discussions including inline code review comments - **Robot mode**: Machine-readable JSON output with structured errors and meaningful exit codes ## Installation ```bash cargo install --path . ``` Or build from source: ```bash cargo build --release ./target/release/lore --help ``` ## Quick Start ```bash # Initialize configuration (interactive) lore init # Verify authentication lore auth # Sync everything from GitLab (issues + MRs + docs + embeddings) lore sync # List recent issues lore issues -n 10 # List open merge requests lore mrs -s opened # Show issue details lore issues 123 # Show MR details with discussions lore mrs 456 # Search across all indexed data lore search "authentication bug" # Robot mode (machine-readable JSON) lore -J issues -n 5 | jq . ``` ## Configuration Configuration is stored in `~/.config/lore/config.json` (or `$XDG_CONFIG_HOME/lore/config.json`). ### Example Configuration ```json { "gitlab": { "baseUrl": "https://gitlab.com", "tokenEnvVar": "GITLAB_TOKEN" }, "projects": [ { "path": "group/project" }, { "path": "other-group/other-project" } ], "sync": { "backfillDays": 14, "staleLockMinutes": 10, "heartbeatIntervalSeconds": 30, "cursorRewindSeconds": 2, "primaryConcurrency": 4, "dependentConcurrency": 2 }, "storage": { "compressRawPayloads": true }, "embedding": { "provider": "ollama", "model": "nomic-embed-text", "baseUrl": "http://localhost:11434", "concurrency": 4 } } ``` ### Configuration Options | Section | Field | Default | Description | |---------|-------|---------|-------------| | `gitlab` | `baseUrl` | -- | GitLab instance URL (required) | | `gitlab` | `tokenEnvVar` | `GITLAB_TOKEN` | Environment variable containing API token | | `projects` | `path` | -- | Project path (e.g., `group/project`) | | `sync` | `backfillDays` | `14` | Days to backfill on initial sync | | `sync` | `staleLockMinutes` | `10` | Minutes before sync lock considered stale | | `sync` | `heartbeatIntervalSeconds` | `30` | Frequency of lock heartbeat updates | | `sync` | `cursorRewindSeconds` | `2` | Seconds to rewind cursor for overlap safety | | `sync` | `primaryConcurrency` | `4` | Concurrent GitLab requests for primary resources | | `sync` | `dependentConcurrency` | `2` | Concurrent requests for dependent resources | | `storage` | `dbPath` | `~/.local/share/lore/lore.db` | Database file path | | `storage` | `backupDir` | `~/.local/share/lore/backups` | Backup directory | | `storage` | `compressRawPayloads` | `true` | Compress stored API responses with gzip | | `embedding` | `provider` | `ollama` | Embedding provider | | `embedding` | `model` | `nomic-embed-text` | Model name for embeddings | | `embedding` | `baseUrl` | `http://localhost:11434` | Ollama server URL | | `embedding` | `concurrency` | `4` | Concurrent embedding requests | ### Config File Resolution The config file is resolved in this order: 1. `--config` / `-c` CLI flag 2. `LORE_CONFIG_PATH` environment variable 3. `~/.config/lore/config.json` (XDG default) 4. `./lore.config.json` (local fallback for development) ### GitLab Token Create a personal access token with `read_api` scope: 1. Go to GitLab > Settings > Access Tokens 2. Create token with `read_api` scope 3. Export it: `export GITLAB_TOKEN=glpat-xxxxxxxxxxxx` ## Environment Variables | Variable | Purpose | Required | |----------|---------|----------| | `GITLAB_TOKEN` | GitLab API authentication token (name configurable via `gitlab.tokenEnvVar`) | Yes | | `LORE_CONFIG_PATH` | Override config file location | No | | `LORE_ROBOT` | Enable robot mode globally (set to `true` or `1`) | No | | `XDG_CONFIG_HOME` | XDG Base Directory for config (fallback: `~/.config`) | No | | `XDG_DATA_HOME` | XDG Base Directory for data (fallback: `~/.local/share`) | No | | `NO_COLOR` | Disable color output when set (any value) | No | | `CLICOLOR` | Standard color control (0 to disable) | No | | `RUST_LOG` | Logging level filter (e.g., `lore=debug`) | No | ## Commands ### `lore issues` Query issues from local database, or show a specific issue. ```bash lore issues # Recent issues (default 50) lore issues 123 # Show issue #123 with discussions lore issues 123 -p group/repo # Disambiguate by project lore issues -n 100 # More results lore issues -s opened # Only open issues lore issues -s closed # Only closed issues lore issues -a username # By author (@ prefix optional) lore issues -A username # By assignee (@ prefix optional) lore issues -l bug # By label (AND logic) lore issues -l bug -l urgent # Multiple labels lore issues -m "v1.0" # By milestone title lore issues --since 7d # Updated in last 7 days lore issues --since 2w # Updated in last 2 weeks lore issues --since 1m # Updated in last month lore issues --since 2024-01-01 # Updated since date lore issues --due-before 2024-12-31 # Due before date lore issues --has-due # Only issues with due dates lore issues -p group/repo # Filter by project lore issues --sort created --asc # Sort by created date, ascending lore issues -o # Open first result in browser ``` When listing, output includes: IID, title, state, author, assignee, labels, and update time. When showing a single issue (e.g., `lore issues 123`), output includes: title, description, state, author, assignees, labels, milestone, due date, web URL, and threaded discussions. #### Project Resolution The `-p` / `--project` flag uses cascading match logic across all commands: 1. **Exact match**: `group/project` 2. **Case-insensitive**: `Group/Project` 3. **Suffix match**: `project` matches `group/project` (if unambiguous) 4. **Substring match**: `typescript` matches `vs/typescript-code` (if unambiguous) If multiple projects match, an error lists the candidates with a hint to use the full path. ### `lore mrs` Query merge requests from local database, or show a specific MR. ```bash lore mrs # Recent MRs (default 50) lore mrs 456 # Show MR !456 with discussions lore mrs 456 -p group/repo # Disambiguate by project lore mrs -n 100 # More results lore mrs -s opened # Only open MRs lore mrs -s merged # Only merged MRs lore mrs -s closed # Only closed MRs lore mrs -s locked # Only locked MRs lore mrs -s all # All states lore mrs -a username # By author (@ prefix optional) lore mrs -A username # By assignee (@ prefix optional) lore mrs -r username # By reviewer (@ prefix optional) lore mrs -d # Only draft/WIP MRs lore mrs -D # Exclude draft MRs lore mrs --target main # By target branch lore mrs --source feature/foo # By source branch lore mrs -l needs-review # By label (AND logic) lore mrs --since 7d # Updated in last 7 days lore mrs -p group/repo # Filter by project lore mrs --sort created --asc # Sort by created date, ascending lore mrs -o # Open first result in browser ``` When listing, output includes: IID, title (with [DRAFT] prefix if applicable), state, author, assignee, labels, and update time. When showing a single MR (e.g., `lore mrs 456`), output includes: title, description, state, draft status, author, assignees, reviewers, labels, source/target branches, merge status, web URL, and threaded discussions. Inline code review comments (DiffNotes) display file context in the format `[src/file.ts:45]`. ### `lore search` Search across indexed documents using hybrid (lexical + semantic), lexical-only, or semantic-only modes. ```bash lore search "authentication bug" # Hybrid search (default) lore search "login flow" --mode lexical # FTS5 lexical only lore search "login flow" --mode semantic # Vector similarity only lore search "auth" --type issue # Filter by source type lore search "auth" --type mr # MR documents only lore search "auth" --type discussion # Discussion documents only lore search "deploy" --author username # Filter by author lore search "deploy" -p group/repo # Filter by project lore search "deploy" --label backend # Filter by label (AND logic) lore search "deploy" --path src/ # Filter by file path (trailing / for prefix) lore search "deploy" --after 7d # Created after (7d, 2w, 1m, or YYYY-MM-DD) lore search "deploy" --updated-after 2w # Updated after lore search "deploy" -n 50 # Limit results (default 20, max 100) lore search "deploy" --explain # Show ranking explanation per result lore search "deploy" --fts-mode raw # Raw FTS5 query syntax (advanced) ``` Requires `lore generate-docs` (or `lore sync`) to have been run at least once. Semantic and hybrid modes require `lore embed` (or `lore sync`) to have generated vector embeddings via Ollama. ### `lore sync` Run the full sync pipeline: ingest from GitLab, generate searchable documents, and compute embeddings. ```bash lore sync # Full pipeline lore sync --full # Reset cursors, fetch everything lore sync --force # Override stale lock lore sync --no-embed # Skip embedding step lore sync --no-docs # Skip document regeneration ``` ### `lore ingest` Sync data from GitLab to local database. Runs only the ingestion step (no doc generation or embeddings). ```bash lore ingest # Ingest everything (issues + MRs) lore ingest issues # Issues only lore ingest mrs # MRs only lore ingest issues -p group/repo # Single project lore ingest --force # Override stale lock lore ingest --full # Full re-sync (reset cursors) ``` The `--full` flag resets sync cursors and discussion watermarks, then fetches all data from scratch. Useful when: - Assignee data or other fields were missing from earlier syncs - You want to ensure complete data after schema changes - Troubleshooting sync issues ### `lore generate-docs` Extract searchable documents from ingested issues, MRs, and discussions for the FTS5 index. ```bash lore generate-docs # Incremental (dirty items only) lore generate-docs --full # Full rebuild lore generate-docs -p group/repo # Single project ``` ### `lore embed` Generate vector embeddings for documents via Ollama. Requires Ollama running with the configured embedding model. ```bash lore embed # Embed new/changed documents lore embed --retry-failed # Retry previously failed embeddings ``` ### `lore count` Count entities in local database. ```bash lore count issues # Total issues lore count mrs # Total MRs (with state breakdown) lore count discussions # Total discussions lore count discussions --for issue # Issue discussions only lore count discussions --for mr # MR discussions only lore count notes # Total notes (system vs user breakdown) lore count notes --for issue # Issue notes only ``` ### `lore stats` Show document and index statistics, with optional integrity checks. ```bash lore stats # Document and index statistics lore stats --check # Run integrity checks lore stats --check --repair # Repair integrity issues ``` ### `lore status` Show current sync state and watermarks. ```bash lore status ``` Displays: - Last sync run details (status, timing) - Cursor positions per project and resource type (issues and MRs) - Data summary counts ### `lore init` Initialize configuration and database interactively. ```bash lore init # Interactive setup lore init --force # Overwrite existing config lore init --non-interactive # Fail if prompts needed ``` ### `lore auth` Verify GitLab authentication is working. ```bash lore auth # Authenticated as @username (Full Name) # GitLab: https://gitlab.com ``` ### `lore doctor` Check environment health and configuration. ```bash lore doctor ``` Checks performed: - Config file existence and validity - Database existence and pragmas (WAL mode, foreign keys) - GitLab authentication - Project accessibility - Ollama connectivity (optional) ### `lore migrate` Run pending database migrations. ```bash lore migrate ``` ### `lore health` Quick pre-flight check for config, database, and schema version. Exits 0 if healthy, 1 if unhealthy. ```bash lore health ``` Useful as a fast gate before running queries or syncs. For a more thorough check including authentication and project access, use `lore doctor`. ### `lore robot-docs` Machine-readable command manifest for agent self-discovery. Returns a JSON schema of all commands, flags, exit codes, and example workflows. ```bash lore robot-docs # Pretty-printed JSON lore --robot robot-docs # Compact JSON for parsing ``` ### `lore version` Show version information including the git commit hash. ```bash lore version # lore version 0.1.0 (abc1234) ``` ## Robot Mode Machine-readable JSON output for scripting and AI agent consumption. ### Activation ```bash # Global flag lore --robot issues -n 5 # JSON shorthand (-J) lore -J issues -n 5 # Environment variable LORE_ROBOT=1 lore issues -n 5 # Auto-detection (when stdout is not a TTY) lore issues -n 5 | jq . ``` ### Response Format All commands return consistent JSON: ```json {"ok": true, "data": {...}, "meta": {...}} ``` Errors return structured JSON to stderr: ```json {"error": {"code": "CONFIG_NOT_FOUND", "message": "...", "suggestion": "Run 'lore init'"}} ``` ### Exit Codes | Code | Meaning | |------|---------| | 0 | Success | | 1 | Internal error / health check failed / not implemented | | 2 | Usage error (invalid flags or arguments) | | 3 | Config invalid | | 4 | Token not set | | 5 | GitLab auth failed | | 6 | Resource not found | | 7 | Rate limited | | 8 | Network error | | 9 | Database locked | | 10 | Database error | | 11 | Migration failed | | 12 | I/O error | | 13 | Transform error | | 14 | Ollama unavailable | | 15 | Ollama model not found | | 16 | Embedding failed | | 17 | Not found (entity does not exist) | | 18 | Ambiguous match (use `-p` to specify project) | | 20 | Config not found | ## Configuration Precedence Settings are resolved in this order (highest to lowest priority): 1. CLI flags (`--robot`, `--config`, `--color`) 2. Environment variables (`LORE_ROBOT`, `GITLAB_TOKEN`, `LORE_CONFIG_PATH`) 3. Config file (`~/.config/lore/config.json`) 4. Built-in defaults ## Global Options ```bash lore -c /path/to/config.json # Use alternate config lore --robot # Machine-readable JSON lore -J # JSON shorthand lore --color never # Disable color output lore --color always # Force color output lore -q # Suppress non-essential output ``` Color output respects `NO_COLOR` and `CLICOLOR` environment variables in `auto` mode (the default). ## Shell Completions Generate shell completions for tab-completion support: ```bash # Bash (add to ~/.bashrc) lore completions bash > ~/.local/share/bash-completion/completions/lore # Zsh (add to ~/.zshrc: fpath=(~/.zfunc $fpath)) lore completions zsh > ~/.zfunc/_lore # Fish lore completions fish > ~/.config/fish/completions/lore.fish # PowerShell (add to $PROFILE) lore completions powershell >> $PROFILE ``` ## Database Schema Data is stored in SQLite with WAL mode and foreign keys enabled. Main tables: | Table | Purpose | |-------|---------| | `projects` | Tracked GitLab projects with metadata | | `issues` | Issue metadata (title, state, author, due date, milestone) | | `merge_requests` | MR metadata (title, state, draft, branches, merge status) | | `milestones` | Project milestones with state and due dates | | `labels` | Project labels with colors | | `issue_labels` | Many-to-many issue-label relationships | | `issue_assignees` | Many-to-many issue-assignee relationships | | `mr_labels` | Many-to-many MR-label relationships | | `mr_assignees` | Many-to-many MR-assignee relationships | | `mr_reviewers` | Many-to-many MR-reviewer relationships | | `discussions` | Issue/MR discussion threads | | `notes` | Individual notes within discussions (with system note flag and DiffNote position data) | | `documents` | Extracted searchable text for FTS and embedding | | `documents_fts` | FTS5 full-text search index | | `embeddings` | Vector embeddings for semantic search | | `dirty_sources` | Entities needing document regeneration after ingest | | `pending_discussion_fetches` | Queue for discussion fetch operations | | `sync_runs` | Audit trail of sync operations | | `sync_cursors` | Cursor positions for incremental sync | | `app_locks` | Crash-safe single-flight lock | | `raw_payloads` | Compressed original API responses | | `schema_version` | Migration version tracking | The database is stored at `~/.local/share/lore/lore.db` by default (XDG compliant). ## Development ```bash # Run tests cargo test # Run with debug logging RUST_LOG=lore=debug lore issues # Run with trace logging RUST_LOG=lore=trace lore ingest issues # Check formatting cargo fmt --check # Lint cargo clippy ``` ## Tech Stack - **Rust** (2024 edition) - **SQLite** via rusqlite (bundled) with FTS5 and sqlite-vec - **Ollama** for vector embeddings (nomic-embed-text) - **clap** for CLI parsing - **reqwest** for HTTP - **tokio** for async runtime - **serde** for serialization - **tracing** for logging - **indicatif** for progress bars ## License MIT