Go to file

teernisse 86a51cddef fix: Project-scoped job claiming, structured rate-limit logging, RRF total_cmp

Targeted fixes across multiple subsystems:

dependent_queue:
- Add project_id parameter to claim_jobs() for project-scoped job claiming,
  preventing cross-project job theft during concurrent multi-project ingestion
- Add project_id parameter to count_pending_jobs() with optional scoping
  (None returns global counts, Some(pid) returns per-project counts)

gitlab/client:
- Downgrade rate-limit log from warn to info (429s are expected operational
  behavior, not warnings) and add structured fields (path, status_code)
  for better log filtering and aggregation

gitlab/transformers/discussion:
- Add tracing::warn on invalid timestamp parse instead of silent fallback
  to epoch 0, making data quality issues visible in logs

ingestion/merge_requests:
- Remove duplicate doc comment on upsert_label_tx

search/rrf:
- Replace partial_cmp().unwrap_or() with total_cmp() for f64 sorting,
  eliminating the NaN edge case entirely (total_cmp treats NaN consistently)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-02-04 13:39:13 -05:00

.beads

chore(beads): Update issue tracker state for Gate 1 completions

2026-02-03 13:01:46 -05:00

.claude/agents

chore: Add test-runner agent, agent-swarm-launcher skill, review artifacts, and beads updates

2026-02-03 09:36:05 -05:00

docs

docs: Overhaul AGENTS.md, update README, add pipeline spec and Phase B plan

2026-02-03 09:35:51 -05:00

migrations

feat(observability): Add metrics, logging, and sync-run core modules

2026-02-04 13:38:29 -05:00

skills/agent-swarm-launcher

chore: Add test-runner agent, agent-swarm-launcher skill, review artifacts, and beads updates

2026-02-03 09:36:05 -05:00

src

fix: Project-scoped job claiming, structured rate-limit logging, RRF total_cmp

2026-02-04 13:39:13 -05:00

tests

fix(events): Handle nullable label and milestone in resource events

2026-02-03 17:36:17 -05:00

.gitignore

Update name to gitlore instead of gitlab-inbox

2026-01-28 15:49:14 -05:00

AGENTS.md

docs: Overhaul AGENTS.md, update README, add pipeline spec and Phase B plan

2026-02-03 09:35:51 -05:00

api-review.html

chore: Add test-runner agent, agent-swarm-launcher skill, review artifacts, and beads updates

2026-02-03 09:36:05 -05:00

build.rs

build: Add clap_complete, libc dependencies and git hash build script

2026-01-30 16:53:51 -05:00

Cargo.lock

feat(observability): Add metrics, logging, and sync-run core modules

2026-02-04 13:38:29 -05:00

Cargo.toml

feat(observability): Add metrics, logging, and sync-run core modules

2026-02-04 13:38:29 -05:00

phase-a-review.html

chore: Add test-runner agent, agent-swarm-launcher skill, review artifacts, and beads updates

2026-02-03 09:36:05 -05:00

PRD.md

Update name to gitlore instead of gitlab-inbox

2026-01-28 15:49:14 -05:00

README.md

docs: Overhaul AGENTS.md, update README, add pipeline spec and Phase B plan

2026-02-03 09:35:51 -05:00

RUST_CLI_TOOLS_BEST_PRACTICES_GUIDE.md

Begin planning phase 3-5 implementation

2026-01-27 22:40:49 -05:00

SPEC-REVISIONS-2.md

Update name to gitlore instead of gitlab-inbox

2026-01-28 15:49:14 -05:00

SPEC-REVISIONS-3.md

Update name to gitlore instead of gitlab-inbox

2026-01-28 15:49:14 -05:00

SPEC-REVISIONS.md

Update name to gitlore instead of gitlab-inbox

2026-01-28 15:49:14 -05:00

SPEC.md

Update name to gitlore instead of gitlab-inbox

2026-01-28 15:49:14 -05:00

README.md

Gitlore

Local GitLab data management with semantic search. Syncs issues, MRs, discussions, and notes from GitLab to a local SQLite database for fast, offline-capable querying, filtering, and hybrid search.

Features

Local-first: All data stored in SQLite for instant queries
Incremental sync: Cursor-based sync only fetches changes since last sync
Full re-sync: Reset cursors and fetch all data from scratch when needed
Multi-project: Track issues and MRs across multiple GitLab projects
Rich filtering: Filter by state, author, assignee, labels, milestone, due date, draft status, reviewer, branches
Hybrid search: Combines FTS5 lexical search with Ollama-powered vector embeddings via Reciprocal Rank Fusion
Raw payload storage: Preserves original GitLab API responses for debugging
Discussion threading: Full support for issue and MR discussions including inline code review comments
Robot mode: Machine-readable JSON output with structured errors and meaningful exit codes

Installation

cargo install --path .

Or build from source:

cargo build --release
./target/release/lore --help

Quick Start

# Initialize configuration (interactive)
lore init

# Verify authentication
lore auth

# Sync everything from GitLab (issues + MRs + docs + embeddings)
lore sync

# List recent issues
lore issues -n 10

# List open merge requests
lore mrs -s opened

# Show issue details
lore issues 123

# Show MR details with discussions
lore mrs 456

# Search across all indexed data
lore search "authentication bug"

# Robot mode (machine-readable JSON)
lore -J issues -n 5 | jq .

Configuration

Configuration is stored in ~/.config/lore/config.json (or $XDG_CONFIG_HOME/lore/config.json).

Example Configuration

{
  "gitlab": {
    "baseUrl": "https://gitlab.com",
    "tokenEnvVar": "GITLAB_TOKEN"
  },
  "projects": [
    { "path": "group/project" },
    { "path": "other-group/other-project" }
  ],
  "sync": {
    "backfillDays": 14,
    "staleLockMinutes": 10,
    "heartbeatIntervalSeconds": 30,
    "cursorRewindSeconds": 2,
    "primaryConcurrency": 4,
    "dependentConcurrency": 2
  },
  "storage": {
    "compressRawPayloads": true
  },
  "embedding": {
    "provider": "ollama",
    "model": "nomic-embed-text",
    "baseUrl": "http://localhost:11434",
    "concurrency": 4
  }
}

Configuration Options

Section	Field	Default	Description
`gitlab`	`baseUrl`	--	GitLab instance URL (required)
`gitlab`	`tokenEnvVar`	`GITLAB_TOKEN`	Environment variable containing API token
`projects`	`path`	--	Project path (e.g., `group/project`)
`sync`	`backfillDays`	`14`	Days to backfill on initial sync
`sync`	`staleLockMinutes`	`10`	Minutes before sync lock considered stale
`sync`	`heartbeatIntervalSeconds`	`30`	Frequency of lock heartbeat updates
`sync`	`cursorRewindSeconds`	`2`	Seconds to rewind cursor for overlap safety
`sync`	`primaryConcurrency`	`4`	Concurrent GitLab requests for primary resources
`sync`	`dependentConcurrency`	`2`	Concurrent requests for dependent resources
`storage`	`dbPath`	`~/.local/share/lore/lore.db`	Database file path
`storage`	`backupDir`	`~/.local/share/lore/backups`	Backup directory
`storage`	`compressRawPayloads`	`true`	Compress stored API responses with gzip
`embedding`	`provider`	`ollama`	Embedding provider
`embedding`	`model`	`nomic-embed-text`	Model name for embeddings
`embedding`	`baseUrl`	`http://localhost:11434`	Ollama server URL
`embedding`	`concurrency`	`4`	Concurrent embedding requests

Config File Resolution

The config file is resolved in this order:

--config / -c CLI flag
LORE_CONFIG_PATH environment variable
~/.config/lore/config.json (XDG default)
./lore.config.json (local fallback for development)

GitLab Token

Create a personal access token with read_api scope:

Go to GitLab > Settings > Access Tokens
Create token with read_api scope
Export it: export GITLAB_TOKEN=glpat-xxxxxxxxxxxx

Environment Variables

Variable	Purpose	Required
`GITLAB_TOKEN`	GitLab API authentication token (name configurable via `gitlab.tokenEnvVar`)	Yes
`LORE_CONFIG_PATH`	Override config file location	No
`LORE_ROBOT`	Enable robot mode globally (set to `true` or `1`)	No
`XDG_CONFIG_HOME`	XDG Base Directory for config (fallback: `~/.config`)	No
`XDG_DATA_HOME`	XDG Base Directory for data (fallback: `~/.local/share`)	No
`NO_COLOR`	Disable color output when set (any value)	No
`CLICOLOR`	Standard color control (0 to disable)	No
`RUST_LOG`	Logging level filter (e.g., `lore=debug`)	No

Commands

`lore issues`

Query issues from local database, or show a specific issue.

lore issues                           # Recent issues (default 50)
lore issues 123                       # Show issue #123 with discussions
lore issues 123 -p group/repo        # Disambiguate by project
lore issues -n 100                    # More results
lore issues -s opened                 # Only open issues
lore issues -s closed                 # Only closed issues
lore issues -a username               # By author (@ prefix optional)
lore issues -A username               # By assignee (@ prefix optional)
lore issues -l bug                    # By label (AND logic)
lore issues -l bug -l urgent          # Multiple labels
lore issues -m "v1.0"                 # By milestone title
lore issues --since 7d               # Updated in last 7 days
lore issues --since 2w               # Updated in last 2 weeks
lore issues --since 1m               # Updated in last month
lore issues --since 2024-01-01       # Updated since date
lore issues --due-before 2024-12-31  # Due before date
lore issues --has-due                 # Only issues with due dates
lore issues -p group/repo            # Filter by project
lore issues --sort created --asc     # Sort by created date, ascending
lore issues -o                        # Open first result in browser

When listing, output includes: IID, title, state, author, assignee, labels, and update time.

When showing a single issue (e.g., lore issues 123), output includes: title, description, state, author, assignees, labels, milestone, due date, web URL, and threaded discussions.

Project Resolution

The -p / --project flag uses cascading match logic across all commands:

Exact match: group/project
Case-insensitive: Group/Project
Suffix match: project matches group/project (if unambiguous)
Substring match: typescript matches vs/typescript-code (if unambiguous)

If multiple projects match, an error lists the candidates with a hint to use the full path.

`lore mrs`

Query merge requests from local database, or show a specific MR.

lore mrs                              # Recent MRs (default 50)
lore mrs 456                          # Show MR !456 with discussions
lore mrs 456 -p group/repo           # Disambiguate by project
lore mrs -n 100                       # More results
lore mrs -s opened                    # Only open MRs
lore mrs -s merged                    # Only merged MRs
lore mrs -s closed                    # Only closed MRs
lore mrs -s locked                    # Only locked MRs
lore mrs -s all                       # All states
lore mrs -a username                  # By author (@ prefix optional)
lore mrs -A username                  # By assignee (@ prefix optional)
lore mrs -r username                  # By reviewer (@ prefix optional)
lore mrs -d                           # Only draft/WIP MRs
lore mrs -D                           # Exclude draft MRs
lore mrs --target main               # By target branch
lore mrs --source feature/foo        # By source branch
lore mrs -l needs-review              # By label (AND logic)
lore mrs --since 7d                  # Updated in last 7 days
lore mrs -p group/repo               # Filter by project
lore mrs --sort created --asc        # Sort by created date, ascending
lore mrs -o                           # Open first result in browser

When listing, output includes: IID, title (with [DRAFT] prefix if applicable), state, author, assignee, labels, and update time.

When showing a single MR (e.g., lore mrs 456), output includes: title, description, state, draft status, author, assignees, reviewers, labels, source/target branches, merge status, web URL, and threaded discussions. Inline code review comments (DiffNotes) display file context in the format [src/file.ts:45].

`lore search`

Search across indexed documents using hybrid (lexical + semantic), lexical-only, or semantic-only modes.

lore search "authentication bug"              # Hybrid search (default)
lore search "login flow" --mode lexical       # FTS5 lexical only
lore search "login flow" --mode semantic      # Vector similarity only
lore search "auth" --type issue               # Filter by source type
lore search "auth" --type mr                  # MR documents only
lore search "auth" --type discussion          # Discussion documents only
lore search "deploy" --author username        # Filter by author
lore search "deploy" -p group/repo           # Filter by project
lore search "deploy" --label backend          # Filter by label (AND logic)
lore search "deploy" --path src/             # Filter by file path (trailing / for prefix)
lore search "deploy" --after 7d              # Created after (7d, 2w, 1m, or YYYY-MM-DD)
lore search "deploy" --updated-after 2w      # Updated after
lore search "deploy" -n 50                    # Limit results (default 20, max 100)
lore search "deploy" --explain               # Show ranking explanation per result
lore search "deploy" --fts-mode raw          # Raw FTS5 query syntax (advanced)

Requires lore generate-docs (or lore sync) to have been run at least once. Semantic and hybrid modes require lore embed (or lore sync) to have generated vector embeddings via Ollama.

`lore sync`

Run the full sync pipeline: ingest from GitLab, generate searchable documents, and compute embeddings.

lore sync                    # Full pipeline
lore sync --full             # Reset cursors, fetch everything
lore sync --force            # Override stale lock
lore sync --no-embed         # Skip embedding step
lore sync --no-docs          # Skip document regeneration

`lore ingest`

Sync data from GitLab to local database. Runs only the ingestion step (no doc generation or embeddings).

lore ingest                                    # Ingest everything (issues + MRs)
lore ingest issues                             # Issues only
lore ingest mrs                                # MRs only
lore ingest issues -p group/repo              # Single project
lore ingest --force                            # Override stale lock
lore ingest --full                             # Full re-sync (reset cursors)

The --full flag resets sync cursors and discussion watermarks, then fetches all data from scratch. Useful when:

Assignee data or other fields were missing from earlier syncs
You want to ensure complete data after schema changes
Troubleshooting sync issues

`lore generate-docs`

Extract searchable documents from ingested issues, MRs, and discussions for the FTS5 index.

lore generate-docs                    # Incremental (dirty items only)
lore generate-docs --full             # Full rebuild
lore generate-docs -p group/repo     # Single project

`lore embed`

Generate vector embeddings for documents via Ollama. Requires Ollama running with the configured embedding model.

lore embed                    # Embed new/changed documents
lore embed --retry-failed     # Retry previously failed embeddings

`lore count`

Count entities in local database.

lore count issues                     # Total issues
lore count mrs                        # Total MRs (with state breakdown)
lore count discussions                # Total discussions
lore count discussions --for issue   # Issue discussions only
lore count discussions --for mr      # MR discussions only
lore count notes                      # Total notes (system vs user breakdown)
lore count notes --for issue         # Issue notes only

`lore stats`

Show document and index statistics, with optional integrity checks.

lore stats                    # Document and index statistics
lore stats --check            # Run integrity checks
lore stats --check --repair   # Repair integrity issues

`lore status`

Show current sync state and watermarks.

lore status

Displays:

Last sync run details (status, timing)
Cursor positions per project and resource type (issues and MRs)
Data summary counts

`lore init`

Initialize configuration and database interactively.

lore init                    # Interactive setup
lore init --force            # Overwrite existing config
lore init --non-interactive  # Fail if prompts needed

`lore auth`

Verify GitLab authentication is working.

lore auth
# Authenticated as @username (Full Name)
# GitLab: https://gitlab.com

`lore doctor`

Check environment health and configuration.

lore doctor

Checks performed:

Config file existence and validity
Database existence and pragmas (WAL mode, foreign keys)
GitLab authentication
Project accessibility
Ollama connectivity (optional)

`lore migrate`

Run pending database migrations.

lore migrate

`lore health`

Quick pre-flight check for config, database, and schema version. Exits 0 if healthy, 1 if unhealthy.

lore health

Useful as a fast gate before running queries or syncs. For a more thorough check including authentication and project access, use lore doctor.

`lore robot-docs`

Machine-readable command manifest for agent self-discovery. Returns a JSON schema of all commands, flags, exit codes, and example workflows.

lore robot-docs                   # Pretty-printed JSON
lore --robot robot-docs           # Compact JSON for parsing

`lore version`

Show version information including the git commit hash.

lore version
# lore version 0.1.0 (abc1234)

Robot Mode

Machine-readable JSON output for scripting and AI agent consumption.

Activation

# Global flag
lore --robot issues -n 5

# JSON shorthand (-J)
lore -J issues -n 5

# Environment variable
LORE_ROBOT=1 lore issues -n 5

# Auto-detection (when stdout is not a TTY)
lore issues -n 5 | jq .

Response Format

All commands return consistent JSON:

{"ok": true, "data": {...}, "meta": {...}}

Errors return structured JSON to stderr:

{"error": {"code": "CONFIG_NOT_FOUND", "message": "...", "suggestion": "Run 'lore init'"}}

Exit Codes

Code	Meaning
0	Success
1	Internal error / health check failed / not implemented
2	Usage error (invalid flags or arguments)
3	Config invalid
4	Token not set
5	GitLab auth failed
6	Resource not found
7	Rate limited
8	Network error
9	Database locked
10	Database error
11	Migration failed
12	I/O error
13	Transform error
14	Ollama unavailable
15	Ollama model not found
16	Embedding failed
17	Not found (entity does not exist)
18	Ambiguous match (use `-p` to specify project)
20	Config not found

Configuration Precedence

Settings are resolved in this order (highest to lowest priority):

CLI flags (--robot, --config, --color)
Environment variables (LORE_ROBOT, GITLAB_TOKEN, LORE_CONFIG_PATH)
Config file (~/.config/lore/config.json)
Built-in defaults

Global Options

lore -c /path/to/config.json <command>   # Use alternate config
lore --robot <command>                    # Machine-readable JSON
lore -J <command>                         # JSON shorthand
lore --color never <command>              # Disable color output
lore --color always <command>             # Force color output
lore -q <command>                         # Suppress non-essential output

Color output respects NO_COLOR and CLICOLOR environment variables in auto mode (the default).

Shell Completions

Generate shell completions for tab-completion support:

# Bash (add to ~/.bashrc)
lore completions bash > ~/.local/share/bash-completion/completions/lore

# Zsh (add to ~/.zshrc: fpath=(~/.zfunc $fpath))
lore completions zsh > ~/.zfunc/_lore

# Fish
lore completions fish > ~/.config/fish/completions/lore.fish

# PowerShell (add to $PROFILE)
lore completions powershell >> $PROFILE

Database Schema

Data is stored in SQLite with WAL mode and foreign keys enabled. Main tables:

Table	Purpose
`projects`	Tracked GitLab projects with metadata
`issues`	Issue metadata (title, state, author, due date, milestone)
`merge_requests`	MR metadata (title, state, draft, branches, merge status)
`milestones`	Project milestones with state and due dates
`labels`	Project labels with colors
`issue_labels`	Many-to-many issue-label relationships
`issue_assignees`	Many-to-many issue-assignee relationships
`mr_labels`	Many-to-many MR-label relationships
`mr_assignees`	Many-to-many MR-assignee relationships
`mr_reviewers`	Many-to-many MR-reviewer relationships
`discussions`	Issue/MR discussion threads
`notes`	Individual notes within discussions (with system note flag and DiffNote position data)
`documents`	Extracted searchable text for FTS and embedding
`documents_fts`	FTS5 full-text search index
`embeddings`	Vector embeddings for semantic search
`dirty_sources`	Entities needing document regeneration after ingest
`pending_discussion_fetches`	Queue for discussion fetch operations
`sync_runs`	Audit trail of sync operations
`sync_cursors`	Cursor positions for incremental sync
`app_locks`	Crash-safe single-flight lock
`raw_payloads`	Compressed original API responses
`schema_version`	Migration version tracking

The database is stored at ~/.local/share/lore/lore.db by default (XDG compliant).

Development

# Run tests
cargo test

# Run with debug logging
RUST_LOG=lore=debug lore issues

# Run with trace logging
RUST_LOG=lore=trace lore ingest issues

# Check formatting
cargo fmt --check

# Lint
cargo clippy

Tech Stack

Rust (2024 edition)
SQLite via rusqlite (bundled) with FTS5 and sqlite-vec
Ollama for vector embeddings (nomic-embed-text)
clap for CLI parsing
reqwest for HTTP
tokio for async runtime
serde for serialization
tracing for logging
indicatif for progress bars

License

MIT