What is Gitlore?
-Gitlore (lore) is a read-only local sync engine for GitLab data. It fetches issues, merge requests, and discussions from the GitLab REST API v4 and stores them in a local SQLite database for fast offline querying.
~11,148 lines of Rust. Noun-first CLI. Robot mode for automation. Cursor-based incremental sync.
-API Surface Coverage
-Gitlore uses a small, focused subset of the GitLab API. It is read-only — it never creates, updates, or deletes anything on GitLab.
- -Data Flow
-10 req/s + jitter
Paginated fetch
Normalize data
Local DB
Key Design Decisions
-Only GET requests. Never mutates GitLab state. Safe to run repeatedly.
-Uses updated_after parameter to only fetch changed data. 2-second rewind overlap for safety.
Stores original JSON responses with SHA-256 dedup and optional gzip compression.
-Discussions use DELETE+INSERT strategy per parent (no incremental). Parallel prefetch, serial write.
-CLI Architecture
- -Command Structure (Noun-First)
-lore <noun> [verb/arg] # Primary pattern -lore issues # List all issues -lore issues 42 # Show issue #42 -lore mrs # List all merge requests -lore mrs 17 # Show MR #17 -lore ingest issues # Fetch issues from GitLab -lore ingest mrs # Fetch MRs from GitLab -lore count issues # Count local issues -lore count discussions # Count local discussions -lore status # Show sync state -lore auth # Verify GitLab auth -lore doctor # Health check -lore init # Initialize config + DB -lore migrate # Run DB migrations -lore version # Show version-
Global Flags
-| Flag | Description |
|---|---|
-c, --config | Path to config file |
--robot | Machine-readable JSON output |
-J, --json | JSON shorthand (same as --robot) |
Robot Mode Detection
-Three ways to activate:
-lore --robot list issues # Explicit flag -lore list issues | jq . # Auto: stdout not a TTY -LORE_ROBOT=1 lore list issues # Environment variable-
Robot mode returns JSON: {"ok":true,"data":{...},"meta":{...}}
Errors go to stderr: {"error":{"code":"...","message":"...","suggestion":"..."}}
Exit Codes
-| Code | Meaning |
|---|---|
| 0 | Success |
| 1 | Internal error |
| 2 | Config not found |
| 3 | Config invalid |
| 4 | Token not set |
| 5 | GitLab auth failed |
| 6 | Resource not found |
| 7 | Rate limited |
| 8 | Network error |
| 9 | Database locked |
| 10 | Database error |
| 11 | Migration failed |
| 12 | I/O error |
| 13 | Transform error |
Configuration
-{
- "gitlab": {
- "baseUrl": "https://gitlab.com",
- "tokenEnvVar": "GITLAB_TOKEN"
- },
- "projects": [
- { "path": "group/project" }
- ],
- "sync": {
- "backfillDays": 14,
- "staleLockMinutes": 10,
- "heartbeatIntervalSeconds": 30,
- "cursorRewindSeconds": 2,
- "primaryConcurrency": 4,
- "dependentConcurrency": 2
- },
- "storage": {
- "dbPath": "~/.local/share/lore/lore.db",
- "compressRawPayloads": true
- }
-}
-
- Deprecated Commands (Hidden)
-| Old | New | Notes |
|---|---|---|
lore list | lore issues / lore mrs | Shows deprecation warning |
lore show | lore issues <iid> | Shows deprecation warning |
lore auth-test | lore auth | Alias |
lore sync-status | lore status | Alias |
GitLab API Endpoints Used by Gitlore
-All requests use PRIVATE-TOKEN header authentication. Rate limited at 10 req/s with 0-50ms jitter.
Full GitLab REST API v4 Reference
-Complete endpoint inventory for the resources relevant to Gitlore. USED = consumed by Gitlore.
- - - -Issues API
-| Method | Endpoint | Description | Status |
|---|---|---|---|
| GET | /issues | List all issues (global) | -- |
| GET | /groups/:id/issues | List group issues | -- |
| GET | /projects/:id/issues | List project issues | USED |
| GET | /projects/:id/issues/:iid | Get single issue | -- |
| POST | /projects/:id/issues | Create issue | -- |
| PUT | /projects/:id/issues/:iid | Update issue | -- |
| DEL | /projects/:id/issues/:iid | Delete issue | -- |
| PUT | /projects/:id/issues/:iid/reorder | Reorder issue | -- |
| POST | /projects/:id/issues/:iid/move | Move issue | -- |
| POST | /projects/:id/issues/:iid/clone | Clone issue | -- |
| POST | /projects/:id/issues/:iid/subscribe | Subscribe to issue | -- |
| POST | /projects/:id/issues/:iid/unsubscribe | Unsubscribe | -- |
| POST | /projects/:id/issues/:iid/todo | Create to-do | -- |
| POST | /projects/:id/issues/:iid/time_estimate | Set time estimate | -- |
| POST | /projects/:id/issues/:iid/add_spent_time | Add spent time | -- |
| GET | /projects/:id/issues/:iid/time_stats | Get time stats | -- |
| GET | /projects/:id/issues/:iid/related_merge_requests | Related MRs | -- |
| GET | /projects/:id/issues/:iid/closed_by | MRs that close issue | -- |
| GET | /projects/:id/issues/:iid/participants | List participants | -- |
Merge Requests API
-| Method | Endpoint | Description | Status |
|---|---|---|---|
| GET | /merge_requests | List all MRs (global) | -- |
| GET | /groups/:id/merge_requests | List group MRs | -- |
| GET | /projects/:id/merge_requests | List project MRs | USED |
| GET | /projects/:id/merge_requests/:iid | Get single MR | -- |
| POST | /projects/:id/merge_requests | Create MR | -- |
| PUT | /projects/:id/merge_requests/:iid | Update MR | -- |
| DEL | /projects/:id/merge_requests/:iid | Delete MR | -- |
| PUT | /projects/:id/merge_requests/:iid/merge | Merge an MR | -- |
| POST | /projects/:id/merge_requests/:iid/cancel_merge | Cancel merge | -- |
| PUT | /projects/:id/merge_requests/:iid/rebase | Rebase MR | -- |
| GET | /projects/:id/merge_requests/:iid/commits | List MR commits | -- |
| GET | /projects/:id/merge_requests/:iid/changes | List MR diffs | -- |
| GET | /projects/:id/merge_requests/:iid/pipelines | MR pipelines | -- |
| GET | /projects/:id/merge_requests/:iid/participants | MR participants | -- |
| GET | /projects/:id/merge_requests/:iid/approvals | MR approvals | -- |
| POST | /projects/:id/merge_requests/:iid/approve | Approve MR | -- |
Discussions API
-| Method | Endpoint | Description | Status |
|---|---|---|---|
| GET | /projects/:id/issues/:iid/discussions | List issue discussions | USED |
| GET | /projects/:id/issues/:iid/discussions/:did | Get single discussion | -- |
| POST | /projects/:id/issues/:iid/discussions | Create issue thread | -- |
| POST | /projects/:id/issues/:iid/discussions/:did/notes | Add note to thread | -- |
| PUT | /projects/:id/issues/:iid/discussions/:did/notes/:nid | Modify note | -- |
| DEL | /projects/:id/issues/:iid/discussions/:did/notes/:nid | Delete note | -- |
| GET | /projects/:id/merge_requests/:iid/discussions | List MR discussions | USED |
| GET | /projects/:id/merge_requests/:iid/discussions/:did | Get single MR discussion | -- |
| POST | /projects/:id/merge_requests/:iid/discussions | Create MR thread | -- |
| PUT | /projects/:id/merge_requests/:iid/discussions/:did | Resolve/unresolve thread | -- |
| POST | /projects/:id/merge_requests/:iid/discussions/:did/notes | Add note to MR thread | -- |
| PUT | /projects/:id/merge_requests/:iid/discussions/:did/notes/:nid | Modify MR note | -- |
| DEL | /projects/:id/merge_requests/:iid/discussions/:did/notes/:nid | Delete MR note | -- |
| GET | /projects/:id/snippets/:sid/discussions | List snippet discussions | -- |
| GET | /groups/:id/epics/:eid/discussions | List epic discussions | -- |
| GET | /projects/:id/repository/commits/:sha/discussions | List commit discussions | -- |
Notes API (Flat, non-threaded)
-| Method | Endpoint | Description | Status |
|---|---|---|---|
| GET | /projects/:id/issues/:iid/notes | List issue notes | -- |
| POST | /projects/:id/issues/:iid/notes | Create issue note | -- |
| GET | /projects/:id/merge_requests/:iid/notes | List MR notes | -- |
| POST | /projects/:id/merge_requests/:iid/notes | Create MR note | -- |
| GET | /projects/:id/snippets/:sid/notes | List snippet notes | -- |
Gitlore uses the Discussions API (threaded) instead of the flat Notes API. Notes are extracted from discussion responses.
- - -Other APIs Used
-| Method | Endpoint | Description | Status |
|---|---|---|---|
| GET | /user | Current authenticated user | USED |
| GET | /projects/:path | Get project by path | USED |
| GET | /version | GitLab instance version | USED |
CLI Command ↔ API Endpoint Mapping
-How each CLI command maps to GitLab API calls and local database operations.
- -GET /projects/:id/issues (paginated, cursor-based)WHERE updated_at > discussions_synced_for_updated_atGET /projects/:id/issues/:iid/discussions (parallel prefetch)issues, labels, issue_labels, discussions, notes, raw_payloadsGET /projects/:id/merge_requests (paginated, cursor-based)WHERE updated_at > discussions_synced_for_updated_atGET /projects/:id/merge_requests/:iid/discussions (parallel prefetch)merge_requests, labels, mr_labels, mr_assignees, mr_reviewers, discussions, notes, raw_payloadsSELECT ... FROM issues/merge_requests with filters (no API call)SELECT ... WHERE iid = ? + join discussions/notes (no API call)GET /api/v4/userGET /api/v4/user + GET /api/v4/versionGET /api/v4/projects/:pathAPI Capabilities NOT Used by Gitlore
--
-
- Create/update/delete issues -
- Create/update/delete MRs -
- Merge MRs -
- Create/reply to discussions -
- Resolve/unresolve threads -
- Approve MRs -
-
-
- Single issue/MR fetch (uses list with filters instead) -
- MR commits, diffs, pipelines -
- Issue/MR participants -
- Time tracking stats -
- Related MRs / closed-by -
- Labels API (extracted from issue/MR responses) -
- Milestones API (extracted from issue responses) -
- Flat Notes API (uses threaded Discussions API) -
- Snippets, Epics, Commits discussions -
- Webhooks, CI/CD, Pipelines, Deployments -
Database Schema
-SQLite with WAL mode. 12 tables across 6 migrations.
- -Entity Relationship
-- projects ────────────────────────────────────────────────────────── - │ │ - ├──< issues ──< issue_labels >── labels │ - │ │ │ - │ └──< discussions ──< notes │ - │ │ - ├──< merge_requests ──< mr_labels >── labels │ - │ │ │ │ │ - │ │ │ └──< mr_reviewers │ - │ │ └──< mr_assignees │ - │ │ │ - │ └──< discussions ──< notes │ - │ │ - ├──< raw_payloads │ - ├──< sync_cursors │ - └── sync_runs, app_locks, schema_version │ - ───────────────────────────────────────────────────────────────────- -
Table Details
- - - - - - - - - - -Ingestion Pipeline
- -Three-Phase Architecture
-- Paginated API fetch with cursor-based sync.
Stores raw payloads + normalized rows. -
- SQL query: which issues/MRs need
their discussions refreshed? -
- Parallel prefetch + serial write.
Full-refresh per parent entity. -
Cursor-Based Incremental Sync
-Cursor State: (updated_at_cursor: i64, tie_breaker_id: i64) - -First sync: - updated_after = (now - backfillDays) - -Subsequent syncs: - updated_after = cursor.updated_at - cursorRewindSeconds - - For each fetched resource: - if (gitlab_id, updated_at) <= cursor: - SKIP (already processed in overlap zone) - else: - UPSERT into database - - After each page boundary: - UPDATE sync_cursors (crash recovery safe)-
Discussion Sync Strategy
-For each issue/MR where updated_at > discussions_synced_for_updated_at: - - 1. PREFETCH (parallel, configurable concurrency): - GET /projects/:id/issues/:iid/discussions (all pages) - - 2. WRITE (serial, inside transaction): - DELETE FROM discussions WHERE issue_id = ? - DELETE FROM notes WHERE discussion_id IN (...) - INSERT discussions + notes (fresh data) - UPDATE issues SET discussions_synced_for_updated_at = updated_at-
Full-refresh avoids complexity of detecting deleted/edited notes. Trade-off: more API calls for heavily-discussed items.
-Rate Limiting
-RateLimiter {
- min_interval: 100ms (= 1s / 10 req/s)
- jitter: 0-50ms random
-
- acquire():
- elapsed = now - last_request
- if elapsed < min_interval:
- sleep(min_interval - elapsed + random_jitter)
- last_request = now
-}
- Pagination
-Async stream-based. Fallback chain for next-page detection:
--
-
Linkheader (RFC 8288) — parserel="next"
- x-next-pageheader — direct page number
- - Full-page heuristic — if response has 100 items, assume more pages -
Raw Payload Storage
-JSON bytes
Dedup check
(if enabled)
BLOB storage
UNIQUE constraint on (project_id, resource_type, gitlab_id, payload_hash) prevents storing identical payloads.
Concurrency Model
-Single-threaded async stream. Rate-limited. Each page written in a transaction. Cursor updated at page boundaries.
-Parallel prefetch (configurable, default 2 concurrent). Serial write phase to avoid DB contention. Each parent entity is one transaction.
-Single-Flight Lock
-AppLock (database-enforced mutex): - name: 'sync' (PK) - owner: UUIDv4 (unique per process) - heartbeat_at: updated every 30s - - Acquire: - INSERT OR fail if row exists - Check stale: if heartbeat > staleLockMinutes, force-acquire - - Release: - DELETE WHERE owner = my_uuid-
Progress Events
-| Event | Description |
|---|---|
IssuesFetchStarted | Beginning primary issue fetch |
IssueFetched | Each issue processed (for progress bars) |
IssuesFetchComplete | All pages consumed |
DiscussionSyncStarted | Beginning discussion phase |
DiscussionSynced | Each parent's discussions written |
DiscussionSyncComplete | All discussions updated |
| Same events exist for MRs (MrsFetchStarted, etc.) | |
Field-Level Coverage: API Response vs Gitlore Storage
-Every field in every GitLab API response, mapped to what Gitlore does with it. Serde silently drops fields not in the Rust structs.
- -Field Coverage Summary
-Although many fields are dropped during transformation, the raw_payloads table stores the complete original JSON response (with SHA-256 dedup and optional gzip). This means all "dropped" data is still recoverable from the blob storage without re-fetching from GitLab. The normalized tables are optimized for query patterns, not completeness.
Efficiency Analysis & Opportunities
-Observations on how gitlore could leverage the GitLab API more efficiently, and data it currently leaves on the table.
- -Current Efficiency Wins
-Uses updated_after + order_by=updated_at&sort=asc to only fetch changed records. Avoids full re-fetch on every sync. This is the single biggest efficiency feature.
SHA-256 hashing prevents storing identical payloads. If an issue's updated_at changes but the actual content is identical, the raw blob is deduplicated.
Only re-syncs discussions for issues/MRs whose updated_at has advanced past their discussions_synced_for_updated_at watermark. Skips unchanged entities.
Fetches discussions for multiple issues/MRs concurrently (configurable, default 2). Dramatically reduces wall-clock time for discussion sync.
-Potential Inefficiencies
- -Every time an issue/MR is updated, ALL its discussions are re-fetched and replaced (DELETE + INSERT). For heavily-discussed items (50+ comments), this is expensive.
-| Scenario | Current | Alternative |
|---|---|---|
| Issue with 100 notes gets 1 new comment | Re-fetch all 100 notes (multiple pages) | Could use GET .../notes?order_by=updated_at&updated_after=... for incremental note sync |
| MR label change (no new comments) | Re-fetch all discussions anyway | Could check user_notes_count delta or use Notes API with updated_after |
Trade-off: Full-refresh is simpler and guarantees consistency (catches edits, deletes). Incremental would miss deleted notes.
-Gitlore uses page=N&per_page=100 offset pagination. GitLab supports keyset pagination for some endpoints (Issues, MRs), which is more efficient for large datasets and recommended by GitLab.
Current: GET /projects/:id/issues?page=5&per_page=100 -Keyset: GET /projects/:id/issues?pagination=keyset&per_page=100 - (uses Link header rel="next" with cursor)-
Benefit: Keyset pagination is O(1) per page (vs O(N) for offset). GitLab recommends it for >10,000 records. Gitlore already parses Link headers, so the client-side support partially exists.
GitLab returns ETag headers on API responses. Sending If-None-Match on subsequent requests would return 304 Not Modified without consuming rate limit quota on some endpoints. Currently all requests are unconditional.
Impact: Moderate. The cursor-based sync already avoids re-fetching unchanged data, so ETag would mainly help with the discussions full-refresh scenario where nothing changed.
Gitlore extracts labels from the labels[] string array embedded in issue/MR responses. The dedicated GET /projects/:id/labels endpoint returns richer data:
| From issues response | From Labels API |
|---|---|
| Label name (string only) | name, color, description, text_color, priority, is_project_label, subscribed, open_issues_count, closed_issues_count, open_merge_requests_count |
Impact: The labels table has color and description columns but they may not be populated from the embedded string array. A single Labels API call (one request, non-paginated for most projects) would enrich the local label catalog.
Dropped Data Worth Capturing
-Fields currently silently dropped that could add value to local queries:
- -| Field | Source | Value Proposition | Effort |
|---|---|---|---|
user_notes_count |
- Issues, MRs | -Could skip discussion re-sync when count hasn't changed. Quick "activity" sort without joining notes table. | -Low | -
upvotes / downvotes |
- Issues, MRs | -Engagement metrics for triage. "Most upvoted issues" is a common query. | -Low | -
confidential |
- Issues | -Security-sensitive filtering. Avoid exposing confidential issues in outputs. | -Low | -
weight |
- Issues | -Effort estimation for sprint planning (Premium/Ultimate only). | -Low | -
time_stats |
- Issues, MRs | -Time tracking data for project reporting. Already in the response, free to capture. | -Low | -
has_conflicts |
- MRs | -Identify MRs needing rebase. Useful for "stale MR" alerts. | -Low | -
blocking_discussions_resolved |
- MRs | -MR readiness indicator without joining discussions table. | -Low | -
merge_commit_sha |
- MRs | -Trace merged MRs to specific commits. Useful for git correlation. | -Low | -
suggestions[] |
- Discussion notes | -Code review suggestions with from/to content. Rich data for code review analysis. | -Medium | -
task_completion_status |
- Issues, MRs | -Track task-list checkbox progress without parsing description markdown. | -Low | -
issue_type |
- Issues | -Distinguish issues vs incidents vs test cases. | -Low | -
discussion_locked |
- Issues, MRs | -Know if new comments can be added. | -Low | -
Structural Optimization Opportunities
- -Currently stores only username for authors, assignees, reviewers. The API returns name, avatar_url, web_url, and state for every user reference. A users table could deduplicate this data and provide richer displays.
-- Potential schema -CREATE TABLE users ( - username TEXT PRIMARY KEY, - name TEXT, - gitlab_id INTEGER, - avatar_url TEXT, - state TEXT, -- "active", "blocked", etc. - last_seen_at INTEGER -- auto-updated on encounter -);-
Cost: No additional API calls. Data is already in every issue/MR/note response. Just needs extraction during transform.
-Milestones are stored for issues but the MR transformer does not extract the milestone object from MR responses, even though GitLab returns it. The merge_requests table has no milestone_id column.
Impact: Cannot query "which MRs are in milestone X?" locally. The data is in the raw payload but not indexed.
-MRs store references.short and references.full, but the issue transformer drops the references object entirely. This means issues lack the cross-project reference format (e.g., group/project#42).
API Strategies Not Yet Used
- -Instead of polling, GitLab can push events via POST /projects/:id/hooks. Would enable near-real-time sync without rate-limit cost. Requires a listener endpoint.
GET /projects/:id/events returns a stream of all project activity. Could be used as a fast "has anything changed?" check before running expensive issue/MR sync. Much lighter than fetching full issue lists.
GitLab's GraphQL API allows requesting exactly the fields needed. Would eliminate bandwidth waste from ~50% of response fields being silently dropped. Trade-off: different pagination model, potentially less stable API surface.
-Summary Verdict
-Gitlore is well-optimized for its core use case (read-only local sync). The cursor-based incremental sync and raw payload archival are sophisticated. The main opportunities are:
--
-
- Capture more "free" data — Fields like
user_notes_count,upvotes,has_conflictsare already in API responses. Storing them costs zero API calls and enables richer queries.
- - Discussion sync efficiency — The full-refresh strategy is the biggest source of redundant API calls. Even a simple
user_notes_countcomparison could skip unchanged discussions.
- - Keyset pagination — A meaningful improvement for large projects (>10K issues), and Gitlore already has partial infrastructure for it. -
- MR milestone parity — Low-effort gap to close with issue milestone support. -