gitlore/README.md

# Gitlore

Local GitLab data management with semantic search, people intelligence, and temporal analysis. Syncs issues, MRs, discussions, notes, and work item statuses from GitLab to a local SQLite database for fast, offline-capable querying, filtering, hybrid search, chronological event reconstruction, and expert discovery.

## Features

- **Local-first**: All data stored in SQLite for instant queries
- **Incremental sync**: Cursor-based sync only fetches changes since last sync
- **Full re-sync**: Reset cursors and fetch all data from scratch when needed
- **Multi-project**: Track issues and MRs across multiple GitLab projects
- **Rich filtering**: Filter by state, author, assignee, labels, milestone, due date, draft status, reviewer, branches, work item status
- **Hybrid search**: Combines FTS5 lexical search with Ollama-powered vector embeddings via Reciprocal Rank Fusion
- **People intelligence**: Expert discovery, workload analysis, review patterns, active discussions, and code ownership overlap
- **Timeline pipeline**: Reconstructs chronological event histories by combining search, graph traversal, and event aggregation across related entities
- **Git history linking**: Tracks merge and squash commit SHAs to connect MRs with git history
- **File change tracking**: Records which files each MR touches, enabling file-level history queries
- **Raw payload storage**: Preserves original GitLab API responses for debugging
- **Discussion threading**: Full support for issue and MR discussions including inline code review comments
- **Cross-reference tracking**: Automatic extraction of "closes", "mentioned" relationships between MRs and issues
- **Work item status enrichment**: Fetches issue statuses (e.g., "To do", "In progress", "Done") from GitLab's GraphQL API with adaptive page sizing, color-coded display, and case-insensitive filtering
- **Resource event history**: Tracks state changes, label events, and milestone events for issues and MRs
- **Note querying**: Rich filtering over discussion notes by author, type, path, resolution status, time range, and body content
- **Discussion drift detection**: Semantic analysis of how discussions diverge from original issue intent
- **Robot mode**: Machine-readable JSON output with structured errors, meaningful exit codes, and actionable recovery steps
- **Error tolerance**: Auto-corrects common CLI mistakes (case, typos, single-dash flags, value casing) with teaching feedback
- **Observability**: Verbosity controls, JSON log format, structured metrics, and stage timing

## Installation

```bash
cargo install --path .
```

Or build from source:

```bash
cargo build --release
./target/release/lore --help
```

## Quick Start

```bash
# Initialize configuration (interactive)
lore init

# Verify authentication
lore auth

# Sync everything from GitLab (issues + MRs + docs + embeddings)
lore sync

# List recent issues
lore issues -n 10

# List open merge requests
lore mrs -s opened

# Show issue details
lore issues 123

# Show MR details with discussions
lore mrs 456

# Search across all indexed data
lore search "authentication bug"

# Who knows about this code area?
lore who src/features/auth/

# What is @asmith working on?
lore who @asmith

# Timeline of events related to deployments
lore timeline "deployment"

# Timeline for a specific issue
lore timeline issue:42

# Query notes by author
lore notes --author alice --since 7d

# Robot mode (machine-readable JSON)
lore -J issues -n 5 | jq .
```

## Configuration

Configuration is stored in `~/.config/lore/config.json` (or `$XDG_CONFIG_HOME/lore/config.json`).

### Example Configuration

```json
{
  "gitlab": {
    "baseUrl": "https://gitlab.com",
    "tokenEnvVar": "GITLAB_TOKEN"
  },
  "projects": [
    { "path": "group/project" },
    { "path": "other-group/other-project" }
  ],
  "defaultProject": "group/project",
  "sync": {
    "backfillDays": 14,
    "staleLockMinutes": 10,
    "heartbeatIntervalSeconds": 30,
    "cursorRewindSeconds": 2,
    "primaryConcurrency": 4,
    "dependentConcurrency": 2,
    "fetchWorkItemStatus": true
  },
  "storage": {
    "compressRawPayloads": true
  },
  "embedding": {
    "provider": "ollama",
    "model": "nomic-embed-text",
    "baseUrl": "http://localhost:11434",
    "concurrency": 4
  },
  "scoring": {
    "authorWeight": 25,
    "reviewerWeight": 10,
    "noteBonus": 1,
    "authorHalfLifeDays": 180,
    "reviewerHalfLifeDays": 90,
    "noteHalfLifeDays": 45,
    "excludedUsernames": ["bot-user"]
  }
}
```

### Configuration Options

| Section | Field | Default | Description |
|---------|-------|---------|-------------|
| `gitlab` | `baseUrl` | -- | GitLab instance URL (required) |
| `gitlab` | `tokenEnvVar` | `GITLAB_TOKEN` | Environment variable containing API token |
| `projects` | `path` | -- | Project path (e.g., `group/project`) |
| *(top-level)* | `defaultProject` | none | Fallback project path used when `-p` is omitted. Must match a configured project path (exact or suffix). CLI `-p` always overrides. |
| `sync` | `backfillDays` | `14` | Days to backfill on initial sync |
| `sync` | `staleLockMinutes` | `10` | Minutes before sync lock considered stale |
| `sync` | `heartbeatIntervalSeconds` | `30` | Frequency of lock heartbeat updates |
| `sync` | `cursorRewindSeconds` | `2` | Seconds to rewind cursor for overlap safety |
| `sync` | `primaryConcurrency` | `4` | Concurrent GitLab requests for primary resources |
| `sync` | `dependentConcurrency` | `2` | Concurrent requests for dependent resources |
| `sync` | `fetchWorkItemStatus` | `true` | Enrich issues with work item status via GraphQL (requires GitLab Premium/Ultimate) |
| `storage` | `dbPath` | `~/.local/share/lore/lore.db` | Database file path |
| `storage` | `backupDir` | `~/.local/share/lore/backups` | Backup directory |
| `storage` | `compressRawPayloads` | `true` | Compress stored API responses with gzip |
| `embedding` | `provider` | `ollama` | Embedding provider |
| `embedding` | `model` | `nomic-embed-text` | Model name for embeddings |
| `embedding` | `baseUrl` | `http://localhost:11434` | Ollama server URL |
| `embedding` | `concurrency` | `4` | Concurrent embedding requests |
| `scoring` | `authorWeight` | `25` | Points per MR where the user authored code touching the path |
| `scoring` | `reviewerWeight` | `10` | Points per MR where the user reviewed code touching the path |
| `scoring` | `noteBonus` | `1` | Bonus per inline review comment (DiffNote) |
| `scoring` | `reviewerAssignmentWeight` | `3` | Points per MR where the user was assigned as reviewer |
| `scoring` | `authorHalfLifeDays` | `180` | Half-life in days for author contribution decay |
| `scoring` | `reviewerHalfLifeDays` | `90` | Half-life in days for reviewer contribution decay |
| `scoring` | `noteHalfLifeDays` | `45` | Half-life in days for note/comment decay |
| `scoring` | `closedMrMultiplier` | `0.5` | Score multiplier for closed (not merged) MRs |
| `scoring` | `excludedUsernames` | `[]` | Usernames excluded from expert results (e.g., bots) |

### Config File Resolution

The config file is resolved in this order:
1. `--config` / `-c` CLI flag
2. `LORE_CONFIG_PATH` environment variable
3. `~/.config/lore/config.json` (XDG default)
4. `./lore.config.json` (local fallback for development)

### GitLab Token

Create a personal access token with `read_api` scope:

1. Go to GitLab > Settings > Access Tokens
2. Create token with `read_api` scope
3. Export it: `export GITLAB_TOKEN=glpat-xxxxxxxxxxxx`

## Environment Variables

| Variable | Purpose | Required |
|----------|---------|----------|
| `GITLAB_TOKEN` | GitLab API authentication token (name configurable via `gitlab.tokenEnvVar`) | Yes |
| `LORE_CONFIG_PATH` | Override config file location | No |
| `LORE_ROBOT` | Enable robot mode globally (set to `true` or `1`) | No |
| `XDG_CONFIG_HOME` | XDG Base Directory for config (fallback: `~/.config`) | No |
| `XDG_DATA_HOME` | XDG Base Directory for data (fallback: `~/.local/share`) | No |
| `NO_COLOR` | Disable color output when set (any value) | No |
| `CLICOLOR` | Standard color control (0 to disable) | No |
| `RUST_LOG` | Logging level filter (e.g., `lore=debug`) | No |

## Commands

### `lore issues`

Query issues from local database, or show a specific issue.

```bash
lore issues                           # Recent issues (default 50)
lore issues 123                       # Show issue #123 with discussions
lore issues 123 -p group/repo        # Disambiguate by project
lore issues -n 100                    # More results
lore issues -s opened                 # Only open issues
lore issues -s closed                 # Only closed issues
lore issues -a username               # By author (@ prefix optional)
lore issues -A username               # By assignee (@ prefix optional)
lore issues -l bug                    # By label (AND logic)
lore issues -l bug -l urgent          # Multiple labels
lore issues -m "v1.0"                 # By milestone title
lore issues --since 7d               # Updated in last 7 days
lore issues --since 2w               # Updated in last 2 weeks
lore issues --since 1m               # Updated in last month
lore issues --since 2024-01-01       # Updated since date
lore issues --due-before 2024-12-31  # Due before date
lore issues --has-due                 # Only issues with due dates
lore issues --status "In progress"   # By work item status (case-insensitive)
lore issues --status "To do" --status "In progress"  # Multiple statuses (OR)
lore issues -p group/repo            # Filter by project
lore issues --sort created --asc     # Sort by created date, ascending
lore issues -o                        # Open first result in browser

# Field selection (robot mode)
lore -J issues --fields minimal       # Compact: iid, title, state, updated_at_iso
lore -J issues --fields iid,title,labels,state  # Custom fields
```

When listing, output includes: IID, title, state, status (when any issue has one), assignee, labels, and update time. Status values display with their configured color. In robot mode, the `--fields` flag controls which fields appear in the JSON response.

When showing a single issue (e.g., `lore issues 123`), output includes: title, description, state, work item status (with color and category), author, assignees, labels, milestone, due date, web URL, and threaded discussions.

#### Project Resolution

When `-p` / `--project` is omitted, the `defaultProject` from config is used as a fallback. If neither is set, results span all configured projects. When a project is specified (via `-p` or config default), it uses cascading match logic across all commands:

1. **Exact match**: `group/project`
2. **Case-insensitive**: `Group/Project`
3. **Suffix match**: `project` matches `group/project` (if unambiguous)
4. **Substring match**: `typescript` matches `vs/typescript-code` (if unambiguous)

If multiple projects match, an error lists the candidates with a hint to use the full path.

### `lore mrs`

Query merge requests from local database, or show a specific MR.

```bash
lore mrs                              # Recent MRs (default 50)
lore mrs 456                          # Show MR !456 with discussions
lore mrs 456 -p group/repo           # Disambiguate by project
lore mrs -n 100                       # More results
lore mrs -s opened                    # Only open MRs
lore mrs -s merged                    # Only merged MRs
lore mrs -s closed                    # Only closed MRs
lore mrs -s locked                    # Only locked MRs
lore mrs -s all                       # All states
lore mrs -a username                  # By author (@ prefix optional)
lore mrs -A username                  # By assignee (@ prefix optional)
lore mrs -r username                  # By reviewer (@ prefix optional)
lore mrs -d                           # Only draft/WIP MRs
lore mrs -D                           # Exclude draft MRs
lore mrs --target main               # By target branch
lore mrs --source feature/foo        # By source branch
lore mrs -l needs-review              # By label (AND logic)
lore mrs --since 7d                  # Updated in last 7 days
lore mrs -p group/repo               # Filter by project
lore mrs --sort created --asc        # Sort by created date, ascending
lore mrs -o                           # Open first result in browser

# Field selection (robot mode)
lore -J mrs --fields minimal          # Compact: iid, title, state, updated_at_iso
lore -J mrs --fields iid,title,draft,target_branch  # Custom fields
```

When listing, output includes: IID, title (with [DRAFT] prefix if applicable), state, author, assignee, labels, and update time.

When showing a single MR (e.g., `lore mrs 456`), output includes: title, description, state, draft status, author, assignees, reviewers, labels, source/target branches, merge status, web URL, and threaded discussions. Inline code review comments (DiffNotes) display file context in the format `[src/file.ts:45]`.

### `lore search`

Search across indexed documents using hybrid (lexical + semantic), lexical-only, or semantic-only modes.

```bash
lore search "authentication bug"              # Hybrid search (default)
lore search "login flow" --mode lexical       # FTS5 lexical only
lore search "login flow" --mode semantic      # Vector similarity only
lore search "auth" --type issue               # Filter by source type
lore search "auth" --type mr                  # MR documents only
lore search "auth" --type discussion          # Discussion documents only
lore search "auth" --type note               # Individual notes only
lore search "deploy" --author username        # Filter by author
lore search "deploy" -p group/repo           # Filter by project
lore search "deploy" --label backend          # Filter by label (AND logic)
lore search "deploy" --path src/             # Filter by file path (trailing / for prefix)
lore search "deploy" --since 7d              # Created since (7d, 2w, 1m, or YYYY-MM-DD)
lore search "deploy" --updated-since 2w      # Updated since
lore search "deploy" -n 50                    # Limit results (default 20, max 100)
lore search "deploy" --explain               # Show ranking explanation per result
lore search "deploy" --fts-mode raw          # Raw FTS5 query syntax (advanced)
```

The `--fts-mode` flag defaults to `safe`, which sanitizes user input into valid FTS5 queries with automatic fallback. FTS5 boolean operators (`AND`, `OR`, `NOT`, `NEAR`) are passed through in safe mode, so queries like `"switch AND health"` work without switching to raw mode. Use `raw` for advanced FTS5 query syntax (phrase matching, column filters, prefix queries).

A progress spinner displays during search, showing the active mode (e.g., `Searching (hybrid)...`). In robot mode, spinners are suppressed for clean JSON output.

Requires `lore generate-docs` (or `lore sync`) to have been run at least once. Semantic and hybrid modes require `lore embed` (or `lore sync`) to have generated vector embeddings via Ollama.

### `lore who`

People intelligence: discover experts, analyze workloads, review patterns, active discussions, and code overlap.

#### Expert Mode

Find who has expertise in a code area based on authoring and reviewing history (DiffNote analysis). Scores use exponential half-life decay so recent contributions count more than older ones. Scoring weights and half-life periods are configurable via the `scoring` config section.

```bash
lore who src/features/auth/           # Who knows about this directory?
lore who src/features/auth/login.ts   # Who knows about this file?
lore who --path README.md             # Root files need --path flag
lore who --path Makefile              # Dotless root files too
lore who src/ --since 3m              # Limit to recent 3 months
lore who src/ -p group/repo           # Scope to project
lore who src/ --explain-score         # Show per-component score breakdown
lore who src/ --as-of 30d            # Score as if "now" was 30 days ago
lore who src/ --include-bots          # Include bot users in results
```

The target is auto-detected as a path when it contains `/`. For root files without `/` (e.g., `README.md`), use the `--path` flag. Default time window: 6 months.

#### Workload Mode

See what someone is currently working on.

```bash
lore who @asmith                      # Full workload summary
lore who @asmith -p group/repo       # Scoped to one project
```

Shows: assigned open issues, authored MRs, MRs under review, and unresolved discussions.

#### Reviews Mode

Analyze someone's code review patterns by area.

```bash
lore who @asmith --reviews            # Review activity breakdown
lore who @asmith --reviews --since 3m # Recent review patterns
```

Shows: total DiffNotes, categorized by code area with percentage breakdown.

#### Active Mode

Surface unresolved discussions needing attention.

```bash
lore who --active                     # Unresolved discussions (last 7 days)
lore who --active --since 30d        # Wider time window
lore who --active -p group/repo      # Scoped to project
```

Shows: discussion threads with participants and last activity timestamps.

#### Overlap Mode

Find who else is touching a file or directory.

```bash
lore who --overlap src/features/auth/ # Who else works here?
lore who --overlap src/lib.rs        # Single file overlap
```

Shows: users with touch counts (author vs. review), linked MR references. Default time window: 6 months.

#### Common Flags

| Flag | Description |
|------|-------------|
| `-p` / `--project` | Scope to a project (fuzzy match) |
| `--since` | Time window (7d, 2w, 6m, YYYY-MM-DD). Default varies by mode. |
| `-n` / `--limit` | Max results per section (1-500, default 20) |
| `--all-history` | Remove the default time window, query all history |
| `--detail` | Show per-MR detail breakdown (expert mode only) |
| `--explain-score` | Show per-component score breakdown (expert mode only) |
| `--as-of` | Score as if "now" is a past date (ISO 8601 or duration like 30d, expert mode only) |
| `--include-bots` | Include bot users normally excluded via `scoring.excludedUsernames` |

### `lore timeline`

Reconstruct a chronological timeline of events matching a keyword query. The pipeline discovers related entities through cross-reference graph traversal and assembles a unified, time-ordered event stream.

```bash
lore timeline "deployment"                    # Search-based seeding (hybrid search)
lore timeline issue:42                        # Direct entity seeding by issue IID
lore timeline i:42                            # Shorthand for issue:42
lore timeline mr:99                           # Direct entity seeding by MR IID
lore timeline m:99                            # Shorthand for mr:99
lore timeline "auth" -p group/repo           # Scoped to a project
lore timeline "auth" --since 30d             # Only recent events
lore timeline "migration" --depth 2          # Deeper cross-reference expansion
lore timeline "migration" --no-mentions      # Skip 'mentioned' edges (reduces fan-out)
lore timeline "deploy" -n 50                 # Limit event count
lore timeline "auth" --max-seeds 5           # Fewer seed entities
```

The query can be either a search string (hybrid search finds matching entities) or an entity reference (`issue:N`, `i:N`, `mr:N`, `m:N`) which directly seeds the timeline from a specific entity and its cross-references.

#### Flags

| Flag | Default | Description |
|------|---------|-------------|
| `-p` / `--project` | all | Scope to a specific project (fuzzy match) |
| `--since` | none | Only events after this date (7d, 2w, 6m, YYYY-MM-DD) |
| `--depth` | `1` | Cross-reference expansion depth (0 = seeds only) |
| `--no-mentions` | off | Skip "mentioned" edges during expansion (reduces fan-out) |
| `-n` / `--limit` | `100` | Maximum events to display |
| `--max-seeds` | `10` | Maximum seed entities from search |
| `--max-entities` | `50` | Maximum entities discovered via cross-references |
| `--max-evidence` | `10` | Maximum evidence notes included |
| `--fields` | all | Select output fields (comma-separated, or 'minimal' preset) |

#### Pipeline Stages

Each stage displays a numbered progress spinner (e.g., `[1/3] Seeding timeline...`). In robot mode, spinners are suppressed for clean JSON output.

1. **SEED** -- Hybrid search (FTS5 lexical + Ollama vector similarity via Reciprocal Rank Fusion) identifies the most relevant issues and MRs. Falls back to lexical-only if Ollama is unavailable. Discussion notes matching the query are also discovered and attached to their parent entities.
2. **HYDRATE** -- Evidence notes are extracted: the top search-matched discussion notes with 200-character snippets explaining *why* each entity was surfaced. Matched discussions are collected as full thread candidates.
3. **EXPAND** -- Breadth-first traversal over the `entity_references` graph discovers related entities via "closes", "related", and "mentioned" references up to the configured depth. Use `--no-mentions` to exclude "mentioned" edges and reduce fan-out.
4. **COLLECT** -- Events are gathered for all discovered entities. Event types include: creation, state changes, label adds/removes, milestone assignments, merge events, evidence notes, and full discussion threads. Events are sorted chronologically with stable tiebreaking.
5. **RENDER** -- Events are formatted as human-readable text or structured JSON (robot mode).

#### Event Types

| Event | Description |
|-------|-------------|
| `Created` | Entity creation |
| `StateChanged` | State transitions (opened, closed, reopened) |
| `LabelAdded` | Label applied to entity |
| `LabelRemoved` | Label removed from entity |
| `MilestoneSet` | Milestone assigned |
| `MilestoneRemoved` | Milestone removed |
| `Merged` | MR merged (deduplicated against state events) |
| `NoteEvidence` | Discussion note matched by search, with snippet |
| `DiscussionThread` | Full discussion thread with all non-system notes |
| `CrossReferenced` | Reference to another entity |

#### Unresolved References

When graph expansion encounters cross-project references to entities not yet synced locally, these are collected as unresolved references in the output. This enables discovery of external dependencies and can inform future sync targets.

### `lore notes`

Query individual notes from discussions with rich filtering options.

```bash
lore notes                                    # List 50 most recent notes
lore notes --author alice --since 7d         # Notes by alice in last 7 days
lore notes --for-issue 42 -p group/repo      # Notes on issue #42
lore notes --for-mr 99 -p group/repo         # Notes on MR !99
lore notes --path src/ --resolution unresolved  # Unresolved diff notes in src/
lore notes --note-type DiffNote              # Only inline code review comments
lore notes --contains "TODO"                 # Substring search in note body
lore notes --include-system                  # Include system-generated notes
lore notes --since 2w --until 2024-12-31     # Time-bounded range
lore notes --sort updated --asc              # Sort by update time, ascending
lore notes --format csv                      # CSV output
lore notes --format jsonl                    # Line-delimited JSON
lore notes -o                                # Open first result in browser

# Field selection (robot mode)
lore -J notes --fields minimal               # Compact: id, author_username, body, created_at_iso
```

#### Filters

| Flag | Description |
|------|-------------|
| `-a` / `--author` | Filter by note author username |
| `--note-type` | Filter by note type (DiffNote, DiscussionNote) |
| `--contains` | Substring search in note body |
| `--note-id` | Filter by internal note ID |
| `--gitlab-note-id` | Filter by GitLab note ID |
| `--discussion-id` | Filter by discussion ID |
| `--include-system` | Include system notes (excluded by default) |
| `--for-issue` | Notes on a specific issue IID (requires `-p`) |
| `--for-mr` | Notes on a specific MR IID (requires `-p`) |
| `-p` / `--project` | Scope to a project (fuzzy match) |
| `--since` | Notes created since (7d, 2w, 1m, or YYYY-MM-DD) |
| `--until` | Notes created until (YYYY-MM-DD, inclusive end-of-day) |
| `--path` | Filter by file path (DiffNotes only; trailing `/` for prefix match) |
| `--resolution` | Filter by resolution status (`any`, `unresolved`, `resolved`) |
| `--sort` | Sort by `created` (default) or `updated` |
| `--asc` | Sort ascending (default: descending) |
| `--format` | Output format: `table` (default), `json`, `jsonl`, `csv` |
| `-o` / `--open` | Open first result in browser |

### `lore drift`

Detect discussion divergence from the original intent of an issue by comparing the semantic similarity of discussion content against the issue description.

```bash
lore drift issues 42                         # Check divergence on issue #42
lore drift issues 42 --threshold 0.6        # Higher threshold (stricter)
lore drift issues 42 -p group/repo          # Scope to project
```

### `lore sync`

Run the full sync pipeline: ingest from GitLab (including work item status enrichment via GraphQL), generate searchable documents, and compute embeddings.

```bash
lore sync                    # Full pipeline
lore sync --full             # Reset cursors, fetch everything
lore sync --force            # Override stale lock
lore sync --no-embed         # Skip embedding step
lore sync --no-docs          # Skip document regeneration
lore sync --no-events        # Skip resource event fetching
lore sync --no-file-changes  # Skip MR file change fetching
lore sync --dry-run          # Preview what would be synced
```

The sync command displays animated progress bars for each stage and outputs timing metrics on completion. In robot mode (`-J`), detailed stage timing is included in the JSON response.

### `lore ingest`

Sync data from GitLab to local database. Runs only the ingestion step (no doc generation or embeddings). For issue ingestion, this includes a status enrichment phase that fetches work item statuses via the GitLab GraphQL API.

```bash
lore ingest                                    # Ingest everything (issues + MRs)
lore ingest issues                             # Issues only (includes status enrichment)
lore ingest mrs                                # MRs only
lore ingest issues -p group/repo              # Single project
lore ingest --force                            # Override stale lock
lore ingest --full                             # Full re-sync (reset cursors)
lore ingest --dry-run                          # Preview what would change
```

The `--full` flag resets sync cursors and discussion watermarks, then fetches all data from scratch. Useful when:
- Assignee data or other fields were missing from earlier syncs
- You want to ensure complete data after schema changes
- Troubleshooting sync issues

Status enrichment uses adaptive page sizing (100 → 50 → 25 → 10) to handle GitLab GraphQL complexity limits. It gracefully handles instances without GraphQL support or Premium/Ultimate licensing. Disable via `sync.fetchWorkItemStatus: false` in config.

### `lore generate-docs`

Extract searchable documents from ingested issues, MRs, and discussions for the FTS5 index.

```bash
lore generate-docs                    # Incremental (dirty items only)
lore generate-docs --full             # Full rebuild
lore generate-docs -p group/repo     # Single project
```

### `lore embed`

Generate vector embeddings for documents via Ollama. Requires Ollama running with the configured embedding model.

```bash
lore embed                    # Embed new/changed documents
lore embed --full             # Re-embed all documents (clears existing)
lore embed --retry-failed     # Retry previously failed embeddings
```

### `lore count`

Count entities in local database.

```bash
lore count issues                     # Total issues
lore count mrs                        # Total MRs (with state breakdown)
lore count discussions                # Total discussions
lore count discussions --for issue   # Issue discussions only
lore count discussions --for mr      # MR discussions only
lore count notes                      # Total notes (system vs user breakdown)
lore count notes --for issue         # Issue notes only
lore count events                     # Total resource events
lore count events --for issue        # Issue events only
lore count events --for mr           # MR events only
```

### `lore stats`

Show document and index statistics, with optional integrity checks.

```bash
lore stats                    # Document and index statistics
lore stats --check            # Run integrity checks
lore stats --check --repair   # Repair integrity issues
lore stats --dry-run          # Preview repairs without saving
```

### `lore status`

Show current sync state and watermarks.

```bash
lore status
```

Displays:
- Last sync run details (status, timing)
- Cursor positions per project and resource type (issues and MRs)
- Data summary counts

### `lore init`

Initialize configuration and database interactively.

```bash
lore init                    # Interactive setup
lore init --force            # Overwrite existing config
lore init --non-interactive  # Fail if prompts needed
```

When multiple projects are configured, `init` prompts whether to set a default project (used when `-p` is omitted). This can also be set via the `--default-project` flag.

In robot mode, `init` supports non-interactive setup via flags:

```bash
lore -J init --gitlab-url https://gitlab.com \
  --token-env-var GITLAB_TOKEN \
  --projects "group/project,other/project" \
  --default-project group/project
```

### `lore auth`

Verify GitLab authentication is working.

```bash
lore auth
# Authenticated as @username (Full Name)
# GitLab: https://gitlab.com
```

### `lore doctor`

Check environment health and configuration.

```bash
lore doctor
```

Checks performed:
- Config file existence and validity
- Database existence and pragmas (WAL mode, foreign keys)
- GitLab authentication
- Project accessibility
- Ollama connectivity (optional)

### `lore migrate`

Run pending database migrations.

```bash
lore migrate
```

### `lore health`

Quick pre-flight check for config, database, and schema version. Exits 0 if healthy, 19 if unhealthy.

```bash
lore health
```

Useful as a fast gate before running queries or syncs. For a more thorough check including authentication and project access, use `lore doctor`.

### `lore robot-docs`

Machine-readable command manifest for agent self-discovery. Returns a JSON schema of all commands, flags, exit codes, and example workflows.

```bash
lore robot-docs                   # Pretty-printed JSON
lore --robot robot-docs           # Compact JSON for parsing
lore robot-docs --brief           # Omit response_schema (~60% smaller)
```

### `lore version`

Show version information including the git commit hash.

```bash
lore version
# lore version 0.1.0 (abc1234)
```

## Robot Mode

Machine-readable JSON output for scripting and AI agent consumption. All responses use compact (single-line) JSON with a uniform envelope and timing metadata.

### Activation

```bash
# Global flag
lore --robot issues -n 5

# JSON shorthand (-J)
lore -J issues -n 5

# Environment variable
LORE_ROBOT=1 lore issues -n 5

# Auto-detection (when stdout is not a TTY)
lore issues -n 5 | jq .
```

### Response Format

All commands return a consistent JSON envelope to stdout:

```json
{"ok":true,"data":{...},"meta":{"elapsed_ms":42}}
```

Every response includes `meta.elapsed_ms` (wall-clock milliseconds for the command).

Errors return structured JSON to stderr with machine-actionable recovery steps:

```json
{"error":{"code":"CONFIG_NOT_FOUND","message":"...","suggestion":"Run 'lore init'","actions":["lore init"]}}
```

The `actions` array contains executable shell commands an agent can run to recover from the error. It is omitted when empty (e.g., for generic I/O errors).

### Field Selection

The `--fields` flag controls which fields appear in the JSON response, reducing token usage for AI agent workflows. Supported on `issues`, `mrs`, `notes`, `search`, `timeline`, and `who` list commands:

```bash
# Minimal preset (~60% fewer tokens)
lore -J issues --fields minimal

# Custom field list
lore -J issues --fields iid,title,state,labels,updated_at_iso

# Available presets
#   minimal: iid, title, state, updated_at_iso
```

Valid fields for issues: `iid`, `title`, `state`, `author_username`, `labels`, `assignees`, `discussion_count`, `unresolved_count`, `created_at_iso`, `updated_at_iso`, `web_url`, `project_path`, `status_name`, `status_category`, `status_color`, `status_icon_name`, `status_synced_at_iso`

Valid fields for MRs: `iid`, `title`, `state`, `author_username`, `labels`, `draft`, `target_branch`, `source_branch`, `discussion_count`, `unresolved_count`, `created_at_iso`, `updated_at_iso`, `web_url`, `project_path`, `reviewers`

### Error Tolerance

The CLI auto-corrects common mistakes before parsing, emitting a teaching note to stderr. Corrections work in both human and robot modes:

| Correction | Example | Mode |
|-----------|---------|------|
| Single-dash long flag | `-robot` -> `--robot` | All |
| Case normalization | `--Robot` -> `--robot` | All |
| Flag prefix expansion | `--proj` -> `--project` (unambiguous only) | All |
| Fuzzy flag match | `--projct` -> `--project` | All (threshold 0.9 in robot, 0.8 in human) |
| Subcommand alias | `merge_requests` -> `mrs`, `robotdocs` -> `robot-docs` | All |
| Value normalization | `--state Opened` -> `--state opened` | All |
| Value fuzzy match | `--state opend` -> `--state opened` | All |
| Subcommand prefix | `lore iss` -> `lore issues` (unambiguous only, via clap) | All |

In robot mode, corrections emit structured JSON to stderr:

```json
{"warning":{"type":"ARG_CORRECTED","corrections":[...],"teaching":["Use double-dash for long flags: --robot (not -robot)"]}}
```

When a command or flag is still unrecognized after corrections, the error response includes a fuzzy suggestion and, for enum-like flags, lists valid values:

```json
{"error":{"code":"UNKNOWN_COMMAND","message":"...","suggestion":"Did you mean 'lore issues'? Example: lore --robot issues -n 10. Run 'lore robot-docs' for all commands"}}
```

### Command Aliases

Commands accept aliases for common variations:

| Primary | Aliases |
|---------|---------|
| `issues` | `issue` |
| `mrs` | `mr`, `merge-requests`, `merge-request` |
| `notes` | `note` |
| `search` | `find`, `query` |
| `stats` | `stat` |
| `status` | `st` |

Unambiguous prefixes also work via subcommand inference (e.g., `lore iss` -> `lore issues`, `lore time` -> `lore timeline`).

### Agent Self-Discovery

The `robot-docs` command provides a complete machine-readable manifest including response schemas for every command:

```bash
lore robot-docs | jq '.data.commands.issues.response_schema'
```

Each command entry includes `response_schema` describing the shape of its JSON response, `fields_presets` for commands supporting `--fields`, and copy-paste `example` invocations.

### Exit Codes

| Code | Meaning |
|------|---------|
| 0 | Success |
| 1 | Internal error / health check failed / not implemented |
| 2 | Usage error (invalid flags or arguments) |
| 3 | Config invalid |
| 4 | Token not set |
| 5 | GitLab auth failed |
| 6 | Resource not found |
| 7 | Rate limited |
| 8 | Network error |
| 9 | Database locked |
| 10 | Database error |
| 11 | Migration failed |
| 12 | I/O error |
| 13 | Transform error |
| 14 | Ollama unavailable |
| 15 | Ollama model not found |
| 16 | Embedding failed |
| 17 | Not found (entity does not exist) |
| 18 | Ambiguous match (use `-p` to specify project) |
| 19 | Health check failed |
| 20 | Config not found |

## Configuration Precedence

Settings are resolved in this order (highest to lowest priority):

1. CLI flags (`--robot`, `--config`, `--color`)
2. Environment variables (`LORE_ROBOT`, `GITLAB_TOKEN`, `LORE_CONFIG_PATH`)
3. Config file (`~/.config/lore/config.json`)
4. Built-in defaults

## Global Options

```bash
lore -c /path/to/config.json <command>   # Use alternate config
lore --robot <command>                    # Machine-readable JSON
lore -J <command>                         # JSON shorthand
lore --color never <command>              # Disable color output
lore --color always <command>             # Force color output
lore -q <command>                         # Suppress non-essential output
lore -v <command>                         # Debug logging
lore -vv <command>                        # More verbose debug logging
lore -vvv <command>                       # Trace-level logging
lore --log-format json <command>          # JSON-formatted log output to stderr
```

Color output respects `NO_COLOR` and `CLICOLOR` environment variables in `auto` mode (the default).

## Shell Completions

Generate shell completions for tab-completion support:

```bash
# Bash (add to ~/.bashrc)
lore completions bash > ~/.local/share/bash-completion/completions/lore

# Zsh (add to ~/.zshrc: fpath=(~/.zfunc $fpath))
lore completions zsh > ~/.zfunc/_lore

# Fish
lore completions fish > ~/.config/fish/completions/lore.fish

# PowerShell (add to $PROFILE)
lore completions powershell >> $PROFILE
```

## Database Schema

Data is stored in SQLite with WAL mode and foreign keys enabled. Main tables:

| Table | Purpose |
|-------|---------|
| `projects` | Tracked GitLab projects with metadata |
| `issues` | Issue metadata (title, state, author, due date, milestone, work item status) |
| `merge_requests` | MR metadata (title, state, draft, branches, merge status, commit SHAs) |
| `milestones` | Project milestones with state and due dates |
| `labels` | Project labels with colors |
| `issue_labels` | Many-to-many issue-label relationships |
| `issue_assignees` | Many-to-many issue-assignee relationships |
| `mr_labels` | Many-to-many MR-label relationships |
| `mr_assignees` | Many-to-many MR-assignee relationships |
| `mr_reviewers` | Many-to-many MR-reviewer relationships |
| `mr_file_changes` | Files touched by each MR (path, change type, renames) |
| `discussions` | Issue/MR discussion threads |
| `notes` | Individual notes within discussions (with system note flag and DiffNote position data) |
| `resource_state_events` | Issue/MR state change history (opened, closed, merged, reopened) |
| `resource_label_events` | Label add/remove events with actor and timestamp |
| `resource_milestone_events` | Milestone add/remove events with actor and timestamp |
| `entity_references` | Cross-references between entities (MR closes issue, mentioned in, etc.) |
| `documents` | Extracted searchable text for FTS and embedding |
| `documents_fts` | FTS5 full-text search index |
| `embeddings` | Vector embeddings for semantic search |
| `dirty_sources` | Entities needing document regeneration after ingest |
| `pending_discussion_fetches` | Queue for discussion fetch operations |
| `sync_runs` | Audit trail of sync operations |
| `sync_cursors` | Cursor positions for incremental sync |
| `app_locks` | Crash-safe single-flight lock |
| `raw_payloads` | Compressed original API responses |
| `schema_version` | Migration version tracking |

The database is stored at `~/.local/share/lore/lore.db` by default (XDG compliant).

## Development

```bash
# Run tests
cargo test

# Run with debug logging
RUST_LOG=lore=debug lore issues

# Run with trace logging
RUST_LOG=lore=trace lore ingest issues

# Check formatting
cargo fmt --check

# Lint
cargo clippy
```

## Tech Stack

- **Rust** (2024 edition)
- **SQLite** via rusqlite (bundled) with FTS5 and sqlite-vec
- **Ollama** for vector embeddings (nomic-embed-text)
- **clap** for CLI parsing
- **reqwest** for HTTP
- **tokio** for async runtime
- **serde** for serialization
- **tracing** for logging
- **indicatif** for progress bars

## License

MIT