docs: add comprehensive command surface analysis

Deep analysis of the full `lore` CLI command surface (34 commands across
6 categories) covering command inventory, data flow, overlap analysis,
and optimization proposals.

Document structure:
- Main consolidated doc: docs/command-surface-analysis.md (1251 lines)
- Split sections in docs/command-surface-analysis/ for navigation:
  00-overview.md      - Summary, inventory, priorities
  01-entity-commands.md   - issues, mrs, notes, search, count
  02-intelligence-commands.md - who, timeline, me, file-history, trace, related, drift
  03-pipeline-and-infra.md    - sync, ingest, generate-docs, embed, diagnostics
  04-data-flow.md     - Shared data source map, command network graph
  05-overlap-analysis.md  - Quantified overlap percentages for every command pair
  06-agent-workflows.md   - Common agent flows, round-trip costs, token profiles
  07-consolidation-proposals.md  - 5 proposals to reduce 34 commands to 29
  08-robot-optimization-proposals.md - 6 proposals for --include, --batch, --depth
  09-appendices.md    - Robot output envelope, field presets, exit codes

Key findings:
- High overlap pairs: who-workload/me (~85%), health/doctor (~90%)
- 5 consolidation proposals to reduce command count by 15%
- 6 robot-mode optimization proposals targeting agent round-trip reduction
- Full DB table mapping and data flow documentation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
teernisse
2026-02-27 07:31:36 -05:00
parent 439c20e713
commit 3f38b3fda7
11 changed files with 3604 additions and 0 deletions

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,92 @@
# Lore Command Surface Analysis — Overview
**Date:** 2026-02-26
**Version:** v0.9.1 (439c20e)
---
## Purpose
Deep analysis of the full `lore` CLI command surface: what each command does, how commands overlap, how they connect in agent workflows, and where consolidation and robot-mode optimization can reduce round trips and token waste.
## Document Map
| File | Contents | When to Read |
|---|---|---|
| **00-overview.md** | This file. Summary, inventory, priorities. | Always read first. |
| [01-entity-commands.md](01-entity-commands.md) | `issues`, `mrs`, `notes`, `search`, `count` — flags, DB tables, robot schemas | Need command reference for entity queries |
| [02-intelligence-commands.md](02-intelligence-commands.md) | `who`, `timeline`, `me`, `file-history`, `trace`, `related`, `drift` | Need command reference for intelligence/analysis |
| [03-pipeline-and-infra.md](03-pipeline-and-infra.md) | `sync`, `ingest`, `generate-docs`, `embed`, diagnostics, setup | Need command reference for data management |
| [04-data-flow.md](04-data-flow.md) | Shared data source map, command network graph, clusters | Understanding how commands interconnect |
| [05-overlap-analysis.md](05-overlap-analysis.md) | Quantified overlap percentages for every command pair | Evaluating what to consolidate |
| [06-agent-workflows.md](06-agent-workflows.md) | Common agent flows, round-trip costs, token profiles | Understanding inefficiency pain points |
| [07-consolidation-proposals.md](07-consolidation-proposals.md) | 5 proposals to reduce 34 commands to 29 | Planning command surface changes |
| [08-robot-optimization-proposals.md](08-robot-optimization-proposals.md) | 6 proposals for `--include`, `--batch`, `--depth`, etc. | Planning robot-mode improvements |
| [09-appendices.md](09-appendices.md) | Robot output envelope, field presets, exit codes | Reference material |
---
## Command Inventory (34 commands)
| Category | Commands | Count |
|---|---|---|
| Entity Query | `issues`, `mrs`, `notes`, `search`, `count` | 5 |
| Intelligence | `who` (5 modes), `timeline`, `related`, `drift`, `me`, `file-history`, `trace` | 7 (11 with who sub-modes) |
| Data Pipeline | `sync`, `ingest`, `generate-docs`, `embed` | 4 |
| Diagnostics | `health`, `auth`, `doctor`, `status`, `stats` | 5 |
| Setup | `init`, `token`, `cron`, `migrate` | 4 |
| Meta | `version`, `completions`, `robot-docs` | 3 |
---
## Key Findings
### High-Overlap Pairs
| Pair | Overlap | Recommendation |
|---|---|---|
| `who workload` vs `me` | ~85% | Workload is a strict subset of me |
| `health` vs `doctor` | ~90% | Health is a strict subset of doctor |
| `file-history` vs `trace` | ~75% | Trace is a superset minus `--merged` |
| `related` query-mode vs `search --mode semantic` | ~80% | Related query-mode is search without filters |
| `auth` vs `doctor` | ~100% of auth | Auth is fully contained within doctor |
### Agent Workflow Pain Points
| Workflow | Current Round Trips | With Optimizations |
|---|---|---|
| "Understand this issue" | 4 calls | 1 call (`--include`) |
| "Why was code changed?" | 3 calls | 1 call (`--include`) |
| "What should I work on?" | 4 calls | 2 calls |
| "Find and understand" | 4 calls | 2 calls |
| "Is system healthy?" | 2-4 calls | 1 call |
---
## Priority Ranking
| Pri | Proposal | Category | Effort | Impact |
|---|---|---|---|---|
| **P0** | `--include` flag on detail commands | Robot optimization | High | Eliminates 2-3 round trips per workflow |
| **P0** | `--depth` on `me` command | Robot optimization | Low | 60-80% token reduction on most-used command |
| **P1** | `--batch` for detail views | Robot optimization | Medium | Eliminates N+1 after search/timeline |
| **P1** | Absorb `file-history` into `trace` | Consolidation | Low | Cleaner surface, shared code |
| **P1** | Merge `who overlap` into `who expert` | Consolidation | Low | -1 round trip in review flows |
| **P2** | `context` composite command | Robot optimization | Medium | Single entry point for entity understanding |
| **P2** | Merge `count`+`status` into `stats` | Consolidation | Medium | -2 commands, progressive disclosure |
| **P2** | Absorb `auth` into `doctor` | Consolidation | Low | -1 command |
| **P2** | Remove `related` query-mode | Consolidation | Low | -1 confusing choice |
| **P3** | `--max-tokens` budget | Robot optimization | High | Flexible but complex to implement |
| **P3** | `--format tsv` | Robot optimization | Medium | High savings, limited applicability |
### Consolidation Summary
| Before | After | Removed |
|---|---|---|
| `file-history` + `trace` | `trace` (+ `--shallow`) | -1 |
| `auth` + `doctor` | `doctor` (+ `--auth`) | -1 |
| `related` query-mode | `search --mode semantic` | -1 mode |
| `who overlap` + `who expert` | `who expert` (+ touch_count) | -1 sub-mode |
| `count` + `status` + `stats` | `stats` (+ `--entities`, `--sync`) | -2 |
**Total: 34 commands -> 29 commands**

View File

@@ -0,0 +1,308 @@
# Entity Query Commands
Reference for: `issues`, `mrs`, `notes`, `search`, `count`
---
## `issues` (alias: `issue`)
List or show issues from local database.
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `[IID]` | positional | — | Omit to list, provide to show detail |
| `-n, --limit` | int | 50 | Max results |
| `--fields` | string | — | Select output columns (preset: `minimal`) |
| `-s, --state` | enum | — | `opened\|closed\|all` |
| `-p, --project` | string | — | Filter by project (fuzzy) |
| `-a, --author` | string | — | Filter by author username |
| `-A, --assignee` | string | — | Filter by assignee username |
| `-l, --label` | string[] | — | Filter by labels (AND logic, repeatable) |
| `-m, --milestone` | string | — | Filter by milestone title |
| `--status` | string[] | — | Filter by work-item status (COLLATE NOCASE, OR logic) |
| `--since` | duration/date | — | Filter by created date (`7d`, `2w`, `YYYY-MM-DD`) |
| `--due-before` | date | — | Filter by due date |
| `--has-due` | flag | — | Show only issues with due dates |
| `--sort` | enum | `updated` | `updated\|created\|iid` |
| `--asc` | flag | — | Sort ascending |
| `-o, --open` | flag | — | Open first match in browser |
**DB tables:** `issues`, `projects`, `issue_assignees`, `issue_labels`, `labels`
**Detail mode adds:** `discussions`, `notes`, `entity_references` (closing MRs)
### Robot Output (list mode)
```json
{
"ok": true,
"data": {
"issues": [
{
"iid": 42, "title": "Fix auth", "state": "opened",
"author_username": "jdoe", "labels": ["backend"],
"assignees": ["jdoe"], "discussion_count": 3,
"unresolved_count": 1, "created_at_iso": "...",
"updated_at_iso": "...", "web_url": "...",
"project_path": "group/repo",
"status_name": "In progress"
}
],
"total_count": 150, "showing": 50
},
"meta": { "elapsed_ms": 40, "available_statuses": ["Open", "In progress", "Closed"] }
}
```
### Robot Output (detail mode — `issues <IID>`)
```json
{
"ok": true,
"data": {
"id": 12345, "iid": 42, "title": "Fix auth",
"description": "Full markdown body...",
"state": "opened", "author_username": "jdoe",
"created_at": "...", "updated_at": "...", "closed_at": null,
"confidential": false, "web_url": "...", "project_path": "group/repo",
"references_full": "group/repo#42",
"labels": ["backend"], "assignees": ["jdoe"],
"due_date": null, "milestone": null,
"user_notes_count": 5, "merge_requests_count": 1,
"closing_merge_requests": [
{ "iid": 99, "title": "Refactor auth", "state": "merged", "web_url": "..." }
],
"discussions": [
{
"notes": [
{ "author_username": "jdoe", "body": "...", "created_at": "...", "is_system": false }
],
"individual_note": false
}
],
"status_name": "In progress", "status_color": "#1068bf"
}
}
```
**Minimal preset:** `iid`, `title`, `state`, `updated_at_iso`
---
## `mrs` (aliases: `mr`, `merge-request`, `merge-requests`)
List or show merge requests.
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `[IID]` | positional | — | Omit to list, provide to show detail |
| `-n, --limit` | int | 50 | Max results |
| `--fields` | string | — | Select output columns (preset: `minimal`) |
| `-s, --state` | enum | — | `opened\|merged\|closed\|locked\|all` |
| `-p, --project` | string | — | Filter by project |
| `-a, --author` | string | — | Filter by author |
| `-A, --assignee` | string | — | Filter by assignee |
| `-r, --reviewer` | string | — | Filter by reviewer |
| `-l, --label` | string[] | — | Filter by labels (AND) |
| `--since` | duration/date | — | Filter by created date |
| `-d, --draft` | flag | — | Draft MRs only |
| `-D, --no-draft` | flag | — | Exclude drafts |
| `--target` | string | — | Filter by target branch |
| `--source` | string | — | Filter by source branch |
| `--sort` | enum | `updated` | `updated\|created\|iid` |
| `--asc` | flag | — | Sort ascending |
| `-o, --open` | flag | — | Open in browser |
**DB tables:** `merge_requests`, `projects`, `mr_reviewers`, `mr_labels`, `labels`, `mr_assignees`
**Detail mode adds:** `discussions`, `notes`, `mr_diffs`
### Robot Output (list mode)
```json
{
"ok": true,
"data": {
"mrs": [
{
"iid": 99, "title": "Refactor auth", "state": "merged",
"draft": false, "author_username": "jdoe",
"source_branch": "feat/auth", "target_branch": "main",
"labels": ["backend"], "assignees": ["jdoe"], "reviewers": ["reviewer"],
"discussion_count": 5, "unresolved_count": 0,
"created_at_iso": "...", "updated_at_iso": "...",
"web_url": "...", "project_path": "group/repo"
}
],
"total_count": 500, "showing": 50
}
}
```
### Robot Output (detail mode — `mrs <IID>`)
```json
{
"ok": true,
"data": {
"id": 67890, "iid": 99, "title": "Refactor auth",
"description": "Full markdown body...",
"state": "merged", "draft": false, "author_username": "jdoe",
"source_branch": "feat/auth", "target_branch": "main",
"created_at": "...", "updated_at": "...",
"merged_at": "...", "closed_at": null,
"web_url": "...", "project_path": "group/repo",
"labels": ["backend"], "assignees": ["jdoe"], "reviewers": ["reviewer"],
"discussions": [
{
"notes": [
{
"author_username": "reviewer", "body": "...",
"created_at": "...", "is_system": false,
"position": { "new_path": "src/auth.rs", "new_line": 42 }
}
],
"individual_note": false
}
]
}
}
```
**Minimal preset:** `iid`, `title`, `state`, `updated_at_iso`
---
## `notes` (alias: `note`)
List discussion notes/comments with fine-grained filters.
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `-n, --limit` | int | 50 | Max results |
| `--fields` | string | — | Preset: `minimal` |
| `-a, --author` | string | — | Filter by author |
| `--note-type` | enum | — | `DiffNote\|DiscussionNote` |
| `--contains` | string | — | Body text substring filter |
| `--note-id` | int | — | Internal note ID |
| `--gitlab-note-id` | int | — | GitLab note ID |
| `--discussion-id` | string | — | Discussion ID filter |
| `--include-system` | flag | — | Include system notes |
| `--for-issue` | int | — | Notes on specific issue (requires `-p`) |
| `--for-mr` | int | — | Notes on specific MR (requires `-p`) |
| `-p, --project` | string | — | Scope to project |
| `--since` | duration/date | — | Created after |
| `--until` | date | — | Created before (inclusive) |
| `--path` | string | — | File path filter (exact or prefix with `/`) |
| `--resolution` | enum | — | `any\|unresolved\|resolved` |
| `--sort` | enum | `created` | `created\|updated` |
| `--asc` | flag | — | Sort ascending |
| `--open` | flag | — | Open in browser |
**DB tables:** `notes`, `discussions`, `projects`, `issues`, `merge_requests`
### Robot Output
```json
{
"ok": true,
"data": {
"notes": [
{
"id": 1234, "gitlab_id": 56789,
"author_username": "reviewer", "body": "...",
"note_type": "DiffNote", "is_system": false,
"created_at_iso": "...", "updated_at_iso": "...",
"position_new_path": "src/auth.rs", "position_new_line": 42,
"resolvable": true, "resolved": false,
"noteable_type": "MergeRequest", "parent_iid": 99,
"parent_title": "Refactor auth", "project_path": "group/repo"
}
],
"total_count": 1000, "showing": 50
}
}
```
**Minimal preset:** `id`, `author_username`, `body`, `created_at_iso`
---
## `search` (aliases: `find`, `query`)
Semantic + full-text search across indexed documents.
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `<QUERY>` | positional | required | Search query string |
| `--mode` | enum | `hybrid` | `lexical\|hybrid\|semantic` |
| `--type` | enum | — | `issue\|mr\|discussion\|note` |
| `--author` | string | — | Filter by author |
| `-p, --project` | string | — | Scope to project |
| `--label` | string[] | — | Filter by labels (AND) |
| `--path` | string | — | File path filter |
| `--since` | duration/date | — | Created after |
| `--updated-since` | duration/date | — | Updated after |
| `-n, --limit` | int | 20 | Max results (max: 100) |
| `--fields` | string | — | Preset: `minimal` |
| `--explain` | flag | — | Show ranking breakdown |
| `--fts-mode` | enum | `safe` | `safe\|raw` |
**DB tables:** `documents`, `documents_fts` (FTS5), `embeddings` (vec0), `document_labels`, `document_paths`, `projects`
**Search modes:**
- **lexical** — FTS5 with BM25 ranking (fastest, no Ollama needed)
- **hybrid** — RRF combination of lexical + semantic (default)
- **semantic** — Vector similarity only (requires Ollama)
### Robot Output
```json
{
"ok": true,
"data": {
"query": "authentication bug",
"mode": "hybrid",
"total_results": 15,
"results": [
{
"document_id": 1234, "source_type": "issue",
"title": "Fix SSO auth", "url": "...",
"author": "jdoe", "project_path": "group/repo",
"labels": ["auth"], "paths": ["src/auth/"],
"snippet": "...matching text...",
"score": 0.85,
"explain": { "vector_rank": 2, "fts_rank": 1, "rrf_score": 0.85 }
}
],
"warnings": []
}
}
```
**Minimal preset:** `document_id`, `title`, `source_type`, `score`
---
## `count`
Count entities in local database.
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `<ENTITY>` | positional | required | `issues\|mrs\|discussions\|notes\|events\|references` |
| `-f, --for` | enum | — | Parent type: `issue\|mr` |
**DB tables:** Conditional aggregation on entity tables
### Robot Output
```json
{
"ok": true,
"data": {
"entity": "merge_requests",
"count": 1234,
"system_excluded": 5000,
"breakdown": { "opened": 100, "closed": 50, "merged": 1084 }
}
}
```

View File

@@ -0,0 +1,452 @@
# Intelligence Commands
Reference for: `who`, `timeline`, `me`, `file-history`, `trace`, `related`, `drift`
---
## `who` (People Intelligence)
Five sub-modes, dispatched by argument shape.
| Mode | Trigger | Purpose |
|---|---|---|
| **expert** | `who <path>` or `who --path <path>` | Who knows about a code area? |
| **workload** | `who @username` | What is this person working on? |
| **reviews** | `who @username --reviews` | Review pattern analysis |
| **active** | `who --active` | Unresolved discussions needing attention |
| **overlap** | `who --overlap <path>` | Who else touches these files? |
### Shared Flags
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `-p, --project` | string | — | Scope to project |
| `-n, --limit` | int | varies | Max results (1-500) |
| `--fields` | string | — | Preset: `minimal` |
| `--since` | duration/date | — | Time window |
| `--include-bots` | flag | — | Include bot users |
| `--include-closed` | flag | — | Include closed issues/MRs |
| `--all-history` | flag | — | Query all history |
### Expert-Only Flags
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `--detail` | flag | — | Per-MR breakdown |
| `--as-of` | date/duration | — | Score at point in time |
| `--explain-score` | flag | — | Score breakdown |
### DB Tables by Mode
| Mode | Primary Tables |
|---|---|
| expert | `notes` (INDEXED BY idx_notes_diffnote_path_created), `merge_requests`, `mr_reviewers` |
| workload | `issues`, `merge_requests`, `mr_reviewers` |
| reviews | `merge_requests`, `discussions`, `notes` |
| active | `discussions`, `notes`, `issues`, `merge_requests` |
| overlap | `notes`, `mr_file_changes`, `merge_requests` |
### Robot Output (expert)
```json
{
"ok": true,
"data": {
"mode": "expert",
"input": { "target": "src/auth/", "path": "src/auth/" },
"resolved_input": { "mode": "expert", "project_id": 1, "project_path": "group/repo" },
"result": {
"experts": [
{
"username": "jdoe", "score": 42.5,
"detail": { "mr_ids_author": [99, 101], "mr_ids_reviewer": [88] }
}
]
}
}
}
```
### Robot Output (workload)
```json
{
"data": {
"mode": "workload",
"result": {
"assigned_issues": [{ "iid": 42, "title": "Fix auth", "state": "opened" }],
"authored_mrs": [{ "iid": 99, "title": "Refactor auth", "state": "merged" }],
"review_mrs": [{ "iid": 88, "title": "Add SSO", "state": "opened" }]
}
}
}
```
### Robot Output (reviews)
```json
{
"data": {
"mode": "reviews",
"result": {
"categories": [
{
"category": "approval_rate",
"reviewers": [{ "name": "jdoe", "count": 15, "percentage": 85.0 }]
}
]
}
}
}
```
### Robot Output (active)
```json
{
"data": {
"mode": "active",
"result": {
"discussions": [
{ "entity_type": "mr", "iid": 99, "title": "Refactor auth", "participants": ["jdoe", "reviewer"] }
]
}
}
}
```
### Robot Output (overlap)
```json
{
"data": {
"mode": "overlap",
"result": {
"users": [{ "username": "jdoe", "touch_count": 15 }]
}
}
}
```
### Minimal Presets
| Mode | Fields |
|---|---|
| expert | `username`, `score` |
| workload | `iid`, `title`, `state` |
| reviews | `name`, `count`, `percentage` |
| active | `entity_type`, `iid`, `title`, `participants` |
| overlap | `username`, `touch_count` |
---
## `timeline`
Reconstruct chronological event history for a topic/entity with cross-reference expansion.
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `<QUERY>` | positional | required | Search text or entity ref (`issue:42`, `mr:99`) |
| `-p, --project` | string | — | Scope to project |
| `--since` | duration/date | — | Filter events after |
| `--depth` | int | 1 | Cross-ref expansion depth (0=none) |
| `--no-mentions` | flag | — | Skip "mentioned" edges, keep "closes"/"related" |
| `-n, --limit` | int | 100 | Max events |
| `--fields` | string | — | Preset: `minimal` |
| `--max-seeds` | int | 10 | Max seed entities from search |
| `--max-entities` | int | 50 | Max expanded entities |
| `--max-evidence` | int | 10 | Max evidence notes |
**Pipeline:** SEED -> HYDRATE -> EXPAND -> COLLECT -> RENDER
**DB tables:** `issues`, `merge_requests`, `discussions`, `notes`, `entity_references`, `resource_state_events`, `resource_label_events`, `resource_milestone_events`, `documents` (for search seeding)
### Robot Output
```json
{
"ok": true,
"data": {
"query": "authentication", "event_count": 25,
"seed_entities": [{ "type": "issue", "iid": 42, "project": "group/repo" }],
"expanded_entities": [
{
"type": "mr", "iid": 99, "project": "group/repo", "depth": 1,
"via": {
"from": { "type": "issue", "iid": 42 },
"reference_type": "closes"
}
}
],
"unresolved_references": [
{
"source": { "type": "issue", "iid": 42, "project": "group/repo" },
"target_type": "mr", "target_iid": 200, "reference_type": "mentioned"
}
],
"events": [
{
"timestamp": "2026-01-15T10:30:00Z",
"entity_type": "issue", "entity_iid": 42, "project": "group/repo",
"event_type": "state_changed", "summary": "Reopened",
"actor": "jdoe", "is_seed": true,
"evidence_notes": [{ "author": "jdoe", "snippet": "..." }]
}
]
},
"meta": {
"elapsed_ms": 150, "search_mode": "fts",
"expansion_depth": 1, "include_mentions": true,
"total_entities": 5, "total_events": 25,
"evidence_notes_included": 8, "discussion_threads_included": 3,
"unresolved_references": 1, "showing": 25
}
}
```
**Minimal preset:** `timestamp`, `type`, `entity_iid`, `detail`
---
## `me` (Personal Dashboard)
Personal work dashboard with issues, MRs, activity, and since-last-check inbox.
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `--issues` | flag | — | Open issues section only |
| `--mrs` | flag | — | MRs section only |
| `--activity` | flag | — | Activity feed only |
| `--since` | duration/date | `30d` | Activity window |
| `-p, --project` | string | — | Scope to one project |
| `--all` | flag | — | All synced projects |
| `--user` | string | — | Override configured username |
| `--fields` | string | — | Preset: `minimal` |
| `--reset-cursor` | flag | — | Clear since-last-check cursor |
**Sections (no flags = all):** Issues, MRs authored, MRs reviewing, Activity, Inbox
**DB tables:** `issues`, `merge_requests`, `resource_state_events`, `projects`, `issue_labels`, `mr_labels`
### Robot Output
```json
{
"ok": true,
"data": {
"username": "jdoe",
"summary": {
"project_count": 3, "open_issue_count": 5,
"authored_mr_count": 2, "reviewing_mr_count": 1,
"needs_attention_count": 3
},
"since_last_check": {
"cursor_iso": "2026-02-25T18:00:00Z",
"total_event_count": 8,
"groups": [
{
"entity_type": "issue", "entity_iid": 42,
"entity_title": "Fix auth", "project": "group/repo",
"events": [
{ "timestamp_iso": "...", "event_type": "comment",
"actor": "reviewer", "summary": "New comment" }
]
}
]
},
"open_issues": [
{
"project": "group/repo", "iid": 42, "title": "Fix auth",
"state": "opened", "attention_state": "needs_attention",
"status_name": "In progress", "labels": ["auth"],
"updated_at_iso": "..."
}
],
"open_mrs_authored": [
{
"project": "group/repo", "iid": 99, "title": "Refactor auth",
"state": "opened", "attention_state": "needs_attention",
"draft": false, "labels": ["backend"], "updated_at_iso": "..."
}
],
"reviewing_mrs": [],
"activity": [
{
"timestamp_iso": "...", "event_type": "state_changed",
"entity_type": "issue", "entity_iid": 42, "project": "group/repo",
"actor": "jdoe", "is_own": true, "summary": "Closed"
}
]
}
}
```
**Minimal presets:** Items: `iid, title, attention_state, updated_at_iso` | Activity: `timestamp_iso, event_type, entity_iid, actor`
---
## `file-history`
Show which MRs touched a file, with linked discussions.
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `<PATH>` | positional | required | File path to trace |
| `-p, --project` | string | — | Scope to project |
| `--discussions` | flag | — | Include DiffNote snippets |
| `--no-follow-renames` | flag | — | Skip rename chain resolution |
| `--merged` | flag | — | Only merged MRs |
| `-n, --limit` | int | 50 | Max MRs |
**DB tables:** `mr_file_changes`, `merge_requests`, `notes` (DiffNotes), `projects`
### Robot Output
```json
{
"ok": true,
"data": {
"path": "src/auth/middleware.rs",
"rename_chain": [
{ "previous_path": "src/auth.rs", "mr_iid": 55, "merged_at": "..." }
],
"merge_requests": [
{
"iid": 99, "title": "Refactor auth", "state": "merged",
"author": "jdoe", "merged_at": "...", "change_type": "modified"
}
],
"discussions": [
{
"discussion_id": 123, "mr_iid": 99, "author": "reviewer",
"body_snippet": "...", "path": "src/auth/middleware.rs"
}
]
},
"meta": { "elapsed_ms": 30, "total_mrs": 5, "renames_followed": true }
}
```
---
## `trace`
File -> MR -> issue -> discussion chain to understand why code was introduced.
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `<PATH>` | positional | required | File path (future: `:line` suffix) |
| `-p, --project` | string | — | Scope to project |
| `--discussions` | flag | — | Include DiffNote snippets |
| `--no-follow-renames` | flag | — | Skip rename chain |
| `-n, --limit` | int | 20 | Max chains |
**DB tables:** `mr_file_changes`, `merge_requests`, `issues`, `discussions`, `notes`, `entity_references`
### Robot Output
```json
{
"ok": true,
"data": {
"path": "src/auth/middleware.rs",
"resolved_paths": ["src/auth/middleware.rs", "src/auth.rs"],
"trace_chains": [
{
"mr_iid": 99, "mr_title": "Refactor auth", "mr_state": "merged",
"mr_author": "jdoe", "change_type": "modified",
"merged_at_iso": "...", "web_url": "...",
"issues": [42],
"discussions": [
{
"discussion_id": 123, "author_username": "reviewer",
"body_snippet": "...", "path": "src/auth/middleware.rs"
}
]
}
]
},
"meta": { "tier": "api_only", "total_chains": 3, "renames_followed": 1 }
}
```
---
## `related`
Find semantically related entities via vector search.
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `<QUERY_OR_TYPE>` | positional | required | Entity type (`issues`, `mrs`) or free text |
| `[IID]` | positional | — | Entity IID (required with entity type) |
| `-n, --limit` | int | 10 | Max results |
| `-p, --project` | string | — | Scope to project |
**Two modes:**
- **Entity mode:** `related issues 42` — find entities similar to issue #42
- **Query mode:** `related "auth flow"` — find entities matching free text
**DB tables:** `documents`, `embeddings` (vec0), `projects`
**Requires:** Ollama running (for query mode embedding)
### Robot Output (entity mode)
```json
{
"ok": true,
"data": {
"query_entity_type": "issue",
"query_entity_iid": 42,
"query_entity_title": "Fix SSO authentication",
"similar_entities": [
{
"entity_type": "mr", "entity_iid": 99,
"entity_title": "Refactor auth module",
"project_path": "group/repo", "state": "merged",
"similarity_score": 0.87,
"shared_labels": ["auth"], "shared_authors": ["jdoe"]
}
]
}
}
```
---
## `drift`
Detect discussion divergence from original intent.
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `<ENTITY_TYPE>` | positional | required | Currently only `issues` |
| `<IID>` | positional | required | Entity IID |
| `--threshold` | f32 | 0.4 | Similarity threshold (0.0-1.0) |
| `-p, --project` | string | — | Scope to project |
**DB tables:** `issues`, `discussions`, `notes`, `embeddings`
**Requires:** Ollama running
### Robot Output
```json
{
"ok": true,
"data": {
"entity_type": "issue", "entity_iid": 42,
"total_notes": 15,
"detected_drift": true,
"drift_point": {
"note_index": 8, "similarity": 0.32,
"author": "someone", "created_at": "..."
},
"similarity_curve": [
{ "note_index": 0, "similarity": 0.95, "author": "jdoe", "created_at": "..." },
{ "note_index": 1, "similarity": 0.88, "author": "reviewer", "created_at": "..." }
]
}
}
```

View File

@@ -0,0 +1,210 @@
# Pipeline & Infrastructure Commands
Reference for: `sync`, `ingest`, `generate-docs`, `embed`, `health`, `auth`, `doctor`, `status`, `stats`, `init`, `token`, `cron`, `migrate`, `version`, `completions`, `robot-docs`
---
## Data Pipeline
### `sync` (Full Pipeline)
Complete sync: ingest -> generate-docs -> embed.
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `--full` | flag | — | Full re-sync (reset cursors) |
| `-f, --force` | flag | — | Override stale lock |
| `--no-embed` | flag | — | Skip embedding |
| `--no-docs` | flag | — | Skip doc generation |
| `--no-events` | flag | — | Skip resource events |
| `--no-file-changes` | flag | — | Skip MR file changes |
| `--no-status` | flag | — | Skip work-item status enrichment |
| `--dry-run` | flag | — | Preview without changes |
| `-t, --timings` | flag | — | Show timing breakdown |
| `--lock` | flag | — | Acquire file lock |
| `--issue` | int[] | — | Surgically sync specific issues (repeatable) |
| `--mr` | int[] | — | Surgically sync specific MRs (repeatable) |
| `-p, --project` | string | — | Required with `--issue`/`--mr` |
| `--preflight-only` | flag | — | Validate without DB writes |
**Stages:** GitLab REST ingest -> GraphQL status enrichment -> Document generation -> Ollama embedding
**Surgical sync:** `lore sync --issue 42 --mr 99 -p group/repo` fetches only specific entities.
### `ingest`
Fetch data from GitLab API only (no docs, no embeddings).
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `[ENTITY]` | positional | — | `issues` or `mrs` (omit for all) |
| `-p, --project` | string | — | Single project |
| `-f, --force` | flag | — | Override stale lock |
| `--full` | flag | — | Full re-sync |
| `--dry-run` | flag | — | Preview |
**Fetches from GitLab:**
- Issues + discussions + notes
- MRs + discussions + notes
- Resource events (state, label, milestone)
- MR file changes (for DiffNote tracking)
- Work-item statuses (via GraphQL)
### `generate-docs`
Create searchable documents from ingested data.
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `--full` | flag | — | Full rebuild |
| `-p, --project` | string | — | Single project rebuild |
**Writes:** `documents`, `document_labels`, `document_paths`
### `embed`
Generate vector embeddings via Ollama.
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `--full` | flag | — | Re-embed all |
| `--retry-failed` | flag | — | Retry failed embeddings |
**Requires:** Ollama running with `nomic-embed-text`
**Writes:** `embeddings`, `embedding_metadata`
---
## Diagnostics
### `health`
Quick pre-flight check (~50ms). Exit 0 = healthy, exit 19 = unhealthy.
**Checks:** config found, DB found, schema version current.
```json
{
"ok": true,
"data": {
"healthy": true,
"config_found": true, "db_found": true,
"schema_current": true, "schema_version": 28
}
}
```
### `auth`
Verify GitLab authentication.
**Checks:** token set, GitLab reachable, user identity.
### `doctor`
Comprehensive environment check.
**Checks:** config validity, token, GitLab connectivity, DB health, migration status, Ollama availability + model status.
```json
{
"ok": true,
"data": {
"config": { "valid": true, "path": "~/.config/lore/config.json" },
"token": { "set": true, "gitlab": { "reachable": true, "user": "jdoe" } },
"database": { "exists": true, "version": 28, "tables": 25 },
"ollama": { "available": true, "model_ready": true }
}
}
```
### `status` (alias: `st`)
Show sync state per project.
```json
{
"ok": true,
"data": {
"projects": [
{
"project_path": "group/repo",
"last_synced_at": "2026-02-26T10:00:00Z",
"document_count": 5000, "discussion_count": 2000, "notes_count": 15000
}
]
}
}
```
### `stats` (alias: `stat`)
Document and index statistics with optional integrity checks.
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `--check` | flag | — | Run integrity checks |
| `--repair` | flag | — | Fix issues (implies `--check`) |
| `--dry-run` | flag | — | Preview repairs |
```json
{
"ok": true,
"data": {
"documents": { "total": 61652, "issues": 5000, "mrs": 2000, "notes": 50000 },
"embeddings": { "total": 80000, "synced": 79500, "pending": 500, "failed": 0 },
"fts": { "total_docs": 61652 },
"queues": { "pending": 0, "in_progress": 0, "failed": 0, "max_attempts": 0 },
"integrity": {
"ok": true, "fts_doc_mismatch": 0, "orphan_embeddings": 0,
"stale_metadata": 0, "orphan_state_events": 0
}
}
}
```
---
## Setup
### `init`
Initialize configuration and database.
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `-f, --force` | flag | — | Skip overwrite confirmation |
| `--non-interactive` | flag | — | Fail if prompts needed |
| `--gitlab-url` | string | — | GitLab base URL (required in robot mode) |
| `--token-env-var` | string | — | Env var holding token (required in robot mode) |
| `--projects` | string | — | Comma-separated project paths (required in robot mode) |
| `--default-project` | string | — | Default project path |
### `token`
| Subcommand | Flags | Purpose |
|---|---|---|
| `token set` | `--token <TOKEN>` | Store token (reads stdin if omitted) |
| `token show` | `--unmask` | Display token (masked by default) |
### `cron`
| Subcommand | Flags | Purpose |
|---|---|---|
| `cron install` | `--interval <MINUTES>` (default: 8) | Schedule auto-sync |
| `cron uninstall` | — | Remove cron job |
| `cron status` | — | Check installation |
### `migrate`
Run pending database migrations. No flags.
---
## Meta
| Command | Purpose |
|---|---|
| `version` | Show version string |
| `completions <shell>` | Generate shell completions (bash/zsh/fish/powershell) |
| `robot-docs` | Machine-readable command manifest (`--brief` for ~60% smaller) |

View File

@@ -0,0 +1,179 @@
# Data Flow & Command Network
How commands interconnect through shared data sources and output-to-input dependencies.
---
## 1. Command Network Graph
Arrows mean "output of A feeds as input to B":
```
┌─────────┐
│ search │─────────────────────────────┐
└────┬────┘ │
│ iid │ topic
┌────▼────┐ ┌────▼─────┐
┌─────│ issues │◄───────────────────────│ timeline │
│ │ mrs │ (detail) └──────────┘
│ └────┬────┘ ▲
│ │ iid │ entity ref
│ ┌────▼────┐ ┌──────────────┐ │
│ │ related │ │ file-history │───────┘
│ │ drift │ └──────┬───────┘
│ └─────────┘ │ MR iids
│ ┌────▼────┐
│ │ trace │──── issues (linked)
│ └────┬────┘
│ │ paths
│ ┌────▼────┐
│ │ who │
│ │ (expert)│
│ └─────────┘
file paths ┌─────────┐
│ │ me │──── issues, mrs (dashboard)
▼ └─────────┘
┌──────────┐ ▲
│ notes │ │ (~same data)
└──────────┘ ┌────┴──────┐
│who workload│
└───────────┘
```
### Feed Chains (output of A -> input of B)
| From | To | What Flows |
|---|---|---|
| `search` | `issues`, `mrs` | IIDs from search results -> detail lookup |
| `search` | `timeline` | Topic/query -> chronological history |
| `search` | `related` | Entity IID -> semantic similarity |
| `me` | `issues`, `mrs` | IIDs from dashboard -> detail lookup |
| `trace` | `issues` | Linked issue IIDs -> detail lookup |
| `trace` | `who` | File paths -> expert lookup |
| `file-history` | `mrs` | MR IIDs -> detail lookup |
| `file-history` | `timeline` | Entity refs -> chronological events |
| `timeline` | `issues`, `mrs` | Referenced IIDs -> detail lookup |
| `who expert` | `who reviews` | Username -> review patterns |
| `who expert` | `mrs` | MR IIDs from expert detail -> MR detail |
---
## 2. Shared Data Source Map
Which DB tables power which commands. Higher overlap = stronger consolidation signal.
### Primary Entity Tables
| Table | Read By |
|---|---|
| `issues` | issues, me, who-workload, search, timeline, trace, count, stats |
| `merge_requests` | mrs, me, who-workload, search, timeline, trace, file-history, count, stats |
| `notes` | notes, issues-detail, mrs-detail, who-expert, who-active, search, timeline, trace, file-history |
| `discussions` | notes, issues-detail, mrs-detail, who-active, who-reviews, timeline, trace |
### Relationship Tables
| Table | Read By |
|---|---|
| `entity_references` | trace, timeline |
| `mr_file_changes` | trace, file-history, who-overlap |
| `issue_labels` | issues, me |
| `mr_labels` | mrs, me |
| `issue_assignees` | issues, me |
| `mr_reviewers` | mrs, who-expert, who-workload |
### Event Tables
| Table | Read By |
|---|---|
| `resource_state_events` | timeline, me-activity |
| `resource_label_events` | timeline |
| `resource_milestone_events` | timeline |
### Document/Search Tables
| Table | Read By |
|---|---|
| `documents` + `documents_fts` | search, stats |
| `embeddings` | search, related, drift |
| `document_labels` | search |
| `document_paths` | search |
### Infrastructure Tables
| Table | Read By |
|---|---|
| `sync_cursors` | status |
| `dirty_sources` | stats |
| `embedding_metadata` | stats, embed |
---
## 3. Shared-Data Clusters
Commands that read from the same primary tables form natural clusters:
### Cluster A: Issue/MR Entities
`issues`, `mrs`, `me`, `who workload`, `count`
All read `issues` + `merge_requests` with similar filter patterns (state, author, labels, project). These commands share the same underlying WHERE-clause builder logic.
### Cluster B: Notes/Discussions
`notes`, `issues detail`, `mrs detail`, `who expert`, `who active`, `timeline`
All traverse the `discussions` -> `notes` join path. The `notes` command does it with independent filters; the others embed notes within parent context.
### Cluster C: File Genealogy
`trace`, `file-history`, `who overlap`
All use `mr_file_changes` with rename chain BFS (forward: old_path -> new_path, backward: new_path -> old_path). Shared `resolve_rename_chain()` function.
### Cluster D: Semantic/Vector
`search`, `related`, `drift`
All use `documents` + `embeddings` via Ollama. `search` adds FTS component; `related` is pure vector; `drift` uses vector for divergence scoring.
### Cluster E: Diagnostics
`health`, `auth`, `doctor`, `status`, `stats`
All check system state. `health` < `doctor` (strict subset). `status` checks sync cursors. `stats` checks document/index health. `auth` checks token/connectivity.
---
## 4. Query Pattern Sharing
### Dynamic Filter Builder (used by issues, mrs, notes)
All three list commands use the same pattern: build a WHERE clause dynamically from filter flags with parameterized tokens. Labels use EXISTS subquery against junction table.
### Rename Chain BFS (used by trace, file-history, who overlap)
Forward query:
```sql
SELECT DISTINCT new_path FROM mr_file_changes
WHERE project_id = ?1 AND old_path = ?2 AND change_type = 'renamed'
```
Backward query:
```sql
SELECT DISTINCT old_path FROM mr_file_changes
WHERE project_id = ?1 AND new_path = ?2 AND change_type = 'renamed'
```
Cycle detection via `HashSet` of visited paths, `MAX_RENAME_HOPS = 10`.
### Hybrid Search (used by search, timeline seeding)
RRF ranking: `score = (60 / fts_rank) + (60 / vector_rank)`
FTS5 queries go through `to_fts_query()` which sanitizes input and builds MATCH expressions. Vector search calls Ollama to embed the query, then does cosine similarity against `embeddings` vec0 table.
### Project Resolution (used by most commands)
`resolve_project(conn, project_filter)` does fuzzy matching on `path_with_namespace` — suffix and substring matching. Returns `(project_id, path_with_namespace)`.

View File

@@ -0,0 +1,170 @@
# Overlap Analysis
Quantified functional duplication between commands.
---
## 1. High Overlap (>70%)
### `who workload` vs `me` — 85% overlap
| Dimension | `who @user` (workload) | `me --user @user` |
|---|---|---|
| Assigned issues | Yes | Yes |
| Authored MRs | Yes | Yes |
| Reviewing MRs | Yes | Yes |
| Attention state | No | **Yes** |
| Activity feed | No | **Yes** |
| Since-last-check inbox | No | **Yes** |
| Cross-project | Yes | **Yes** |
**Verdict:** `who workload` is a strict subset of `me`. The only reason to use `who workload` is if you DON'T want attention_state/activity/inbox — but `me --issues --mrs --fields minimal` achieves the same thing.
### `health` vs `doctor` — 90% overlap
| Check | `health` | `doctor` |
|---|---|---|
| Config found | Yes | Yes |
| DB exists | Yes | Yes |
| Schema current | Yes | Yes |
| Token valid | No | **Yes** |
| GitLab reachable | No | **Yes** |
| Ollama available | No | **Yes** |
**Verdict:** `health` is a strict subset of `doctor`. However, `health` has unique value as a ~50ms pre-flight with clean exit 0/19 semantics for scripting.
### `file-history` vs `trace` — 75% overlap
| Feature | `file-history` | `trace` |
|---|---|---|
| Find MRs for file | Yes | Yes |
| Rename chain BFS | Yes | Yes |
| DiffNote discussions | `--discussions` | `--discussions` |
| Follow to linked issues | No | **Yes** |
| `--merged` filter | **Yes** | No |
**Verdict:** `trace` is a superset of `file-history` minus the `--merged` filter. Both use the same `resolve_rename_chain()` function and query `mr_file_changes`.
### `related` query-mode vs `search --mode semantic` — 80% overlap
| Feature | `related "text"` | `search "text" --mode semantic` |
|---|---|---|
| Vector similarity | Yes | Yes |
| FTS component | No | No (semantic mode skips FTS) |
| Filters (labels, author, since) | No | **Yes** |
| Explain ranking | No | **Yes** |
| Field selection | No | **Yes** |
| Requires Ollama | Yes | Yes |
**Verdict:** `related "text"` is `search --mode semantic` without any filter capabilities. The entity-seeded mode (`related issues 42`) is NOT duplicated — it seeds from an existing entity's embedding.
---
## 2. Medium Overlap (40-70%)
### `who expert` vs `who overlap` — 50%
Both answer "who works on this file" but with different scoring:
| Aspect | `who expert` | `who overlap` |
|---|---|---|
| Scoring | Half-life decay, signal types (diffnote_author, reviewer, etc.) | Raw touch count |
| Output | Ranked experts with scores | Users with touch counts |
| Use case | "Who should review this?" | "Who else touches this?" |
**Verdict:** Overlap is a simplified version of expert. Expert could include touch_count as a field.
### `timeline` vs `trace` — 45%
Both follow `entity_references` to discover connected entities, but from different entry points:
| Aspect | `timeline` | `trace` |
|---|---|---|
| Entry point | Entity (issue/MR) or search query | File path |
| Direction | Entity -> cross-refs -> events | File -> MRs -> issues -> discussions |
| Output | Chronological events | Causal chains (why code changed) |
| Expansion | Depth-controlled cross-ref following | MR -> issue via entity_references |
**Verdict:** Complementary, not duplicative. Different questions, shared plumbing.
### `auth` vs `doctor` — 100% of auth
`auth` checks: token set + GitLab reachable + user identity.
`doctor` checks: all of the above + DB + schema + Ollama.
**Verdict:** `auth` is completely contained within `doctor`.
### `count` vs `stats` — 40%
Both answer "how much data?":
| Aspect | `count` | `stats` |
|---|---|---|
| Layer | Entity (issues, MRs, notes) | Document index |
| State breakdown | Yes (opened/closed/merged) | No |
| Integrity checks | No | Yes |
| Queue status | No | Yes |
**Verdict:** Different layers. Could be unified under `stats --entities`.
### `notes` vs `issues/mrs detail` — 50%
Both return note content:
| Aspect | `notes` command | Detail view discussions |
|---|---|---|
| Independent filtering | **Yes** (author, path, resolution, contains, type) | No |
| Parent context | Minimal (parent_iid, parent_title) | **Full** (complete entity + all discussions) |
| Cross-entity queries | **Yes** (all notes matching criteria) | No (one entity only) |
**Verdict:** `notes` is for filtered queries across entities. Detail views are for complete context on one entity. Different use cases.
---
## 3. No Significant Overlap
| Command | Why It's Unique |
|---|---|
| `drift` | Only command doing semantic divergence detection |
| `timeline` | Only command doing multi-entity chronological reconstruction with expansion |
| `search` (hybrid) | Only command combining FTS + vector with RRF ranking |
| `me` (inbox) | Only command with cursor-based since-last-check tracking |
| `who expert` | Only command with half-life decay scoring by signal type |
| `who reviews` | Only command analyzing review patterns (approval rate, latency) |
| `who active` | Only command surfacing unresolved discussions needing attention |
---
## 4. Overlap Adjacency Matrix
Rows/columns are commands. Values are estimated functional overlap percentage.
```
issues mrs notes search who-e who-w who-r who-a who-o timeline me fh trace related drift count status stats health doctor
issues - 30 50 20 5 40 0 5 0 15 40 0 10 10 0 20 0 10 0 0
mrs 30 - 50 20 5 40 0 5 0 15 40 5 10 10 0 20 0 10 0 0
notes 50 50 - 15 15 0 5 10 0 10 0 5 5 0 0 0 0 0 0 0
search 20 20 15 - 0 0 0 0 0 15 0 0 0 80 0 0 0 5 0 0
who-expert 5 5 15 0 - 0 10 0 50 0 0 10 10 0 0 0 0 0 0 0
who-workload 40 40 0 0 0 - 0 0 0 0 85 0 0 0 0 0 0 0 0 0
who-reviews 0 0 5 0 10 0 - 0 0 0 0 0 0 0 0 0 0 0 0 0
who-active 5 5 10 0 0 0 0 - 0 5 0 0 0 0 0 0 0 0 0 0
who-overlap 0 0 0 0 50 0 0 0 - 0 0 10 5 0 0 0 0 0 0 0
timeline 15 15 10 15 0 0 0 5 0 - 5 5 45 0 0 0 0 0 0 0
me 40 40 0 0 0 85 0 0 0 5 - 0 0 0 0 0 5 0 5 5
file-history 0 5 5 0 10 0 0 0 10 5 0 - 75 0 0 0 0 0 0 0
trace 10 10 5 0 10 0 0 0 5 45 0 75 - 0 0 0 0 0 0 0
related 10 10 0 80 0 0 0 0 0 0 0 0 0 - 0 0 0 0 0 0
drift 0 0 0 0 0 0 0 0 0 0 0 0 0 0 - 0 0 0 0 0
count 20 20 0 0 0 0 0 0 0 0 0 0 0 0 0 - 0 40 0 0
status 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 - 20 30 40
stats 10 10 0 5 0 0 0 0 0 0 0 0 0 0 0 40 20 - 0 15
health 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 30 0 - 90
doctor 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 40 15 90 -
```
**Highest overlap pairs (>= 75%):**
1. `health` / `doctor` — 90%
2. `who workload` / `me` — 85%
3. `related` query-mode / `search semantic` — 80%
4. `file-history` / `trace` — 75%

View File

@@ -0,0 +1,216 @@
# Agent Workflow Analysis
Common agent workflows, round-trip costs, and token profiles.
---
## 1. Common Workflows
### Flow 1: "What should I work on?" — 4 round trips
```
me → dashboard overview (which items need attention?)
issues <iid> -p proj → detail on picked issue (full context + discussions)
trace src/relevant/file.rs → understand code context (why was it written?)
who src/relevant/file.rs → find domain experts (who can help?)
```
**Total tokens (minimal):** ~800 + ~2000 + ~1000 + ~400 = ~4200
**Total tokens (full):** ~3000 + ~6000 + ~1500 + ~800 = ~11300
**Latency:** 4 serial round trips
### Flow 2: "What happened with this feature?" — 3 round trips
```
search "feature name" → find relevant entities
timeline "feature name" → reconstruct chronological history
related issues 42 → discover connected work
```
**Total tokens (minimal):** ~600 + ~1500 + ~400 = ~2500
**Total tokens (full):** ~2000 + ~5000 + ~1000 = ~8000
**Latency:** 3 serial round trips
### Flow 3: "Why was this code changed?" — 3 round trips
```
trace src/file.rs → file -> MR -> issue chain
issues <iid> -p proj → full issue detail
timeline "issue:42" → full history with cross-refs
```
**Total tokens (minimal):** ~800 + ~2000 + ~1500 = ~4300
**Total tokens (full):** ~1500 + ~6000 + ~5000 = ~12500
**Latency:** 3 serial round trips
### Flow 4: "Is the system healthy?" — 2-4 round trips
```
health → quick pre-flight (pass/fail)
doctor → detailed diagnostics (if health fails)
status → sync state per project
stats → document/index health
```
**Total tokens:** ~100 + ~300 + ~200 + ~400 = ~1000
**Latency:** 2-4 serial round trips (often 1 if health passes)
### Flow 5: "Who can review this?" — 2-3 round trips
```
who src/auth/ → find file experts
who @jdoe --reviews → check reviewer's patterns
```
**Total tokens (minimal):** ~300 + ~300 = ~600
**Latency:** 2 serial round trips
### Flow 6: "Find and understand an issue" — 4 round trips
```
search "query" → discover entities (get IIDs)
issues <iid> → full detail with discussions
timeline "issue:42" → chronological context
related issues 42 → connected entities
```
**Total tokens (minimal):** ~600 + ~2000 + ~1500 + ~400 = ~4500
**Total tokens (full):** ~2000 + ~6000 + ~5000 + ~1000 = ~14000
**Latency:** 4 serial round trips
---
## 2. Token Cost Profiles
Measured typical response sizes in robot mode with default settings:
| Command | Typical Tokens (full) | With `--fields minimal` | Dominant Cost Driver |
|---|---|---|---|
| `me` (all sections) | 2000-5000 | 500-1500 | Open items count |
| `issues` (list, n=50) | 1500-3000 | 400-800 | Labels arrays |
| `issues <iid>` (detail) | 1000-8000 | N/A (no minimal for detail) | Discussion depth |
| `mrs <iid>` (detail) | 1000-8000 | N/A | Discussion depth, DiffNote positions |
| `timeline` (limit=100) | 2000-6000 | 800-1500 | Event count + evidence |
| `search` (n=20) | 1000-3000 | 300-600 | Snippet length |
| `who expert` | 300-800 | 150-300 | Expert count |
| `who workload` | 500-1500 | 200-500 | Open items count |
| `trace` | 500-2000 | 300-800 | Chain depth |
| `file-history` | 300-1500 | 200-500 | MR count |
| `related` | 300-1000 | 200-400 | Result count |
| `drift` | 200-800 | N/A | Similarity curve length |
| `notes` (n=50) | 1500-5000 | 500-1000 | Body length |
| `count` | ~100 | N/A | Fixed structure |
| `stats` | ~500 | N/A | Fixed structure |
| `health` | ~100 | N/A | Fixed structure |
| `doctor` | ~300 | N/A | Fixed structure |
| `status` | ~200 | N/A | Project count |
### Key Observations
1. **Detail commands are expensive.** `issues <iid>` and `mrs <iid>` can hit 8000 tokens due to discussions. This is the content agents actually need, but most of it is discussion body text.
2. **`me` is the most-called command** and ranges 2000-5000 tokens. Agents often just need "do I have work?" which is ~100 tokens (summary counts only).
3. **Lists with labels are wasteful.** Every issue/MR in a list carries its full label array. With 50 items x 5 labels each, that's 250 strings of overhead.
4. **`--fields minimal` helps a lot** — 50-70% reduction on list commands. But it's not available on detail views.
5. **Timeline scales linearly** with event count and evidence notes. The `--max-evidence` flag helps cap the expensive part.
---
## 3. Round-Trip Inefficiency Patterns
### Pattern A: Discovery -> Detail (N+1)
Agent searches, gets 5 results, then needs detail on each:
```
search "auth bug" → 5 results
issues 42 -p proj → detail
issues 55 -p proj → detail
issues 71 -p proj → detail
issues 88 -p proj → detail
issues 95 -p proj → detail
```
**6 round trips** for what should be 2 (search + batch detail).
### Pattern B: Detail -> Context Gathering
Agent gets issue detail, then needs timeline + related + trace:
```
issues 42 -p proj → detail
timeline "issue:42" -p proj → events
related issues 42 -p proj → similar
trace src/file.rs -p proj → code provenance
```
**4 round trips** for what should be 1 (detail with embedded context).
### Pattern C: Health Check Cascade
Agent checks health, discovers issue, drills down:
```
health → unhealthy (exit 19)
doctor → token OK, Ollama missing
stats --check → 5 orphan embeddings
stats --repair → fixed
```
**4 round trips** but only 2 are actually needed (doctor covers health).
### Pattern D: Dashboard -> Action
Agent checks dashboard, picks item, needs full context:
```
me → 5 open issues, 2 MRs
issues 42 -p proj → picked issue detail
who src/auth/ -p proj → expert for help
timeline "issue:42" -p proj → history
```
**4 round trips.** With `--include`, could be 2 (me with inline detail + who).
---
## 4. Optimized Workflow Vision
What the same workflows look like with proposed optimizations:
### Flow 1 Optimized: "What should I work on?" — 2 round trips
```
me --depth titles → 400 tokens: counts + item titles with attention_state
issues 42 --include timeline,trace → 1 call: detail + events + code provenance
```
### Flow 2 Optimized: "What happened with this feature?" — 1-2 round trips
```
search "feature" -n 5 → find entities
issues 42 --include timeline,related → everything in one call
```
### Flow 3 Optimized: "Why was this code changed?" — 1 round trip
```
trace src/file.rs --include experts,timeline → full chain + experts + events
```
### Flow 4 Optimized: "Is the system healthy?" — 1 round trip
```
doctor → covers health + auth + connectivity
# status + stats only if doctor reveals issues
```
### Flow 6 Optimized: "Find and understand" — 2 round trips
```
search "query" -n 5 → discover entities
issues --batch 42,55,71 --include timeline → batch detail with events
```

View File

@@ -0,0 +1,198 @@
# Consolidation Proposals
5 proposals to reduce 34 commands to 29 by merging high-overlap commands.
---
## A. Absorb `file-history` into `trace --shallow`
**Overlap:** 75%. Both do rename chain BFS on `mr_file_changes`, both optionally include DiffNote discussions. `trace` follows `entity_references` to linked issues; `file-history` stops at MRs.
**Current state:**
```bash
# These do nearly the same thing:
lore file-history src/auth/ -p proj --discussions
lore trace src/auth/ -p proj --discussions
# trace just adds: issues linked via entity_references
```
**Proposed change:**
- `trace <path>` — full chain: file -> MR -> issue -> discussions (existing behavior)
- `trace <path> --shallow` — MR-only, no issue following (replaces `file-history`)
- Move `--merged` flag from `file-history` to `trace`
- Deprecate `file-history` as an alias that maps to `trace --shallow`
**Migration path:**
1. Add `--shallow` and `--merged` flags to `trace`
2. Make `file-history` an alias with deprecation warning
3. Update robot-docs to point to `trace`
4. Remove alias after 2 releases
**Breaking changes:** Robot output shape differs slightly (`trace_chains` vs `merge_requests` key name). The `--shallow` variant should match `file-history`'s output shape for compatibility.
**Effort:** Low. Most code is already shared via `resolve_rename_chain()`.
---
## B. Absorb `auth` into `doctor`
**Overlap:** 100% of `auth` is contained within `doctor`.
**Current state:**
```bash
lore auth # checks: token set, GitLab reachable, user identity
lore doctor # checks: all of above + DB + schema + Ollama
```
**Proposed change:**
- `doctor` — full check (existing behavior)
- `doctor --auth` — token + GitLab only (replaces `auth`)
- Keep `health` separate (fast pre-flight, different exit code contract: 0/19)
- Deprecate `auth` as alias for `doctor --auth`
**Migration path:**
1. Add `--auth` flag to `doctor`
2. Make `auth` an alias with deprecation warning
3. Remove alias after 2 releases
**Breaking changes:** None for robot mode (same JSON shape). Exit code mapping needs verification.
**Effort:** Low. Doctor already has the auth check logic.
---
## C. Remove `related` query-mode
**Overlap:** 80% with `search --mode semantic`.
**Current state:**
```bash
# These are functionally equivalent:
lore related "authentication flow"
lore search "authentication flow" --mode semantic
# This is UNIQUE (no overlap):
lore related issues 42
```
**Proposed change:**
- Keep entity-seeded mode: `related issues 42` (seeds from existing entity embedding)
- Remove free-text mode: `related "text"` -> error with suggestion: "Use `search --mode semantic`"
- Alternatively: keep as sugar but document it as equivalent to search
**Migration path:**
1. Add deprecation warning when query-mode is used
2. After 2 releases, remove query-mode parsing
3. Entity-mode stays unchanged
**Breaking changes:** Agents using `related "text"` must switch to `search --mode semantic`. This is a strict improvement since search has filters.
**Effort:** Low. Just argument validation change.
---
## D. Merge `who overlap` into `who expert`
**Overlap:** 50% functional, but overlap is a strict simplification of expert.
**Current state:**
```bash
lore who src/auth/ # expert mode: scored rankings
lore who --overlap src/auth/ # overlap mode: raw touch counts
```
**Proposed change:**
- `who <path>` (expert) adds `touch_count` and `last_touch_at` fields to each expert row
- `who --overlap <path>` becomes an alias for `who <path> --fields username,touch_count`
- Eventually remove `--overlap` flag
**New expert output:**
```json
{
"experts": [
{
"username": "jdoe", "score": 42.5,
"touch_count": 15, "last_touch_at": "2026-02-20",
"detail": { "mr_ids_author": [99, 101] }
}
]
}
```
**Migration path:**
1. Add `touch_count` and `last_touch_at` to expert output
2. Make `--overlap` an alias with deprecation warning
3. Remove `--overlap` after 2 releases
**Breaking changes:** Expert output gains new fields (non-breaking for JSON consumers). Overlap output shape changes if agents were parsing `{ "users": [...] }` vs `{ "experts": [...] }`.
**Effort:** Low. Expert query already touches the same tables; just need to add a COUNT aggregation.
---
## E. Merge `count` and `status` into `stats`
**Overlap:** `count` and `stats` both answer "how much data?"; `status` and `stats` both report system state.
**Current state:**
```bash
lore count issues # entity count + state breakdown
lore count mrs # entity count + state breakdown
lore status # sync cursors per project
lore stats # document/index counts + integrity
```
**Proposed change:**
- `stats` — document/index health (existing behavior, default)
- `stats --entities` — adds entity counts (replaces `count`)
- `stats --sync` — adds sync cursor positions (replaces `status`)
- `stats --all` — everything: entities + sync + documents + integrity
- `stats --check` / `--repair` — unchanged
**New `--all` output:**
```json
{
"data": {
"entities": {
"issues": { "total": 5000, "opened": 200, "closed": 4800 },
"merge_requests": { "total": 1234, "opened": 100, "closed": 50, "merged": 1084 },
"discussions": { "total": 8000 },
"notes": { "total": 282000, "system_excluded": 50000 }
},
"sync": {
"projects": [
{ "project_path": "group/repo", "last_synced_at": "...", "document_count": 5000 }
]
},
"documents": { "total": 61652, "issues": 5000, "mrs": 2000, "notes": 50000 },
"embeddings": { "total": 80000, "synced": 79500, "pending": 500 },
"fts": { "total_docs": 61652 },
"queues": { "pending": 0, "in_progress": 0, "failed": 0 },
"integrity": { "ok": true }
}
}
```
**Migration path:**
1. Add `--entities`, `--sync`, `--all` flags to `stats`
2. Make `count` an alias for `stats --entities` with deprecation warning
3. Make `status` an alias for `stats --sync` with deprecation warning
4. Remove aliases after 2 releases
**Breaking changes:** `count` output currently has `{ "entity": "issues", "count": N, "breakdown": {...} }`. Under `stats --entities`, this becomes nested under `data.entities`. Alias can preserve old shape during deprecation period.
**Effort:** Medium. Need to compose three query paths into one response builder.
---
## Summary
| Consolidation | Removes | Effort | Breaking? |
|---|---|---|---|
| `file-history` -> `trace --shallow` | -1 command | Low | Alias redirect, output shape compat |
| `auth` -> `doctor --auth` | -1 command | Low | Alias redirect |
| `related` query-mode removal | -1 mode | Low | Must switch to `search --mode semantic` |
| `who overlap` -> `who expert` | -1 sub-mode | Low | Output gains fields |
| `count` + `status` -> `stats` | -2 commands | Medium | Output nesting changes |
**Total: 34 commands -> 29 commands.** All changes use deprecation-with-alias pattern for gradual migration.

View File

@@ -0,0 +1,347 @@
# Robot-Mode Optimization Proposals
6 proposals to reduce round trips and token waste for agent consumers.
---
## A. `--include` flag for embedded sub-queries (P0)
**Problem:** The #1 agent inefficiency. Every "understand this entity" workflow requires 3-4 serial round trips: detail + timeline + related + trace.
**Proposal:** Add `--include` flag to detail commands that embeds sub-query results in the response.
```bash
# Before: 4 round trips, ~12000 tokens
lore -J issues 42 -p proj
lore -J timeline "issue:42" -p proj --limit 20
lore -J related issues 42 -p proj -n 5
lore -J trace src/auth/ -p proj
# After: 1 round trip, ~5000 tokens (sub-queries use reduced limits)
lore -J issues 42 -p proj --include timeline,related
```
### Include Matrix
| Base Command | Valid Includes | Default Limits |
|---|---|---|
| `issues <iid>` | `timeline`, `related`, `trace` | 20 events, 5 related, 5 chains |
| `mrs <iid>` | `timeline`, `related`, `file-changes` | 20 events, 5 related |
| `trace <path>` | `experts`, `timeline` | 5 experts, 20 events |
| `me` | `detail` (inline top-N item details) | 3 items detailed |
| `search` | `detail` (inline top-N result details) | 3 results detailed |
### Response Shape
Included data uses `_` prefix to distinguish from base fields:
```json
{
"ok": true,
"data": {
"iid": 42, "title": "Fix auth", "state": "opened",
"discussions": [...],
"_timeline": {
"event_count": 15,
"events": [...]
},
"_related": {
"similar_entities": [...]
}
},
"meta": {
"elapsed_ms": 200,
"_timeline_ms": 45,
"_related_ms": 120
}
}
```
### Error Handling
Sub-query errors are non-fatal. If Ollama is down, `_related` returns an error instead of failing the whole request:
```json
{
"_related_error": "Ollama unavailable — related results skipped"
}
```
### Limit Control
```bash
# Custom limits for included data
lore -J issues 42 --include timeline:50,related:10
```
### Round-Trip Savings
| Workflow | Before | After | Savings |
|---|---|---|---|
| Understand an issue | 4 calls | 1 call | **75%** |
| Why was code changed | 3 calls | 1 call | **67%** |
| Find and understand | 4 calls | 2 calls | **50%** |
**Effort:** High. Each include needs its own sub-query executor, error isolation, and limit enforcement. But the payoff is massive — this single feature halves agent round trips.
---
## B. `--depth` control on `me` (P0)
**Problem:** `me` returns 2000-5000 tokens. Agents checking "do I have work?" only need ~100 tokens.
**Proposal:** Add `--depth` flag with three levels.
```bash
# Counts only (~100 tokens) — "do I have work?"
lore -J me --depth counts
# Titles (~400 tokens) — "what work do I have?"
lore -J me --depth titles
# Full (current behavior, 2000+ tokens) — "give me everything"
lore -J me --depth full
lore -J me # same as --depth full
```
### Depth Levels
| Level | Includes | Typical Tokens |
|---|---|---|
| `counts` | `summary` block only (counts, no items) | ~100 |
| `titles` | summary + item lists with minimal fields (iid, title, attention_state) | ~400 |
| `full` | Everything: items, activity, inbox, discussions | ~2000-5000 |
### Response at `--depth counts`
```json
{
"ok": true,
"data": {
"username": "jdoe",
"summary": {
"project_count": 3,
"open_issue_count": 5,
"authored_mr_count": 2,
"reviewing_mr_count": 1,
"needs_attention_count": 3
}
}
}
```
### Response at `--depth titles`
```json
{
"ok": true,
"data": {
"username": "jdoe",
"summary": { ... },
"open_issues": [
{ "iid": 42, "title": "Fix auth", "attention_state": "needs_attention" }
],
"open_mrs_authored": [
{ "iid": 99, "title": "Refactor auth", "attention_state": "needs_attention" }
],
"reviewing_mrs": []
}
}
```
**Effort:** Low. The data is already available; just need to gate serialization by depth level.
---
## C. `--batch` flag for multi-entity detail (P1)
**Problem:** After search/timeline, agents discover N entity IIDs and need detail on each. Currently N round trips.
**Proposal:** Add `--batch` flag to `issues` and `mrs` detail mode.
```bash
# Before: 3 round trips
lore -J issues 42 -p proj
lore -J issues 55 -p proj
lore -J issues 71 -p proj
# After: 1 round trip
lore -J issues --batch 42,55,71 -p proj
```
### Response
```json
{
"ok": true,
"data": {
"results": [
{ "iid": 42, "title": "Fix auth", "state": "opened", ... },
{ "iid": 55, "title": "Add SSO", "state": "opened", ... },
{ "iid": 71, "title": "Token refresh", "state": "closed", ... }
],
"errors": [
{ "iid": 99, "error": "Not found" }
]
}
}
```
### Constraints
- Max 20 IIDs per batch
- Individual errors don't fail the batch (partial results returned)
- Works with `--include` for maximum efficiency: `--batch 42,55 --include timeline`
- Works with `--fields minimal` for token control
**Effort:** Medium. Need to loop the existing detail handler and compose results.
---
## D. Composite `context` command (P2)
**Problem:** Agents need full context on an entity but must learn `--include` syntax. A purpose-built command is more discoverable.
**Proposal:** Add `context` command that returns detail + timeline + related in one call.
```bash
lore -J context issues 42 -p proj
lore -J context mrs 99 -p proj
```
### Equivalent To
```bash
lore -J issues 42 -p proj --include timeline,related
```
But with optimized defaults:
- Timeline: 20 most recent events, max 3 evidence notes
- Related: top 5 entities
- Discussions: truncated after 5 threads
- Non-fatal: Ollama-dependent parts gracefully degrade
### Response Shape
Same as `issues <iid> --include timeline,related` but with the reduced defaults applied.
### Relationship to `--include`
`context` is sugar for the most common `--include` pattern. Both mechanisms can coexist:
- `context` for the 80% case (agents wanting full entity understanding)
- `--include` for custom combinations
**Effort:** Medium. Thin wrapper around detail + include pipeline.
---
## E. `--max-tokens` response budget (P3)
**Problem:** Response sizes vary wildly (100 to 8000 tokens). Agents can't predict cost in advance.
**Proposal:** Let agents cap response size. Server truncates to fit.
```bash
lore -J me --max-tokens 500
lore -J timeline "feature" --max-tokens 1000
lore -J context issues 42 --max-tokens 2000
```
### Truncation Strategy (priority order)
1. Apply `--fields minimal` if not already set
2. Reduce array lengths (newest/highest-score items survive)
3. Truncate string fields (descriptions, snippets) to 200 chars
4. Omit null/empty fields
5. Drop included sub-queries (if using `--include`)
### Meta Notice
```json
{
"meta": {
"elapsed_ms": 50,
"truncated": true,
"original_tokens": 3500,
"budget_tokens": 1000,
"dropped": ["_related", "discussions[5:]", "activity[10:]"]
}
}
```
### Implementation Notes
Token estimation: rough heuristic based on JSON character count / 4. Doesn't need to be exact — the goal is "roughly this size" not "exactly N tokens."
**Effort:** High. Requires token estimation, progressive truncation logic, and tracking what was dropped.
---
## F. `--format tsv` for list commands (P3)
**Problem:** JSON is verbose for tabular data. List commands return arrays of objects with repeated key names.
**Proposal:** Add `--format tsv` for list commands.
```bash
lore -J issues --format tsv --fields iid,title,state -n 10
```
### Output
```
iid title state
42 Fix auth opened
55 Add SSO opened
71 Token refresh closed
```
### Token Savings
| Command | JSON tokens | TSV tokens | Savings |
|---|---|---|---|
| `issues -n 50 --fields minimal` | ~800 | ~250 | **69%** |
| `mrs -n 50 --fields minimal` | ~800 | ~250 | **69%** |
| `who expert -n 10` | ~300 | ~100 | **67%** |
| `notes -n 50 --fields minimal` | ~1000 | ~350 | **65%** |
### Applicable Commands
TSV works well for flat, tabular data:
- `issues` (list), `mrs` (list), `notes` (list)
- `who expert`, `who overlap`, `who reviews`
- `count`
TSV does NOT work for nested/complex data:
- Detail views (discussions are nested)
- Timeline (events have nested evidence)
- Search (nested explain, labels arrays)
- `me` (multiple sections)
### Agent Parsing
Most LLMs parse TSV naturally. Agents that need structured data can still use JSON.
**Effort:** Medium. Tab-separated serialization for flat structs is straightforward. Need to handle escaping for body text containing tabs/newlines.
---
## Impact Summary
| Optimization | Priority | Effort | Round-Trip Savings | Token Savings |
|---|---|---|---|---|
| `--include` | P0 | High | **50-75%** | Moderate |
| `--depth` on `me` | P0 | Low | None | **60-80%** |
| `--batch` | P1 | Medium | **N-1 per batch** | Moderate |
| `context` command | P2 | Medium | **67-75%** | Moderate |
| `--max-tokens` | P3 | High | None | **Variable** |
| `--format tsv` | P3 | Medium | None | **65-69% on lists** |
### Implementation Order
1. **`--depth` on `me`** — lowest effort, high value, no risk
2. **`--include` on `issues`/`mrs` detail** — highest impact, start with `timeline` include only
3. **`--batch`** — eliminates N+1 pattern
4. **`context` command** — sugar on top of `--include`
5. **`--format tsv`** — nice-to-have, easy to add incrementally
6. **`--max-tokens`** — complex, defer until demand is clear

View File

@@ -0,0 +1,181 @@
# Appendices
---
## A. Robot Output Envelope
All robot-mode responses follow this structure:
```json
{
"ok": true,
"data": { /* command-specific */ },
"meta": { "elapsed_ms": 42 }
}
```
Errors (to stderr):
```json
{
"error": {
"code": "CONFIG_NOT_FOUND",
"message": "Configuration file not found",
"suggestion": "Run 'lore init'",
"actions": ["lore init"]
}
}
```
The `actions` array contains copy-paste shell commands for automated recovery. Omitted when empty.
---
## B. Exit Codes
| Code | Meaning | Retryable |
|---|---|---|
| 0 | Success | N/A |
| 1 | Internal error / not implemented | Maybe |
| 2 | Usage error (invalid flags or arguments) | No (fix syntax) |
| 3 | Config invalid | No (fix config) |
| 4 | Token not set | No (set token) |
| 5 | GitLab auth failed | Maybe (token expired?) |
| 6 | Resource not found (HTTP 404) | No |
| 7 | Rate limited | Yes (wait) |
| 8 | Network error | Yes (retry) |
| 9 | Database locked | Yes (wait) |
| 10 | Database error | Maybe |
| 11 | Migration failed | No (investigate) |
| 12 | I/O error | Maybe |
| 13 | Transform error | No (bug) |
| 14 | Ollama unavailable | Yes (start Ollama) |
| 15 | Ollama model not found | No (pull model) |
| 16 | Embedding failed | Yes (retry) |
| 17 | Not found (entity does not exist) | No |
| 18 | Ambiguous match (use `-p` to specify project) | No (be specific) |
| 19 | Health check failed | Yes (fix issues first) |
| 20 | Config not found | No (run init) |
---
## C. Field Selection Presets
The `--fields` flag supports both presets and custom field lists:
```bash
lore -J issues --fields minimal # Preset
lore -J mrs --fields iid,title,state,draft # Custom comma-separated
```
| Command | Minimal Preset Fields |
|---|---|
| `issues` (list) | `iid`, `title`, `state`, `updated_at_iso` |
| `mrs` (list) | `iid`, `title`, `state`, `updated_at_iso` |
| `notes` (list) | `id`, `author_username`, `body`, `created_at_iso` |
| `search` | `document_id`, `title`, `source_type`, `score` |
| `timeline` | `timestamp`, `type`, `entity_iid`, `detail` |
| `who expert` | `username`, `score` |
| `who workload` | `iid`, `title`, `state` |
| `who reviews` | `name`, `count`, `percentage` |
| `who active` | `entity_type`, `iid`, `title`, `participants` |
| `who overlap` | `username`, `touch_count` |
| `me` (items) | `iid`, `title`, `attention_state`, `updated_at_iso` |
| `me` (activity) | `timestamp_iso`, `event_type`, `entity_iid`, `actor` |
---
## D. Configuration Precedence
1. CLI flags (highest priority)
2. Environment variables (`LORE_ROBOT`, `GITLAB_TOKEN`, `LORE_CONFIG_PATH`)
3. Config file (`~/.config/lore/config.json`)
4. Built-in defaults (lowest priority)
---
## E. Time Parsing
All commands accepting `--since`, `--until`, `--as-of` support:
| Format | Example | Meaning |
|---|---|---|
| Relative days | `7d` | 7 days ago |
| Relative weeks | `2w` | 2 weeks ago |
| Relative months | `1m`, `6m` | 1/6 months ago |
| Absolute date | `2026-01-15` | Specific date |
Internally converted to Unix milliseconds for DB queries.
---
## F. Database Schema (28 migrations)
### Primary Entity Tables
| Table | Key Columns | Notes |
|---|---|---|
| `projects` | `gitlab_project_id`, `path_with_namespace`, `web_url` | No `name` or `last_seen_at` |
| `issues` | `iid`, `title`, `state`, `author_username`, 5 status columns | Status columns nullable (migration 021) |
| `merge_requests` | `iid`, `title`, `state`, `draft`, `source_branch`, `target_branch` | `last_seen_at INTEGER NOT NULL` |
| `discussions` | `gitlab_discussion_id` (text), `issue_id`/`merge_request_id` | One FK must be set |
| `notes` | `gitlab_id`, `author_username`, `body`, DiffNote position columns | `type` column for DiffNote/DiscussionNote |
### Relationship Tables
| Table | Purpose |
|---|---|
| `issue_labels`, `mr_labels` | Label junction (DELETE+INSERT for stale removal) |
| `issue_assignees`, `mr_assignees` | Assignee junction |
| `mr_reviewers` | Reviewer junction |
| `entity_references` | Cross-refs: closes, mentioned, related (with `source_method`) |
| `mr_file_changes` | File diffs: old_path, new_path, change_type |
### Event Tables
| Table | Constraint |
|---|---|
| `resource_state_events` | CHECK: exactly one of issue_id/merge_request_id NOT NULL |
| `resource_label_events` | Same CHECK constraint; `label_name` nullable (migration 012) |
| `resource_milestone_events` | Same CHECK constraint; `milestone_title` nullable |
### Document/Search Pipeline
| Table | Purpose |
|---|---|
| `documents` | Unified searchable content (source_type: issue/merge_request/discussion) |
| `documents_fts` | FTS5 virtual table for text search |
| `documents_fts_docsize` | FTS5 shadow B-tree (19x faster for COUNT) |
| `document_labels` | Fast label filtering (indexed exact-match) |
| `document_paths` | File path association for DiffNote filtering |
| `embeddings` | vec0 virtual table; rowid = document_id * 1000 + chunk_index |
| `embedding_metadata` | Chunk provenance + staleness tracking (document_hash) |
| `dirty_sources` | Documents needing regeneration (with backoff via next_attempt_at) |
### Infrastructure
| Table | Purpose |
|---|---|
| `sync_runs` | Sync history with metrics |
| `sync_cursors` | Per-resource sync position (updated_at cursor + tie_breaker_id) |
| `app_locks` | Crash-safe single-flight lock |
| `raw_payloads` | Raw JSON storage for debugging |
| `pending_discussion_fetches` | Dependent discussion fetch queue |
| `pending_dependent_fetches` | Job queue for resource_events, mr_closes, mr_diffs |
| `schema_version` | Migration tracking |
---
## G. Glossary
| Term | Definition |
|---|---|
| **IID** | Issue/MR number within a project (not globally unique) |
| **FTS5** | SQLite full-text search extension (BM25 ranking) |
| **vec0** | SQLite extension for vector similarity search |
| **RRF** | Reciprocal Rank Fusion — combines FTS and vector rankings |
| **DiffNote** | Comment attached to a specific line in a merge request diff |
| **Entity reference** | Cross-reference between issues/MRs (closes, mentioned, related) |
| **Rename chain** | BFS traversal of mr_file_changes to follow file renames |
| **Attention state** | Computed field on `me` items: needs_attention, not_started, stale, etc. |
| **Surgical sync** | Fetching specific entities by IID instead of full incremental sync |