Files
gitlore/docs/command-surface-analysis/03-pipeline-and-infra.md
teernisse 3f38b3fda7 docs: add comprehensive command surface analysis
Deep analysis of the full `lore` CLI command surface (34 commands across
6 categories) covering command inventory, data flow, overlap analysis,
and optimization proposals.

Document structure:
- Main consolidated doc: docs/command-surface-analysis.md (1251 lines)
- Split sections in docs/command-surface-analysis/ for navigation:
  00-overview.md      - Summary, inventory, priorities
  01-entity-commands.md   - issues, mrs, notes, search, count
  02-intelligence-commands.md - who, timeline, me, file-history, trace, related, drift
  03-pipeline-and-infra.md    - sync, ingest, generate-docs, embed, diagnostics
  04-data-flow.md     - Shared data source map, command network graph
  05-overlap-analysis.md  - Quantified overlap percentages for every command pair
  06-agent-workflows.md   - Common agent flows, round-trip costs, token profiles
  07-consolidation-proposals.md  - 5 proposals to reduce 34 commands to 29
  08-robot-optimization-proposals.md - 6 proposals for --include, --batch, --depth
  09-appendices.md    - Robot output envelope, field presets, exit codes

Key findings:
- High overlap pairs: who-workload/me (~85%), health/doctor (~90%)
- 5 consolidation proposals to reduce command count by 15%
- 6 robot-mode optimization proposals targeting agent round-trip reduction
- Full DB table mapping and data flow documentation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-28 00:08:31 -05:00

211 lines
5.8 KiB
Markdown

# Pipeline & Infrastructure Commands
Reference for: `sync`, `ingest`, `generate-docs`, `embed`, `health`, `auth`, `doctor`, `status`, `stats`, `init`, `token`, `cron`, `migrate`, `version`, `completions`, `robot-docs`
---
## Data Pipeline
### `sync` (Full Pipeline)
Complete sync: ingest -> generate-docs -> embed.
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `--full` | flag | — | Full re-sync (reset cursors) |
| `-f, --force` | flag | — | Override stale lock |
| `--no-embed` | flag | — | Skip embedding |
| `--no-docs` | flag | — | Skip doc generation |
| `--no-events` | flag | — | Skip resource events |
| `--no-file-changes` | flag | — | Skip MR file changes |
| `--no-status` | flag | — | Skip work-item status enrichment |
| `--dry-run` | flag | — | Preview without changes |
| `-t, --timings` | flag | — | Show timing breakdown |
| `--lock` | flag | — | Acquire file lock |
| `--issue` | int[] | — | Surgically sync specific issues (repeatable) |
| `--mr` | int[] | — | Surgically sync specific MRs (repeatable) |
| `-p, --project` | string | — | Required with `--issue`/`--mr` |
| `--preflight-only` | flag | — | Validate without DB writes |
**Stages:** GitLab REST ingest -> GraphQL status enrichment -> Document generation -> Ollama embedding
**Surgical sync:** `lore sync --issue 42 --mr 99 -p group/repo` fetches only specific entities.
### `ingest`
Fetch data from GitLab API only (no docs, no embeddings).
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `[ENTITY]` | positional | — | `issues` or `mrs` (omit for all) |
| `-p, --project` | string | — | Single project |
| `-f, --force` | flag | — | Override stale lock |
| `--full` | flag | — | Full re-sync |
| `--dry-run` | flag | — | Preview |
**Fetches from GitLab:**
- Issues + discussions + notes
- MRs + discussions + notes
- Resource events (state, label, milestone)
- MR file changes (for DiffNote tracking)
- Work-item statuses (via GraphQL)
### `generate-docs`
Create searchable documents from ingested data.
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `--full` | flag | — | Full rebuild |
| `-p, --project` | string | — | Single project rebuild |
**Writes:** `documents`, `document_labels`, `document_paths`
### `embed`
Generate vector embeddings via Ollama.
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `--full` | flag | — | Re-embed all |
| `--retry-failed` | flag | — | Retry failed embeddings |
**Requires:** Ollama running with `nomic-embed-text`
**Writes:** `embeddings`, `embedding_metadata`
---
## Diagnostics
### `health`
Quick pre-flight check (~50ms). Exit 0 = healthy, exit 19 = unhealthy.
**Checks:** config found, DB found, schema version current.
```json
{
"ok": true,
"data": {
"healthy": true,
"config_found": true, "db_found": true,
"schema_current": true, "schema_version": 28
}
}
```
### `auth`
Verify GitLab authentication.
**Checks:** token set, GitLab reachable, user identity.
### `doctor`
Comprehensive environment check.
**Checks:** config validity, token, GitLab connectivity, DB health, migration status, Ollama availability + model status.
```json
{
"ok": true,
"data": {
"config": { "valid": true, "path": "~/.config/lore/config.json" },
"token": { "set": true, "gitlab": { "reachable": true, "user": "jdoe" } },
"database": { "exists": true, "version": 28, "tables": 25 },
"ollama": { "available": true, "model_ready": true }
}
}
```
### `status` (alias: `st`)
Show sync state per project.
```json
{
"ok": true,
"data": {
"projects": [
{
"project_path": "group/repo",
"last_synced_at": "2026-02-26T10:00:00Z",
"document_count": 5000, "discussion_count": 2000, "notes_count": 15000
}
]
}
}
```
### `stats` (alias: `stat`)
Document and index statistics with optional integrity checks.
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `--check` | flag | — | Run integrity checks |
| `--repair` | flag | — | Fix issues (implies `--check`) |
| `--dry-run` | flag | — | Preview repairs |
```json
{
"ok": true,
"data": {
"documents": { "total": 61652, "issues": 5000, "mrs": 2000, "notes": 50000 },
"embeddings": { "total": 80000, "synced": 79500, "pending": 500, "failed": 0 },
"fts": { "total_docs": 61652 },
"queues": { "pending": 0, "in_progress": 0, "failed": 0, "max_attempts": 0 },
"integrity": {
"ok": true, "fts_doc_mismatch": 0, "orphan_embeddings": 0,
"stale_metadata": 0, "orphan_state_events": 0
}
}
}
```
---
## Setup
### `init`
Initialize configuration and database.
| Flag | Type | Default | Purpose |
|---|---|---|---|
| `-f, --force` | flag | — | Skip overwrite confirmation |
| `--non-interactive` | flag | — | Fail if prompts needed |
| `--gitlab-url` | string | — | GitLab base URL (required in robot mode) |
| `--token-env-var` | string | — | Env var holding token (required in robot mode) |
| `--projects` | string | — | Comma-separated project paths (required in robot mode) |
| `--default-project` | string | — | Default project path |
### `token`
| Subcommand | Flags | Purpose |
|---|---|---|
| `token set` | `--token <TOKEN>` | Store token (reads stdin if omitted) |
| `token show` | `--unmask` | Display token (masked by default) |
### `cron`
| Subcommand | Flags | Purpose |
|---|---|---|
| `cron install` | `--interval <MINUTES>` (default: 8) | Schedule auto-sync |
| `cron uninstall` | — | Remove cron job |
| `cron status` | — | Check installation |
### `migrate`
Run pending database migrations. No flags.
---
## Meta
| Command | Purpose |
|---|---|
| `version` | Show version string |
| `completions <shell>` | Generate shell completions (bash/zsh/fish/powershell) |
| `robot-docs` | Machine-readable command manifest (`--brief` for ~60% smaller) |