Files

teernisse 3f38b3fda7 docs: add comprehensive command surface analysis

Deep analysis of the full `lore` CLI command surface (34 commands across
6 categories) covering command inventory, data flow, overlap analysis,
and optimization proposals.

Document structure:
- Main consolidated doc: docs/command-surface-analysis.md (1251 lines)
- Split sections in docs/command-surface-analysis/ for navigation:
  00-overview.md      - Summary, inventory, priorities
  01-entity-commands.md   - issues, mrs, notes, search, count
  02-intelligence-commands.md - who, timeline, me, file-history, trace, related, drift
  03-pipeline-and-infra.md    - sync, ingest, generate-docs, embed, diagnostics
  04-data-flow.md     - Shared data source map, command network graph
  05-overlap-analysis.md  - Quantified overlap percentages for every command pair
  06-agent-workflows.md   - Common agent flows, round-trip costs, token profiles
  07-consolidation-proposals.md  - 5 proposals to reduce 34 commands to 29
  08-robot-optimization-proposals.md - 6 proposals for --include, --batch, --depth
  09-appendices.md    - Robot output envelope, field presets, exit codes

Key findings:
- High overlap pairs: who-workload/me (~85%), health/doctor (~90%)
- 5 consolidation proposals to reduce command count by 15%
- 6 robot-mode optimization proposals targeting agent round-trip reduction
- Full DB table mapping and data flow documentation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-02-28 00:08:31 -05:00

7.1 KiB

Raw Blame History

Agent Workflow Analysis

Common agent workflows, round-trip costs, and token profiles.

1. Common Workflows

Flow 1: "What should I work on?" — 4 round trips

me                            → dashboard overview (which items need attention?)
issues <iid> -p proj         → detail on picked issue (full context + discussions)
trace src/relevant/file.rs    → understand code context (why was it written?)
who src/relevant/file.rs      → find domain experts (who can help?)

Total tokens (minimal): ~800 + ~2000 + ~1000 + ~400 = ~4200 Total tokens (full): ~3000 + ~6000 + ~1500 + ~800 = ~11300 Latency: 4 serial round trips

Flow 2: "What happened with this feature?" — 3 round trips

search "feature name"         → find relevant entities
timeline "feature name"       → reconstruct chronological history
related issues 42             → discover connected work

Total tokens (minimal): ~600 + ~1500 + ~400 = ~2500 Total tokens (full): ~2000 + ~5000 + ~1000 = ~8000 Latency: 3 serial round trips

Flow 3: "Why was this code changed?" — 3 round trips

trace src/file.rs             → file -> MR -> issue chain
issues <iid> -p proj         → full issue detail
timeline "issue:42"           → full history with cross-refs

Total tokens (minimal): ~800 + ~2000 + ~1500 = ~4300 Total tokens (full): ~1500 + ~6000 + ~5000 = ~12500 Latency: 3 serial round trips

Flow 4: "Is the system healthy?" — 2-4 round trips

health                        → quick pre-flight (pass/fail)
doctor                        → detailed diagnostics (if health fails)
status                        → sync state per project
stats                         → document/index health

Total tokens: ~100 + ~300 + ~200 + ~400 = ~1000 Latency: 2-4 serial round trips (often 1 if health passes)

Flow 5: "Who can review this?" — 2-3 round trips

who src/auth/                 → find file experts
who @jdoe --reviews           → check reviewer's patterns

Total tokens (minimal): ~300 + ~300 = ~600 Latency: 2 serial round trips

Flow 6: "Find and understand an issue" — 4 round trips

search "query"                → discover entities (get IIDs)
issues <iid>                  → full detail with discussions
timeline "issue:42"           → chronological context
related issues 42             → connected entities

Total tokens (minimal): ~600 + ~2000 + ~1500 + ~400 = ~4500 Total tokens (full): ~2000 + ~6000 + ~5000 + ~1000 = ~14000 Latency: 4 serial round trips

2. Token Cost Profiles

Measured typical response sizes in robot mode with default settings:

Command	Typical Tokens (full)	With `--fields minimal`	Dominant Cost Driver
`me` (all sections)	2000-5000	500-1500	Open items count
`issues` (list, n=50)	1500-3000	400-800	Labels arrays
`issues <iid>` (detail)	1000-8000	N/A (no minimal for detail)	Discussion depth
`mrs <iid>` (detail)	1000-8000	N/A	Discussion depth, DiffNote positions
`timeline` (limit=100)	2000-6000	800-1500	Event count + evidence
`search` (n=20)	1000-3000	300-600	Snippet length
`who expert`	300-800	150-300	Expert count
`who workload`	500-1500	200-500	Open items count
`trace`	500-2000	300-800	Chain depth
`file-history`	300-1500	200-500	MR count
`related`	300-1000	200-400	Result count
`drift`	200-800	N/A	Similarity curve length
`notes` (n=50)	1500-5000	500-1000	Body length
`count`	~100	N/A	Fixed structure
`stats`	~500	N/A	Fixed structure
`health`	~100	N/A	Fixed structure
`doctor`	~300	N/A	Fixed structure
`status`	~200	N/A	Project count

Key Observations

Detail commands are expensive. issues <iid> and mrs <iid> can hit 8000 tokens due to discussions. This is the content agents actually need, but most of it is discussion body text.
me is the most-called command and ranges 2000-5000 tokens. Agents often just need "do I have work?" which is ~100 tokens (summary counts only).
Lists with labels are wasteful. Every issue/MR in a list carries its full label array. With 50 items x 5 labels each, that's 250 strings of overhead.
--fields minimal helps a lot — 50-70% reduction on list commands. But it's not available on detail views.
Timeline scales linearly with event count and evidence notes. The --max-evidence flag helps cap the expensive part.

3. Round-Trip Inefficiency Patterns

Pattern A: Discovery -> Detail (N+1)

Agent searches, gets 5 results, then needs detail on each:

search "auth bug"              → 5 results
issues 42 -p proj             → detail
issues 55 -p proj             → detail
issues 71 -p proj             → detail
issues 88 -p proj             → detail
issues 95 -p proj             → detail

6 round trips for what should be 2 (search + batch detail).

Pattern B: Detail -> Context Gathering

Agent gets issue detail, then needs timeline + related + trace:

issues 42 -p proj             → detail
timeline "issue:42" -p proj   → events
related issues 42 -p proj     → similar
trace src/file.rs -p proj     → code provenance

4 round trips for what should be 1 (detail with embedded context).

Pattern C: Health Check Cascade

Agent checks health, discovers issue, drills down:

health                         → unhealthy (exit 19)
doctor                         → token OK, Ollama missing
stats --check                  → 5 orphan embeddings
stats --repair                 → fixed

4 round trips but only 2 are actually needed (doctor covers health).

Pattern D: Dashboard -> Action

Agent checks dashboard, picks item, needs full context:

me                             → 5 open issues, 2 MRs
issues 42 -p proj             → picked issue detail
who src/auth/ -p proj         → expert for help
timeline "issue:42" -p proj   → history

4 round trips. With --include, could be 2 (me with inline detail + who).

4. Optimized Workflow Vision

What the same workflows look like with proposed optimizations:

Flow 1 Optimized: "What should I work on?" — 2 round trips

me --depth titles              → 400 tokens: counts + item titles with attention_state
issues 42 --include timeline,trace  → 1 call: detail + events + code provenance

Flow 2 Optimized: "What happened with this feature?" — 1-2 round trips

search "feature" -n 5          → find entities
issues 42 --include timeline,related → everything in one call

Flow 3 Optimized: "Why was this code changed?" — 1 round trip

trace src/file.rs --include experts,timeline → full chain + experts + events

Flow 4 Optimized: "Is the system healthy?" — 1 round trip

doctor                         → covers health + auth + connectivity
# status + stats only if doctor reveals issues

Flow 6 Optimized: "Find and understand" — 2 round trips

search "query" -n 5            → discover entities
issues --batch 42,55,71 --include timeline → batch detail with events

7.1 KiB Raw Blame History