docs: add comprehensive command surface analysis
Deep analysis of the full `lore` CLI command surface (34 commands across 6 categories) covering command inventory, data flow, overlap analysis, and optimization proposals. Document structure: - Main consolidated doc: docs/command-surface-analysis.md (1251 lines) - Split sections in docs/command-surface-analysis/ for navigation: 00-overview.md - Summary, inventory, priorities 01-entity-commands.md - issues, mrs, notes, search, count 02-intelligence-commands.md - who, timeline, me, file-history, trace, related, drift 03-pipeline-and-infra.md - sync, ingest, generate-docs, embed, diagnostics 04-data-flow.md - Shared data source map, command network graph 05-overlap-analysis.md - Quantified overlap percentages for every command pair 06-agent-workflows.md - Common agent flows, round-trip costs, token profiles 07-consolidation-proposals.md - 5 proposals to reduce 34 commands to 29 08-robot-optimization-proposals.md - 6 proposals for --include, --batch, --depth 09-appendices.md - Robot output envelope, field presets, exit codes Key findings: - High overlap pairs: who-workload/me (~85%), health/doctor (~90%) - 5 consolidation proposals to reduce command count by 15% - 6 robot-mode optimization proposals targeting agent round-trip reduction - Full DB table mapping and data flow documentation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
347
docs/command-surface-analysis/08-robot-optimization-proposals.md
Normal file
347
docs/command-surface-analysis/08-robot-optimization-proposals.md
Normal file
@@ -0,0 +1,347 @@
|
||||
# Robot-Mode Optimization Proposals
|
||||
|
||||
6 proposals to reduce round trips and token waste for agent consumers.
|
||||
|
||||
---
|
||||
|
||||
## A. `--include` flag for embedded sub-queries (P0)
|
||||
|
||||
**Problem:** The #1 agent inefficiency. Every "understand this entity" workflow requires 3-4 serial round trips: detail + timeline + related + trace.
|
||||
|
||||
**Proposal:** Add `--include` flag to detail commands that embeds sub-query results in the response.
|
||||
|
||||
```bash
|
||||
# Before: 4 round trips, ~12000 tokens
|
||||
lore -J issues 42 -p proj
|
||||
lore -J timeline "issue:42" -p proj --limit 20
|
||||
lore -J related issues 42 -p proj -n 5
|
||||
lore -J trace src/auth/ -p proj
|
||||
|
||||
# After: 1 round trip, ~5000 tokens (sub-queries use reduced limits)
|
||||
lore -J issues 42 -p proj --include timeline,related
|
||||
```
|
||||
|
||||
### Include Matrix
|
||||
|
||||
| Base Command | Valid Includes | Default Limits |
|
||||
|---|---|---|
|
||||
| `issues <iid>` | `timeline`, `related`, `trace` | 20 events, 5 related, 5 chains |
|
||||
| `mrs <iid>` | `timeline`, `related`, `file-changes` | 20 events, 5 related |
|
||||
| `trace <path>` | `experts`, `timeline` | 5 experts, 20 events |
|
||||
| `me` | `detail` (inline top-N item details) | 3 items detailed |
|
||||
| `search` | `detail` (inline top-N result details) | 3 results detailed |
|
||||
|
||||
### Response Shape
|
||||
|
||||
Included data uses `_` prefix to distinguish from base fields:
|
||||
|
||||
```json
|
||||
{
|
||||
"ok": true,
|
||||
"data": {
|
||||
"iid": 42, "title": "Fix auth", "state": "opened",
|
||||
"discussions": [...],
|
||||
"_timeline": {
|
||||
"event_count": 15,
|
||||
"events": [...]
|
||||
},
|
||||
"_related": {
|
||||
"similar_entities": [...]
|
||||
}
|
||||
},
|
||||
"meta": {
|
||||
"elapsed_ms": 200,
|
||||
"_timeline_ms": 45,
|
||||
"_related_ms": 120
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Error Handling
|
||||
|
||||
Sub-query errors are non-fatal. If Ollama is down, `_related` returns an error instead of failing the whole request:
|
||||
|
||||
```json
|
||||
{
|
||||
"_related_error": "Ollama unavailable — related results skipped"
|
||||
}
|
||||
```
|
||||
|
||||
### Limit Control
|
||||
|
||||
```bash
|
||||
# Custom limits for included data
|
||||
lore -J issues 42 --include timeline:50,related:10
|
||||
```
|
||||
|
||||
### Round-Trip Savings
|
||||
|
||||
| Workflow | Before | After | Savings |
|
||||
|---|---|---|---|
|
||||
| Understand an issue | 4 calls | 1 call | **75%** |
|
||||
| Why was code changed | 3 calls | 1 call | **67%** |
|
||||
| Find and understand | 4 calls | 2 calls | **50%** |
|
||||
|
||||
**Effort:** High. Each include needs its own sub-query executor, error isolation, and limit enforcement. But the payoff is massive — this single feature halves agent round trips.
|
||||
|
||||
---
|
||||
|
||||
## B. `--depth` control on `me` (P0)
|
||||
|
||||
**Problem:** `me` returns 2000-5000 tokens. Agents checking "do I have work?" only need ~100 tokens.
|
||||
|
||||
**Proposal:** Add `--depth` flag with three levels.
|
||||
|
||||
```bash
|
||||
# Counts only (~100 tokens) — "do I have work?"
|
||||
lore -J me --depth counts
|
||||
|
||||
# Titles (~400 tokens) — "what work do I have?"
|
||||
lore -J me --depth titles
|
||||
|
||||
# Full (current behavior, 2000+ tokens) — "give me everything"
|
||||
lore -J me --depth full
|
||||
lore -J me # same as --depth full
|
||||
```
|
||||
|
||||
### Depth Levels
|
||||
|
||||
| Level | Includes | Typical Tokens |
|
||||
|---|---|---|
|
||||
| `counts` | `summary` block only (counts, no items) | ~100 |
|
||||
| `titles` | summary + item lists with minimal fields (iid, title, attention_state) | ~400 |
|
||||
| `full` | Everything: items, activity, inbox, discussions | ~2000-5000 |
|
||||
|
||||
### Response at `--depth counts`
|
||||
|
||||
```json
|
||||
{
|
||||
"ok": true,
|
||||
"data": {
|
||||
"username": "jdoe",
|
||||
"summary": {
|
||||
"project_count": 3,
|
||||
"open_issue_count": 5,
|
||||
"authored_mr_count": 2,
|
||||
"reviewing_mr_count": 1,
|
||||
"needs_attention_count": 3
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Response at `--depth titles`
|
||||
|
||||
```json
|
||||
{
|
||||
"ok": true,
|
||||
"data": {
|
||||
"username": "jdoe",
|
||||
"summary": { ... },
|
||||
"open_issues": [
|
||||
{ "iid": 42, "title": "Fix auth", "attention_state": "needs_attention" }
|
||||
],
|
||||
"open_mrs_authored": [
|
||||
{ "iid": 99, "title": "Refactor auth", "attention_state": "needs_attention" }
|
||||
],
|
||||
"reviewing_mrs": []
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Effort:** Low. The data is already available; just need to gate serialization by depth level.
|
||||
|
||||
---
|
||||
|
||||
## C. `--batch` flag for multi-entity detail (P1)
|
||||
|
||||
**Problem:** After search/timeline, agents discover N entity IIDs and need detail on each. Currently N round trips.
|
||||
|
||||
**Proposal:** Add `--batch` flag to `issues` and `mrs` detail mode.
|
||||
|
||||
```bash
|
||||
# Before: 3 round trips
|
||||
lore -J issues 42 -p proj
|
||||
lore -J issues 55 -p proj
|
||||
lore -J issues 71 -p proj
|
||||
|
||||
# After: 1 round trip
|
||||
lore -J issues --batch 42,55,71 -p proj
|
||||
```
|
||||
|
||||
### Response
|
||||
|
||||
```json
|
||||
{
|
||||
"ok": true,
|
||||
"data": {
|
||||
"results": [
|
||||
{ "iid": 42, "title": "Fix auth", "state": "opened", ... },
|
||||
{ "iid": 55, "title": "Add SSO", "state": "opened", ... },
|
||||
{ "iid": 71, "title": "Token refresh", "state": "closed", ... }
|
||||
],
|
||||
"errors": [
|
||||
{ "iid": 99, "error": "Not found" }
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Constraints
|
||||
|
||||
- Max 20 IIDs per batch
|
||||
- Individual errors don't fail the batch (partial results returned)
|
||||
- Works with `--include` for maximum efficiency: `--batch 42,55 --include timeline`
|
||||
- Works with `--fields minimal` for token control
|
||||
|
||||
**Effort:** Medium. Need to loop the existing detail handler and compose results.
|
||||
|
||||
---
|
||||
|
||||
## D. Composite `context` command (P2)
|
||||
|
||||
**Problem:** Agents need full context on an entity but must learn `--include` syntax. A purpose-built command is more discoverable.
|
||||
|
||||
**Proposal:** Add `context` command that returns detail + timeline + related in one call.
|
||||
|
||||
```bash
|
||||
lore -J context issues 42 -p proj
|
||||
lore -J context mrs 99 -p proj
|
||||
```
|
||||
|
||||
### Equivalent To
|
||||
|
||||
```bash
|
||||
lore -J issues 42 -p proj --include timeline,related
|
||||
```
|
||||
|
||||
But with optimized defaults:
|
||||
- Timeline: 20 most recent events, max 3 evidence notes
|
||||
- Related: top 5 entities
|
||||
- Discussions: truncated after 5 threads
|
||||
- Non-fatal: Ollama-dependent parts gracefully degrade
|
||||
|
||||
### Response Shape
|
||||
|
||||
Same as `issues <iid> --include timeline,related` but with the reduced defaults applied.
|
||||
|
||||
### Relationship to `--include`
|
||||
|
||||
`context` is sugar for the most common `--include` pattern. Both mechanisms can coexist:
|
||||
- `context` for the 80% case (agents wanting full entity understanding)
|
||||
- `--include` for custom combinations
|
||||
|
||||
**Effort:** Medium. Thin wrapper around detail + include pipeline.
|
||||
|
||||
---
|
||||
|
||||
## E. `--max-tokens` response budget (P3)
|
||||
|
||||
**Problem:** Response sizes vary wildly (100 to 8000 tokens). Agents can't predict cost in advance.
|
||||
|
||||
**Proposal:** Let agents cap response size. Server truncates to fit.
|
||||
|
||||
```bash
|
||||
lore -J me --max-tokens 500
|
||||
lore -J timeline "feature" --max-tokens 1000
|
||||
lore -J context issues 42 --max-tokens 2000
|
||||
```
|
||||
|
||||
### Truncation Strategy (priority order)
|
||||
|
||||
1. Apply `--fields minimal` if not already set
|
||||
2. Reduce array lengths (newest/highest-score items survive)
|
||||
3. Truncate string fields (descriptions, snippets) to 200 chars
|
||||
4. Omit null/empty fields
|
||||
5. Drop included sub-queries (if using `--include`)
|
||||
|
||||
### Meta Notice
|
||||
|
||||
```json
|
||||
{
|
||||
"meta": {
|
||||
"elapsed_ms": 50,
|
||||
"truncated": true,
|
||||
"original_tokens": 3500,
|
||||
"budget_tokens": 1000,
|
||||
"dropped": ["_related", "discussions[5:]", "activity[10:]"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Implementation Notes
|
||||
|
||||
Token estimation: rough heuristic based on JSON character count / 4. Doesn't need to be exact — the goal is "roughly this size" not "exactly N tokens."
|
||||
|
||||
**Effort:** High. Requires token estimation, progressive truncation logic, and tracking what was dropped.
|
||||
|
||||
---
|
||||
|
||||
## F. `--format tsv` for list commands (P3)
|
||||
|
||||
**Problem:** JSON is verbose for tabular data. List commands return arrays of objects with repeated key names.
|
||||
|
||||
**Proposal:** Add `--format tsv` for list commands.
|
||||
|
||||
```bash
|
||||
lore -J issues --format tsv --fields iid,title,state -n 10
|
||||
```
|
||||
|
||||
### Output
|
||||
|
||||
```
|
||||
iid title state
|
||||
42 Fix auth opened
|
||||
55 Add SSO opened
|
||||
71 Token refresh closed
|
||||
```
|
||||
|
||||
### Token Savings
|
||||
|
||||
| Command | JSON tokens | TSV tokens | Savings |
|
||||
|---|---|---|---|
|
||||
| `issues -n 50 --fields minimal` | ~800 | ~250 | **69%** |
|
||||
| `mrs -n 50 --fields minimal` | ~800 | ~250 | **69%** |
|
||||
| `who expert -n 10` | ~300 | ~100 | **67%** |
|
||||
| `notes -n 50 --fields minimal` | ~1000 | ~350 | **65%** |
|
||||
|
||||
### Applicable Commands
|
||||
|
||||
TSV works well for flat, tabular data:
|
||||
- `issues` (list), `mrs` (list), `notes` (list)
|
||||
- `who expert`, `who overlap`, `who reviews`
|
||||
- `count`
|
||||
|
||||
TSV does NOT work for nested/complex data:
|
||||
- Detail views (discussions are nested)
|
||||
- Timeline (events have nested evidence)
|
||||
- Search (nested explain, labels arrays)
|
||||
- `me` (multiple sections)
|
||||
|
||||
### Agent Parsing
|
||||
|
||||
Most LLMs parse TSV naturally. Agents that need structured data can still use JSON.
|
||||
|
||||
**Effort:** Medium. Tab-separated serialization for flat structs is straightforward. Need to handle escaping for body text containing tabs/newlines.
|
||||
|
||||
---
|
||||
|
||||
## Impact Summary
|
||||
|
||||
| Optimization | Priority | Effort | Round-Trip Savings | Token Savings |
|
||||
|---|---|---|---|---|
|
||||
| `--include` | P0 | High | **50-75%** | Moderate |
|
||||
| `--depth` on `me` | P0 | Low | None | **60-80%** |
|
||||
| `--batch` | P1 | Medium | **N-1 per batch** | Moderate |
|
||||
| `context` command | P2 | Medium | **67-75%** | Moderate |
|
||||
| `--max-tokens` | P3 | High | None | **Variable** |
|
||||
| `--format tsv` | P3 | Medium | None | **65-69% on lists** |
|
||||
|
||||
### Implementation Order
|
||||
|
||||
1. **`--depth` on `me`** — lowest effort, high value, no risk
|
||||
2. **`--include` on `issues`/`mrs` detail** — highest impact, start with `timeline` include only
|
||||
3. **`--batch`** — eliminates N+1 pattern
|
||||
4. **`context` command** — sugar on top of `--include`
|
||||
5. **`--format tsv`** — nice-to-have, easy to add incrementally
|
||||
6. **`--max-tokens`** — complex, defer until demand is clear
|
||||
Reference in New Issue
Block a user