Files
gitlore/docs/command-surface-analysis/08-robot-optimization-proposals.md
teernisse 3f38b3fda7 docs: add comprehensive command surface analysis
Deep analysis of the full `lore` CLI command surface (34 commands across
6 categories) covering command inventory, data flow, overlap analysis,
and optimization proposals.

Document structure:
- Main consolidated doc: docs/command-surface-analysis.md (1251 lines)
- Split sections in docs/command-surface-analysis/ for navigation:
  00-overview.md      - Summary, inventory, priorities
  01-entity-commands.md   - issues, mrs, notes, search, count
  02-intelligence-commands.md - who, timeline, me, file-history, trace, related, drift
  03-pipeline-and-infra.md    - sync, ingest, generate-docs, embed, diagnostics
  04-data-flow.md     - Shared data source map, command network graph
  05-overlap-analysis.md  - Quantified overlap percentages for every command pair
  06-agent-workflows.md   - Common agent flows, round-trip costs, token profiles
  07-consolidation-proposals.md  - 5 proposals to reduce 34 commands to 29
  08-robot-optimization-proposals.md - 6 proposals for --include, --batch, --depth
  09-appendices.md    - Robot output envelope, field presets, exit codes

Key findings:
- High overlap pairs: who-workload/me (~85%), health/doctor (~90%)
- 5 consolidation proposals to reduce command count by 15%
- 6 robot-mode optimization proposals targeting agent round-trip reduction
- Full DB table mapping and data flow documentation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-28 00:08:31 -05:00

9.2 KiB

Robot-Mode Optimization Proposals

6 proposals to reduce round trips and token waste for agent consumers.


A. --include flag for embedded sub-queries (P0)

Problem: The #1 agent inefficiency. Every "understand this entity" workflow requires 3-4 serial round trips: detail + timeline + related + trace.

Proposal: Add --include flag to detail commands that embeds sub-query results in the response.

# Before: 4 round trips, ~12000 tokens
lore -J issues 42 -p proj
lore -J timeline "issue:42" -p proj --limit 20
lore -J related issues 42 -p proj -n 5
lore -J trace src/auth/ -p proj

# After: 1 round trip, ~5000 tokens (sub-queries use reduced limits)
lore -J issues 42 -p proj --include timeline,related

Include Matrix

Base Command Valid Includes Default Limits
issues <iid> timeline, related, trace 20 events, 5 related, 5 chains
mrs <iid> timeline, related, file-changes 20 events, 5 related
trace <path> experts, timeline 5 experts, 20 events
me detail (inline top-N item details) 3 items detailed
search detail (inline top-N result details) 3 results detailed

Response Shape

Included data uses _ prefix to distinguish from base fields:

{
  "ok": true,
  "data": {
    "iid": 42, "title": "Fix auth", "state": "opened",
    "discussions": [...],
    "_timeline": {
      "event_count": 15,
      "events": [...]
    },
    "_related": {
      "similar_entities": [...]
    }
  },
  "meta": {
    "elapsed_ms": 200,
    "_timeline_ms": 45,
    "_related_ms": 120
  }
}

Error Handling

Sub-query errors are non-fatal. If Ollama is down, _related returns an error instead of failing the whole request:

{
  "_related_error": "Ollama unavailable — related results skipped"
}

Limit Control

# Custom limits for included data
lore -J issues 42 --include timeline:50,related:10

Round-Trip Savings

Workflow Before After Savings
Understand an issue 4 calls 1 call 75%
Why was code changed 3 calls 1 call 67%
Find and understand 4 calls 2 calls 50%

Effort: High. Each include needs its own sub-query executor, error isolation, and limit enforcement. But the payoff is massive — this single feature halves agent round trips.


B. --depth control on me (P0)

Problem: me returns 2000-5000 tokens. Agents checking "do I have work?" only need ~100 tokens.

Proposal: Add --depth flag with three levels.

# Counts only (~100 tokens) — "do I have work?"
lore -J me --depth counts

# Titles (~400 tokens) — "what work do I have?"
lore -J me --depth titles

# Full (current behavior, 2000+ tokens) — "give me everything"
lore -J me --depth full
lore -J me  # same as --depth full

Depth Levels

Level Includes Typical Tokens
counts summary block only (counts, no items) ~100
titles summary + item lists with minimal fields (iid, title, attention_state) ~400
full Everything: items, activity, inbox, discussions ~2000-5000

Response at --depth counts

{
  "ok": true,
  "data": {
    "username": "jdoe",
    "summary": {
      "project_count": 3,
      "open_issue_count": 5,
      "authored_mr_count": 2,
      "reviewing_mr_count": 1,
      "needs_attention_count": 3
    }
  }
}

Response at --depth titles

{
  "ok": true,
  "data": {
    "username": "jdoe",
    "summary": { ... },
    "open_issues": [
      { "iid": 42, "title": "Fix auth", "attention_state": "needs_attention" }
    ],
    "open_mrs_authored": [
      { "iid": 99, "title": "Refactor auth", "attention_state": "needs_attention" }
    ],
    "reviewing_mrs": []
  }
}

Effort: Low. The data is already available; just need to gate serialization by depth level.


C. --batch flag for multi-entity detail (P1)

Problem: After search/timeline, agents discover N entity IIDs and need detail on each. Currently N round trips.

Proposal: Add --batch flag to issues and mrs detail mode.

# Before: 3 round trips
lore -J issues 42 -p proj
lore -J issues 55 -p proj
lore -J issues 71 -p proj

# After: 1 round trip
lore -J issues --batch 42,55,71 -p proj

Response

{
  "ok": true,
  "data": {
    "results": [
      { "iid": 42, "title": "Fix auth", "state": "opened", ... },
      { "iid": 55, "title": "Add SSO", "state": "opened", ... },
      { "iid": 71, "title": "Token refresh", "state": "closed", ... }
    ],
    "errors": [
      { "iid": 99, "error": "Not found" }
    ]
  }
}

Constraints

  • Max 20 IIDs per batch
  • Individual errors don't fail the batch (partial results returned)
  • Works with --include for maximum efficiency: --batch 42,55 --include timeline
  • Works with --fields minimal for token control

Effort: Medium. Need to loop the existing detail handler and compose results.


D. Composite context command (P2)

Problem: Agents need full context on an entity but must learn --include syntax. A purpose-built command is more discoverable.

Proposal: Add context command that returns detail + timeline + related in one call.

lore -J context issues 42 -p proj
lore -J context mrs 99 -p proj

Equivalent To

lore -J issues 42 -p proj --include timeline,related

But with optimized defaults:

  • Timeline: 20 most recent events, max 3 evidence notes
  • Related: top 5 entities
  • Discussions: truncated after 5 threads
  • Non-fatal: Ollama-dependent parts gracefully degrade

Response Shape

Same as issues <iid> --include timeline,related but with the reduced defaults applied.

Relationship to --include

context is sugar for the most common --include pattern. Both mechanisms can coexist:

  • context for the 80% case (agents wanting full entity understanding)
  • --include for custom combinations

Effort: Medium. Thin wrapper around detail + include pipeline.


E. --max-tokens response budget (P3)

Problem: Response sizes vary wildly (100 to 8000 tokens). Agents can't predict cost in advance.

Proposal: Let agents cap response size. Server truncates to fit.

lore -J me --max-tokens 500
lore -J timeline "feature" --max-tokens 1000
lore -J context issues 42 --max-tokens 2000

Truncation Strategy (priority order)

  1. Apply --fields minimal if not already set
  2. Reduce array lengths (newest/highest-score items survive)
  3. Truncate string fields (descriptions, snippets) to 200 chars
  4. Omit null/empty fields
  5. Drop included sub-queries (if using --include)

Meta Notice

{
  "meta": {
    "elapsed_ms": 50,
    "truncated": true,
    "original_tokens": 3500,
    "budget_tokens": 1000,
    "dropped": ["_related", "discussions[5:]", "activity[10:]"]
  }
}

Implementation Notes

Token estimation: rough heuristic based on JSON character count / 4. Doesn't need to be exact — the goal is "roughly this size" not "exactly N tokens."

Effort: High. Requires token estimation, progressive truncation logic, and tracking what was dropped.


F. --format tsv for list commands (P3)

Problem: JSON is verbose for tabular data. List commands return arrays of objects with repeated key names.

Proposal: Add --format tsv for list commands.

lore -J issues --format tsv --fields iid,title,state -n 10

Output

iid	title	state
42	Fix auth	opened
55	Add SSO	opened
71	Token refresh	closed

Token Savings

Command JSON tokens TSV tokens Savings
issues -n 50 --fields minimal ~800 ~250 69%
mrs -n 50 --fields minimal ~800 ~250 69%
who expert -n 10 ~300 ~100 67%
notes -n 50 --fields minimal ~1000 ~350 65%

Applicable Commands

TSV works well for flat, tabular data:

  • issues (list), mrs (list), notes (list)
  • who expert, who overlap, who reviews
  • count

TSV does NOT work for nested/complex data:

  • Detail views (discussions are nested)
  • Timeline (events have nested evidence)
  • Search (nested explain, labels arrays)
  • me (multiple sections)

Agent Parsing

Most LLMs parse TSV naturally. Agents that need structured data can still use JSON.

Effort: Medium. Tab-separated serialization for flat structs is straightforward. Need to handle escaping for body text containing tabs/newlines.


Impact Summary

Optimization Priority Effort Round-Trip Savings Token Savings
--include P0 High 50-75% Moderate
--depth on me P0 Low None 60-80%
--batch P1 Medium N-1 per batch Moderate
context command P2 Medium 67-75% Moderate
--max-tokens P3 High None Variable
--format tsv P3 Medium None 65-69% on lists

Implementation Order

  1. --depth on me — lowest effort, high value, no risk
  2. --include on issues/mrs detail — highest impact, start with timeline include only
  3. --batch — eliminates N+1 pattern
  4. context command — sugar on top of --include
  5. --format tsv — nice-to-have, easy to add incrementally
  6. --max-tokens — complex, defer until demand is clear