Files

teernisse 3f38b3fda7 docs: add comprehensive command surface analysis

Deep analysis of the full `lore` CLI command surface (34 commands across
6 categories) covering command inventory, data flow, overlap analysis,
and optimization proposals.

Document structure:
- Main consolidated doc: docs/command-surface-analysis.md (1251 lines)
- Split sections in docs/command-surface-analysis/ for navigation:
  00-overview.md      - Summary, inventory, priorities
  01-entity-commands.md   - issues, mrs, notes, search, count
  02-intelligence-commands.md - who, timeline, me, file-history, trace, related, drift
  03-pipeline-and-infra.md    - sync, ingest, generate-docs, embed, diagnostics
  04-data-flow.md     - Shared data source map, command network graph
  05-overlap-analysis.md  - Quantified overlap percentages for every command pair
  06-agent-workflows.md   - Common agent flows, round-trip costs, token profiles
  07-consolidation-proposals.md  - 5 proposals to reduce 34 commands to 29
  08-robot-optimization-proposals.md - 6 proposals for --include, --batch, --depth
  09-appendices.md    - Robot output envelope, field presets, exit codes

Key findings:
- High overlap pairs: who-workload/me (~85%), health/doctor (~90%)
- 5 consolidation proposals to reduce command count by 15%
- 6 robot-mode optimization proposals targeting agent round-trip reduction
- Full DB table mapping and data flow documentation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-02-28 00:08:31 -05:00

9.2 KiB

Raw Blame History

Robot-Mode Optimization Proposals

6 proposals to reduce round trips and token waste for agent consumers.

A. `--include` flag for embedded sub-queries (P0)

Problem: The #1 agent inefficiency. Every "understand this entity" workflow requires 3-4 serial round trips: detail + timeline + related + trace.

Proposal: Add --include flag to detail commands that embeds sub-query results in the response.

# Before: 4 round trips, ~12000 tokens
lore -J issues 42 -p proj
lore -J timeline "issue:42" -p proj --limit 20
lore -J related issues 42 -p proj -n 5
lore -J trace src/auth/ -p proj

# After: 1 round trip, ~5000 tokens (sub-queries use reduced limits)
lore -J issues 42 -p proj --include timeline,related

Include Matrix

Base Command	Valid Includes	Default Limits
`issues <iid>`	`timeline`, `related`, `trace`	20 events, 5 related, 5 chains
`mrs <iid>`	`timeline`, `related`, `file-changes`	20 events, 5 related
`trace <path>`	`experts`, `timeline`	5 experts, 20 events
`me`	`detail` (inline top-N item details)	3 items detailed
`search`	`detail` (inline top-N result details)	3 results detailed

Response Shape

Included data uses _ prefix to distinguish from base fields:

{
  "ok": true,
  "data": {
    "iid": 42, "title": "Fix auth", "state": "opened",
    "discussions": [...],
    "_timeline": {
      "event_count": 15,
      "events": [...]
    },
    "_related": {
      "similar_entities": [...]
    }
  },
  "meta": {
    "elapsed_ms": 200,
    "_timeline_ms": 45,
    "_related_ms": 120
  }
}

Error Handling

Sub-query errors are non-fatal. If Ollama is down, _related returns an error instead of failing the whole request:

{
  "_related_error": "Ollama unavailable — related results skipped"
}

Limit Control

# Custom limits for included data
lore -J issues 42 --include timeline:50,related:10

Round-Trip Savings

Workflow	Before	After	Savings
Understand an issue	4 calls	1 call	75%
Why was code changed	3 calls	1 call	67%
Find and understand	4 calls	2 calls	50%

Effort: High. Each include needs its own sub-query executor, error isolation, and limit enforcement. But the payoff is massive — this single feature halves agent round trips.

B. `--depth` control on `me` (P0)

Problem: me returns 2000-5000 tokens. Agents checking "do I have work?" only need ~100 tokens.

Proposal: Add --depth flag with three levels.

# Counts only (~100 tokens) — "do I have work?"
lore -J me --depth counts

# Titles (~400 tokens) — "what work do I have?"
lore -J me --depth titles

# Full (current behavior, 2000+ tokens) — "give me everything"
lore -J me --depth full
lore -J me  # same as --depth full

Depth Levels

Level	Includes	Typical Tokens
`counts`	`summary` block only (counts, no items)	~100
`titles`	summary + item lists with minimal fields (iid, title, attention_state)	~400
`full`	Everything: items, activity, inbox, discussions	~2000-5000

Response at `--depth counts`

{
  "ok": true,
  "data": {
    "username": "jdoe",
    "summary": {
      "project_count": 3,
      "open_issue_count": 5,
      "authored_mr_count": 2,
      "reviewing_mr_count": 1,
      "needs_attention_count": 3
    }
  }
}

Response at `--depth titles`

{
  "ok": true,
  "data": {
    "username": "jdoe",
    "summary": { ... },
    "open_issues": [
      { "iid": 42, "title": "Fix auth", "attention_state": "needs_attention" }
    ],
    "open_mrs_authored": [
      { "iid": 99, "title": "Refactor auth", "attention_state": "needs_attention" }
    ],
    "reviewing_mrs": []
  }
}

Effort: Low. The data is already available; just need to gate serialization by depth level.

C. `--batch` flag for multi-entity detail (P1)

Problem: After search/timeline, agents discover N entity IIDs and need detail on each. Currently N round trips.

Proposal: Add --batch flag to issues and mrs detail mode.

# Before: 3 round trips
lore -J issues 42 -p proj
lore -J issues 55 -p proj
lore -J issues 71 -p proj

# After: 1 round trip
lore -J issues --batch 42,55,71 -p proj

Response

{
  "ok": true,
  "data": {
    "results": [
      { "iid": 42, "title": "Fix auth", "state": "opened", ... },
      { "iid": 55, "title": "Add SSO", "state": "opened", ... },
      { "iid": 71, "title": "Token refresh", "state": "closed", ... }
    ],
    "errors": [
      { "iid": 99, "error": "Not found" }
    ]
  }
}

Constraints

Max 20 IIDs per batch
Individual errors don't fail the batch (partial results returned)
Works with --include for maximum efficiency: --batch 42,55 --include timeline
Works with --fields minimal for token control

Effort: Medium. Need to loop the existing detail handler and compose results.

D. Composite `context` command (P2)

Problem: Agents need full context on an entity but must learn --include syntax. A purpose-built command is more discoverable.

Proposal: Add context command that returns detail + timeline + related in one call.

lore -J context issues 42 -p proj
lore -J context mrs 99 -p proj

Equivalent To

lore -J issues 42 -p proj --include timeline,related

But with optimized defaults:

Timeline: 20 most recent events, max 3 evidence notes
Related: top 5 entities
Discussions: truncated after 5 threads
Non-fatal: Ollama-dependent parts gracefully degrade

Response Shape

Same as issues <iid> --include timeline,related but with the reduced defaults applied.

Relationship to `--include`

context is sugar for the most common --include pattern. Both mechanisms can coexist:

context for the 80% case (agents wanting full entity understanding)
--include for custom combinations

Effort: Medium. Thin wrapper around detail + include pipeline.

E. `--max-tokens` response budget (P3)

Problem: Response sizes vary wildly (100 to 8000 tokens). Agents can't predict cost in advance.

Proposal: Let agents cap response size. Server truncates to fit.

lore -J me --max-tokens 500
lore -J timeline "feature" --max-tokens 1000
lore -J context issues 42 --max-tokens 2000

Truncation Strategy (priority order)

Apply --fields minimal if not already set
Reduce array lengths (newest/highest-score items survive)
Truncate string fields (descriptions, snippets) to 200 chars
Omit null/empty fields
Drop included sub-queries (if using --include)

Meta Notice

{
  "meta": {
    "elapsed_ms": 50,
    "truncated": true,
    "original_tokens": 3500,
    "budget_tokens": 1000,
    "dropped": ["_related", "discussions[5:]", "activity[10:]"]
  }
}

Implementation Notes

Token estimation: rough heuristic based on JSON character count / 4. Doesn't need to be exact — the goal is "roughly this size" not "exactly N tokens."

Effort: High. Requires token estimation, progressive truncation logic, and tracking what was dropped.

F. `--format tsv` for list commands (P3)

Problem: JSON is verbose for tabular data. List commands return arrays of objects with repeated key names.

Proposal: Add --format tsv for list commands.

lore -J issues --format tsv --fields iid,title,state -n 10

Output

iid	title	state
42	Fix auth	opened
55	Add SSO	opened
71	Token refresh	closed

Token Savings

Command	JSON tokens	TSV tokens	Savings
`issues -n 50 --fields minimal`	~800	~250	69%
`mrs -n 50 --fields minimal`	~800	~250	69%
`who expert -n 10`	~300	~100	67%
`notes -n 50 --fields minimal`	~1000	~350	65%

Applicable Commands

TSV works well for flat, tabular data:

issues (list), mrs (list), notes (list)
who expert, who overlap, who reviews
count

TSV does NOT work for nested/complex data:

Detail views (discussions are nested)
Timeline (events have nested evidence)
Search (nested explain, labels arrays)
me (multiple sections)

Agent Parsing

Most LLMs parse TSV naturally. Agents that need structured data can still use JSON.

Effort: Medium. Tab-separated serialization for flat structs is straightforward. Need to handle escaping for body text containing tabs/newlines.

Impact Summary

Optimization	Priority	Effort	Round-Trip Savings	Token Savings
`--include`	P0	High	50-75%	Moderate
`--depth` on `me`	P0	Low	None	60-80%
`--batch`	P1	Medium	N-1 per batch	Moderate
`context` command	P2	Medium	67-75%	Moderate
`--max-tokens`	P3	High	None	Variable
`--format tsv`	P3	Medium	None	65-69% on lists

Implementation Order

--depth on me — lowest effort, high value, no risk
--include on issues/mrs detail — highest impact, start with timeline include only
--batch — eliminates N+1 pattern
context command — sugar on top of --include
--format tsv — nice-to-have, easy to add incrementally
--max-tokens — complex, defer until demand is clear

9.2 KiB Raw Blame History

Robot-Mode Optimization Proposals

A. --include flag for embedded sub-queries (P0)

Include Matrix

Response Shape

Error Handling

Limit Control

Round-Trip Savings

B. --depth control on me (P0)

Depth Levels

Response at --depth counts

Response at --depth titles

C. --batch flag for multi-entity detail (P1)

Response

Constraints

D. Composite context command (P2)

Equivalent To

Response Shape

Relationship to --include

E. --max-tokens response budget (P3)

Truncation Strategy (priority order)

Meta Notice

Implementation Notes

F. --format tsv for list commands (P3)

Output

Token Savings

Applicable Commands

Agent Parsing

Impact Summary

Implementation Order

9.2 KiB

Raw Blame History

A. `--include` flag for embedded sub-queries (P0)

B. `--depth` control on `me` (P0)

Response at `--depth counts`

Response at `--depth titles`

C. `--batch` flag for multi-entity detail (P1)

D. Composite `context` command (P2)

Relationship to `--include`

E. `--max-tokens` response budget (P3)

F. `--format tsv` for list commands (P3)