Deep analysis of the full `lore` CLI command surface (34 commands across 6 categories) covering command inventory, data flow, overlap analysis, and optimization proposals. Document structure: - Main consolidated doc: docs/command-surface-analysis.md (1251 lines) - Split sections in docs/command-surface-analysis/ for navigation: 00-overview.md - Summary, inventory, priorities 01-entity-commands.md - issues, mrs, notes, search, count 02-intelligence-commands.md - who, timeline, me, file-history, trace, related, drift 03-pipeline-and-infra.md - sync, ingest, generate-docs, embed, diagnostics 04-data-flow.md - Shared data source map, command network graph 05-overlap-analysis.md - Quantified overlap percentages for every command pair 06-agent-workflows.md - Common agent flows, round-trip costs, token profiles 07-consolidation-proposals.md - 5 proposals to reduce 34 commands to 29 08-robot-optimization-proposals.md - 6 proposals for --include, --batch, --depth 09-appendices.md - Robot output envelope, field presets, exit codes Key findings: - High overlap pairs: who-workload/me (~85%), health/doctor (~90%) - 5 consolidation proposals to reduce command count by 15% - 6 robot-mode optimization proposals targeting agent round-trip reduction - Full DB table mapping and data flow documentation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
7.2 KiB
Data Flow & Command Network
How commands interconnect through shared data sources and output-to-input dependencies.
1. Command Network Graph
Arrows mean "output of A feeds as input to B":
┌─────────┐
│ search │─────────────────────────────┐
└────┬────┘ │
│ iid │ topic
┌────▼────┐ ┌────▼─────┐
┌─────│ issues │◄───────────────────────│ timeline │
│ │ mrs │ (detail) └──────────┘
│ └────┬────┘ ▲
│ │ iid │ entity ref
│ ┌────▼────┐ ┌──────────────┐ │
│ │ related │ │ file-history │───────┘
│ │ drift │ └──────┬───────┘
│ └─────────┘ │ MR iids
│ ┌────▼────┐
│ │ trace │──── issues (linked)
│ └────┬────┘
│ │ paths
│ ┌────▼────┐
│ │ who │
│ │ (expert)│
│ └─────────┘
│
file paths ┌─────────┐
│ │ me │──── issues, mrs (dashboard)
▼ └─────────┘
┌──────────┐ ▲
│ notes │ │ (~same data)
└──────────┘ ┌────┴──────┐
│who workload│
└───────────┘
Feed Chains (output of A -> input of B)
| From | To | What Flows |
|---|---|---|
search |
issues, mrs |
IIDs from search results -> detail lookup |
search |
timeline |
Topic/query -> chronological history |
search |
related |
Entity IID -> semantic similarity |
me |
issues, mrs |
IIDs from dashboard -> detail lookup |
trace |
issues |
Linked issue IIDs -> detail lookup |
trace |
who |
File paths -> expert lookup |
file-history |
mrs |
MR IIDs -> detail lookup |
file-history |
timeline |
Entity refs -> chronological events |
timeline |
issues, mrs |
Referenced IIDs -> detail lookup |
who expert |
who reviews |
Username -> review patterns |
who expert |
mrs |
MR IIDs from expert detail -> MR detail |
2. Shared Data Source Map
Which DB tables power which commands. Higher overlap = stronger consolidation signal.
Primary Entity Tables
| Table | Read By |
|---|---|
issues |
issues, me, who-workload, search, timeline, trace, count, stats |
merge_requests |
mrs, me, who-workload, search, timeline, trace, file-history, count, stats |
notes |
notes, issues-detail, mrs-detail, who-expert, who-active, search, timeline, trace, file-history |
discussions |
notes, issues-detail, mrs-detail, who-active, who-reviews, timeline, trace |
Relationship Tables
| Table | Read By |
|---|---|
entity_references |
trace, timeline |
mr_file_changes |
trace, file-history, who-overlap |
issue_labels |
issues, me |
mr_labels |
mrs, me |
issue_assignees |
issues, me |
mr_reviewers |
mrs, who-expert, who-workload |
Event Tables
| Table | Read By |
|---|---|
resource_state_events |
timeline, me-activity |
resource_label_events |
timeline |
resource_milestone_events |
timeline |
Document/Search Tables
| Table | Read By |
|---|---|
documents + documents_fts |
search, stats |
embeddings |
search, related, drift |
document_labels |
search |
document_paths |
search |
Infrastructure Tables
| Table | Read By |
|---|---|
sync_cursors |
status |
dirty_sources |
stats |
embedding_metadata |
stats, embed |
3. Shared-Data Clusters
Commands that read from the same primary tables form natural clusters:
Cluster A: Issue/MR Entities
issues, mrs, me, who workload, count
All read issues + merge_requests with similar filter patterns (state, author, labels, project). These commands share the same underlying WHERE-clause builder logic.
Cluster B: Notes/Discussions
notes, issues detail, mrs detail, who expert, who active, timeline
All traverse the discussions -> notes join path. The notes command does it with independent filters; the others embed notes within parent context.
Cluster C: File Genealogy
trace, file-history, who overlap
All use mr_file_changes with rename chain BFS (forward: old_path -> new_path, backward: new_path -> old_path). Shared resolve_rename_chain() function.
Cluster D: Semantic/Vector
search, related, drift
All use documents + embeddings via Ollama. search adds FTS component; related is pure vector; drift uses vector for divergence scoring.
Cluster E: Diagnostics
health, auth, doctor, status, stats
All check system state. health < doctor (strict subset). status checks sync cursors. stats checks document/index health. auth checks token/connectivity.
4. Query Pattern Sharing
Dynamic Filter Builder (used by issues, mrs, notes)
All three list commands use the same pattern: build a WHERE clause dynamically from filter flags with parameterized tokens. Labels use EXISTS subquery against junction table.
Rename Chain BFS (used by trace, file-history, who overlap)
Forward query:
SELECT DISTINCT new_path FROM mr_file_changes
WHERE project_id = ?1 AND old_path = ?2 AND change_type = 'renamed'
Backward query:
SELECT DISTINCT old_path FROM mr_file_changes
WHERE project_id = ?1 AND new_path = ?2 AND change_type = 'renamed'
Cycle detection via HashSet of visited paths, MAX_RENAME_HOPS = 10.
Hybrid Search (used by search, timeline seeding)
RRF ranking: score = (60 / fts_rank) + (60 / vector_rank)
FTS5 queries go through to_fts_query() which sanitizes input and builds MATCH expressions. Vector search calls Ollama to embed the query, then does cosine similarity against embeddings vec0 table.
Project Resolution (used by most commands)
resolve_project(conn, project_filter) does fuzzy matching on path_with_namespace — suffix and substring matching. Returns (project_id, path_with_namespace).