Update planning docs and audit tables to reflect the removal of `lore show`: - CLI_AUDIT.md: remove show row, renumber remaining entries - plan-expose-discussion-ids.md: replace `show` with `issues <IID>`/`mrs <IID>` - plan-expose-discussion-ids.feedback-3.md: replace `show` with "detail views" - work-item-status-graphql.md: update example commands from `lore show issue 123` to `lore issues 123`
20 KiB
Gitlore CLI Command Audit
1. Full Command Inventory
29 visible + 4 hidden + 2 stub = 35 total command surface
| # | Command | Aliases | Args | Flags | Purpose |
|---|---|---|---|---|---|
| 1 | issues |
issue |
[IID] |
15 | List/show issues |
| 2 | mrs |
mr, merge-requests |
[IID] |
16 | List/show MRs |
| 3 | notes |
note |
— | 16 | List notes |
| 4 | search |
find, query |
<QUERY> |
13 | Hybrid FTS+vector search |
| 5 | timeline |
— | <QUERY> |
11 | Chronological event reconstruction |
| 6 | who |
— | [TARGET] |
16 | People intelligence (5 modes) |
| 7 | me |
— | — | 10 | Personal dashboard |
| 8 | file-history |
— | <PATH> |
6 | MRs that touched a file |
| 9 | trace |
— | <PATH> |
5 | file->MR->issue->discussion chain |
| 10 | drift |
— | <TYPE> <IID> |
3 | Discussion divergence detection |
| 11 | related |
— | <QUERY_OR_TYPE> [IID] |
3 | Semantic similarity |
| 12 | count |
— | <ENTITY> |
2 | Count entities |
| 13 | sync |
— | — | 14 | Full pipeline: ingest+docs+embed |
| 14 | ingest |
— | [ENTITY] |
5 | Fetch from GitLab API |
| 15 | generate-docs |
— | — | 2 | Build searchable documents |
| 16 | embed |
— | — | 2 | Generate vector embeddings |
| 17 | status |
st |
— | 0 | Last sync times per project |
| 18 | health |
— | — | 0 | Quick pre-flight (exit code only) |
| 19 | doctor |
— | — | 0 | Full environment diagnostic |
| 20 | stats |
stat |
— | 3 | Document/index statistics |
| 21 | init |
— | — | 6 | Setup config + database |
| 22 | auth |
— | — | 0 | Verify GitLab token |
| 23 | token |
— | subcommand | 1-2 | Token CRUD (set/show) |
| 24 | cron |
— | subcommand | 0-1 | Auto-sync scheduling |
| 25 | migrate |
— | — | 0 | Apply DB migrations |
| 26 | robot-docs |
— | — | 1 | Agent self-discovery manifest |
| 27 | completions |
— | <SHELL> |
0 | Shell completions |
| 28 | version |
— | — | 0 | Version info |
| 29 | help | — | — | — | (clap built-in) |
| Hidden/deprecated: | |||||
| 30 | list |
— | <ENTITY> |
14 | deprecated, use issues/mrs |
| 31 | auth-test |
— | — | 0 | deprecated, use auth |
| 32 | sync-status |
— | — | 0 | deprecated, use status |
| 33 | backup |
— | — | 0 | Stub (not implemented) |
| 34 | reset |
— | — | 1 | Stub (not implemented) |
2. Semantic Overlap Analysis
Cluster A: "Is the system working?" (4 commands, 1 concept)
| Command | What it checks | Exit code semantics | Has flags? |
|---|---|---|---|
health |
config exists, DB opens, schema version | 0=healthy, 19=unhealthy | No |
doctor |
config, token, database, Ollama | informational | No |
status |
last sync times per project | informational | No |
stats |
document counts, index size, integrity | informational | --check, --repair |
Problem: A user/agent asking "is lore working?" must choose among four commands. health is a strict subset of doctor. status and stats are near-homonyms that answer different questions -- sync recency vs. index health. count (Cluster E) also overlaps with what stats reports.
Cognitive cost: High. The CLI literature (Clig.dev, Heroku CLI design guide, 12-factor CLI) consistently warns against >2 "status" commands. Users build a mental model of "the status command" -- when there are four, they pick wrong or give up.
Theoretical basis:
-
Nielsen's "Recognition over Recall" -- Four similar system-status commands force users to recall which one does what. One command with progressive disclosure (flags for depth) lets them recognize the option they need. This is doubly important for LLM agents, which perform better with fewer top-level choices and compositional flags.
-
Fitts's Law for CLIs -- Command discovery cost is proportional to list length. Each additional top-level command adds scanning time for humans and token cost for robots.
Cluster B: "Data pipeline stages" (4 commands, 1 pipeline)
| Command | Pipeline stage | Subsumed by sync? |
|---|---|---|
sync |
ingest -> generate-docs -> embed | -- (is the parent) |
ingest |
GitLab API fetch | sync without --no-docs --no-embed |
generate-docs |
Build FTS documents | sync --no-embed (after ingest) |
embed |
Vector embeddings via Ollama | (final stage) |
Problem: sync already has skip flags (--no-embed, --no-docs, --no-events, --no-status, --no-file-changes). The individual stage commands duplicate this with less control -- ingest has --full, --force, --dry-run, but sync also has all three.
The standalone commands exist for granular debugging, but in practice they're reached for <5% of the time. They inflate the help screen while sync handles 95% of use cases.
Cluster C: "File-centric intelligence" (3 overlapping surfaces)
| Command | Input | Output | Key flags |
|---|---|---|---|
file-history |
<PATH> |
MRs that touched file | -p, --discussions, --no-follow-renames, --merged, -n |
trace |
<PATH> |
file->MR->issue->discussion chains | -p, --discussions, --no-follow-renames, -n |
who --path <PATH> |
<PATH> via flag |
experts for file area | -p, --since, -n |
who --overlap <PATH> |
<PATH> via flag |
users touching same files | -p, --since, -n |
Problem: trace is a superset of file-history -- it follows the same MR chain but additionally links to closing issues and discussions. They share 4 of 5 filter flags. A user who wants "what happened to this file?" has to choose between two commands that sound nearly identical.
Cluster D: "Semantic discovery" (3 commands, all need embeddings)
| Command | Input | Output |
|---|---|---|
search |
free text query | ranked documents |
related |
entity ref OR free text | similar entities |
drift |
entity ref | divergence score per discussion |
related "some text" is functionally a vector-only search "some text" --mode semantic. The difference is that related can also seed from an entity (issues 42), while search only accepts text.
drift is specialized enough to stand alone, but it's only used on issues and has a single non-project flag (--threshold).
Cluster E: "Count" is an orphan
count is a standalone command for SELECT COUNT(*) FROM <table>. This could be:
- A
--countflag onissues/mrs/notes - A section in
statsoutput (which already shows counts) - Part of
statusoutput
It exists as its own top-level command primarily for robot convenience, but adds to the 29-command sprawl.
3. Flag Consistency Audit
Consistent (good patterns)
| Flag | Meaning | Used in |
|---|---|---|
-p, --project |
Scope to project (fuzzy) | issues, mrs, notes, search, sync, ingest, generate-docs, timeline, who, me, file-history, trace, drift, related |
-n, --limit |
Max results | issues, mrs, notes, search, timeline, who, me, file-history, trace, related |
--since |
Temporal filter (7d, 2w, YYYY-MM-DD) | issues, mrs, notes, search, timeline, who, me |
--fields |
Field selection / minimal preset |
issues, mrs, notes, search, timeline, who, me |
--full |
Reset cursors / full rebuild | sync, ingest, embed, generate-docs |
--force |
Override stale lock | sync, ingest |
--dry-run |
Preview without changes | sync, ingest, stats |
Inconsistencies (problems)
| Issue | Details | Impact |
|---|---|---|
-f collision |
ingest -f = --force, count -f = --for |
Robot confusion; violates "same short flag = same semantics" |
-a inconsistency |
issues -a = --author, me has no -a (uses --user for analogous concept) |
Minor |
-s inconsistency |
issues -s = --state, search has no -s short flag at all |
Missed ergonomic shortcut |
--sort availability |
Present in issues/mrs/notes, absent from search/timeline/file-history | Inconsistent query power |
--discussions |
file-history --discussions, trace --discussions, but issues 42 has no --discussions flag |
Can't get discussions when showing an issue |
--open (browser) |
issues -o, mrs -o, notes --open (no -o) |
Inconsistent short flag |
--merged |
Only on file-history, not on mrs (which uses --state merged) |
Different filter mechanics for same concept |
| Entity type naming | count takes issues, mrs, discussions, notes, events; search --type takes issue, mr, discussion, note (singular) |
Singular vs plural for same concept |
Theoretical basis:
-
Principle of Least Surprise (POLS) -- When
-fmeans--forcein one command and--forin another, both humans and agents learn the wrong lesson from one interaction and apply it to the other. CLI design guides (GNU standards, POSIX conventions, clig.dev) are unanimous: short flags should have consistent semantics across all subcommands. -
Singular/plural inconsistency (
issuesvsissueas entity type values) is particularly harmful for LLM agents, which use pattern matching on prior successful invocations. Iflore count issuesworks, the agent will trylore search --type issues-- and get a parse error.
4. Robot Ergonomics Assessment
Strengths (well above average for a CLI)
| Feature | Rating | Notes |
|---|---|---|
| Structured output | Excellent | Consistent {ok, data, meta} envelope |
| Auto-detection | Excellent | Non-TTY -> robot mode, LORE_ROBOT env var |
| Error output | Excellent | Structured JSON to stderr with actions array for recovery |
| Exit codes | Excellent | 20 distinct, well-documented codes |
| Self-discovery | Excellent | robot-docs manifest, --brief for token savings |
| Typo tolerance | Excellent | Autocorrect with confidence scores + structured warnings |
| Field selection | Good | --fields minimal saves ~60% tokens |
| No-args behavior | Good | Robot mode auto-outputs robot-docs |
Weaknesses
| Issue | Severity | Recommendation |
|---|---|---|
| 29 commands in robot-docs manifest | High | Agents spend tokens evaluating which command to use. Grouping would reduce decision space. |
status/stats/stat near-homonyms |
High | LLMs are particularly susceptible to surface-level lexical confusion. stat is an alias for stats while status is a different command -- this guarantees agent errors. |
| Singular vs plural entity types | Medium | count issues works but search --type issues fails. Agents learn from one and apply to the other. |
| Overlapping file commands | Medium | Agent must decide between trace, file-history, and who --path. The decision tree isn't obvious from names alone. |
count as separate command |
Low | Could be a flag; standalone command inflates the decision space |
5. Human Ergonomics Assessment
Strengths
| Feature | Rating | Notes |
|---|---|---|
| Help text quality | Excellent | Every command has examples, help headings organize flags |
| Short flags | Good | -p, -n, -s, -a, -J cover 80% of common use |
| Alias coverage | Good | issue/issues, mr/mrs, st/status, find/search |
| Subcommand inference | Good | lore issu -> issues via clap infer |
| Color/icon system | Good | Auto, with overrides |
Weaknesses
| Issue | Severity | Recommendation |
|---|---|---|
| 29 commands in flat help | High | Doesn't fit one terminal screen. No grouping -> overwhelming |
status vs stats naming |
High | Humans will type wrong one repeatedly |
health vs doctor distinction |
Medium | "Which one do I run?" -- unclear from names |
who 5-mode overload |
Medium | Help text is long; mode exclusions are complex |
| Pipeline stages as top-level | Low | ingest/generate-docs/embed rarely used directly but clutter help |
generate-docs is 14 chars |
Low | Longest command name; gen-docs or gendocs would help |
6. Proposals (Ranked by Impact x Feasibility)
P1: Help Grouping (HIGH impact, LOW effort)
Problem: 29 flat commands -> information overload.
Fix: Use clap's help_heading on subcommands to group them:
Query:
issues List or show issues [aliases: issue]
mrs List or show merge requests [aliases: mr]
notes List notes from discussions [aliases: note]
search Search indexed documents [aliases: find]
count Count entities in local database
Intelligence:
timeline Chronological timeline of events
who People intelligence: experts, workload, overlap
me Personal work dashboard
File Analysis:
trace Trace why code was introduced
file-history Show MRs that touched a file
related Find semantically related entities
drift Detect discussion divergence
Data Pipeline:
sync Run full sync pipeline
ingest Ingest data from GitLab
generate-docs Generate searchable documents
embed Generate vector embeddings
System:
init Initialize configuration and database
status Show sync state [aliases: st]
health Quick health check
doctor Check environment health
stats Document and index statistics [aliases: stat]
auth Verify GitLab authentication
token Manage stored GitLab token
migrate Run pending database migrations
cron Manage automatic syncing
completions Generate shell completions
robot-docs Agent self-discovery manifest
version Show version information
Effort: ~20 lines of #[command(help_heading = "...")] annotations. No behavior changes.
P2: Resolve status/stats Confusion (HIGH impact, LOW effort)
Option A (recommended): Rename stats -> index.
lore status= when did I last sync? (pipeline state)lore index= how big is my index? (data inventory)- The alias
statgoes away (it was causing confusion anyway)
Option B: Rename status -> sync-state and stats -> db-stats. More descriptive but longer.
Option C: Merge both under check (see P4).
P3: Fix Singular/Plural Entity Type Inconsistency (MEDIUM impact, TRIVIAL effort)
Accept both singular and plural forms everywhere:
countalready takesissues(plural) -- also acceptissuesearch --typealready takesissue(singular) -- also acceptissuesdrifttakesissues-- also acceptissue
This is a ~10 line change in the value parsers and eliminates an entire class of agent errors.
P4: Merge health + doctor (MEDIUM impact, LOW effort)
health is a fast subset of doctor. Merge:
lore doctor= full diagnostic (current behavior)lore doctor --quick= fast pre-flight, exit-code-only (currenthealth)- Drop
healthas a separate command, add a hidden alias for backward compat
P5: Fix -f Short Flag Collision (MEDIUM impact, TRIVIAL effort)
Change count's -f, --for to just --for (no short flag). -f should mean --force project-wide, or nowhere.
P6: Consolidate trace + file-history (MEDIUM impact, MEDIUM effort)
trace already does everything file-history does plus more. Options:
Option A: Make file-history an alias for trace --flat (shows MR list without issue/discussion linking).
Option B: Add --mrs-only to trace that produces file-history output. Deprecate file-history with a hidden alias.
Either way, one fewer top-level command and no lost functionality.
P7: Hide Pipeline Sub-stages (LOW impact, TRIVIAL effort)
Move ingest, generate-docs, embed to #[command(hide = true)]. They remain usable but don't clutter --help. Direct users to sync with stage-skip flags.
For power users who need individual stages, document in sync --help:
To run individual stages:
lore ingest # Fetch from GitLab only
lore generate-docs # Rebuild documents only
lore embed # Re-embed only
P8: Make count a Flag, Not a Command (LOW impact, MEDIUM effort)
Add --count to issues and mrs:
lore issues --count # replaces: lore count issues
lore mrs --count # replaces: lore count mrs
lore notes --count # replaces: lore count notes
Keep count as a hidden alias for backward compatibility. Removes one top-level command.
P9: Consistent --open Short Flag (LOW impact, TRIVIAL effort)
notes --open lacks the -o shorthand that issues and mrs have. Add it.
P10: Add --sort to search (LOW impact, LOW effort)
search returns ranked results but offers no --sort override. Adding --sort=score,created,updated would bring it in line with issues/mrs/notes.
7. Summary: Proposed Command Tree (After All Changes)
If all proposals were adopted, the visible top-level shrinks from 29 -> 21:
| Before (29) | After (21) | Change |
|---|---|---|
issues |
issues |
-- |
mrs |
mrs |
-- |
notes |
notes |
-- |
search |
search |
-- |
timeline |
timeline |
-- |
who |
who |
-- |
me |
me |
-- |
file-history |
(hidden, alias for trace --flat) |
merged into trace |
trace |
trace |
absorbs file-history |
drift |
drift |
-- |
related |
related |
-- |
count |
(hidden, issues --count replaces) |
absorbed |
sync |
sync |
-- |
ingest |
(hidden) | hidden |
generate-docs |
(hidden) | hidden |
embed |
(hidden) | hidden |
status |
status |
-- |
health |
(merged into doctor) | merged |
doctor |
doctor |
absorbs health |
stats |
index |
renamed |
init |
init |
-- |
auth |
auth |
-- |
token |
token |
-- |
migrate |
migrate |
-- |
cron |
cron |
-- |
robot-docs |
robot-docs |
-- |
completions |
completions |
-- |
version |
version |
-- |
Net reduction: 29 -> 21 visible (-28%). The hidden commands remain fully functional and documented in robot-docs for agents that already use them.
Theoretical basis:
-
Miller's Law -- Humans can hold 7+/-2 items in working memory. 29 commands far exceeds this. Even with help grouping (P1), the sheer count creates decision fatigue. The literature on CLI design (Heroku's "12-Factor CLI", clig.dev's "Command Line Interface Guidelines") recommends 10-15 top-level commands maximum, with grouping or nesting for anything beyond.
-
For LLM agents specifically: Research on tool-use with large tool sets (Schick et al. 2023, Qin et al. 2023) shows that agent accuracy degrades as the tool count increases, roughly following an inverse log curve. Reducing from 29 to 21 commands in the robot-docs manifest would measurably improve agent command selection accuracy.
-
Backward compatibility is free: Since AGENTS.md says "we don't care about backward compatibility," hidden aliases cost nothing and prevent breakage for agents with cached robot-docs.
8. Priority Matrix
| Proposal | Impact | Effort | Risk | Recommended Order |
|---|---|---|---|---|
| P1: Help grouping | High | Trivial | None | Do first |
| P3: Singular/plural fix | Medium | Trivial | None | Do first |
P5: Fix -f collision |
Medium | Trivial | None | Do first |
P9: notes -o shorthand |
Low | Trivial | None | Do first |
P2: Rename stats->index |
High | Low | Alias needed | Do second |
| P4: Merge health->doctor | Medium | Low | Alias needed | Do second |
| P7: Hide pipeline stages | Low | Trivial | Needs docs update | Do second |
| P6: Merge file-history->trace | Medium | Medium | Flag design | Plan carefully |
| P8: count -> --count flag | Low | Medium | Compat shim | Plan carefully |
P10: --sort on search |
Low | Low | None | When convenient |
The "do first" tier is 4 changes that could ship in a single commit with zero risk and immediate ergonomic improvement for both humans and agents.