docs: update README with notes, drift, error tolerance, scoring config, and expanded command reference

Major additions: - lore notes command: full documentation of rich note querying with filters (author, type, path, resolution, time range, body substring), sort/format options, field selection, and browser opening - lore drift command: discussion divergence detection documentation - Error Tolerance section: table of all 8 auto-correction types with examples and mode behavior, stderr JSON warning format, fuzzy suggestion format for unrecognized commands - Command Aliases table: primary commands and their accepted aliases - scoring config section: all weight/half-life/decay parameters for the who-expert scoring engine (authorWeight, reviewerWeight, noteBonus, half-life periods, closedMrMultiplier, excludedUsernames) Updates to existing sections: - Timeline: entity-direct seeding syntax (issue:N, i:N, mr:N, m:N), hybrid search pipeline description replacing pure FTS5, discussion thread collection, --fields flag, numbered progress spinners - Search: --after/--updated-after renamed to --since/--updated-since, progress spinner behavior, note type filter - Who: --explain-score, --as-of, --include-bots, --all-history, --detail - Sync: --no-file-changes flag - Robot-docs: --brief flag - Field selection: expanded to note which commands support --fields
2026-02-13 17:27:49 -05:00
parent e0041ed4d9
commit 159c490ad7
1 changed files with 158 additions and 10 deletions
--- a/README.md
+++ b/README.md
@@ -19,7 +19,10 @@ Local GitLab data management with semantic search, people intelligence, and temp
 - **Cross-reference tracking**: Automatic extraction of "closes", "mentioned" relationships between MRs and issues
 - **Work item status enrichment**: Fetches issue statuses (e.g., "To do", "In progress", "Done") from GitLab's GraphQL API with adaptive page sizing, color-coded display, and case-insensitive filtering
 - **Resource event history**: Tracks state changes, label events, and milestone events for issues and MRs
+- **Note querying**: Rich filtering over discussion notes by author, type, path, resolution status, time range, and body content
+- **Discussion drift detection**: Semantic analysis of how discussions diverge from original issue intent
 - **Robot mode**: Machine-readable JSON output with structured errors, meaningful exit codes, and actionable recovery steps
+- **Error tolerance**: Auto-corrects common CLI mistakes (case, typos, single-dash flags, value casing) with teaching feedback
 - **Observability**: Verbosity controls, JSON log format, structured metrics, and stage timing

 ## Installation
@@ -71,6 +74,12 @@ lore who @asmith
 # Timeline of events related to deployments
 lore timeline "deployment"

+# Timeline for a specific issue
+lore timeline issue:42
+
+# Query notes by author
+lore notes --author alice --since 7d
+
 # Robot mode (machine-readable JSON)
 lore -J issues -n 5 | jq .
 ```
@@ -109,6 +118,15 @@ Configuration is stored in `~/.config/lore/config.json` (or `$XDG_CONFIG_HOME/lo
    "model": "nomic-embed-text",
    "baseUrl": "http://localhost:11434",
    "concurrency": 4
+  },
+  "scoring": {
+    "authorWeight": 25,
+    "reviewerWeight": 10,
+    "noteBonus": 1,
+    "authorHalfLifeDays": 180,
+    "reviewerHalfLifeDays": 90,
+    "noteHalfLifeDays": 45,
+    "excludedUsernames": ["bot-user"]
  }
 }
 ```
@@ -135,6 +153,15 @@ Configuration is stored in `~/.config/lore/config.json` (or `$XDG_CONFIG_HOME/lo
 | `embedding` | `model` | `nomic-embed-text` | Model name for embeddings |
 | `embedding` | `baseUrl` | `http://localhost:11434` | Ollama server URL |
 | `embedding` | `concurrency` | `4` | Concurrent embedding requests |
+| `scoring` | `authorWeight` | `25` | Points per MR where the user authored code touching the path |
+| `scoring` | `reviewerWeight` | `10` | Points per MR where the user reviewed code touching the path |
+| `scoring` | `noteBonus` | `1` | Bonus per inline review comment (DiffNote) |
+| `scoring` | `reviewerAssignmentWeight` | `3` | Points per MR where the user was assigned as reviewer |
+| `scoring` | `authorHalfLifeDays` | `180` | Half-life in days for author contribution decay |
+| `scoring` | `reviewerHalfLifeDays` | `90` | Half-life in days for reviewer contribution decay |
+| `scoring` | `noteHalfLifeDays` | `45` | Half-life in days for note/comment decay |
+| `scoring` | `closedMrMultiplier` | `0.5` | Score multiplier for closed (not merged) MRs |
+| `scoring` | `excludedUsernames` | `[]` | Usernames excluded from expert results (e.g., bots) |

 ### Config File Resolution

@@ -262,18 +289,21 @@ lore search "login flow" --mode semantic      # Vector similarity only
 lore search "auth" --type issue               # Filter by source type
 lore search "auth" --type mr                  # MR documents only
 lore search "auth" --type discussion          # Discussion documents only
+lore search "auth" --type note               # Individual notes only
 lore search "deploy" --author username        # Filter by author
 lore search "deploy" -p group/repo           # Filter by project
 lore search "deploy" --label backend          # Filter by label (AND logic)
 lore search "deploy" --path src/             # Filter by file path (trailing / for prefix)
-lore search "deploy" --after 7d              # Created after (7d, 2w, 1m, or YYYY-MM-DD)
-lore search "deploy" --updated-after 2w      # Updated after
+lore search "deploy" --since 7d              # Created since (7d, 2w, 1m, or YYYY-MM-DD)
+lore search "deploy" --updated-since 2w      # Updated since
 lore search "deploy" -n 50                    # Limit results (default 20, max 100)
 lore search "deploy" --explain               # Show ranking explanation per result
 lore search "deploy" --fts-mode raw          # Raw FTS5 query syntax (advanced)
 ```

-The `--fts-mode` flag defaults to `safe`, which sanitizes user input into valid FTS5 queries with automatic fallback. Use `raw` for advanced FTS5 query syntax (AND, OR, NOT, phrase matching, prefix queries).
+The `--fts-mode` flag defaults to `safe`, which sanitizes user input into valid FTS5 queries with automatic fallback. FTS5 boolean operators (`AND`, `OR`, `NOT`, `NEAR`) are passed through in safe mode, so queries like `"switch AND health"` work without switching to raw mode. Use `raw` for advanced FTS5 query syntax (phrase matching, column filters, prefix queries).
+
+A progress spinner displays during search, showing the active mode (e.g., `Searching (hybrid)...`). In robot mode, spinners are suppressed for clean JSON output.

 Requires `lore generate-docs` (or `lore sync`) to have been run at least once. Semantic and hybrid modes require `lore embed` (or `lore sync`) to have generated vector embeddings via Ollama.

@@ -283,7 +313,7 @@ People intelligence: discover experts, analyze workloads, review patterns, activ

 #### Expert Mode

-Find who has expertise in a code area based on authoring and reviewing history (DiffNote analysis).
+Find who has expertise in a code area based on authoring and reviewing history (DiffNote analysis). Scores use exponential half-life decay so recent contributions count more than older ones. Scoring weights and half-life periods are configurable via the `scoring` config section.

 ```bash
 lore who src/features/auth/           # Who knows about this directory?
@@ -292,6 +322,9 @@ lore who --path README.md             # Root files need --path flag
 lore who --path Makefile              # Dotless root files too
 lore who src/ --since 3m              # Limit to recent 3 months
 lore who src/ -p group/repo           # Scope to project
+lore who src/ --explain-score         # Show per-component score breakdown
+lore who src/ --as-of 30d            # Score as if "now" was 30 days ago
+lore who src/ --include-bots          # Include bot users in results
 ```

 The target is auto-detected as a path when it contains `/`. For root files without `/` (e.g., `README.md`), use the `--path` flag. Default time window: 6 months.
@@ -348,13 +381,22 @@ Shows: users with touch counts (author vs. review), linked MR references. Defaul
 | `-p` / `--project` | Scope to a project (fuzzy match) |
 | `--since` | Time window (7d, 2w, 6m, YYYY-MM-DD). Default varies by mode. |
 | `-n` / `--limit` | Max results per section (1-500, default 20) |
+| `--all-history` | Remove the default time window, query all history |
+| `--detail` | Show per-MR detail breakdown (expert mode only) |
+| `--explain-score` | Show per-component score breakdown (expert mode only) |
+| `--as-of` | Score as if "now" is a past date (ISO 8601 or duration like 30d, expert mode only) |
+| `--include-bots` | Include bot users normally excluded via `scoring.excludedUsernames` |

 ### `lore timeline`

 Reconstruct a chronological timeline of events matching a keyword query. The pipeline discovers related entities through cross-reference graph traversal and assembles a unified, time-ordered event stream.

 ```bash
-lore timeline "deployment"                    # Events related to deployments
+lore timeline "deployment"                    # Search-based seeding (hybrid search)
+lore timeline issue:42                        # Direct entity seeding by issue IID
+lore timeline i:42                            # Shorthand for issue:42
+lore timeline mr:99                           # Direct entity seeding by MR IID
+lore timeline m:99                            # Shorthand for mr:99
 lore timeline "auth" -p group/repo           # Scoped to a project
 lore timeline "auth" --since 30d             # Only recent events
 lore timeline "migration" --depth 2          # Deeper cross-reference expansion
@@ -363,6 +405,8 @@ lore timeline "deploy" -n 50                 # Limit event count
 lore timeline "auth" --max-seeds 5           # Fewer seed entities
 ```

+The query can be either a search string (hybrid search finds matching entities) or an entity reference (`issue:N`, `i:N`, `mr:N`, `m:N`) which directly seeds the timeline from a specific entity and its cross-references.
+
 #### Flags

 | Flag | Default | Description |
@@ -375,13 +419,16 @@ lore timeline "auth" --max-seeds 5           # Fewer seed entities
 | `--max-seeds` | `10` | Maximum seed entities from search |
 | `--max-entities` | `50` | Maximum entities discovered via cross-references |
 | `--max-evidence` | `10` | Maximum evidence notes included |
+| `--fields` | all | Select output fields (comma-separated, or 'minimal' preset) |

 #### Pipeline Stages

-1. **SEED** -- Full-text search identifies the most relevant issues and MRs matching the query. Documents are ranked by BM25 relevance.
-2. **HYDRATE** -- Evidence notes are extracted: the top FTS-matched discussion notes with 200-character snippets explaining *why* each entity was surfaced.
+Each stage displays a numbered progress spinner (e.g., `[1/3] Seeding timeline...`). In robot mode, spinners are suppressed for clean JSON output.
+
+1. **SEED** -- Hybrid search (FTS5 lexical + Ollama vector similarity via Reciprocal Rank Fusion) identifies the most relevant issues and MRs. Falls back to lexical-only if Ollama is unavailable. Discussion notes matching the query are also discovered and attached to their parent entities.
+2. **HYDRATE** -- Evidence notes are extracted: the top search-matched discussion notes with 200-character snippets explaining *why* each entity was surfaced. Matched discussions are collected as full thread candidates.
 3. **EXPAND** -- Breadth-first traversal over the `entity_references` graph discovers related entities via "closes", "related", and optionally "mentioned" references up to the configured depth.
-4. **COLLECT** -- Events are gathered for all discovered entities. Event types include: creation, state changes, label adds/removes, milestone assignments, merge events, and evidence notes. Events are sorted chronologically with stable tiebreaking.
+4. **COLLECT** -- Events are gathered for all discovered entities. Event types include: creation, state changes, label adds/removes, milestone assignments, merge events, evidence notes, and full discussion threads. Events are sorted chronologically with stable tiebreaking.
 5. **RENDER** -- Events are formatted as human-readable text or structured JSON (robot mode).

 #### Event Types
@@ -395,13 +442,70 @@ lore timeline "auth" --max-seeds 5           # Fewer seed entities
 | `MilestoneSet` | Milestone assigned |
 | `MilestoneRemoved` | Milestone removed |
 | `Merged` | MR merged (deduplicated against state events) |
-| `NoteEvidence` | Discussion note matched by FTS, with snippet |
+| `NoteEvidence` | Discussion note matched by search, with snippet |
+| `DiscussionThread` | Full discussion thread with all non-system notes |
 | `CrossReferenced` | Reference to another entity |

 #### Unresolved References

 When graph expansion encounters cross-project references to entities not yet synced locally, these are collected as unresolved references in the output. This enables discovery of external dependencies and can inform future sync targets.

+### `lore notes`
+
+Query individual notes from discussions with rich filtering options.
+
+```bash
+lore notes                                    # List 50 most recent notes
+lore notes --author alice --since 7d         # Notes by alice in last 7 days
+lore notes --for-issue 42 -p group/repo      # Notes on issue #42
+lore notes --for-mr 99 -p group/repo         # Notes on MR !99
+lore notes --path src/ --resolution unresolved  # Unresolved diff notes in src/
+lore notes --note-type DiffNote              # Only inline code review comments
+lore notes --contains "TODO"                 # Substring search in note body
+lore notes --include-system                  # Include system-generated notes
+lore notes --since 2w --until 2024-12-31     # Time-bounded range
+lore notes --sort updated --asc              # Sort by update time, ascending
+lore notes --format csv                      # CSV output
+lore notes --format jsonl                    # Line-delimited JSON
+lore notes -o                                # Open first result in browser
+
+# Field selection (robot mode)
+lore -J notes --fields minimal               # Compact: id, author_username, body, created_at_iso
+```
+
+#### Filters
+
+| Flag | Description |
+|------|-------------|
+| `-a` / `--author` | Filter by note author username |
+| `--note-type` | Filter by note type (DiffNote, DiscussionNote) |
+| `--contains` | Substring search in note body |
+| `--note-id` | Filter by internal note ID |
+| `--gitlab-note-id` | Filter by GitLab note ID |
+| `--discussion-id` | Filter by discussion ID |
+| `--include-system` | Include system notes (excluded by default) |
+| `--for-issue` | Notes on a specific issue IID (requires `-p`) |
+| `--for-mr` | Notes on a specific MR IID (requires `-p`) |
+| `-p` / `--project` | Scope to a project (fuzzy match) |
+| `--since` | Notes created since (7d, 2w, 1m, or YYYY-MM-DD) |
+| `--until` | Notes created until (YYYY-MM-DD, inclusive end-of-day) |
+| `--path` | Filter by file path (DiffNotes only; trailing `/` for prefix match) |
+| `--resolution` | Filter by resolution status (`any`, `unresolved`, `resolved`) |
+| `--sort` | Sort by `created` (default) or `updated` |
+| `--asc` | Sort ascending (default: descending) |
+| `--format` | Output format: `table` (default), `json`, `jsonl`, `csv` |
+| `-o` / `--open` | Open first result in browser |
+
+### `lore drift`
+
+Detect discussion divergence from the original intent of an issue by comparing the semantic similarity of discussion content against the issue description.
+
+```bash
+lore drift issues 42                         # Check divergence on issue #42
+lore drift issues 42 --threshold 0.6        # Higher threshold (stricter)
+lore drift issues 42 -p group/repo          # Scope to project
+```
+
 ### `lore sync`

 Run the full sync pipeline: ingest from GitLab (including work item status enrichment via GraphQL), generate searchable documents, and compute embeddings.
@@ -413,6 +517,7 @@ lore sync --force            # Override stale lock
 lore sync --no-embed         # Skip embedding step
 lore sync --no-docs          # Skip document regeneration
 lore sync --no-events        # Skip resource event fetching
+lore sync --no-file-changes  # Skip MR file change fetching
 lore sync --dry-run          # Preview what would be synced
 ```

@@ -571,6 +676,7 @@ Machine-readable command manifest for agent self-discovery. Returns a JSON schem
 ```bash
 lore robot-docs                   # Pretty-printed JSON
 lore --robot robot-docs           # Compact JSON for parsing
+lore robot-docs --brief           # Omit response_schema (~60% smaller)
 ```

 ### `lore version`
@@ -622,7 +728,7 @@ The `actions` array contains executable shell commands an agent can run to recov

 ### Field Selection

-The `--fields` flag on `issues` and `mrs` list commands controls which fields appear in the JSON response, reducing token usage for AI agent workflows:
+The `--fields` flag controls which fields appear in the JSON response, reducing token usage for AI agent workflows. Supported on `issues`, `mrs`, `notes`, `search`, `timeline`, and `who` list commands:

 ```bash
 # Minimal preset (~60% fewer tokens)
@@ -639,6 +745,48 @@ Valid fields for issues: `iid`, `title`, `state`, `author_username`, `labels`, `

 Valid fields for MRs: `iid`, `title`, `state`, `author_username`, `labels`, `draft`, `target_branch`, `source_branch`, `discussion_count`, `unresolved_count`, `created_at_iso`, `updated_at_iso`, `web_url`, `project_path`, `reviewers`

+### Error Tolerance
+
+The CLI auto-corrects common mistakes before parsing, emitting a teaching note to stderr. Corrections work in both human and robot modes:
+
+| Correction | Example | Mode |
+|-----------|---------|------|
+| Single-dash long flag | `-robot` -> `--robot` | All |
+| Case normalization | `--Robot` -> `--robot` | All |
+| Flag prefix expansion | `--proj` -> `--project` (unambiguous only) | All |
+| Fuzzy flag match | `--projct` -> `--project` | All (threshold 0.9 in robot, 0.8 in human) |
+| Subcommand alias | `merge_requests` -> `mrs`, `robotdocs` -> `robot-docs` | All |
+| Value normalization | `--state Opened` -> `--state opened` | All |
+| Value fuzzy match | `--state opend` -> `--state opened` | All |
+| Subcommand prefix | `lore iss` -> `lore issues` (unambiguous only, via clap) | All |
+
+In robot mode, corrections emit structured JSON to stderr:
+
+```json
+{"warning":{"type":"ARG_CORRECTED","corrections":[...],"teaching":["Use double-dash for long flags: --robot (not -robot)"]}}
+```
+
+When a command or flag is still unrecognized after corrections, the error response includes a fuzzy suggestion and, for enum-like flags, lists valid values:
+
+```json
+{"error":{"code":"UNKNOWN_COMMAND","message":"...","suggestion":"Did you mean 'lore issues'? Example: lore --robot issues -n 10. Run 'lore robot-docs' for all commands"}}
+```
+
+### Command Aliases
+
+Commands accept aliases for common variations:
+
+| Primary | Aliases |
+|---------|---------|
+| `issues` | `issue` |
+| `mrs` | `mr`, `merge-requests`, `merge-request` |
+| `notes` | `note` |
+| `search` | `find`, `query` |
+| `stats` | `stat` |
+| `status` | `st` |
+
+Unambiguous prefixes also work via subcommand inference (e.g., `lore iss` -> `lore issues`, `lore time` -> `lore timeline`).
+
 ### Agent Self-Discovery

 The `robot-docs` command provides a complete machine-readable manifest including response schemas for every command: