6 Commits

Author SHA1 Message Date
teernisse
159c490ad7 docs: update README with notes, drift, error tolerance, scoring config, and expanded command reference
Major additions:
- lore notes command: full documentation of rich note querying with
  filters (author, type, path, resolution, time range, body substring),
  sort/format options, field selection, and browser opening
- lore drift command: discussion divergence detection documentation
- Error Tolerance section: table of all 8 auto-correction types with
  examples and mode behavior, stderr JSON warning format, fuzzy
  suggestion format for unrecognized commands
- Command Aliases table: primary commands and their accepted aliases
- scoring config section: all weight/half-life/decay parameters for
  the who-expert scoring engine (authorWeight, reviewerWeight, noteBonus,
  half-life periods, closedMrMultiplier, excludedUsernames)

Updates to existing sections:
- Timeline: entity-direct seeding syntax (issue:N, i:N, mr:N, m:N),
  hybrid search pipeline description replacing pure FTS5, discussion
  thread collection, --fields flag, numbered progress spinners
- Search: --after/--updated-after renamed to --since/--updated-since,
  progress spinner behavior, note type filter
- Who: --explain-score, --as-of, --include-bots, --all-history, --detail
- Sync: --no-file-changes flag
- Robot-docs: --brief flag
- Field selection: expanded to note which commands support --fields
2026-02-13 17:27:59 -05:00
teernisse
e0041ed4d9 feat(cli): improve error recovery with alias-aware suggestions and error tolerance manifest
Two related improvements to agent ergonomics in main.rs:

1. suggest_similar_command now matches against aliases (issue->issues,
   mr->mrs, find->search, stat->stats, note->notes, etc.) and provides
   contextual usage examples via a new command_example() helper, so
   agents get actionable recovery hints like "Did you mean 'lore mrs'?
   Example: lore --robot mrs -n 10" instead of just the command name.

2. robot-docs now includes an error_tolerance section documenting every
   auto-correction the CLI performs: types (single_dash_long_flag,
   case_normalization, flag_prefix, fuzzy_flag, subcommand_alias,
   value_normalization, value_fuzzy, prefix_matching), examples, and
   mode behavior (threshold differences). Also expands the aliases
   section with command_aliases and pre_clap_aliases maps for complete
   agent self-discovery.

Together these ensure agents can programmatically discover and recover
from any CLI input error without human intervention.
2026-02-13 17:27:49 -05:00
teernisse
a34751bd47 feat(autocorrect): expand pre-clap correction to 3-phase pipeline with subcommand aliases, value normalization, and flag prefix matching
Three-phase pipeline replacing the single-pass correction:

- Phase A: Subcommand alias correction — handles forms clap can't
  express (merge_requests, mergerequests, robotdocs, generatedocs,
  gen-docs, etc.) via case-insensitive alias map lookup.
- Phase B: Per-arg flag corrections — adds unambiguous prefix expansion
  (--proj -> --project) alongside existing single-dash, case, and fuzzy
  rules. New FlagPrefix rule with 0.95 confidence.
- Phase C: Enum value normalization — auto-corrects casing, prefixes,
  and typos for flags with known valid values. Handles both --flag value
  and --flag=value forms. Respects POSIX -- option terminator.

Changes strict/robot mode from disabling fuzzy matching entirely to using
a higher threshold (0.9 vs 0.8), still catching obvious typos like
--projct while avoiding speculative corrections that mislead agents.

New CorrectionRule variants: SubcommandAlias, ValueNormalization,
ValueFuzzy, FlagPrefix. Each has a corresponding teaching note.
Comprehensive test coverage for all new correction types including
subcommand aliases, value normalization (case, prefix, fuzzy, eq-form),
flag prefix (ambiguous rejection, eq-value preservation), and updated
strict mode behavior.
2026-02-13 17:27:39 -05:00
teernisse
0aecbf33c0 feat(xref): extract cross-references from descriptions, user notes, and fix system note regex
- Fix MENTIONED_RE/CLOSED_BY_RE to match real GitLab format
  ('mentioned in issue #N' / 'mentioned in merge request !N')
- Add GITLAB_URL_RE + parse_url_refs() for full URL extraction
- Add extract_refs_from_descriptions() -> source_method='description_parse'
- Add extract_refs_from_user_notes() -> source_method='note_parse'
- Wire both into orchestrator after system note extraction
- 36 tests: regex fix, URL parsing, integration, idempotency
2026-02-13 17:19:36 -05:00
teernisse
c10471ddb9 feat(timeline): add entity-direct seeding (issue:N, mr:N syntax)
Adds issue:N / i:N / mr:N / m:N query syntax to bypass hybrid search
and seed the timeline directly from a known entity. All discussions for
the entity are gathered without needing Ollama.

- parse_timeline_query() detects entity-direct patterns
- resolve_entity_by_iid() resolves IID to EntityRef with ambiguity handling
- seed_timeline_direct() gathers all discussions for the entity
- 20 new tests (5 resolve, 6 direct seed, 9 parse)
- Updated CLI help text and robot-docs manifest
2026-02-13 15:22:45 -05:00
teernisse
cbce4c9f59 release: v0.8.2 2026-02-13 15:01:28 -05:00
13 changed files with 1937 additions and 109 deletions

2
Cargo.lock generated
View File

@@ -1106,7 +1106,7 @@ checksum = "5e5032e24019045c762d3c0f28f5b6b8bbf38563a65908389bf7978758920897"
[[package]] [[package]]
name = "lore" name = "lore"
version = "0.8.1" version = "0.8.2"
dependencies = [ dependencies = [
"async-stream", "async-stream",
"chrono", "chrono",

View File

@@ -1,6 +1,6 @@
[package] [package]
name = "lore" name = "lore"
version = "0.8.1" version = "0.8.2"
edition = "2024" edition = "2024"
description = "Gitlore - Local GitLab data management with semantic search" description = "Gitlore - Local GitLab data management with semantic search"
authors = ["Taylor Eernisse"] authors = ["Taylor Eernisse"]

168
README.md
View File

@@ -19,7 +19,10 @@ Local GitLab data management with semantic search, people intelligence, and temp
- **Cross-reference tracking**: Automatic extraction of "closes", "mentioned" relationships between MRs and issues - **Cross-reference tracking**: Automatic extraction of "closes", "mentioned" relationships between MRs and issues
- **Work item status enrichment**: Fetches issue statuses (e.g., "To do", "In progress", "Done") from GitLab's GraphQL API with adaptive page sizing, color-coded display, and case-insensitive filtering - **Work item status enrichment**: Fetches issue statuses (e.g., "To do", "In progress", "Done") from GitLab's GraphQL API with adaptive page sizing, color-coded display, and case-insensitive filtering
- **Resource event history**: Tracks state changes, label events, and milestone events for issues and MRs - **Resource event history**: Tracks state changes, label events, and milestone events for issues and MRs
- **Note querying**: Rich filtering over discussion notes by author, type, path, resolution status, time range, and body content
- **Discussion drift detection**: Semantic analysis of how discussions diverge from original issue intent
- **Robot mode**: Machine-readable JSON output with structured errors, meaningful exit codes, and actionable recovery steps - **Robot mode**: Machine-readable JSON output with structured errors, meaningful exit codes, and actionable recovery steps
- **Error tolerance**: Auto-corrects common CLI mistakes (case, typos, single-dash flags, value casing) with teaching feedback
- **Observability**: Verbosity controls, JSON log format, structured metrics, and stage timing - **Observability**: Verbosity controls, JSON log format, structured metrics, and stage timing
## Installation ## Installation
@@ -71,6 +74,12 @@ lore who @asmith
# Timeline of events related to deployments # Timeline of events related to deployments
lore timeline "deployment" lore timeline "deployment"
# Timeline for a specific issue
lore timeline issue:42
# Query notes by author
lore notes --author alice --since 7d
# Robot mode (machine-readable JSON) # Robot mode (machine-readable JSON)
lore -J issues -n 5 | jq . lore -J issues -n 5 | jq .
``` ```
@@ -109,6 +118,15 @@ Configuration is stored in `~/.config/lore/config.json` (or `$XDG_CONFIG_HOME/lo
"model": "nomic-embed-text", "model": "nomic-embed-text",
"baseUrl": "http://localhost:11434", "baseUrl": "http://localhost:11434",
"concurrency": 4 "concurrency": 4
},
"scoring": {
"authorWeight": 25,
"reviewerWeight": 10,
"noteBonus": 1,
"authorHalfLifeDays": 180,
"reviewerHalfLifeDays": 90,
"noteHalfLifeDays": 45,
"excludedUsernames": ["bot-user"]
} }
} }
``` ```
@@ -135,6 +153,15 @@ Configuration is stored in `~/.config/lore/config.json` (or `$XDG_CONFIG_HOME/lo
| `embedding` | `model` | `nomic-embed-text` | Model name for embeddings | | `embedding` | `model` | `nomic-embed-text` | Model name for embeddings |
| `embedding` | `baseUrl` | `http://localhost:11434` | Ollama server URL | | `embedding` | `baseUrl` | `http://localhost:11434` | Ollama server URL |
| `embedding` | `concurrency` | `4` | Concurrent embedding requests | | `embedding` | `concurrency` | `4` | Concurrent embedding requests |
| `scoring` | `authorWeight` | `25` | Points per MR where the user authored code touching the path |
| `scoring` | `reviewerWeight` | `10` | Points per MR where the user reviewed code touching the path |
| `scoring` | `noteBonus` | `1` | Bonus per inline review comment (DiffNote) |
| `scoring` | `reviewerAssignmentWeight` | `3` | Points per MR where the user was assigned as reviewer |
| `scoring` | `authorHalfLifeDays` | `180` | Half-life in days for author contribution decay |
| `scoring` | `reviewerHalfLifeDays` | `90` | Half-life in days for reviewer contribution decay |
| `scoring` | `noteHalfLifeDays` | `45` | Half-life in days for note/comment decay |
| `scoring` | `closedMrMultiplier` | `0.5` | Score multiplier for closed (not merged) MRs |
| `scoring` | `excludedUsernames` | `[]` | Usernames excluded from expert results (e.g., bots) |
### Config File Resolution ### Config File Resolution
@@ -262,18 +289,21 @@ lore search "login flow" --mode semantic # Vector similarity only
lore search "auth" --type issue # Filter by source type lore search "auth" --type issue # Filter by source type
lore search "auth" --type mr # MR documents only lore search "auth" --type mr # MR documents only
lore search "auth" --type discussion # Discussion documents only lore search "auth" --type discussion # Discussion documents only
lore search "auth" --type note # Individual notes only
lore search "deploy" --author username # Filter by author lore search "deploy" --author username # Filter by author
lore search "deploy" -p group/repo # Filter by project lore search "deploy" -p group/repo # Filter by project
lore search "deploy" --label backend # Filter by label (AND logic) lore search "deploy" --label backend # Filter by label (AND logic)
lore search "deploy" --path src/ # Filter by file path (trailing / for prefix) lore search "deploy" --path src/ # Filter by file path (trailing / for prefix)
lore search "deploy" --after 7d # Created after (7d, 2w, 1m, or YYYY-MM-DD) lore search "deploy" --since 7d # Created since (7d, 2w, 1m, or YYYY-MM-DD)
lore search "deploy" --updated-after 2w # Updated after lore search "deploy" --updated-since 2w # Updated since
lore search "deploy" -n 50 # Limit results (default 20, max 100) lore search "deploy" -n 50 # Limit results (default 20, max 100)
lore search "deploy" --explain # Show ranking explanation per result lore search "deploy" --explain # Show ranking explanation per result
lore search "deploy" --fts-mode raw # Raw FTS5 query syntax (advanced) lore search "deploy" --fts-mode raw # Raw FTS5 query syntax (advanced)
``` ```
The `--fts-mode` flag defaults to `safe`, which sanitizes user input into valid FTS5 queries with automatic fallback. Use `raw` for advanced FTS5 query syntax (AND, OR, NOT, phrase matching, prefix queries). The `--fts-mode` flag defaults to `safe`, which sanitizes user input into valid FTS5 queries with automatic fallback. FTS5 boolean operators (`AND`, `OR`, `NOT`, `NEAR`) are passed through in safe mode, so queries like `"switch AND health"` work without switching to raw mode. Use `raw` for advanced FTS5 query syntax (phrase matching, column filters, prefix queries).
A progress spinner displays during search, showing the active mode (e.g., `Searching (hybrid)...`). In robot mode, spinners are suppressed for clean JSON output.
Requires `lore generate-docs` (or `lore sync`) to have been run at least once. Semantic and hybrid modes require `lore embed` (or `lore sync`) to have generated vector embeddings via Ollama. Requires `lore generate-docs` (or `lore sync`) to have been run at least once. Semantic and hybrid modes require `lore embed` (or `lore sync`) to have generated vector embeddings via Ollama.
@@ -283,7 +313,7 @@ People intelligence: discover experts, analyze workloads, review patterns, activ
#### Expert Mode #### Expert Mode
Find who has expertise in a code area based on authoring and reviewing history (DiffNote analysis). Find who has expertise in a code area based on authoring and reviewing history (DiffNote analysis). Scores use exponential half-life decay so recent contributions count more than older ones. Scoring weights and half-life periods are configurable via the `scoring` config section.
```bash ```bash
lore who src/features/auth/ # Who knows about this directory? lore who src/features/auth/ # Who knows about this directory?
@@ -292,6 +322,9 @@ lore who --path README.md # Root files need --path flag
lore who --path Makefile # Dotless root files too lore who --path Makefile # Dotless root files too
lore who src/ --since 3m # Limit to recent 3 months lore who src/ --since 3m # Limit to recent 3 months
lore who src/ -p group/repo # Scope to project lore who src/ -p group/repo # Scope to project
lore who src/ --explain-score # Show per-component score breakdown
lore who src/ --as-of 30d # Score as if "now" was 30 days ago
lore who src/ --include-bots # Include bot users in results
``` ```
The target is auto-detected as a path when it contains `/`. For root files without `/` (e.g., `README.md`), use the `--path` flag. Default time window: 6 months. The target is auto-detected as a path when it contains `/`. For root files without `/` (e.g., `README.md`), use the `--path` flag. Default time window: 6 months.
@@ -348,13 +381,22 @@ Shows: users with touch counts (author vs. review), linked MR references. Defaul
| `-p` / `--project` | Scope to a project (fuzzy match) | | `-p` / `--project` | Scope to a project (fuzzy match) |
| `--since` | Time window (7d, 2w, 6m, YYYY-MM-DD). Default varies by mode. | | `--since` | Time window (7d, 2w, 6m, YYYY-MM-DD). Default varies by mode. |
| `-n` / `--limit` | Max results per section (1-500, default 20) | | `-n` / `--limit` | Max results per section (1-500, default 20) |
| `--all-history` | Remove the default time window, query all history |
| `--detail` | Show per-MR detail breakdown (expert mode only) |
| `--explain-score` | Show per-component score breakdown (expert mode only) |
| `--as-of` | Score as if "now" is a past date (ISO 8601 or duration like 30d, expert mode only) |
| `--include-bots` | Include bot users normally excluded via `scoring.excludedUsernames` |
### `lore timeline` ### `lore timeline`
Reconstruct a chronological timeline of events matching a keyword query. The pipeline discovers related entities through cross-reference graph traversal and assembles a unified, time-ordered event stream. Reconstruct a chronological timeline of events matching a keyword query. The pipeline discovers related entities through cross-reference graph traversal and assembles a unified, time-ordered event stream.
```bash ```bash
lore timeline "deployment" # Events related to deployments lore timeline "deployment" # Search-based seeding (hybrid search)
lore timeline issue:42 # Direct entity seeding by issue IID
lore timeline i:42 # Shorthand for issue:42
lore timeline mr:99 # Direct entity seeding by MR IID
lore timeline m:99 # Shorthand for mr:99
lore timeline "auth" -p group/repo # Scoped to a project lore timeline "auth" -p group/repo # Scoped to a project
lore timeline "auth" --since 30d # Only recent events lore timeline "auth" --since 30d # Only recent events
lore timeline "migration" --depth 2 # Deeper cross-reference expansion lore timeline "migration" --depth 2 # Deeper cross-reference expansion
@@ -363,6 +405,8 @@ lore timeline "deploy" -n 50 # Limit event count
lore timeline "auth" --max-seeds 5 # Fewer seed entities lore timeline "auth" --max-seeds 5 # Fewer seed entities
``` ```
The query can be either a search string (hybrid search finds matching entities) or an entity reference (`issue:N`, `i:N`, `mr:N`, `m:N`) which directly seeds the timeline from a specific entity and its cross-references.
#### Flags #### Flags
| Flag | Default | Description | | Flag | Default | Description |
@@ -375,13 +419,16 @@ lore timeline "auth" --max-seeds 5 # Fewer seed entities
| `--max-seeds` | `10` | Maximum seed entities from search | | `--max-seeds` | `10` | Maximum seed entities from search |
| `--max-entities` | `50` | Maximum entities discovered via cross-references | | `--max-entities` | `50` | Maximum entities discovered via cross-references |
| `--max-evidence` | `10` | Maximum evidence notes included | | `--max-evidence` | `10` | Maximum evidence notes included |
| `--fields` | all | Select output fields (comma-separated, or 'minimal' preset) |
#### Pipeline Stages #### Pipeline Stages
1. **SEED** -- Full-text search identifies the most relevant issues and MRs matching the query. Documents are ranked by BM25 relevance. Each stage displays a numbered progress spinner (e.g., `[1/3] Seeding timeline...`). In robot mode, spinners are suppressed for clean JSON output.
2. **HYDRATE** -- Evidence notes are extracted: the top FTS-matched discussion notes with 200-character snippets explaining *why* each entity was surfaced.
1. **SEED** -- Hybrid search (FTS5 lexical + Ollama vector similarity via Reciprocal Rank Fusion) identifies the most relevant issues and MRs. Falls back to lexical-only if Ollama is unavailable. Discussion notes matching the query are also discovered and attached to their parent entities.
2. **HYDRATE** -- Evidence notes are extracted: the top search-matched discussion notes with 200-character snippets explaining *why* each entity was surfaced. Matched discussions are collected as full thread candidates.
3. **EXPAND** -- Breadth-first traversal over the `entity_references` graph discovers related entities via "closes", "related", and optionally "mentioned" references up to the configured depth. 3. **EXPAND** -- Breadth-first traversal over the `entity_references` graph discovers related entities via "closes", "related", and optionally "mentioned" references up to the configured depth.
4. **COLLECT** -- Events are gathered for all discovered entities. Event types include: creation, state changes, label adds/removes, milestone assignments, merge events, and evidence notes. Events are sorted chronologically with stable tiebreaking. 4. **COLLECT** -- Events are gathered for all discovered entities. Event types include: creation, state changes, label adds/removes, milestone assignments, merge events, evidence notes, and full discussion threads. Events are sorted chronologically with stable tiebreaking.
5. **RENDER** -- Events are formatted as human-readable text or structured JSON (robot mode). 5. **RENDER** -- Events are formatted as human-readable text or structured JSON (robot mode).
#### Event Types #### Event Types
@@ -395,13 +442,70 @@ lore timeline "auth" --max-seeds 5 # Fewer seed entities
| `MilestoneSet` | Milestone assigned | | `MilestoneSet` | Milestone assigned |
| `MilestoneRemoved` | Milestone removed | | `MilestoneRemoved` | Milestone removed |
| `Merged` | MR merged (deduplicated against state events) | | `Merged` | MR merged (deduplicated against state events) |
| `NoteEvidence` | Discussion note matched by FTS, with snippet | | `NoteEvidence` | Discussion note matched by search, with snippet |
| `DiscussionThread` | Full discussion thread with all non-system notes |
| `CrossReferenced` | Reference to another entity | | `CrossReferenced` | Reference to another entity |
#### Unresolved References #### Unresolved References
When graph expansion encounters cross-project references to entities not yet synced locally, these are collected as unresolved references in the output. This enables discovery of external dependencies and can inform future sync targets. When graph expansion encounters cross-project references to entities not yet synced locally, these are collected as unresolved references in the output. This enables discovery of external dependencies and can inform future sync targets.
### `lore notes`
Query individual notes from discussions with rich filtering options.
```bash
lore notes # List 50 most recent notes
lore notes --author alice --since 7d # Notes by alice in last 7 days
lore notes --for-issue 42 -p group/repo # Notes on issue #42
lore notes --for-mr 99 -p group/repo # Notes on MR !99
lore notes --path src/ --resolution unresolved # Unresolved diff notes in src/
lore notes --note-type DiffNote # Only inline code review comments
lore notes --contains "TODO" # Substring search in note body
lore notes --include-system # Include system-generated notes
lore notes --since 2w --until 2024-12-31 # Time-bounded range
lore notes --sort updated --asc # Sort by update time, ascending
lore notes --format csv # CSV output
lore notes --format jsonl # Line-delimited JSON
lore notes -o # Open first result in browser
# Field selection (robot mode)
lore -J notes --fields minimal # Compact: id, author_username, body, created_at_iso
```
#### Filters
| Flag | Description |
|------|-------------|
| `-a` / `--author` | Filter by note author username |
| `--note-type` | Filter by note type (DiffNote, DiscussionNote) |
| `--contains` | Substring search in note body |
| `--note-id` | Filter by internal note ID |
| `--gitlab-note-id` | Filter by GitLab note ID |
| `--discussion-id` | Filter by discussion ID |
| `--include-system` | Include system notes (excluded by default) |
| `--for-issue` | Notes on a specific issue IID (requires `-p`) |
| `--for-mr` | Notes on a specific MR IID (requires `-p`) |
| `-p` / `--project` | Scope to a project (fuzzy match) |
| `--since` | Notes created since (7d, 2w, 1m, or YYYY-MM-DD) |
| `--until` | Notes created until (YYYY-MM-DD, inclusive end-of-day) |
| `--path` | Filter by file path (DiffNotes only; trailing `/` for prefix match) |
| `--resolution` | Filter by resolution status (`any`, `unresolved`, `resolved`) |
| `--sort` | Sort by `created` (default) or `updated` |
| `--asc` | Sort ascending (default: descending) |
| `--format` | Output format: `table` (default), `json`, `jsonl`, `csv` |
| `-o` / `--open` | Open first result in browser |
### `lore drift`
Detect discussion divergence from the original intent of an issue by comparing the semantic similarity of discussion content against the issue description.
```bash
lore drift issues 42 # Check divergence on issue #42
lore drift issues 42 --threshold 0.6 # Higher threshold (stricter)
lore drift issues 42 -p group/repo # Scope to project
```
### `lore sync` ### `lore sync`
Run the full sync pipeline: ingest from GitLab (including work item status enrichment via GraphQL), generate searchable documents, and compute embeddings. Run the full sync pipeline: ingest from GitLab (including work item status enrichment via GraphQL), generate searchable documents, and compute embeddings.
@@ -413,6 +517,7 @@ lore sync --force # Override stale lock
lore sync --no-embed # Skip embedding step lore sync --no-embed # Skip embedding step
lore sync --no-docs # Skip document regeneration lore sync --no-docs # Skip document regeneration
lore sync --no-events # Skip resource event fetching lore sync --no-events # Skip resource event fetching
lore sync --no-file-changes # Skip MR file change fetching
lore sync --dry-run # Preview what would be synced lore sync --dry-run # Preview what would be synced
``` ```
@@ -571,6 +676,7 @@ Machine-readable command manifest for agent self-discovery. Returns a JSON schem
```bash ```bash
lore robot-docs # Pretty-printed JSON lore robot-docs # Pretty-printed JSON
lore --robot robot-docs # Compact JSON for parsing lore --robot robot-docs # Compact JSON for parsing
lore robot-docs --brief # Omit response_schema (~60% smaller)
``` ```
### `lore version` ### `lore version`
@@ -622,7 +728,7 @@ The `actions` array contains executable shell commands an agent can run to recov
### Field Selection ### Field Selection
The `--fields` flag on `issues` and `mrs` list commands controls which fields appear in the JSON response, reducing token usage for AI agent workflows: The `--fields` flag controls which fields appear in the JSON response, reducing token usage for AI agent workflows. Supported on `issues`, `mrs`, `notes`, `search`, `timeline`, and `who` list commands:
```bash ```bash
# Minimal preset (~60% fewer tokens) # Minimal preset (~60% fewer tokens)
@@ -639,6 +745,48 @@ Valid fields for issues: `iid`, `title`, `state`, `author_username`, `labels`, `
Valid fields for MRs: `iid`, `title`, `state`, `author_username`, `labels`, `draft`, `target_branch`, `source_branch`, `discussion_count`, `unresolved_count`, `created_at_iso`, `updated_at_iso`, `web_url`, `project_path`, `reviewers` Valid fields for MRs: `iid`, `title`, `state`, `author_username`, `labels`, `draft`, `target_branch`, `source_branch`, `discussion_count`, `unresolved_count`, `created_at_iso`, `updated_at_iso`, `web_url`, `project_path`, `reviewers`
### Error Tolerance
The CLI auto-corrects common mistakes before parsing, emitting a teaching note to stderr. Corrections work in both human and robot modes:
| Correction | Example | Mode |
|-----------|---------|------|
| Single-dash long flag | `-robot` -> `--robot` | All |
| Case normalization | `--Robot` -> `--robot` | All |
| Flag prefix expansion | `--proj` -> `--project` (unambiguous only) | All |
| Fuzzy flag match | `--projct` -> `--project` | All (threshold 0.9 in robot, 0.8 in human) |
| Subcommand alias | `merge_requests` -> `mrs`, `robotdocs` -> `robot-docs` | All |
| Value normalization | `--state Opened` -> `--state opened` | All |
| Value fuzzy match | `--state opend` -> `--state opened` | All |
| Subcommand prefix | `lore iss` -> `lore issues` (unambiguous only, via clap) | All |
In robot mode, corrections emit structured JSON to stderr:
```json
{"warning":{"type":"ARG_CORRECTED","corrections":[...],"teaching":["Use double-dash for long flags: --robot (not -robot)"]}}
```
When a command or flag is still unrecognized after corrections, the error response includes a fuzzy suggestion and, for enum-like flags, lists valid values:
```json
{"error":{"code":"UNKNOWN_COMMAND","message":"...","suggestion":"Did you mean 'lore issues'? Example: lore --robot issues -n 10. Run 'lore robot-docs' for all commands"}}
```
### Command Aliases
Commands accept aliases for common variations:
| Primary | Aliases |
|---------|---------|
| `issues` | `issue` |
| `mrs` | `mr`, `merge-requests`, `merge-request` |
| `notes` | `note` |
| `search` | `find`, `query` |
| `stats` | `stat` |
| `status` | `st` |
Unambiguous prefixes also work via subcommand inference (e.g., `lore iss` -> `lore issues`, `lore time` -> `lore timeline`).
### Agent Self-Discovery ### Agent Self-Discovery
The `robot-docs` command provides a complete machine-readable manifest including response schemas for every command: The `robot-docs` command provides a complete machine-readable manifest including response schemas for every command:

View File

@@ -21,6 +21,10 @@ pub enum CorrectionRule {
SingleDashLongFlag, SingleDashLongFlag,
CaseNormalization, CaseNormalization,
FuzzyFlag, FuzzyFlag,
SubcommandAlias,
ValueNormalization,
ValueFuzzy,
FlagPrefix,
} }
/// Result of the correction pass over raw args. /// Result of the correction pass over raw args.
@@ -261,18 +265,45 @@ pub const ENUM_VALUES: &[(&str, &[&str])] = &[
("--state", &["opened", "closed", "merged", "locked", "all"]), ("--state", &["opened", "closed", "merged", "locked", "all"]),
("--mode", &["lexical", "hybrid", "semantic"]), ("--mode", &["lexical", "hybrid", "semantic"]),
("--sort", &["updated", "created", "iid"]), ("--sort", &["updated", "created", "iid"]),
("--type", &["issue", "mr", "discussion"]), ("--type", &["issue", "mr", "discussion", "note"]),
("--fts-mode", &["safe", "raw"]), ("--fts-mode", &["safe", "raw"]),
("--color", &["auto", "always", "never"]), ("--color", &["auto", "always", "never"]),
("--log-format", &["text", "json"]), ("--log-format", &["text", "json"]),
("--for", &["issue", "mr"]), ("--for", &["issue", "mr"]),
]; ];
// ---------------------------------------------------------------------------
// Subcommand alias map (for forms clap aliases can't express)
// ---------------------------------------------------------------------------
/// Subcommand aliases for non-standard forms (underscores, no separators).
/// Clap `visible_alias`/`alias` handles hyphenated forms (`merge-requests`);
/// this map catches the rest.
const SUBCOMMAND_ALIASES: &[(&str, &str)] = &[
("merge_requests", "mrs"),
("merge_request", "mrs"),
("mergerequests", "mrs"),
("mergerequest", "mrs"),
("generate_docs", "generate-docs"),
("generatedocs", "generate-docs"),
("gendocs", "generate-docs"),
("gen-docs", "generate-docs"),
("robot_docs", "robot-docs"),
("robotdocs", "robot-docs"),
("sync_status", "status"),
("syncstatus", "status"),
("auth_test", "auth"),
("authtest", "auth"),
];
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// Correction thresholds // Correction thresholds
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
const FUZZY_FLAG_THRESHOLD: f64 = 0.8; const FUZZY_FLAG_THRESHOLD: f64 = 0.8;
/// Stricter threshold for robot mode — only high-confidence corrections to
/// avoid misleading agents. Still catches obvious typos like `--projct`.
const FUZZY_FLAG_THRESHOLD_STRICT: f64 = 0.9;
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// Core logic // Core logic
@@ -332,20 +363,29 @@ fn valid_flags_for(subcommand: Option<&str>) -> Vec<&'static str> {
/// Run the pre-clap correction pass on raw args. /// Run the pre-clap correction pass on raw args.
/// ///
/// When `strict` is true (robot mode), only deterministic corrections are applied /// Three-phase pipeline:
/// (single-dash long flags, case normalization). Fuzzy matching is disabled to /// - Phase A: Subcommand alias correction (case-insensitive alias map)
/// prevent misleading agents with speculative corrections. /// - Phase B: Per-arg flag corrections (single-dash, case, prefix, fuzzy)
/// - Phase C: Enum value normalization (case + fuzzy + prefix on known values)
///
/// When `strict` is true (robot mode), fuzzy matching uses a higher threshold
/// (0.9 vs 0.8) to avoid speculative corrections while still catching obvious
/// typos like `--projct` → `--project`.
/// ///
/// Returns the (possibly modified) args and any corrections applied. /// Returns the (possibly modified) args and any corrections applied.
pub fn correct_args(raw: Vec<String>, strict: bool) -> CorrectionResult { pub fn correct_args(raw: Vec<String>, strict: bool) -> CorrectionResult {
let subcommand = detect_subcommand(&raw);
let valid = valid_flags_for(subcommand);
let mut corrected = Vec::with_capacity(raw.len());
let mut corrections = Vec::new(); let mut corrections = Vec::new();
// Phase A: Subcommand alias correction
let args = correct_subcommand(raw, &mut corrections);
// Phase B: Per-arg flag corrections
let valid = valid_flags_for(detect_subcommand(&args));
let mut corrected = Vec::with_capacity(args.len());
let mut past_terminator = false; let mut past_terminator = false;
for arg in raw { for arg in args {
// B1: Stop correcting after POSIX `--` option terminator // B1: Stop correcting after POSIX `--` option terminator
if arg == "--" { if arg == "--" {
past_terminator = true; past_terminator = true;
@@ -367,12 +407,177 @@ pub fn correct_args(raw: Vec<String>, strict: bool) -> CorrectionResult {
} }
} }
// Phase C: Enum value normalization
normalize_enum_values(&mut corrected, &mut corrections);
CorrectionResult { CorrectionResult {
args: corrected, args: corrected,
corrections, corrections,
} }
} }
/// Phase A: Replace subcommand aliases with their canonical names.
///
/// Handles forms that can't be expressed as clap `alias`/`visible_alias`
/// (underscores, no-separator forms). Case-insensitive matching.
fn correct_subcommand(mut args: Vec<String>, corrections: &mut Vec<Correction>) -> Vec<String> {
// Find the subcommand position index, then check the alias map.
// Can't use iterators easily because we need to mutate args[i].
let mut skip_next = false;
let mut subcmd_idx = None;
for (i, arg) in args.iter().enumerate().skip(1) {
if skip_next {
skip_next = false;
continue;
}
if arg.starts_with('-') {
if arg.contains('=') {
continue;
}
if matches!(arg.as_str(), "--config" | "-c" | "--color" | "--log-format") {
skip_next = true;
}
continue;
}
subcmd_idx = Some(i);
break;
}
if let Some(i) = subcmd_idx
&& let Some((_, canonical)) = SUBCOMMAND_ALIASES
.iter()
.find(|(alias, _)| alias.eq_ignore_ascii_case(&args[i]))
{
corrections.push(Correction {
original: args[i].clone(),
corrected: (*canonical).to_string(),
rule: CorrectionRule::SubcommandAlias,
confidence: 1.0,
});
args[i] = (*canonical).to_string();
}
args
}
/// Phase C: Normalize enum values for flags with known valid values.
///
/// Handles both `--flag value` and `--flag=value` forms. Corrections are:
/// 1. Case normalization: `Opened` → `opened`
/// 2. Prefix expansion: `open` → `opened` (only if unambiguous)
/// 3. Fuzzy matching: `opend` → `opened`
fn normalize_enum_values(args: &mut [String], corrections: &mut Vec<Correction>) {
let mut i = 0;
while i < args.len() {
// Respect POSIX `--` option terminator — don't normalize values after it
if args[i] == "--" {
break;
}
// Handle --flag=value form
if let Some(eq_pos) = args[i].find('=') {
let flag = args[i][..eq_pos].to_string();
let value = args[i][eq_pos + 1..].to_string();
if let Some(valid_vals) = lookup_enum_values(&flag)
&& let Some((corrected_val, is_case_only)) = normalize_value(&value, valid_vals)
{
let original = args[i].clone();
let corrected = format!("{flag}={corrected_val}");
args[i] = corrected.clone();
corrections.push(Correction {
original,
corrected,
rule: if is_case_only {
CorrectionRule::ValueNormalization
} else {
CorrectionRule::ValueFuzzy
},
confidence: 0.95,
});
}
i += 1;
continue;
}
// Handle --flag value form
if args[i].starts_with("--")
&& let Some(valid_vals) = lookup_enum_values(&args[i])
&& i + 1 < args.len()
&& !args[i + 1].starts_with('-')
{
let value = args[i + 1].clone();
if let Some((corrected_val, is_case_only)) = normalize_value(&value, valid_vals) {
let original = args[i + 1].clone();
args[i + 1] = corrected_val.to_string();
corrections.push(Correction {
original,
corrected: corrected_val.to_string(),
rule: if is_case_only {
CorrectionRule::ValueNormalization
} else {
CorrectionRule::ValueFuzzy
},
confidence: 0.95,
});
}
i += 2;
continue;
}
i += 1;
}
}
/// Look up valid enum values for a flag (case-insensitive flag name match).
fn lookup_enum_values(flag: &str) -> Option<&'static [&'static str]> {
let lower = flag.to_lowercase();
ENUM_VALUES
.iter()
.find(|(f, _)| f.to_lowercase() == lower)
.map(|(_, vals)| *vals)
}
/// Try to normalize a value against a set of valid values.
///
/// Returns `Some((corrected, is_case_only))` if a correction is needed:
/// - `is_case_only = true` for pure case normalization
/// - `is_case_only = false` for prefix/fuzzy corrections
///
/// Returns `None` if the value is already valid or no match is found.
fn normalize_value(input: &str, valid_values: &[&str]) -> Option<(String, bool)> {
// Already valid (exact match)? No correction needed.
if valid_values.contains(&input) {
return None;
}
let lower = input.to_lowercase();
// Case-insensitive exact match
if let Some(&val) = valid_values.iter().find(|v| v.to_lowercase() == lower) {
return Some((val.to_string(), true));
}
// Prefix match (e.g., "open" → "opened") — only if unambiguous
let prefix_matches: Vec<&&str> = valid_values
.iter()
.filter(|v| v.starts_with(&*lower))
.collect();
if prefix_matches.len() == 1 {
return Some(((*prefix_matches[0]).to_string(), false));
}
// Fuzzy match
let best = valid_values
.iter()
.map(|v| (*v, jaro_winkler(&lower, v)))
.max_by(|a, b| a.1.partial_cmp(&b.1).unwrap_or(std::cmp::Ordering::Equal));
if let Some((val, score)) = best
&& score >= 0.8
{
return Some((val.to_string(), false));
}
None
}
/// Clap built-in flags that should never be corrected. These are handled by clap /// Clap built-in flags that should never be corrected. These are handled by clap
/// directly and are not in our GLOBAL_FLAGS registry. /// directly and are not in our GLOBAL_FLAGS registry.
const CLAP_BUILTINS: &[&str] = &["--help", "--version"]; const CLAP_BUILTINS: &[&str] = &["--help", "--version"];
@@ -491,10 +696,34 @@ fn try_correct(arg: &str, valid_flags: &[&str], strict: bool) -> Option<Correcti
}); });
} }
// Rule 3: Fuzzy flag match — `--staate` -> `--state` (skip in strict mode) // Rule 3: Prefix match — `--proj` -> `--project` (only if unambiguous)
if !strict let prefix_matches: Vec<&str> = valid_flags
&& let Some((best_flag, score)) = best_fuzzy_match(&lower, valid_flags) .iter()
&& score >= FUZZY_FLAG_THRESHOLD .filter(|f| f.starts_with(&*lower) && f.to_lowercase() != lower)
.copied()
.collect();
if prefix_matches.len() == 1 {
let matched = prefix_matches[0];
let corrected = match value_suffix {
Some(suffix) => format!("{matched}{suffix}"),
None => matched.to_string(),
};
return Some(Correction {
original: arg.to_string(),
corrected,
rule: CorrectionRule::FlagPrefix,
confidence: 0.95,
});
}
// Rule 4: Fuzzy flag match — higher threshold in strict/robot mode
let threshold = if strict {
FUZZY_FLAG_THRESHOLD_STRICT
} else {
FUZZY_FLAG_THRESHOLD
};
if let Some((best_flag, score)) = best_fuzzy_match(&lower, valid_flags)
&& score >= threshold
{ {
let corrected = match value_suffix { let corrected = match value_suffix {
Some(suffix) => format!("{best_flag}{suffix}"), Some(suffix) => format!("{best_flag}{suffix}"),
@@ -568,6 +797,30 @@ pub fn format_teaching_note(correction: &Correction) -> String {
correction.corrected, correction.original correction.corrected, correction.original
) )
} }
CorrectionRule::SubcommandAlias => {
format!(
"Use canonical command name: {} (not {})",
correction.corrected, correction.original
)
}
CorrectionRule::ValueNormalization => {
format!(
"Values are lowercase: {} (not {})",
correction.corrected, correction.original
)
}
CorrectionRule::ValueFuzzy => {
format!(
"Correct value spelling: {} (not {})",
correction.corrected, correction.original
)
}
CorrectionRule::FlagPrefix => {
format!(
"Use full flag name: {} (not {})",
correction.corrected, correction.original
)
}
} }
} }
@@ -751,17 +1004,20 @@ mod tests {
assert_eq!(result.args[1], "--help"); assert_eq!(result.args[1], "--help");
} }
// ---- I6: Strict mode (robot) disables fuzzy matching ---- // ---- Strict mode (robot) uses higher fuzzy threshold ----
#[test] #[test]
fn strict_mode_disables_fuzzy() { fn strict_mode_rejects_low_confidence_fuzzy() {
// Fuzzy match works in non-strict // `--staate` vs `--state` — close but may be below strict threshold (0.9)
// The exact score depends on Jaro-Winkler; this tests that the strict
// threshold is higher than non-strict.
let non_strict = correct_args(args("lore --robot issues --staate opened"), false); let non_strict = correct_args(args("lore --robot issues --staate opened"), false);
assert_eq!(non_strict.corrections.len(), 1); assert_eq!(non_strict.corrections.len(), 1);
assert_eq!(non_strict.corrections[0].rule, CorrectionRule::FuzzyFlag); assert_eq!(non_strict.corrections[0].rule, CorrectionRule::FuzzyFlag);
// Fuzzy match disabled in strict // In strict mode, same typo might or might not match depending on JW score.
let strict = correct_args(args("lore --robot issues --staate opened"), true); // We verify that at least wildly wrong flags are still rejected.
let strict = correct_args(args("lore --robot issues --xyzzy foo"), true);
assert!(strict.corrections.is_empty()); assert!(strict.corrections.is_empty());
} }
@@ -780,6 +1036,155 @@ mod tests {
assert_eq!(result.corrections[0].corrected, "--robot"); assert_eq!(result.corrections[0].corrected, "--robot");
} }
// ---- Subcommand alias correction ----
#[test]
fn subcommand_alias_merge_requests_underscore() {
let result = correct_args(args("lore --robot merge_requests -n 10"), false);
assert!(
result
.corrections
.iter()
.any(|c| c.rule == CorrectionRule::SubcommandAlias && c.corrected == "mrs")
);
assert!(result.args.contains(&"mrs".to_string()));
}
#[test]
fn subcommand_alias_mergerequests_no_sep() {
let result = correct_args(args("lore --robot mergerequests"), false);
assert!(result.corrections.iter().any(|c| c.corrected == "mrs"));
}
#[test]
fn subcommand_alias_generate_docs_underscore() {
let result = correct_args(args("lore generate_docs"), false);
assert!(
result
.corrections
.iter()
.any(|c| c.corrected == "generate-docs")
);
}
#[test]
fn subcommand_alias_case_insensitive() {
let result = correct_args(args("lore Merge_Requests"), false);
assert!(result.corrections.iter().any(|c| c.corrected == "mrs"));
}
#[test]
fn subcommand_alias_valid_command_untouched() {
let result = correct_args(args("lore issues -n 10"), false);
assert!(result.corrections.is_empty());
}
// ---- Enum value normalization ----
#[test]
fn value_case_normalization() {
let result = correct_args(args("lore issues --state Opened"), false);
assert!(
result
.corrections
.iter()
.any(|c| c.rule == CorrectionRule::ValueNormalization && c.corrected == "opened")
);
assert!(result.args.contains(&"opened".to_string()));
}
#[test]
fn value_case_normalization_eq_form() {
let result = correct_args(args("lore issues --state=Opened"), false);
assert!(
result
.corrections
.iter()
.any(|c| c.corrected == "--state=opened")
);
}
#[test]
fn value_prefix_expansion() {
// "open" is a unique prefix of "opened"
let result = correct_args(args("lore issues --state open"), false);
assert!(
result
.corrections
.iter()
.any(|c| c.corrected == "opened" && c.rule == CorrectionRule::ValueFuzzy)
);
}
#[test]
fn value_fuzzy_typo() {
let result = correct_args(args("lore issues --state opend"), false);
assert!(result.corrections.iter().any(|c| c.corrected == "opened"));
}
#[test]
fn value_already_valid_untouched() {
let result = correct_args(args("lore issues --state opened"), false);
// No value corrections expected (flag corrections may still exist)
assert!(!result.corrections.iter().any(|c| matches!(
c.rule,
CorrectionRule::ValueNormalization | CorrectionRule::ValueFuzzy
)));
}
#[test]
fn value_mode_case() {
let result = correct_args(args("lore search --mode Hybrid query"), false);
assert!(result.corrections.iter().any(|c| c.corrected == "hybrid"));
}
#[test]
fn value_normalization_respects_option_terminator() {
// Values after `--` are positional and must not be corrected
let result = correct_args(args("lore search -- --state Opened"), false);
assert!(!result.corrections.iter().any(|c| matches!(
c.rule,
CorrectionRule::ValueNormalization | CorrectionRule::ValueFuzzy
)));
assert_eq!(result.args[4], "Opened"); // preserved as-is
}
// ---- Flag prefix matching ----
#[test]
fn flag_prefix_project() {
let result = correct_args(args("lore issues --proj group/repo"), false);
assert!(
result
.corrections
.iter()
.any(|c| c.rule == CorrectionRule::FlagPrefix && c.corrected == "--project")
);
}
#[test]
fn flag_prefix_ambiguous_not_corrected() {
// --s could be --state, --since, --sort, --status — ambiguous
let result = correct_args(args("lore issues --s opened"), false);
assert!(
!result
.corrections
.iter()
.any(|c| c.rule == CorrectionRule::FlagPrefix)
);
}
#[test]
fn flag_prefix_with_eq_value() {
let result = correct_args(args("lore issues --proj=group/repo"), false);
assert!(
result
.corrections
.iter()
.any(|c| c.corrected == "--project=group/repo")
);
}
// ---- Teaching notes ---- // ---- Teaching notes ----
#[test] #[test]
@@ -819,6 +1224,43 @@ mod tests {
assert!(note.contains("spelling")); assert!(note.contains("spelling"));
} }
#[test]
fn teaching_note_subcommand_alias() {
let c = Correction {
original: "merge_requests".to_string(),
corrected: "mrs".to_string(),
rule: CorrectionRule::SubcommandAlias,
confidence: 1.0,
};
let note = format_teaching_note(&c);
assert!(note.contains("canonical"));
assert!(note.contains("mrs"));
}
#[test]
fn teaching_note_value_normalization() {
let c = Correction {
original: "Opened".to_string(),
corrected: "opened".to_string(),
rule: CorrectionRule::ValueNormalization,
confidence: 0.95,
};
let note = format_teaching_note(&c);
assert!(note.contains("lowercase"));
}
#[test]
fn teaching_note_flag_prefix() {
let c = Correction {
original: "--proj".to_string(),
corrected: "--project".to_string(),
rule: CorrectionRule::FlagPrefix,
confidence: 0.95,
};
let note = format_teaching_note(&c);
assert!(note.contains("full flag name"));
}
// ---- Post-clap suggestion helpers ---- // ---- Post-clap suggestion helpers ----
#[test] #[test]

View File

@@ -13,7 +13,7 @@ use crate::core::timeline::{
}; };
use crate::core::timeline_collect::collect_events; use crate::core::timeline_collect::collect_events;
use crate::core::timeline_expand::expand_timeline; use crate::core::timeline_expand::expand_timeline;
use crate::core::timeline_seed::seed_timeline; use crate::core::timeline_seed::{seed_timeline, seed_timeline_direct};
use crate::embedding::ollama::{OllamaClient, OllamaConfig}; use crate::embedding::ollama::{OllamaClient, OllamaConfig};
/// Parameters for running the timeline pipeline. /// Parameters for running the timeline pipeline.
@@ -30,6 +30,43 @@ pub struct TimelineParams {
pub robot_mode: bool, pub robot_mode: bool,
} }
/// Parsed timeline query: either a search string or a direct entity reference.
enum TimelineQuery {
Search(String),
EntityDirect { entity_type: String, iid: i64 },
}
/// Parse the timeline query for entity-direct patterns.
///
/// Recognized patterns (case-insensitive prefix):
/// - `issue:N`, `i:N` -> issue
/// - `mr:N`, `m:N` -> merge_request
/// - Anything else -> search query
fn parse_timeline_query(query: &str) -> TimelineQuery {
let query = query.trim();
if let Some((prefix, rest)) = query.split_once(':') {
let prefix_lower = prefix.to_ascii_lowercase();
if let Ok(iid) = rest.trim().parse::<i64>() {
match prefix_lower.as_str() {
"issue" | "i" => {
return TimelineQuery::EntityDirect {
entity_type: "issue".to_owned(),
iid,
};
}
"mr" | "m" => {
return TimelineQuery::EntityDirect {
entity_type: "merge_request".to_owned(),
iid,
};
}
_ => {}
}
}
}
TimelineQuery::Search(query.to_owned())
}
/// Run the full timeline pipeline: SEED -> EXPAND -> COLLECT. /// Run the full timeline pipeline: SEED -> EXPAND -> COLLECT.
pub async fn run_timeline(config: &Config, params: &TimelineParams) -> Result<TimelineResult> { pub async fn run_timeline(config: &Config, params: &TimelineParams) -> Result<TimelineResult> {
let db_path = get_db_path(config.storage.db_path.as_deref()); let db_path = get_db_path(config.storage.db_path.as_deref());
@@ -53,27 +90,42 @@ pub async fn run_timeline(config: &Config, params: &TimelineParams) -> Result<Ti
}) })
.transpose()?; .transpose()?;
// Construct OllamaClient for hybrid search (same pattern as run_search) // Parse query for entity-direct syntax (issue:N, mr:N, i:N, m:N)
let ollama_cfg = &config.embedding; let parsed_query = parse_timeline_query(&params.query);
let client = OllamaClient::new(OllamaConfig {
base_url: ollama_cfg.base_url.clone(),
model: ollama_cfg.model.clone(),
..OllamaConfig::default()
});
// Stage 1+2: SEED + HYDRATE (hybrid search with FTS fallback) let seed_result = match parsed_query {
let spinner = stage_spinner(1, 3, "Seeding timeline...", params.robot_mode); TimelineQuery::EntityDirect { entity_type, iid } => {
let seed_result = seed_timeline( // Direct seeding: synchronous, no Ollama needed
&conn, let spinner = stage_spinner(1, 3, "Resolving entity...", params.robot_mode);
Some(&client), let result = seed_timeline_direct(&conn, &entity_type, iid, project_id)?;
&params.query, spinner.finish_and_clear();
project_id, result
since_ms, }
params.max_seeds, TimelineQuery::Search(ref query) => {
params.max_evidence, // Construct OllamaClient for hybrid search (same pattern as run_search)
) let ollama_cfg = &config.embedding;
.await?; let client = OllamaClient::new(OllamaConfig {
spinner.finish_and_clear(); base_url: ollama_cfg.base_url.clone(),
model: ollama_cfg.model.clone(),
..OllamaConfig::default()
});
// Stage 1+2: SEED + HYDRATE (hybrid search with FTS fallback)
let spinner = stage_spinner(1, 3, "Seeding timeline...", params.robot_mode);
let result = seed_timeline(
&conn,
Some(&client),
query,
project_id,
since_ms,
params.max_seeds,
params.max_evidence,
)
.await?;
spinner.finish_and_clear();
result
}
};
// Stage 3: EXPAND // Stage 3: EXPAND
let spinner = stage_spinner(2, 3, "Expanding cross-references...", params.robot_mode); let spinner = stage_spinner(2, 3, "Expanding cross-references...", params.robot_mode);
@@ -556,3 +608,84 @@ fn count_discussion_threads(events: &[TimelineEvent]) -> usize {
.filter(|e| matches!(e.event_type, TimelineEventType::DiscussionThread { .. })) .filter(|e| matches!(e.event_type, TimelineEventType::DiscussionThread { .. }))
.count() .count()
} }
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_parse_issue_colon_number() {
let q = parse_timeline_query("issue:42");
assert!(
matches!(q, TimelineQuery::EntityDirect { ref entity_type, iid } if entity_type == "issue" && iid == 42)
);
}
#[test]
fn test_parse_i_colon_number() {
let q = parse_timeline_query("i:42");
assert!(
matches!(q, TimelineQuery::EntityDirect { ref entity_type, iid } if entity_type == "issue" && iid == 42)
);
}
#[test]
fn test_parse_mr_colon_number() {
let q = parse_timeline_query("mr:99");
assert!(
matches!(q, TimelineQuery::EntityDirect { ref entity_type, iid } if entity_type == "merge_request" && iid == 99)
);
}
#[test]
fn test_parse_m_colon_number() {
let q = parse_timeline_query("m:99");
assert!(
matches!(q, TimelineQuery::EntityDirect { ref entity_type, iid } if entity_type == "merge_request" && iid == 99)
);
}
#[test]
fn test_parse_case_insensitive() {
let q = parse_timeline_query("ISSUE:42");
assert!(
matches!(q, TimelineQuery::EntityDirect { ref entity_type, iid } if entity_type == "issue" && iid == 42)
);
let q = parse_timeline_query("MR:99");
assert!(
matches!(q, TimelineQuery::EntityDirect { ref entity_type, iid } if entity_type == "merge_request" && iid == 99)
);
let q = parse_timeline_query("Issue:7");
assert!(
matches!(q, TimelineQuery::EntityDirect { ref entity_type, iid } if entity_type == "issue" && iid == 7)
);
}
#[test]
fn test_parse_search_fallback() {
let q = parse_timeline_query("switch health");
assert!(matches!(q, TimelineQuery::Search(ref s) if s == "switch health"));
}
#[test]
fn test_parse_non_numeric_falls_back_to_search() {
let q = parse_timeline_query("issue:abc");
assert!(matches!(q, TimelineQuery::Search(_)));
}
#[test]
fn test_parse_unknown_prefix_falls_back_to_search() {
let q = parse_timeline_query("foo:42");
assert!(matches!(q, TimelineQuery::Search(_)));
}
#[test]
fn test_parse_whitespace_trimmed() {
let q = parse_timeline_query(" issue:42 ");
assert!(
matches!(q, TimelineQuery::EntityDirect { ref entity_type, iid } if entity_type == "issue" && iid == 42)
);
}
}

View File

@@ -10,6 +10,7 @@ use std::io::IsTerminal;
#[command(name = "lore")] #[command(name = "lore")]
#[command(version = env!("LORE_VERSION"), about = "Local GitLab data management with semantic search", long_about = None)] #[command(version = env!("LORE_VERSION"), about = "Local GitLab data management with semantic search", long_about = None)]
#[command(subcommand_required = false)] #[command(subcommand_required = false)]
#[command(infer_subcommands = true)]
#[command(after_long_help = "\x1b[1mEnvironment:\x1b[0m #[command(after_long_help = "\x1b[1mEnvironment:\x1b[0m
GITLAB_TOKEN GitLab personal access token (or name set in config) GITLAB_TOKEN GitLab personal access token (or name set in config)
LORE_ROBOT Enable robot/JSON mode (non-empty, non-zero value) LORE_ROBOT Enable robot/JSON mode (non-empty, non-zero value)
@@ -107,12 +108,19 @@ impl Cli {
#[allow(clippy::large_enum_variant)] #[allow(clippy::large_enum_variant)]
pub enum Commands { pub enum Commands {
/// List or show issues /// List or show issues
#[command(visible_alias = "issue")]
Issues(IssuesArgs), Issues(IssuesArgs),
/// List or show merge requests /// List or show merge requests
#[command(
visible_alias = "mr",
alias = "merge-requests",
alias = "merge-request"
)]
Mrs(MrsArgs), Mrs(MrsArgs),
/// List notes from discussions /// List notes from discussions
#[command(visible_alias = "note")]
Notes(NotesArgs), Notes(NotesArgs),
/// Ingest data from GitLab /// Ingest data from GitLab
@@ -122,6 +130,7 @@ pub enum Commands {
Count(CountArgs), Count(CountArgs),
/// Show sync state /// Show sync state
#[command(visible_alias = "st")]
Status, Status,
/// Verify GitLab authentication /// Verify GitLab authentication
@@ -170,9 +179,11 @@ pub enum Commands {
}, },
/// Search indexed documents /// Search indexed documents
#[command(visible_alias = "find", alias = "query")]
Search(SearchArgs), Search(SearchArgs),
/// Show document and index statistics /// Show document and index statistics
#[command(visible_alias = "stat")]
Stats(StatsArgs), Stats(StatsArgs),
/// Generate searchable documents from ingested data /// Generate searchable documents from ingested data
@@ -794,11 +805,14 @@ pub struct EmbedArgs {
#[derive(Parser)] #[derive(Parser)]
#[command(after_help = "\x1b[1mExamples:\x1b[0m #[command(after_help = "\x1b[1mExamples:\x1b[0m
lore timeline 'deployment' # Events related to deployments lore timeline 'deployment' # Search-based seeding
lore timeline issue:42 # Direct: issue #42 and related entities
lore timeline i:42 # Shorthand for issue:42
lore timeline mr:99 # Direct: MR !99 and related entities
lore timeline 'auth' --since 30d -p group/repo # Scoped to project and time lore timeline 'auth' --since 30d -p group/repo # Scoped to project and time
lore timeline 'migration' --depth 2 --expand-mentions # Deep cross-reference expansion")] lore timeline 'migration' --depth 2 --expand-mentions # Deep cross-reference expansion")]
pub struct TimelineArgs { pub struct TimelineArgs {
/// Search query (keywords to find in issues, MRs, and discussions) /// Search text or entity reference (issue:N, i:N, mr:N, m:N)
pub query: String, pub query: String,
/// Scope to a specific project (fuzzy match) /// Scope to a specific project (fuzzy match)

View File

@@ -22,20 +22,34 @@ pub struct ExtractResult {
pub parse_failures: usize, pub parse_failures: usize,
} }
// GitLab system notes include the entity type word: "mentioned in issue #5"
// or "mentioned in merge request !730". The word is mandatory in real data,
// but we also keep the old bare-sigil form as a fallback (no data uses it today,
// but other GitLab instances might differ).
static MENTIONED_RE: LazyLock<Regex> = LazyLock::new(|| { static MENTIONED_RE: LazyLock<Regex> = LazyLock::new(|| {
Regex::new( Regex::new(
r"mentioned in (?:(?P<project>[\w][\w.\-]*(?:/[\w][\w.\-]*)+))?(?P<sigil>[#!])(?P<iid>\d+)", r"mentioned in (?:issue |merge request )?(?:(?P<project>[\w][\w.\-]*(?:/[\w][\w.\-]*)+))?(?P<sigil>[#!])(?P<iid>\d+)",
) )
.expect("mentioned regex is valid") .expect("mentioned regex is valid")
}); });
static CLOSED_BY_RE: LazyLock<Regex> = LazyLock::new(|| { static CLOSED_BY_RE: LazyLock<Regex> = LazyLock::new(|| {
Regex::new( Regex::new(
r"closed by (?:(?P<project>[\w][\w.\-]*(?:/[\w][\w.\-]*)+))?(?P<sigil>[#!])(?P<iid>\d+)", r"closed by (?:issue |merge request )?(?:(?P<project>[\w][\w.\-]*(?:/[\w][\w.\-]*)+))?(?P<sigil>[#!])(?P<iid>\d+)",
) )
.expect("closed_by regex is valid") .expect("closed_by regex is valid")
}); });
/// Matches full GitLab URLs like:
/// `https://gitlab.example.com/group/project/-/issues/123`
/// `https://gitlab.example.com/group/sub/project/-/merge_requests/456`
static GITLAB_URL_RE: LazyLock<Regex> = LazyLock::new(|| {
Regex::new(
r"https?://[^\s/]+/(?P<project>[^\s]+?)/-/(?P<entity_type>issues|merge_requests)/(?P<iid>\d+)",
)
.expect("gitlab url regex is valid")
});
pub fn parse_cross_refs(body: &str) -> Vec<ParsedCrossRef> { pub fn parse_cross_refs(body: &str) -> Vec<ParsedCrossRef> {
let mut refs = Vec::new(); let mut refs = Vec::new();
@@ -54,6 +68,47 @@ pub fn parse_cross_refs(body: &str) -> Vec<ParsedCrossRef> {
refs refs
} }
/// Extract cross-references from GitLab URLs in free-text bodies (descriptions, user notes).
pub fn parse_url_refs(body: &str) -> Vec<ParsedCrossRef> {
let mut refs = Vec::new();
let mut seen = std::collections::HashSet::new();
for caps in GITLAB_URL_RE.captures_iter(body) {
let Some(entity_type_raw) = caps.name("entity_type").map(|m| m.as_str()) else {
continue;
};
let Some(iid_str) = caps.name("iid").map(|m| m.as_str()) else {
continue;
};
let Some(project) = caps.name("project").map(|m| m.as_str()) else {
continue;
};
let Ok(iid) = iid_str.parse::<i64>() else {
continue;
};
let target_entity_type = match entity_type_raw {
"issues" => "issue",
"merge_requests" => "merge_request",
_ => continue,
};
let key = (target_entity_type, project.to_owned(), iid);
if !seen.insert(key) {
continue; // deduplicate within same body
}
refs.push(ParsedCrossRef {
reference_type: "mentioned".to_owned(),
target_entity_type: target_entity_type.to_owned(),
target_iid: iid,
target_project_path: Some(project.to_owned()),
});
}
refs
}
fn capture_to_cross_ref( fn capture_to_cross_ref(
caps: &regex::Captures<'_>, caps: &regex::Captures<'_>,
reference_type: &str, reference_type: &str,
@@ -233,6 +288,189 @@ fn resolve_cross_project_entity(
resolve_entity_id(conn, project_id, entity_type, iid) resolve_entity_id(conn, project_id, entity_type, iid)
} }
/// Extract cross-references from issue and MR descriptions (GitLab URLs only).
pub fn extract_refs_from_descriptions(conn: &Connection, project_id: i64) -> Result<ExtractResult> {
let mut result = ExtractResult::default();
let mut insert_stmt = conn.prepare_cached(
"INSERT OR IGNORE INTO entity_references
(project_id, source_entity_type, source_entity_id,
target_entity_type, target_entity_id,
target_project_path, target_entity_iid,
reference_type, source_method, created_at)
VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8, 'description_parse', ?9)",
)?;
let now = now_ms();
// Issues with descriptions
let mut issue_stmt = conn.prepare_cached(
"SELECT id, iid, description FROM issues
WHERE project_id = ?1 AND description IS NOT NULL AND description != ''",
)?;
let issues: Vec<(i64, i64, String)> = issue_stmt
.query_map([project_id], |row| {
Ok((row.get(0)?, row.get(1)?, row.get(2)?))
})?
.collect::<std::result::Result<Vec<_>, _>>()?;
for (entity_id, _iid, description) in &issues {
insert_url_refs(
conn,
&mut insert_stmt,
&mut result,
project_id,
"issue",
*entity_id,
description,
now,
)?;
}
// Merge requests with descriptions
let mut mr_stmt = conn.prepare_cached(
"SELECT id, iid, description FROM merge_requests
WHERE project_id = ?1 AND description IS NOT NULL AND description != ''",
)?;
let mrs: Vec<(i64, i64, String)> = mr_stmt
.query_map([project_id], |row| {
Ok((row.get(0)?, row.get(1)?, row.get(2)?))
})?
.collect::<std::result::Result<Vec<_>, _>>()?;
for (entity_id, _iid, description) in &mrs {
insert_url_refs(
conn,
&mut insert_stmt,
&mut result,
project_id,
"merge_request",
*entity_id,
description,
now,
)?;
}
if result.inserted > 0 || result.skipped_unresolvable > 0 {
debug!(
inserted = result.inserted,
unresolvable = result.skipped_unresolvable,
"Description cross-reference extraction complete"
);
}
Ok(result)
}
/// Extract cross-references from user (non-system) notes (GitLab URLs only).
pub fn extract_refs_from_user_notes(conn: &Connection, project_id: i64) -> Result<ExtractResult> {
let mut result = ExtractResult::default();
let mut note_stmt = conn.prepare_cached(
"SELECT n.id, n.body, d.noteable_type,
COALESCE(d.issue_id, d.merge_request_id) AS entity_id
FROM notes n
JOIN discussions d ON n.discussion_id = d.id
WHERE n.is_system = 0
AND n.project_id = ?1
AND n.body IS NOT NULL",
)?;
let notes: Vec<(i64, String, String, i64)> = note_stmt
.query_map([project_id], |row| {
Ok((row.get(0)?, row.get(1)?, row.get(2)?, row.get(3)?))
})?
.collect::<std::result::Result<Vec<_>, _>>()?;
if notes.is_empty() {
return Ok(result);
}
let mut insert_stmt = conn.prepare_cached(
"INSERT OR IGNORE INTO entity_references
(project_id, source_entity_type, source_entity_id,
target_entity_type, target_entity_id,
target_project_path, target_entity_iid,
reference_type, source_method, created_at)
VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8, 'note_parse', ?9)",
)?;
let now = now_ms();
for (_, body, noteable_type, entity_id) in &notes {
let source_entity_type = noteable_type_to_entity_type(noteable_type);
insert_url_refs(
conn,
&mut insert_stmt,
&mut result,
project_id,
source_entity_type,
*entity_id,
body,
now,
)?;
}
if result.inserted > 0 || result.skipped_unresolvable > 0 {
debug!(
inserted = result.inserted,
unresolvable = result.skipped_unresolvable,
"User note cross-reference extraction complete"
);
}
Ok(result)
}
/// Shared helper: parse URL refs from a body and insert into entity_references.
#[allow(clippy::too_many_arguments)]
fn insert_url_refs(
conn: &Connection,
insert_stmt: &mut rusqlite::CachedStatement<'_>,
result: &mut ExtractResult,
project_id: i64,
source_entity_type: &str,
source_entity_id: i64,
body: &str,
now: i64,
) -> Result<()> {
let url_refs = parse_url_refs(body);
for xref in &url_refs {
let target_entity_id = if let Some(ref path) = xref.target_project_path {
resolve_cross_project_entity(conn, path, &xref.target_entity_type, xref.target_iid)
} else {
resolve_entity_id(conn, project_id, &xref.target_entity_type, xref.target_iid)
};
let rows_changed = insert_stmt.execute(rusqlite::params![
project_id,
source_entity_type,
source_entity_id,
xref.target_entity_type,
target_entity_id,
xref.target_project_path,
if target_entity_id.is_none() {
Some(xref.target_iid)
} else {
None
},
xref.reference_type,
now,
])?;
if rows_changed > 0 {
if target_entity_id.is_none() {
result.skipped_unresolvable += 1;
} else {
result.inserted += 1;
}
}
}
Ok(())
}
#[cfg(test)] #[cfg(test)]
#[path = "note_parser_tests.rs"] #[path = "note_parser_tests.rs"]
mod tests; mod tests;

View File

@@ -1,8 +1,10 @@
use super::*; use super::*;
// --- parse_cross_refs: real GitLab system note format ---
#[test] #[test]
fn test_parse_mentioned_in_mr() { fn test_parse_mentioned_in_mr() {
let refs = parse_cross_refs("mentioned in !567"); let refs = parse_cross_refs("mentioned in merge request !567");
assert_eq!(refs.len(), 1); assert_eq!(refs.len(), 1);
assert_eq!(refs[0].reference_type, "mentioned"); assert_eq!(refs[0].reference_type, "mentioned");
assert_eq!(refs[0].target_entity_type, "merge_request"); assert_eq!(refs[0].target_entity_type, "merge_request");
@@ -12,7 +14,7 @@ fn test_parse_mentioned_in_mr() {
#[test] #[test]
fn test_parse_mentioned_in_issue() { fn test_parse_mentioned_in_issue() {
let refs = parse_cross_refs("mentioned in #234"); let refs = parse_cross_refs("mentioned in issue #234");
assert_eq!(refs.len(), 1); assert_eq!(refs.len(), 1);
assert_eq!(refs[0].reference_type, "mentioned"); assert_eq!(refs[0].reference_type, "mentioned");
assert_eq!(refs[0].target_entity_type, "issue"); assert_eq!(refs[0].target_entity_type, "issue");
@@ -22,7 +24,7 @@ fn test_parse_mentioned_in_issue() {
#[test] #[test]
fn test_parse_mentioned_cross_project() { fn test_parse_mentioned_cross_project() {
let refs = parse_cross_refs("mentioned in group/repo!789"); let refs = parse_cross_refs("mentioned in merge request group/repo!789");
assert_eq!(refs.len(), 1); assert_eq!(refs.len(), 1);
assert_eq!(refs[0].reference_type, "mentioned"); assert_eq!(refs[0].reference_type, "mentioned");
assert_eq!(refs[0].target_entity_type, "merge_request"); assert_eq!(refs[0].target_entity_type, "merge_request");
@@ -32,7 +34,7 @@ fn test_parse_mentioned_cross_project() {
#[test] #[test]
fn test_parse_mentioned_cross_project_issue() { fn test_parse_mentioned_cross_project_issue() {
let refs = parse_cross_refs("mentioned in group/repo#123"); let refs = parse_cross_refs("mentioned in issue group/repo#123");
assert_eq!(refs.len(), 1); assert_eq!(refs.len(), 1);
assert_eq!(refs[0].reference_type, "mentioned"); assert_eq!(refs[0].reference_type, "mentioned");
assert_eq!(refs[0].target_entity_type, "issue"); assert_eq!(refs[0].target_entity_type, "issue");
@@ -42,7 +44,7 @@ fn test_parse_mentioned_cross_project_issue() {
#[test] #[test]
fn test_parse_closed_by_mr() { fn test_parse_closed_by_mr() {
let refs = parse_cross_refs("closed by !567"); let refs = parse_cross_refs("closed by merge request !567");
assert_eq!(refs.len(), 1); assert_eq!(refs.len(), 1);
assert_eq!(refs[0].reference_type, "closes"); assert_eq!(refs[0].reference_type, "closes");
assert_eq!(refs[0].target_entity_type, "merge_request"); assert_eq!(refs[0].target_entity_type, "merge_request");
@@ -52,7 +54,7 @@ fn test_parse_closed_by_mr() {
#[test] #[test]
fn test_parse_closed_by_cross_project() { fn test_parse_closed_by_cross_project() {
let refs = parse_cross_refs("closed by group/repo!789"); let refs = parse_cross_refs("closed by merge request group/repo!789");
assert_eq!(refs.len(), 1); assert_eq!(refs.len(), 1);
assert_eq!(refs[0].reference_type, "closes"); assert_eq!(refs[0].reference_type, "closes");
assert_eq!(refs[0].target_entity_type, "merge_request"); assert_eq!(refs[0].target_entity_type, "merge_request");
@@ -62,7 +64,7 @@ fn test_parse_closed_by_cross_project() {
#[test] #[test]
fn test_parse_multiple_refs() { fn test_parse_multiple_refs() {
let refs = parse_cross_refs("mentioned in !123 and mentioned in #456"); let refs = parse_cross_refs("mentioned in merge request !123 and mentioned in issue #456");
assert_eq!(refs.len(), 2); assert_eq!(refs.len(), 2);
assert_eq!(refs[0].target_entity_type, "merge_request"); assert_eq!(refs[0].target_entity_type, "merge_request");
assert_eq!(refs[0].target_iid, 123); assert_eq!(refs[0].target_iid, 123);
@@ -84,7 +86,7 @@ fn test_parse_non_english_note() {
#[test] #[test]
fn test_parse_multi_level_group_path() { fn test_parse_multi_level_group_path() {
let refs = parse_cross_refs("mentioned in top/sub/project#123"); let refs = parse_cross_refs("mentioned in issue top/sub/project#123");
assert_eq!(refs.len(), 1); assert_eq!(refs.len(), 1);
assert_eq!( assert_eq!(
refs[0].target_project_path.as_deref(), refs[0].target_project_path.as_deref(),
@@ -95,7 +97,7 @@ fn test_parse_multi_level_group_path() {
#[test] #[test]
fn test_parse_deeply_nested_group_path() { fn test_parse_deeply_nested_group_path() {
let refs = parse_cross_refs("mentioned in a/b/c/d/e!42"); let refs = parse_cross_refs("mentioned in merge request a/b/c/d/e!42");
assert_eq!(refs.len(), 1); assert_eq!(refs.len(), 1);
assert_eq!(refs[0].target_project_path.as_deref(), Some("a/b/c/d/e")); assert_eq!(refs[0].target_project_path.as_deref(), Some("a/b/c/d/e"));
assert_eq!(refs[0].target_iid, 42); assert_eq!(refs[0].target_iid, 42);
@@ -103,7 +105,7 @@ fn test_parse_deeply_nested_group_path() {
#[test] #[test]
fn test_parse_hyphenated_project_path() { fn test_parse_hyphenated_project_path() {
let refs = parse_cross_refs("mentioned in my-group/my-project#99"); let refs = parse_cross_refs("mentioned in issue my-group/my-project#99");
assert_eq!(refs.len(), 1); assert_eq!(refs.len(), 1);
assert_eq!( assert_eq!(
refs[0].target_project_path.as_deref(), refs[0].target_project_path.as_deref(),
@@ -113,7 +115,7 @@ fn test_parse_hyphenated_project_path() {
#[test] #[test]
fn test_parse_dotted_project_path() { fn test_parse_dotted_project_path() {
let refs = parse_cross_refs("mentioned in visiostack.io/backend#123"); let refs = parse_cross_refs("mentioned in issue visiostack.io/backend#123");
assert_eq!(refs.len(), 1); assert_eq!(refs.len(), 1);
assert_eq!( assert_eq!(
refs[0].target_project_path.as_deref(), refs[0].target_project_path.as_deref(),
@@ -124,7 +126,7 @@ fn test_parse_dotted_project_path() {
#[test] #[test]
fn test_parse_dotted_nested_project_path() { fn test_parse_dotted_nested_project_path() {
let refs = parse_cross_refs("closed by my.org/sub.group/my.project!42"); let refs = parse_cross_refs("closed by merge request my.org/sub.group/my.project!42");
assert_eq!(refs.len(), 1); assert_eq!(refs.len(), 1);
assert_eq!( assert_eq!(
refs[0].target_project_path.as_deref(), refs[0].target_project_path.as_deref(),
@@ -134,16 +136,27 @@ fn test_parse_dotted_nested_project_path() {
assert_eq!(refs[0].target_iid, 42); assert_eq!(refs[0].target_iid, 42);
} }
// Bare-sigil fallback (no "issue"/"merge request" word) still works
#[test] #[test]
fn test_parse_self_reference_is_valid() { fn test_parse_bare_sigil_fallback() {
let refs = parse_cross_refs("mentioned in #123"); let refs = parse_cross_refs("mentioned in #123");
assert_eq!(refs.len(), 1); assert_eq!(refs.len(), 1);
assert_eq!(refs[0].target_iid, 123); assert_eq!(refs[0].target_iid, 123);
assert_eq!(refs[0].target_entity_type, "issue");
}
#[test]
fn test_parse_bare_sigil_closed_by() {
let refs = parse_cross_refs("closed by !567");
assert_eq!(refs.len(), 1);
assert_eq!(refs[0].reference_type, "closes");
assert_eq!(refs[0].target_entity_type, "merge_request");
assert_eq!(refs[0].target_iid, 567);
} }
#[test] #[test]
fn test_parse_mixed_mentioned_and_closed() { fn test_parse_mixed_mentioned_and_closed() {
let refs = parse_cross_refs("mentioned in !10 and closed by !20"); let refs = parse_cross_refs("mentioned in merge request !10 and closed by merge request !20");
assert_eq!(refs.len(), 2); assert_eq!(refs.len(), 2);
assert_eq!(refs[0].reference_type, "mentioned"); assert_eq!(refs[0].reference_type, "mentioned");
assert_eq!(refs[0].target_iid, 10); assert_eq!(refs[0].target_iid, 10);
@@ -151,6 +164,113 @@ fn test_parse_mixed_mentioned_and_closed() {
assert_eq!(refs[1].target_iid, 20); assert_eq!(refs[1].target_iid, 20);
} }
// --- parse_url_refs ---
#[test]
fn test_url_ref_same_project_issue() {
let refs = parse_url_refs(
"See https://gitlab.visiostack.com/vs/typescript-code/-/issues/3537 for details",
);
assert_eq!(refs.len(), 1);
assert_eq!(refs[0].target_entity_type, "issue");
assert_eq!(refs[0].target_iid, 3537);
assert_eq!(
refs[0].target_project_path.as_deref(),
Some("vs/typescript-code")
);
assert_eq!(refs[0].reference_type, "mentioned");
}
#[test]
fn test_url_ref_merge_request() {
let refs =
parse_url_refs("https://gitlab.visiostack.com/vs/typescript-code/-/merge_requests/3548");
assert_eq!(refs.len(), 1);
assert_eq!(refs[0].target_entity_type, "merge_request");
assert_eq!(refs[0].target_iid, 3548);
assert_eq!(
refs[0].target_project_path.as_deref(),
Some("vs/typescript-code")
);
}
#[test]
fn test_url_ref_cross_project() {
let refs = parse_url_refs(
"Related: https://gitlab.visiostack.com/vs/python-code/-/merge_requests/5203",
);
assert_eq!(refs.len(), 1);
assert_eq!(refs[0].target_entity_type, "merge_request");
assert_eq!(refs[0].target_iid, 5203);
assert_eq!(
refs[0].target_project_path.as_deref(),
Some("vs/python-code")
);
}
#[test]
fn test_url_ref_with_anchor() {
let refs =
parse_url_refs("https://gitlab.visiostack.com/vs/typescript-code/-/issues/123#note_456");
assert_eq!(refs.len(), 1);
assert_eq!(refs[0].target_entity_type, "issue");
assert_eq!(refs[0].target_iid, 123);
}
#[test]
fn test_url_ref_markdown_link() {
let refs = parse_url_refs(
"Check [this MR](https://gitlab.visiostack.com/vs/typescript-code/-/merge_requests/100) for context",
);
assert_eq!(refs.len(), 1);
assert_eq!(refs[0].target_entity_type, "merge_request");
assert_eq!(refs[0].target_iid, 100);
}
#[test]
fn test_url_ref_multiple_urls() {
let body =
"See https://gitlab.com/a/b/-/issues/1 and https://gitlab.com/a/b/-/merge_requests/2";
let refs = parse_url_refs(body);
assert_eq!(refs.len(), 2);
assert_eq!(refs[0].target_entity_type, "issue");
assert_eq!(refs[0].target_iid, 1);
assert_eq!(refs[1].target_entity_type, "merge_request");
assert_eq!(refs[1].target_iid, 2);
}
#[test]
fn test_url_ref_deduplicates() {
let body = "See https://gitlab.com/a/b/-/issues/1 and again https://gitlab.com/a/b/-/issues/1";
let refs = parse_url_refs(body);
assert_eq!(
refs.len(),
1,
"Duplicate URLs in same body should be deduplicated"
);
}
#[test]
fn test_url_ref_non_gitlab_urls_ignored() {
let refs = parse_url_refs(
"Check https://google.com/search?q=test and https://github.com/org/repo/issues/1",
);
assert!(refs.is_empty());
}
#[test]
fn test_url_ref_deeply_nested_project() {
let refs = parse_url_refs("https://gitlab.com/org/sub/deep/project/-/issues/42");
assert_eq!(refs.len(), 1);
assert_eq!(
refs[0].target_project_path.as_deref(),
Some("org/sub/deep/project")
);
assert_eq!(refs[0].target_iid, 42);
}
// --- Integration tests: system notes (updated for real format) ---
fn setup_test_db() -> Connection { fn setup_test_db() -> Connection {
use crate::core::db::{create_connection, run_migrations}; use crate::core::db::{create_connection, run_migrations};
@@ -204,27 +324,31 @@ fn seed_test_data(conn: &Connection) -> i64 {
) )
.unwrap(); .unwrap();
// System note: real GitLab format "mentioned in merge request !789"
conn.execute( conn.execute(
"INSERT INTO notes (id, gitlab_id, discussion_id, project_id, is_system, body, created_at, updated_at, last_seen_at) "INSERT INTO notes (id, gitlab_id, discussion_id, project_id, is_system, body, created_at, updated_at, last_seen_at)
VALUES (40, 4000, 30, 1, 1, 'mentioned in !789', ?1, ?1, ?1)", VALUES (40, 4000, 30, 1, 1, 'mentioned in merge request !789', ?1, ?1, ?1)",
[now], [now],
) )
.unwrap(); .unwrap();
// System note: real GitLab format "mentioned in issue #456"
conn.execute( conn.execute(
"INSERT INTO notes (id, gitlab_id, discussion_id, project_id, is_system, body, created_at, updated_at, last_seen_at) "INSERT INTO notes (id, gitlab_id, discussion_id, project_id, is_system, body, created_at, updated_at, last_seen_at)
VALUES (41, 4001, 31, 1, 1, 'mentioned in #456', ?1, ?1, ?1)", VALUES (41, 4001, 31, 1, 1, 'mentioned in issue #456', ?1, ?1, ?1)",
[now], [now],
) )
.unwrap(); .unwrap();
// User note (is_system=0) — should NOT be processed by system note extractor
conn.execute( conn.execute(
"INSERT INTO notes (id, gitlab_id, discussion_id, project_id, is_system, body, created_at, updated_at, last_seen_at) "INSERT INTO notes (id, gitlab_id, discussion_id, project_id, is_system, body, created_at, updated_at, last_seen_at)
VALUES (42, 4002, 30, 1, 0, 'mentioned in !999', ?1, ?1, ?1)", VALUES (42, 4002, 30, 1, 0, 'mentioned in merge request !999', ?1, ?1, ?1)",
[now], [now],
) )
.unwrap(); .unwrap();
// System note with no cross-ref pattern
conn.execute( conn.execute(
"INSERT INTO notes (id, gitlab_id, discussion_id, project_id, is_system, body, created_at, updated_at, last_seen_at) "INSERT INTO notes (id, gitlab_id, discussion_id, project_id, is_system, body, created_at, updated_at, last_seen_at)
VALUES (43, 4003, 30, 1, 1, 'added label ~bug', ?1, ?1, ?1)", VALUES (43, 4003, 30, 1, 1, 'added label ~bug', ?1, ?1, ?1)",
@@ -232,9 +356,10 @@ fn seed_test_data(conn: &Connection) -> i64 {
) )
.unwrap(); .unwrap();
// System note: cross-project ref
conn.execute( conn.execute(
"INSERT INTO notes (id, gitlab_id, discussion_id, project_id, is_system, body, created_at, updated_at, last_seen_at) "INSERT INTO notes (id, gitlab_id, discussion_id, project_id, is_system, body, created_at, updated_at, last_seen_at)
VALUES (44, 4004, 30, 1, 1, 'mentioned in other/project#999', ?1, ?1, ?1)", VALUES (44, 4004, 30, 1, 1, 'mentioned in issue other/project#999', ?1, ?1, ?1)",
[now], [now],
) )
.unwrap(); .unwrap();
@@ -323,3 +448,323 @@ fn test_extract_refs_empty_project() {
assert_eq!(result.skipped_unresolvable, 0); assert_eq!(result.skipped_unresolvable, 0);
assert_eq!(result.parse_failures, 0); assert_eq!(result.parse_failures, 0);
} }
// --- Integration tests: description extraction ---
#[test]
fn test_extract_refs_from_descriptions_issue() {
let conn = setup_test_db();
let now = now_ms();
conn.execute(
"INSERT INTO projects (id, gitlab_project_id, path_with_namespace, web_url, created_at, updated_at)
VALUES (1, 100, 'vs/typescript-code', 'https://gitlab.com/vs/typescript-code', ?1, ?1)",
[now],
)
.unwrap();
// Issue with MR reference in description
conn.execute(
"INSERT INTO issues (id, gitlab_id, project_id, iid, title, state, description, created_at, updated_at, last_seen_at)
VALUES (10, 1000, 1, 3537, 'Test Issue', 'opened',
'Related to https://gitlab.com/vs/typescript-code/-/merge_requests/3548',
?1, ?1, ?1)",
[now],
)
.unwrap();
// The target MR so it resolves
conn.execute(
"INSERT INTO merge_requests (id, gitlab_id, project_id, iid, title, state, source_branch, target_branch, author_username, created_at, updated_at, last_seen_at)
VALUES (20, 2000, 1, 3548, 'Fix MR', 'merged', 'fix', 'main', 'dev', ?1, ?1, ?1)",
[now],
)
.unwrap();
let result = extract_refs_from_descriptions(&conn, 1).unwrap();
assert_eq!(result.inserted, 1, "Should insert 1 description ref");
assert_eq!(result.skipped_unresolvable, 0);
let method: String = conn
.query_row(
"SELECT source_method FROM entity_references WHERE project_id = 1",
[],
|row| row.get(0),
)
.unwrap();
assert_eq!(method, "description_parse");
}
#[test]
fn test_extract_refs_from_descriptions_mr() {
let conn = setup_test_db();
let now = now_ms();
conn.execute(
"INSERT INTO projects (id, gitlab_project_id, path_with_namespace, web_url, created_at, updated_at)
VALUES (1, 100, 'vs/typescript-code', 'https://gitlab.com/vs/typescript-code', ?1, ?1)",
[now],
)
.unwrap();
conn.execute(
"INSERT INTO issues (id, gitlab_id, project_id, iid, title, state, created_at, updated_at, last_seen_at)
VALUES (10, 1000, 1, 100, 'Target Issue', 'opened', ?1, ?1, ?1)",
[now],
)
.unwrap();
conn.execute(
"INSERT INTO merge_requests (id, gitlab_id, project_id, iid, title, state, source_branch, target_branch, author_username, description, created_at, updated_at, last_seen_at)
VALUES (20, 2000, 1, 200, 'Fixing MR', 'merged', 'fix', 'main', 'dev',
'Fixes https://gitlab.com/vs/typescript-code/-/issues/100',
?1, ?1, ?1)",
[now],
)
.unwrap();
let result = extract_refs_from_descriptions(&conn, 1).unwrap();
assert_eq!(result.inserted, 1);
let (src_type, tgt_type): (String, String) = conn
.query_row(
"SELECT source_entity_type, target_entity_type FROM entity_references WHERE project_id = 1",
[],
|row| Ok((row.get(0)?, row.get(1)?)),
)
.unwrap();
assert_eq!(src_type, "merge_request");
assert_eq!(tgt_type, "issue");
}
#[test]
fn test_extract_refs_from_descriptions_idempotent() {
let conn = setup_test_db();
let now = now_ms();
conn.execute(
"INSERT INTO projects (id, gitlab_project_id, path_with_namespace, web_url, created_at, updated_at)
VALUES (1, 100, 'vs/code', 'https://gitlab.com/vs/code', ?1, ?1)",
[now],
)
.unwrap();
conn.execute(
"INSERT INTO issues (id, gitlab_id, project_id, iid, title, state, description, created_at, updated_at, last_seen_at)
VALUES (10, 1000, 1, 1, 'Issue', 'opened',
'See https://gitlab.com/vs/code/-/merge_requests/2', ?1, ?1, ?1)",
[now],
)
.unwrap();
conn.execute(
"INSERT INTO merge_requests (id, gitlab_id, project_id, iid, title, state, source_branch, target_branch, author_username, created_at, updated_at, last_seen_at)
VALUES (20, 2000, 1, 2, 'MR', 'opened', 'x', 'main', 'dev', ?1, ?1, ?1)",
[now],
)
.unwrap();
let r1 = extract_refs_from_descriptions(&conn, 1).unwrap();
assert_eq!(r1.inserted, 1);
let r2 = extract_refs_from_descriptions(&conn, 1).unwrap();
assert_eq!(r2.inserted, 0, "Second run should insert 0 (idempotent)");
}
#[test]
fn test_extract_refs_from_descriptions_cross_project_unresolved() {
let conn = setup_test_db();
let now = now_ms();
conn.execute(
"INSERT INTO projects (id, gitlab_project_id, path_with_namespace, web_url, created_at, updated_at)
VALUES (1, 100, 'vs/typescript-code', 'https://gitlab.com/vs/typescript-code', ?1, ?1)",
[now],
)
.unwrap();
conn.execute(
"INSERT INTO issues (id, gitlab_id, project_id, iid, title, state, description, created_at, updated_at, last_seen_at)
VALUES (10, 1000, 1, 1, 'Issue', 'opened',
'See https://gitlab.com/vs/other-project/-/merge_requests/99', ?1, ?1, ?1)",
[now],
)
.unwrap();
let result = extract_refs_from_descriptions(&conn, 1).unwrap();
assert_eq!(result.inserted, 0);
assert_eq!(
result.skipped_unresolvable, 1,
"Cross-project ref with no matching project should be unresolvable"
);
let (path, iid): (String, i64) = conn
.query_row(
"SELECT target_project_path, target_entity_iid FROM entity_references WHERE target_entity_id IS NULL",
[],
|row| Ok((row.get(0)?, row.get(1)?)),
)
.unwrap();
assert_eq!(path, "vs/other-project");
assert_eq!(iid, 99);
}
// --- Integration tests: user note extraction ---
#[test]
fn test_extract_refs_from_user_notes_with_url() {
let conn = setup_test_db();
let now = now_ms();
conn.execute(
"INSERT INTO projects (id, gitlab_project_id, path_with_namespace, web_url, created_at, updated_at)
VALUES (1, 100, 'vs/code', 'https://gitlab.com/vs/code', ?1, ?1)",
[now],
)
.unwrap();
conn.execute(
"INSERT INTO issues (id, gitlab_id, project_id, iid, title, state, created_at, updated_at, last_seen_at)
VALUES (10, 1000, 1, 50, 'Source Issue', 'opened', ?1, ?1, ?1)",
[now],
)
.unwrap();
conn.execute(
"INSERT INTO merge_requests (id, gitlab_id, project_id, iid, title, state, source_branch, target_branch, author_username, created_at, updated_at, last_seen_at)
VALUES (20, 2000, 1, 60, 'Target MR', 'opened', 'x', 'main', 'dev', ?1, ?1, ?1)",
[now],
)
.unwrap();
conn.execute(
"INSERT INTO discussions (id, gitlab_discussion_id, project_id, issue_id, noteable_type, last_seen_at)
VALUES (30, 'disc-user', 1, 10, 'Issue', ?1)",
[now],
)
.unwrap();
// User note with a URL
conn.execute(
"INSERT INTO notes (id, gitlab_id, discussion_id, project_id, is_system, body, created_at, updated_at, last_seen_at)
VALUES (40, 4000, 30, 1, 0,
'This is related to https://gitlab.com/vs/code/-/merge_requests/60',
?1, ?1, ?1)",
[now],
)
.unwrap();
let result = extract_refs_from_user_notes(&conn, 1).unwrap();
assert_eq!(result.inserted, 1);
let method: String = conn
.query_row(
"SELECT source_method FROM entity_references WHERE project_id = 1",
[],
|row| row.get(0),
)
.unwrap();
assert_eq!(method, "note_parse");
}
#[test]
fn test_extract_refs_from_user_notes_no_system_note_patterns() {
let conn = setup_test_db();
let now = now_ms();
conn.execute(
"INSERT INTO projects (id, gitlab_project_id, path_with_namespace, web_url, created_at, updated_at)
VALUES (1, 100, 'vs/code', 'https://gitlab.com/vs/code', ?1, ?1)",
[now],
)
.unwrap();
conn.execute(
"INSERT INTO issues (id, gitlab_id, project_id, iid, title, state, created_at, updated_at, last_seen_at)
VALUES (10, 1000, 1, 50, 'Source', 'opened', ?1, ?1, ?1)",
[now],
)
.unwrap();
conn.execute(
"INSERT INTO merge_requests (id, gitlab_id, project_id, iid, title, state, source_branch, target_branch, author_username, created_at, updated_at, last_seen_at)
VALUES (20, 2000, 1, 999, 'Target', 'opened', 'x', 'main', 'dev', ?1, ?1, ?1)",
[now],
)
.unwrap();
conn.execute(
"INSERT INTO discussions (id, gitlab_discussion_id, project_id, issue_id, noteable_type, last_seen_at)
VALUES (30, 'disc-x', 1, 10, 'Issue', ?1)",
[now],
)
.unwrap();
// User note with system-note-like text but no URL — should NOT extract
// (user notes only use URL parsing, not system note pattern matching)
conn.execute(
"INSERT INTO notes (id, gitlab_id, discussion_id, project_id, is_system, body, created_at, updated_at, last_seen_at)
VALUES (40, 4000, 30, 1, 0, 'mentioned in merge request !999', ?1, ?1, ?1)",
[now],
)
.unwrap();
let result = extract_refs_from_user_notes(&conn, 1).unwrap();
assert_eq!(
result.inserted, 0,
"User notes should only parse URLs, not system note patterns"
);
}
#[test]
fn test_extract_refs_from_user_notes_idempotent() {
let conn = setup_test_db();
let now = now_ms();
conn.execute(
"INSERT INTO projects (id, gitlab_project_id, path_with_namespace, web_url, created_at, updated_at)
VALUES (1, 100, 'vs/code', 'https://gitlab.com/vs/code', ?1, ?1)",
[now],
)
.unwrap();
conn.execute(
"INSERT INTO issues (id, gitlab_id, project_id, iid, title, state, created_at, updated_at, last_seen_at)
VALUES (10, 1000, 1, 1, 'Src', 'opened', ?1, ?1, ?1)",
[now],
)
.unwrap();
conn.execute(
"INSERT INTO merge_requests (id, gitlab_id, project_id, iid, title, state, source_branch, target_branch, author_username, created_at, updated_at, last_seen_at)
VALUES (20, 2000, 1, 2, 'Tgt', 'opened', 'x', 'main', 'dev', ?1, ?1, ?1)",
[now],
)
.unwrap();
conn.execute(
"INSERT INTO discussions (id, gitlab_discussion_id, project_id, issue_id, noteable_type, last_seen_at)
VALUES (30, 'disc-y', 1, 10, 'Issue', ?1)",
[now],
)
.unwrap();
conn.execute(
"INSERT INTO notes (id, gitlab_id, discussion_id, project_id, is_system, body, created_at, updated_at, last_seen_at)
VALUES (40, 4000, 30, 1, 0,
'See https://gitlab.com/vs/code/-/merge_requests/2', ?1, ?1, ?1)",
[now],
)
.unwrap();
let r1 = extract_refs_from_user_notes(&conn, 1).unwrap();
assert_eq!(r1.inserted, 1);
let r2 = extract_refs_from_user_notes(&conn, 1).unwrap();
assert_eq!(r2.inserted, 0, "Second extraction should be idempotent");
}

View File

@@ -211,6 +211,77 @@ pub fn resolve_entity_ref(
} }
} }
/// Resolve an entity by its user-facing IID (e.g. issue #42) to a full [`EntityRef`].
///
/// Unlike [`resolve_entity_ref`] which takes an internal DB id, this takes the
/// GitLab IID that users see. Used by entity-direct timeline seeding (`issue:42`).
///
/// When `project_id` is `Some`, the query is scoped to that project (disambiguates
/// duplicate IIDs across projects).
///
/// Returns `LoreError::NotFound` when no match exists, `LoreError::Ambiguous` when
/// the same IID exists in multiple projects (suggest `--project`).
pub fn resolve_entity_by_iid(
conn: &Connection,
entity_type: &str,
iid: i64,
project_id: Option<i64>,
) -> Result<EntityRef> {
let table = match entity_type {
"issue" => "issues",
"merge_request" => "merge_requests",
_ => {
return Err(super::error::LoreError::NotFound(format!(
"Unknown entity type: {entity_type}"
)));
}
};
let sql = format!(
"SELECT e.id, e.iid, p.path_with_namespace
FROM {table} e
JOIN projects p ON p.id = e.project_id
WHERE e.iid = ?1 AND (?2 IS NULL OR e.project_id = ?2)"
);
let mut stmt = conn.prepare(&sql)?;
let rows: Vec<(i64, i64, String)> = stmt
.query_map(rusqlite::params![iid, project_id], |row| {
Ok((
row.get::<_, i64>(0)?,
row.get::<_, i64>(1)?,
row.get::<_, String>(2)?,
))
})?
.collect::<std::result::Result<Vec<_>, _>>()?;
match rows.len() {
0 => {
let sigil = if entity_type == "issue" { "#" } else { "!" };
Err(super::error::LoreError::NotFound(format!(
"{entity_type} {sigil}{iid} not found"
)))
}
1 => {
let (entity_id, entity_iid, project_path) = rows.into_iter().next().unwrap();
Ok(EntityRef {
entity_type: entity_type.to_owned(),
entity_id,
entity_iid,
project_path,
})
}
_ => {
let projects: Vec<&str> = rows.iter().map(|(_, _, p)| p.as_str()).collect();
let sigil = if entity_type == "issue" { "#" } else { "!" };
Err(super::error::LoreError::Ambiguous(format!(
"{entity_type} {sigil}{iid} exists in multiple projects: {}. Use --project to specify.",
projects.join(", ")
)))
}
}
}
#[cfg(test)] #[cfg(test)]
mod tests { mod tests {
use super::*; use super::*;
@@ -409,4 +480,106 @@ mod tests {
let long = "a".repeat(300); let long = "a".repeat(300);
assert_eq!(truncate_to_chars(&long, 200).chars().count(), 200); assert_eq!(truncate_to_chars(&long, 200).chars().count(), 200);
} }
// ─── resolve_entity_by_iid tests ────────────────────────────────────────
use crate::core::db::{create_connection, run_migrations};
use std::path::Path;
fn setup_db() -> Connection {
let conn = create_connection(Path::new(":memory:")).unwrap();
run_migrations(&conn).unwrap();
conn
}
fn insert_project(conn: &Connection, gitlab_id: i64, path: &str) -> i64 {
conn.execute(
"INSERT INTO projects (gitlab_project_id, path_with_namespace, web_url) VALUES (?1, ?2, ?3)",
rusqlite::params![gitlab_id, path, format!("https://gitlab.com/{path}")],
)
.unwrap();
conn.last_insert_rowid()
}
fn insert_issue(conn: &Connection, project_id: i64, iid: i64) -> i64 {
conn.execute(
"INSERT INTO issues (gitlab_id, project_id, iid, title, state, author_username, created_at, updated_at, last_seen_at) VALUES (?1, ?2, ?3, 'Test issue', 'opened', 'alice', 1000, 2000, 3000)",
rusqlite::params![project_id * 10000 + iid, project_id, iid],
)
.unwrap();
conn.last_insert_rowid()
}
fn insert_mr(conn: &Connection, project_id: i64, iid: i64) -> i64 {
conn.execute(
"INSERT INTO merge_requests (gitlab_id, project_id, iid, title, state, author_username, created_at, updated_at, last_seen_at) VALUES (?1, ?2, ?3, 'Test MR', 'opened', 'bob', 1000, 2000, 3000)",
rusqlite::params![project_id * 10000 + iid, project_id, iid],
)
.unwrap();
conn.last_insert_rowid()
}
#[test]
fn test_resolve_entity_by_iid_issue() {
let conn = setup_db();
let project_id = insert_project(&conn, 1, "group/project");
let entity_id = insert_issue(&conn, project_id, 42);
let result = resolve_entity_by_iid(&conn, "issue", 42, None).unwrap();
assert_eq!(result.entity_type, "issue");
assert_eq!(result.entity_id, entity_id);
assert_eq!(result.entity_iid, 42);
assert_eq!(result.project_path, "group/project");
}
#[test]
fn test_resolve_entity_by_iid_mr() {
let conn = setup_db();
let project_id = insert_project(&conn, 1, "group/project");
let entity_id = insert_mr(&conn, project_id, 99);
let result = resolve_entity_by_iid(&conn, "merge_request", 99, None).unwrap();
assert_eq!(result.entity_type, "merge_request");
assert_eq!(result.entity_id, entity_id);
assert_eq!(result.entity_iid, 99);
assert_eq!(result.project_path, "group/project");
}
#[test]
fn test_resolve_entity_by_iid_not_found() {
let conn = setup_db();
insert_project(&conn, 1, "group/project");
let result = resolve_entity_by_iid(&conn, "issue", 999, None);
assert!(result.is_err());
let err = result.unwrap_err();
assert!(matches!(err, crate::core::error::LoreError::NotFound(_)));
}
#[test]
fn test_resolve_entity_by_iid_ambiguous() {
let conn = setup_db();
let proj1 = insert_project(&conn, 1, "group/project-a");
let proj2 = insert_project(&conn, 2, "group/project-b");
insert_issue(&conn, proj1, 42);
insert_issue(&conn, proj2, 42);
let result = resolve_entity_by_iid(&conn, "issue", 42, None);
assert!(result.is_err());
let err = result.unwrap_err();
assert!(matches!(err, crate::core::error::LoreError::Ambiguous(_)));
}
#[test]
fn test_resolve_entity_by_iid_project_scoped() {
let conn = setup_db();
let proj1 = insert_project(&conn, 1, "group/project-a");
let proj2 = insert_project(&conn, 2, "group/project-b");
insert_issue(&conn, proj1, 42);
let entity_id_b = insert_issue(&conn, proj2, 42);
let result = resolve_entity_by_iid(&conn, "issue", 42, Some(proj2)).unwrap();
assert_eq!(result.entity_id, entity_id_b);
assert_eq!(result.project_path, "group/project-b");
}
} }

View File

@@ -5,8 +5,8 @@ use tracing::debug;
use crate::core::error::Result; use crate::core::error::Result;
use crate::core::timeline::{ use crate::core::timeline::{
EntityRef, MatchedDiscussion, TimelineEvent, TimelineEventType, resolve_entity_ref, EntityRef, MatchedDiscussion, TimelineEvent, TimelineEventType, resolve_entity_by_iid,
truncate_to_chars, resolve_entity_ref, truncate_to_chars,
}; };
use crate::embedding::ollama::OllamaClient; use crate::embedding::ollama::OllamaClient;
use crate::search::{FtsQueryMode, SearchFilters, SearchMode, search_hybrid, to_fts_query}; use crate::search::{FtsQueryMode, SearchFilters, SearchMode, search_hybrid, to_fts_query};
@@ -102,6 +102,53 @@ pub async fn seed_timeline(
}) })
} }
/// Seed the timeline directly from an entity IID, bypassing search entirely.
///
/// Used for `issue:42` / `mr:99` syntax. Resolves the entity, gathers ALL its
/// discussions, and returns a `SeedResult` compatible with the rest of the pipeline.
pub fn seed_timeline_direct(
conn: &Connection,
entity_type: &str,
iid: i64,
project_id: Option<i64>,
) -> Result<SeedResult> {
let entity_ref = resolve_entity_by_iid(conn, entity_type, iid, project_id)?;
// Gather all discussions for this entity (not search-matched, ALL of them)
let entity_id_col = match entity_type {
"issue" => "issue_id",
"merge_request" => "merge_request_id",
_ => {
return Ok(SeedResult {
seed_entities: vec![entity_ref],
evidence_notes: Vec::new(),
matched_discussions: Vec::new(),
search_mode: "direct".to_owned(),
});
}
};
let sql = format!("SELECT id, project_id FROM discussions WHERE {entity_id_col} = ?1");
let mut stmt = conn.prepare(&sql)?;
let matched_discussions: Vec<MatchedDiscussion> = stmt
.query_map(rusqlite::params![entity_ref.entity_id], |row| {
Ok(MatchedDiscussion {
discussion_id: row.get(0)?,
entity_type: entity_type.to_owned(),
entity_id: entity_ref.entity_id,
project_id: row.get(1)?,
})
})?
.collect::<std::result::Result<Vec<_>, _>>()?;
Ok(SeedResult {
seed_entities: vec![entity_ref],
evidence_notes: Vec::new(),
matched_discussions,
search_mode: "direct".to_owned(),
})
}
/// Resolve a list of document IDs to deduplicated entity refs and matched discussions. /// Resolve a list of document IDs to deduplicated entity refs and matched discussions.
/// Discussion and note documents are resolved to their parent entity (issue or MR). /// Discussion and note documents are resolved to their parent entity (issue or MR).
/// Returns (entities, matched_discussions). /// Returns (entities, matched_discussions).

View File

@@ -423,3 +423,90 @@ async fn test_seed_matched_discussions_have_correct_parent_entity() {
assert_eq!(result.matched_discussions[0].entity_type, "merge_request"); assert_eq!(result.matched_discussions[0].entity_type, "merge_request");
assert_eq!(result.matched_discussions[0].entity_id, mr_id); assert_eq!(result.matched_discussions[0].entity_id, mr_id);
} }
// ─── seed_timeline_direct tests ─────────────────────────────────────────────
#[test]
fn test_direct_seed_resolves_entity() {
let conn = setup_test_db();
let project_id = insert_test_project(&conn);
insert_test_issue(&conn, project_id, 42);
let result = seed_timeline_direct(&conn, "issue", 42, None).unwrap();
assert_eq!(result.seed_entities.len(), 1);
assert_eq!(result.seed_entities[0].entity_type, "issue");
assert_eq!(result.seed_entities[0].entity_iid, 42);
assert_eq!(result.seed_entities[0].project_path, "group/project");
}
#[test]
fn test_direct_seed_gathers_all_discussions() {
let conn = setup_test_db();
let project_id = insert_test_project(&conn);
let issue_id = insert_test_issue(&conn, project_id, 42);
// Create 3 discussions for this issue
let disc1 = insert_discussion(&conn, project_id, Some(issue_id), None);
let disc2 = insert_discussion(&conn, project_id, Some(issue_id), None);
let disc3 = insert_discussion(&conn, project_id, Some(issue_id), None);
let result = seed_timeline_direct(&conn, "issue", 42, None).unwrap();
assert_eq!(result.matched_discussions.len(), 3);
let disc_ids: Vec<i64> = result
.matched_discussions
.iter()
.map(|d| d.discussion_id)
.collect();
assert!(disc_ids.contains(&disc1));
assert!(disc_ids.contains(&disc2));
assert!(disc_ids.contains(&disc3));
}
#[test]
fn test_direct_seed_no_evidence_notes() {
let conn = setup_test_db();
let project_id = insert_test_project(&conn);
let issue_id = insert_test_issue(&conn, project_id, 42);
let disc_id = insert_discussion(&conn, project_id, Some(issue_id), None);
insert_note(&conn, disc_id, project_id, "some note body", false);
let result = seed_timeline_direct(&conn, "issue", 42, None).unwrap();
assert!(
result.evidence_notes.is_empty(),
"Direct seeding should not produce evidence notes"
);
}
#[test]
fn test_direct_seed_search_mode_is_direct() {
let conn = setup_test_db();
let project_id = insert_test_project(&conn);
insert_test_issue(&conn, project_id, 42);
let result = seed_timeline_direct(&conn, "issue", 42, None).unwrap();
assert_eq!(result.search_mode, "direct");
}
#[test]
fn test_direct_seed_not_found() {
let conn = setup_test_db();
insert_test_project(&conn);
let result = seed_timeline_direct(&conn, "issue", 999, None);
assert!(result.is_err());
}
#[test]
fn test_direct_seed_mr() {
let conn = setup_test_db();
let project_id = insert_test_project(&conn);
let mr_id = insert_test_mr(&conn, project_id, 99);
let disc_id = insert_discussion(&conn, project_id, None, Some(mr_id));
let result = seed_timeline_direct(&conn, "merge_request", 99, None).unwrap();
assert_eq!(result.seed_entities.len(), 1);
assert_eq!(result.seed_entities[0].entity_type, "merge_request");
assert_eq!(result.seed_entities[0].entity_iid, 99);
assert_eq!(result.matched_discussions.len(), 1);
assert_eq!(result.matched_discussions[0].discussion_id, disc_id);
}

View File

@@ -640,6 +640,24 @@ pub async fn ingest_project_merge_requests_with_progress(
); );
} }
let desc_refs = crate::core::note_parser::extract_refs_from_descriptions(conn, project_id)?;
if desc_refs.inserted > 0 || desc_refs.skipped_unresolvable > 0 {
debug!(
inserted = desc_refs.inserted,
unresolvable = desc_refs.skipped_unresolvable,
"Extracted cross-references from descriptions"
);
}
let user_note_refs = crate::core::note_parser::extract_refs_from_user_notes(conn, project_id)?;
if user_note_refs.inserted > 0 || user_note_refs.skipped_unresolvable > 0 {
debug!(
inserted = user_note_refs.inserted,
unresolvable = user_note_refs.skipped_unresolvable,
"Extracted cross-references from user notes"
);
}
{ {
let enqueued = enqueue_mr_closes_issues_jobs(conn, project_id)?; let enqueued = enqueue_mr_closes_issues_jobs(conn, project_id)?;
if enqueued > 0 { if enqueued > 0 {

View File

@@ -651,27 +651,37 @@ fn extract_invalid_value_context(e: &clap::Error) -> (Option<String>, Option<Vec
/// Phase 4: Suggest similar command using fuzzy matching /// Phase 4: Suggest similar command using fuzzy matching
fn suggest_similar_command(invalid: &str) -> String { fn suggest_similar_command(invalid: &str) -> String {
const VALID_COMMANDS: &[&str] = &[ // Primary commands + common aliases for fuzzy matching
"issues", const VALID_COMMANDS: &[(&str, &str)] = &[
"mrs", ("issues", "issues"),
"search", ("issue", "issues"),
"sync", ("mrs", "mrs"),
"ingest", ("mr", "mrs"),
"count", ("merge-requests", "mrs"),
"status", ("search", "search"),
"auth", ("find", "search"),
"doctor", ("query", "search"),
"version", ("sync", "sync"),
"init", ("ingest", "ingest"),
"stats", ("count", "count"),
"generate-docs", ("status", "status"),
"embed", ("auth", "auth"),
"migrate", ("doctor", "doctor"),
"health", ("version", "version"),
"robot-docs", ("init", "init"),
"completions", ("stats", "stats"),
"timeline", ("stat", "stats"),
"who", ("generate-docs", "generate-docs"),
("embed", "embed"),
("migrate", "migrate"),
("health", "health"),
("robot-docs", "robot-docs"),
("completions", "completions"),
("timeline", "timeline"),
("who", "who"),
("notes", "notes"),
("note", "notes"),
("drift", "drift"),
]; ];
let invalid_lower = invalid.to_lowercase(); let invalid_lower = invalid.to_lowercase();
@@ -679,19 +689,43 @@ fn suggest_similar_command(invalid: &str) -> String {
// Find the best match using Jaro-Winkler similarity // Find the best match using Jaro-Winkler similarity
let best_match = VALID_COMMANDS let best_match = VALID_COMMANDS
.iter() .iter()
.map(|cmd| (*cmd, jaro_winkler(&invalid_lower, cmd))) .map(|(alias, canonical)| (*canonical, jaro_winkler(&invalid_lower, alias)))
.max_by(|a, b| a.1.partial_cmp(&b.1).unwrap_or(std::cmp::Ordering::Equal)); .max_by(|a, b| a.1.partial_cmp(&b.1).unwrap_or(std::cmp::Ordering::Equal));
if let Some((cmd, score)) = best_match if let Some((cmd, score)) = best_match
&& score > 0.7 && score > 0.7
{ {
let example = command_example(cmd);
return format!( return format!(
"Did you mean 'lore {}'? Run 'lore robot-docs' for all commands", "Did you mean 'lore {cmd}'? Example: {example}. Run 'lore robot-docs' for all commands"
cmd
); );
} }
"Run 'lore robot-docs' for valid commands".to_string() "Run 'lore robot-docs' for valid commands. Common: issues, mrs, search, sync, timeline, who"
.to_string()
}
/// Return a contextual usage example for a command.
fn command_example(cmd: &str) -> &'static str {
match cmd {
"issues" => "lore --robot issues -n 10",
"mrs" => "lore --robot mrs -n 10",
"search" => "lore --robot search \"auth bug\"",
"sync" => "lore --robot sync",
"ingest" => "lore --robot ingest issues",
"notes" => "lore --robot notes --for-issue 123",
"count" => "lore --robot count issues",
"status" => "lore --robot status",
"stats" => "lore --robot stats",
"timeline" => "lore --robot timeline \"auth flow\"",
"who" => "lore --robot who --path src/",
"health" => "lore --robot health",
"generate-docs" => "lore --robot generate-docs",
"embed" => "lore --robot embed",
"robot-docs" => "lore robot-docs",
"init" => "lore init",
_ => "lore --robot <command>",
}
} }
fn handle_issues( fn handle_issues(
@@ -2135,6 +2169,8 @@ struct RobotDocsData {
commands: serde_json::Value, commands: serde_json::Value,
/// Deprecated command aliases (old -> new) /// Deprecated command aliases (old -> new)
aliases: serde_json::Value, aliases: serde_json::Value,
/// Pre-clap error tolerance: what the CLI auto-corrects
error_tolerance: serde_json::Value,
exit_codes: serde_json::Value, exit_codes: serde_json::Value,
/// Error codes emitted by clap parse failures /// Error codes emitted by clap parse failures
clap_error_codes: serde_json::Value, clap_error_codes: serde_json::Value,
@@ -2345,13 +2381,17 @@ fn handle_robot_docs(robot_mode: bool, brief: bool) -> Result<(), Box<dyn std::e
"example": "lore completions bash > ~/.local/share/bash-completion/completions/lore" "example": "lore completions bash > ~/.local/share/bash-completion/completions/lore"
}, },
"timeline": { "timeline": {
"description": "Chronological timeline of events matching a keyword query", "description": "Chronological timeline of events matching a keyword query or entity reference",
"flags": ["<QUERY>", "-p/--project", "--since <duration>", "--depth <n>", "--expand-mentions", "-n/--limit", "--fields <list>", "--max-seeds", "--max-entities", "--max-evidence"], "flags": ["<QUERY>", "-p/--project", "--since <duration>", "--depth <n>", "--expand-mentions", "-n/--limit", "--fields <list>", "--max-seeds", "--max-entities", "--max-evidence"],
"example": "lore --robot timeline '<keyword>' --since 30d", "query_syntax": {
"search": "Any text -> hybrid search seeding (FTS + vector)",
"entity_direct": "issue:N, i:N, mr:N, m:N -> direct entity seeding (no search, no Ollama)"
},
"example": "lore --robot timeline issue:42",
"response_schema": { "response_schema": {
"ok": "bool", "ok": "bool",
"data": {"entities": "[{type:string, iid:int, title:string, project_path:string}]", "events": "[{timestamp:string, type:string, entity_type:string, entity_iid:int, detail:string}]", "total_events": "int"}, "data": {"entities": "[{type:string, iid:int, title:string, project_path:string}]", "events": "[{timestamp:string, type:string, entity_type:string, entity_iid:int, detail:string}]", "total_events": "int"},
"meta": {"elapsed_ms": "int"} "meta": {"elapsed_ms": "int", "search_mode": "string (hybrid|lexical|direct)"}
}, },
"fields_presets": {"minimal": ["timestamp", "type", "entity_iid", "detail"]} "fields_presets": {"minimal": ["timestamp", "type", "entity_iid", "detail"]}
}, },
@@ -2485,12 +2525,54 @@ fn handle_robot_docs(robot_mode: bool, brief: bool) -> Result<(), Box<dyn std::e
// Phase 3: Deprecated command aliases // Phase 3: Deprecated command aliases
let aliases = serde_json::json!({ let aliases = serde_json::json!({
"list issues": "issues", "deprecated_commands": {
"list mrs": "mrs", "list issues": "issues",
"show issue <IID>": "issues <IID>", "list mrs": "mrs",
"show mr <IID>": "mrs <IID>", "show issue <IID>": "issues <IID>",
"auth-test": "auth", "show mr <IID>": "mrs <IID>",
"sync-status": "status" "auth-test": "auth",
"sync-status": "status"
},
"command_aliases": {
"issue": "issues",
"mr": "mrs",
"merge-requests": "mrs",
"merge-request": "mrs",
"note": "notes",
"find": "search",
"query": "search",
"stat": "stats",
"st": "status"
},
"pre_clap_aliases": {
"note": "Underscore/no-separator forms auto-corrected before parsing",
"merge_requests": "mrs",
"merge_request": "mrs",
"mergerequests": "mrs",
"mergerequest": "mrs",
"generate_docs": "generate-docs",
"generatedocs": "generate-docs",
"gendocs": "generate-docs",
"gen-docs": "generate-docs",
"robot_docs": "robot-docs",
"robotdocs": "robot-docs"
},
"prefix_matching": "Enabled via infer_subcommands. Unambiguous prefixes work: 'iss' -> issues, 'time' -> timeline, 'sea' -> search"
});
let error_tolerance = serde_json::json!({
"note": "The CLI auto-corrects common mistakes before parsing. Corrections are applied silently with a teaching note on stderr.",
"auto_corrections": [
{"type": "single_dash_long_flag", "example": "-robot -> --robot", "mode": "all"},
{"type": "case_normalization", "example": "--Robot -> --robot, --State -> --state", "mode": "all"},
{"type": "flag_prefix", "example": "--proj -> --project (when unambiguous)", "mode": "all"},
{"type": "fuzzy_flag", "example": "--projct -> --project", "mode": "all (threshold 0.9 in robot, 0.8 in human)"},
{"type": "subcommand_alias", "example": "merge_requests -> mrs, robotdocs -> robot-docs", "mode": "all"},
{"type": "value_normalization", "example": "--state Opened -> --state opened", "mode": "all"},
{"type": "value_fuzzy", "example": "--state opend -> --state opened", "mode": "all"},
{"type": "prefix_matching", "example": "lore iss -> lore issues, lore time -> lore timeline", "mode": "all (via clap infer_subcommands)"}
],
"teaching_notes": "Auto-corrections emit a JSON warning on stderr: {\"warning\":{\"type\":\"ARG_CORRECTED\",\"corrections\":[...],\"teaching\":[...]}}"
}); });
// Phase 3: Clap error codes (emitted by handle_clap_error) // Phase 3: Clap error codes (emitted by handle_clap_error)
@@ -2529,6 +2611,7 @@ fn handle_robot_docs(robot_mode: bool, brief: bool) -> Result<(), Box<dyn std::e
quick_start, quick_start,
commands, commands,
aliases, aliases,
error_tolerance,
exit_codes, exit_codes,
clap_error_codes, clap_error_codes,
error_format: "stderr JSON: {\"error\":{\"code\":\"...\",\"message\":\"...\",\"suggestion\":\"...\",\"actions\":[\"...\"]}}".to_string(), error_format: "stderr JSON: {\"error\":{\"code\":\"...\",\"message\":\"...\",\"suggestion\":\"...\",\"actions\":[\"...\"]}}".to_string(),