8 Commits

Author SHA1 Message Date
Taylor Eernisse
d9f99ef21d feat(cli): status display/filtering, expanded --fields, and robot-docs --brief
Work item status integration across all CLI output:

Issue listing (lore list issues):
- New Status column appears when any issue has status data, with
  hex-color rendering using ANSI 256-color approximation
- New --status flag for case-insensitive filtering (OR logic for
  multiple values): lore issues --status "In progress" --status "To do"
- Status fields (name, category, color, icon_name, synced_at) in issue
  list query and JSON output with conditional serialization

Issue detail (lore show issue):
- Displays "Status: In progress (in_progress)" with color-coded output
  using ANSI 256-color approximation from hex color values
- Status fields included in robot mode JSON with ISO timestamps
- IssueRow, IssueDetail, IssueDetailJson all carry status columns

Robot mode field selection expanded to new commands:
- search: --fields with "minimal" preset (document_id, title, source_type, score)
- timeline: --fields with "minimal" preset (timestamp, type, entity_iid, detail)
- who: --fields with per-mode presets (expert_minimal, workload_minimal, etc.)
- robot-docs: new --brief flag strips response_schema from output (~60% smaller)
- strip_schemas() utility in robot.rs for --brief mode
- expand_fields_preset() extended for search, timeline, and all who modes

Robot-docs manifest updated with --status flag documentation, --fields
flags for search/timeline/who, fields_presets sections, and corrected
search response schema field names.

Note: replaces empty commit dcfd449 which lost staging during hook execution.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 08:13:37 -05:00
Taylor Eernisse
f5967a8e52 chore: fix UBS hook stdin parsing and update beads
.claude/hooks/on-file-write.sh:
- Fix hook to read Claude Code context from JSON stdin (FILE_PATH and
  CWD extracted via jq) instead of relying on environment variables
- Scan only the changed file instead of the entire project directory,
  reducing hook execution from ~30s to <1s per save

.beads/:
- Sync issue tracker state

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 08:12:34 -05:00
Taylor Eernisse
2c9de1a6c3 docs: add lore-service, work-item-status-graphql, and time-decay plans
Three implementation plans with iterative cross-model refinement:

lore-service (5 iterations):
  HTTP service layer exposing lore's SQLite data via REST/SSE for
  integration with external tools (dashboards, IDE extensions, chat
  agents). Covers authentication, rate limiting, caching strategy, and
  webhook-driven sync triggers.

work-item-status-graphql (7 iterations + TDD appendix):
  Detailed implementation plan for the GraphQL-based work item status
  enrichment feature (now implemented). Includes the TDD appendix with
  test-first development specifications covering GraphQL client, adaptive
  pagination, ingestion orchestration, CLI display, and robot mode output.

time-decay-expert-scoring (iteration 5 feedback):
  Updates to the existing time-decay scoring plan incorporating feedback
  on decay curve parameterization, recency weighting for discussion
  contributions, and staleness detection thresholds.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 08:12:17 -05:00
Taylor Eernisse
1161edb212 docs: add TUI PRD v2 (FrankenTUI) with 9 plan-refine iterations
Comprehensive product requirements document for the gitlore TUI built on
FrankenTUI's Elm architecture (Msg -> update -> view). The PRD (7800+
lines) covers:

Architecture: Separate binary crate (lore-tui) with runtime delegation,
Elm-style Model/Cmd/Msg, DbManager with closure-based read pool + WAL,
TaskSupervisor for dedup/cancellation, EntityKey system for type-safe
entity references, CommandRegistry as single source of truth for
keybindings/palette/help.

Screens: Dashboard, IssueList, IssueDetail, MrList, MrDetail, Search
(lexical/hybrid/semantic with facets), Timeline (5-stage pipeline),
Who (expert/workload/reviews/active/overlap), Sync (live progress),
CommandPalette, Help overlay.

Infrastructure: InputMode state machine, Clock trait for deterministic
rendering, crash_context ring buffer with redaction, instance lock,
progressive hydration, session restore, grapheme-safe text truncation
(unicode-width + unicode-segmentation), terminal sanitization (ANSI/bidi/
C1 controls), entity LRU cache.

Testing: Snapshot tests via insta, event-fuzz, CLI/TUI parity, tiered
benchmark fixtures (S/M/L), query-plan CI enforcement, Phase 2.5
vertical slice gate.

9 plan-refine iterations (ChatGPT review -> Claude integration):
  Iter 1-3: Connection pool, debounce, EntityKey, TaskSupervisor,
    keyset pagination, capability-adaptive rendering
  Iter 4-6: Separate binary crate, ANSI hardening, session restore,
    read tx isolation, progressive hydration, unicode-width
  Iter 7-9: Per-screen LoadState, CommandRegistry, InputMode, Clock,
    log redaction, entity cache, search cancel SLO, crash diagnostics

Also includes the original tui-prd.md (ratatui-based, superseded by v2).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 08:11:26 -05:00
Taylor Eernisse
5ea976583e docs: update README, AGENTS, and robot-mode-design for work item status
README.md:
- Feature summary updated to mention work item status sync and GraphQL
- New config reference entry for sync.fetchWorkItemStatus (default true)
- Issue listing/show examples include --status flag usage
- Valid fields list expanded with status_name, status_category,
  status_color, status_icon_name, status_synced_at_iso
- Database schema table updated for issues table
- Ingest/sync command descriptions mention status enrichment phase
- Adaptive page sizing and graceful degradation documented

AGENTS.md:
- Robot mode example shows --status flag usage

docs/robot-mode-design.md:
- Issue available fields list expanded with status fields

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 08:10:51 -05:00
Taylor Eernisse
dcfd449b72 feat(cli): status display/filtering, expanded --fields, and robot-docs --brief
Work item status integration across all CLI output:

Issue listing (lore list issues):
- New Status column appears when any issue has status data, with
  hex-color rendering using ANSI 256-color approximation
- New --status flag for case-insensitive filtering (OR logic for
  multiple values): lore issues --status "In progress" --status "To do"

Issue detail (lore show issue):
- Displays "Status: In progress (in_progress)" with color-coded output
- Status fields (name, category, color, icon, synced_at) included in
  robot mode JSON with ISO timestamps

Robot mode field selection expanded to new commands:
- search: --fields with "minimal" preset (document_id, title, source_type, score)
- timeline: --fields with "minimal" preset (timestamp, type, entity_iid, detail)
- who: --fields with per-mode presets (expert_minimal, workload_minimal, etc.)
- robot-docs: new --brief flag strips response_schema from output (~60% smaller)

Robot-docs manifest updated with --status flag documentation, --fields
flags for search/timeline/who, fields_presets sections, and corrected
search response schema field names.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 08:09:47 -05:00
Taylor Eernisse
6b75697638 feat(ingestion): enrich issues with work item status from GraphQL API
Add a "Phase 1.5" status enrichment step to the issue ingestion pipeline
that fetches work item statuses via the GitLab GraphQL API after the
standard REST API ingestion completes.

Schema changes (migration 021):
- Add status_name, status_category, status_color, status_icon_name, and
  status_synced_at columns to the issues table (all nullable)

Ingestion pipeline changes:
- New `enrich_issue_statuses_txn()` function that applies fetched
  statuses in a single transaction with two phases: clear stale statuses
  for issues that no longer have a status widget, then apply new/updated
  statuses from the GraphQL response
- ProgressEvent variants for status enrichment (complete/skipped)
- IngestProjectResult tracks enrichment metrics (seen, enriched, cleared,
  without_widget, partial_error_count, enrichment_mode, errors)
- Robot mode JSON output includes per-project status enrichment details

Configuration:
- New `sync.fetchWorkItemStatus` config option (defaults true) to disable
  GraphQL status enrichment on instances without Premium/Ultimate
- `LoreError::GitLabAuthFailed` now treated as permanent API error so
  status enrichment auth failures don't trigger retries

Also removes the unnecessary nested SAVEPOINT in store_closes_issues_refs
(already runs within the orchestrator's transaction context).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 08:09:21 -05:00
Taylor Eernisse
dc49f5209e feat(gitlab): add GraphQL client with adaptive pagination and work item status types
Introduce a reusable GraphQL client (`src/gitlab/graphql.rs`) that handles
GitLab's GraphQL API with full error handling for auth failures, rate
limiting, and partial errors. Key capabilities:

- Adaptive page sizing (100 → 50 → 25 → 10) to handle GitLab GraphQL
  complexity limits without hardcoding a single safe page size
- Paginated issue status fetching via the workItems GraphQL query
- Graceful detection of unsupported instances (missing GraphQL endpoint
  or forbidden auth) so ingestion continues without status data
- Retry-After header parsing via the `httpdate` crate for rate limit
  compliance

Also adds `WorkItemStatus` type to `gitlab::types` with name, category,
color, and icon_name fields (all optional except name) with comprehensive
deserialization tests covering all system statuses (TO_DO, IN_PROGRESS,
DONE, CANCELED) and edge cases (null category, unknown future values).

The `GitLabClient` gains a `graphql_client()` factory method for
ergonomic access from the ingestion pipeline.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 08:08:53 -05:00
54 changed files with 22978 additions and 124 deletions

File diff suppressed because one or more lines are too long

View File

@@ -1 +1 @@
bd-3qn6 bd-3hjh

View File

@@ -1,6 +1,11 @@
#!/bin/bash #!/bin/bash
# Ultimate Bug Scanner - Claude Code Hook # Ultimate Bug Scanner - Claude Code Hook
# Runs on every file save for UBS-supported languages (JS/TS, Python, C/C++, Rust, Go, Java, Ruby) # Runs on every file save for UBS-supported languages (JS/TS, Python, C/C++, Rust, Go, Java, Ruby)
# Claude Code hooks receive context as JSON on stdin.
INPUT=$(cat)
FILE_PATH=$(echo "$INPUT" | jq -r '.tool_input.file_path // empty')
CWD=$(echo "$INPUT" | jq -r '.cwd // empty')
if [[ "$FILE_PATH" =~ \.(js|jsx|ts|tsx|mjs|cjs|py|pyw|pyi|c|cc|cpp|cxx|h|hh|hpp|hxx|rs|go|java|rb)$ ]]; then if [[ "$FILE_PATH" =~ \.(js|jsx|ts|tsx|mjs|cjs|py|pyw|pyi|c|cc|cpp|cxx|h|hh|hpp|hxx|rs|go|java|rb)$ ]]; then
echo "🔬 Running bug scanner..." echo "🔬 Running bug scanner..."
@@ -8,5 +13,5 @@ if [[ "$FILE_PATH" =~ \.(js|jsx|ts|tsx|mjs|cjs|py|pyw|pyi|c|cc|cpp|cxx|h|hh|hpp|
echo "⚠️ 'ubs' not found in PATH; install it before using this hook." >&2 echo "⚠️ 'ubs' not found in PATH; install it before using this hook." >&2
exit 0 exit 0
fi fi
ubs "${PROJECT_DIR}" --ci 2>&1 | head -50 ubs "$FILE_PATH" --ci 2>&1 | head -50
fi fi

View File

@@ -618,6 +618,9 @@ LORE_ROBOT=1 lore issues
lore --robot issues -n 10 lore --robot issues -n 10
lore --robot mrs -s opened lore --robot mrs -s opened
# Filter issues by work item status (case-insensitive)
lore --robot issues --status "In progress"
# List with field selection (reduces token usage ~60%) # List with field selection (reduces token usage ~60%)
lore --robot issues --fields minimal lore --robot issues --fields minimal
lore --robot mrs --fields iid,title,state,draft lore --robot mrs --fields iid,title,state,draft

1
Cargo.lock generated
View File

@@ -1118,6 +1118,7 @@ dependencies = [
"dirs", "dirs",
"flate2", "flate2",
"futures", "futures",
"httpdate",
"indicatif", "indicatif",
"libc", "libc",
"open", "open",

View File

@@ -45,6 +45,7 @@ rand = "0.8"
sha2 = "0.10" sha2 = "0.10"
flate2 = "1" flate2 = "1"
chrono = { version = "0.4", features = ["serde"] } chrono = { version = "0.4", features = ["serde"] }
httpdate = "1"
uuid = { version = "1", features = ["v4"] } uuid = { version = "1", features = ["v4"] }
regex = "1" regex = "1"
strsim = "0.11" strsim = "0.11"

View File

@@ -1,6 +1,6 @@
# Gitlore # Gitlore
Local GitLab data management with semantic search, people intelligence, and temporal analysis. Syncs issues, MRs, discussions, and notes from GitLab to a local SQLite database for fast, offline-capable querying, filtering, hybrid search, chronological event reconstruction, and expert discovery. Local GitLab data management with semantic search, people intelligence, and temporal analysis. Syncs issues, MRs, discussions, notes, and work item statuses from GitLab to a local SQLite database for fast, offline-capable querying, filtering, hybrid search, chronological event reconstruction, and expert discovery.
## Features ## Features
@@ -8,7 +8,7 @@ Local GitLab data management with semantic search, people intelligence, and temp
- **Incremental sync**: Cursor-based sync only fetches changes since last sync - **Incremental sync**: Cursor-based sync only fetches changes since last sync
- **Full re-sync**: Reset cursors and fetch all data from scratch when needed - **Full re-sync**: Reset cursors and fetch all data from scratch when needed
- **Multi-project**: Track issues and MRs across multiple GitLab projects - **Multi-project**: Track issues and MRs across multiple GitLab projects
- **Rich filtering**: Filter by state, author, assignee, labels, milestone, due date, draft status, reviewer, branches - **Rich filtering**: Filter by state, author, assignee, labels, milestone, due date, draft status, reviewer, branches, work item status
- **Hybrid search**: Combines FTS5 lexical search with Ollama-powered vector embeddings via Reciprocal Rank Fusion - **Hybrid search**: Combines FTS5 lexical search with Ollama-powered vector embeddings via Reciprocal Rank Fusion
- **People intelligence**: Expert discovery, workload analysis, review patterns, active discussions, and code ownership overlap - **People intelligence**: Expert discovery, workload analysis, review patterns, active discussions, and code ownership overlap
- **Timeline pipeline**: Reconstructs chronological event histories by combining search, graph traversal, and event aggregation across related entities - **Timeline pipeline**: Reconstructs chronological event histories by combining search, graph traversal, and event aggregation across related entities
@@ -17,6 +17,7 @@ Local GitLab data management with semantic search, people intelligence, and temp
- **Raw payload storage**: Preserves original GitLab API responses for debugging - **Raw payload storage**: Preserves original GitLab API responses for debugging
- **Discussion threading**: Full support for issue and MR discussions including inline code review comments - **Discussion threading**: Full support for issue and MR discussions including inline code review comments
- **Cross-reference tracking**: Automatic extraction of "closes", "mentioned" relationships between MRs and issues - **Cross-reference tracking**: Automatic extraction of "closes", "mentioned" relationships between MRs and issues
- **Work item status enrichment**: Fetches issue statuses (e.g., "To do", "In progress", "Done") from GitLab's GraphQL API with adaptive page sizing, color-coded display, and case-insensitive filtering
- **Resource event history**: Tracks state changes, label events, and milestone events for issues and MRs - **Resource event history**: Tracks state changes, label events, and milestone events for issues and MRs
- **Robot mode**: Machine-readable JSON output with structured errors, meaningful exit codes, and actionable recovery steps - **Robot mode**: Machine-readable JSON output with structured errors, meaningful exit codes, and actionable recovery steps
- **Observability**: Verbosity controls, JSON log format, structured metrics, and stage timing - **Observability**: Verbosity controls, JSON log format, structured metrics, and stage timing
@@ -96,7 +97,8 @@ Configuration is stored in `~/.config/lore/config.json` (or `$XDG_CONFIG_HOME/lo
"heartbeatIntervalSeconds": 30, "heartbeatIntervalSeconds": 30,
"cursorRewindSeconds": 2, "cursorRewindSeconds": 2,
"primaryConcurrency": 4, "primaryConcurrency": 4,
"dependentConcurrency": 2 "dependentConcurrency": 2,
"fetchWorkItemStatus": true
}, },
"storage": { "storage": {
"compressRawPayloads": true "compressRawPayloads": true
@@ -123,6 +125,7 @@ Configuration is stored in `~/.config/lore/config.json` (or `$XDG_CONFIG_HOME/lo
| `sync` | `cursorRewindSeconds` | `2` | Seconds to rewind cursor for overlap safety | | `sync` | `cursorRewindSeconds` | `2` | Seconds to rewind cursor for overlap safety |
| `sync` | `primaryConcurrency` | `4` | Concurrent GitLab requests for primary resources | | `sync` | `primaryConcurrency` | `4` | Concurrent GitLab requests for primary resources |
| `sync` | `dependentConcurrency` | `2` | Concurrent requests for dependent resources | | `sync` | `dependentConcurrency` | `2` | Concurrent requests for dependent resources |
| `sync` | `fetchWorkItemStatus` | `true` | Enrich issues with work item status via GraphQL (requires GitLab Premium/Ultimate) |
| `storage` | `dbPath` | `~/.local/share/lore/lore.db` | Database file path | | `storage` | `dbPath` | `~/.local/share/lore/lore.db` | Database file path |
| `storage` | `backupDir` | `~/.local/share/lore/backups` | Backup directory | | `storage` | `backupDir` | `~/.local/share/lore/backups` | Backup directory |
| `storage` | `compressRawPayloads` | `true` | Compress stored API responses with gzip | | `storage` | `compressRawPayloads` | `true` | Compress stored API responses with gzip |
@@ -184,6 +187,8 @@ lore issues --since 1m # Updated in last month
lore issues --since 2024-01-01 # Updated since date lore issues --since 2024-01-01 # Updated since date
lore issues --due-before 2024-12-31 # Due before date lore issues --due-before 2024-12-31 # Due before date
lore issues --has-due # Only issues with due dates lore issues --has-due # Only issues with due dates
lore issues --status "In progress" # By work item status (case-insensitive)
lore issues --status "To do" --status "In progress" # Multiple statuses (OR)
lore issues -p group/repo # Filter by project lore issues -p group/repo # Filter by project
lore issues --sort created --asc # Sort by created date, ascending lore issues --sort created --asc # Sort by created date, ascending
lore issues -o # Open first result in browser lore issues -o # Open first result in browser
@@ -193,9 +198,9 @@ lore -J issues --fields minimal # Compact: iid, title, state, updated_at_i
lore -J issues --fields iid,title,labels,state # Custom fields lore -J issues --fields iid,title,labels,state # Custom fields
``` ```
When listing, output includes: IID, title, state, author, assignee, labels, and update time. In robot mode, the `--fields` flag controls which fields appear in the JSON response. When listing, output includes: IID, title, state, status (when any issue has one), assignee, labels, and update time. Status values display with their configured color. In robot mode, the `--fields` flag controls which fields appear in the JSON response.
When showing a single issue (e.g., `lore issues 123`), output includes: title, description, state, author, assignees, labels, milestone, due date, web URL, and threaded discussions. When showing a single issue (e.g., `lore issues 123`), output includes: title, description, state, work item status (with color and category), author, assignees, labels, milestone, due date, web URL, and threaded discussions.
#### Project Resolution #### Project Resolution
@@ -397,7 +402,7 @@ When graph expansion encounters cross-project references to entities not yet syn
### `lore sync` ### `lore sync`
Run the full sync pipeline: ingest from GitLab, generate searchable documents, and compute embeddings. Run the full sync pipeline: ingest from GitLab (including work item status enrichment via GraphQL), generate searchable documents, and compute embeddings.
```bash ```bash
lore sync # Full pipeline lore sync # Full pipeline
@@ -413,11 +418,11 @@ The sync command displays animated progress bars for each stage and outputs timi
### `lore ingest` ### `lore ingest`
Sync data from GitLab to local database. Runs only the ingestion step (no doc generation or embeddings). Sync data from GitLab to local database. Runs only the ingestion step (no doc generation or embeddings). For issue ingestion, this includes a status enrichment phase that fetches work item statuses via the GitLab GraphQL API.
```bash ```bash
lore ingest # Ingest everything (issues + MRs) lore ingest # Ingest everything (issues + MRs)
lore ingest issues # Issues only lore ingest issues # Issues only (includes status enrichment)
lore ingest mrs # MRs only lore ingest mrs # MRs only
lore ingest issues -p group/repo # Single project lore ingest issues -p group/repo # Single project
lore ingest --force # Override stale lock lore ingest --force # Override stale lock
@@ -430,6 +435,8 @@ The `--full` flag resets sync cursors and discussion watermarks, then fetches al
- You want to ensure complete data after schema changes - You want to ensure complete data after schema changes
- Troubleshooting sync issues - Troubleshooting sync issues
Status enrichment uses adaptive page sizing (100 → 50 → 25 → 10) to handle GitLab GraphQL complexity limits. It gracefully handles instances without GraphQL support or Premium/Ultimate licensing. Disable via `sync.fetchWorkItemStatus: false` in config.
### `lore generate-docs` ### `lore generate-docs`
Extract searchable documents from ingested issues, MRs, and discussions for the FTS5 index. Extract searchable documents from ingested issues, MRs, and discussions for the FTS5 index.
@@ -623,7 +630,7 @@ lore -J issues --fields iid,title,state,labels,updated_at_iso
# minimal: iid, title, state, updated_at_iso # minimal: iid, title, state, updated_at_iso
``` ```
Valid fields for issues: `iid`, `title`, `state`, `author_username`, `labels`, `assignees`, `discussion_count`, `unresolved_count`, `created_at_iso`, `updated_at_iso`, `web_url`, `project_path` Valid fields for issues: `iid`, `title`, `state`, `author_username`, `labels`, `assignees`, `discussion_count`, `unresolved_count`, `created_at_iso`, `updated_at_iso`, `web_url`, `project_path`, `status_name`, `status_category`, `status_color`, `status_icon_name`, `status_synced_at_iso`
Valid fields for MRs: `iid`, `title`, `state`, `author_username`, `labels`, `draft`, `target_branch`, `source_branch`, `discussion_count`, `unresolved_count`, `created_at_iso`, `updated_at_iso`, `web_url`, `project_path`, `reviewers` Valid fields for MRs: `iid`, `title`, `state`, `author_username`, `labels`, `draft`, `target_branch`, `source_branch`, `discussion_count`, `unresolved_count`, `created_at_iso`, `updated_at_iso`, `web_url`, `project_path`, `reviewers`
@@ -714,7 +721,7 @@ Data is stored in SQLite with WAL mode and foreign keys enabled. Main tables:
| Table | Purpose | | Table | Purpose |
|-------|---------| |-------|---------|
| `projects` | Tracked GitLab projects with metadata | | `projects` | Tracked GitLab projects with metadata |
| `issues` | Issue metadata (title, state, author, due date, milestone) | | `issues` | Issue metadata (title, state, author, due date, milestone, work item status) |
| `merge_requests` | MR metadata (title, state, draft, branches, merge status, commit SHAs) | | `merge_requests` | MR metadata (title, state, draft, branches, merge status, commit SHAs) |
| `milestones` | Project milestones with state and due dates | | `milestones` | Project milestones with state and due dates |
| `labels` | Project labels with colors | | `labels` | Project labels with colors |

View File

@@ -125,7 +125,7 @@ lore -J mrs --fields iid,title,state,draft,target_branch
### Available Fields ### Available Fields
**Issues**: `iid`, `title`, `state`, `author_username`, `labels`, `assignees`, `discussion_count`, `unresolved_count`, `created_at_iso`, `updated_at_iso`, `web_url`, `project_path` **Issues**: `iid`, `title`, `state`, `author_username`, `labels`, `assignees`, `discussion_count`, `unresolved_count`, `created_at_iso`, `updated_at_iso`, `web_url`, `project_path`, `status_name`, `status_category`, `status_color`, `status_icon_name`, `status_synced_at_iso`
**MRs**: `iid`, `title`, `state`, `author_username`, `labels`, `draft`, `target_branch`, `source_branch`, `discussion_count`, `unresolved_count`, `created_at_iso`, `updated_at_iso`, `web_url`, `project_path`, `reviewers` **MRs**: `iid`, `title`, `state`, `author_username`, `labels`, `draft`, `target_branch`, `source_branch`, `discussion_count`, `unresolved_count`, `created_at_iso`, `updated_at_iso`, `web_url`, `project_path`, `reviewers`

View File

@@ -0,0 +1,9 @@
ALTER TABLE issues ADD COLUMN status_name TEXT;
ALTER TABLE issues ADD COLUMN status_category TEXT;
ALTER TABLE issues ADD COLUMN status_color TEXT;
ALTER TABLE issues ADD COLUMN status_icon_name TEXT;
ALTER TABLE issues ADD COLUMN status_synced_at INTEGER;
CREATE INDEX IF NOT EXISTS idx_issues_project_status_name ON issues(project_id, status_name);
INSERT INTO schema_version (version, applied_at, description)
VALUES (21, strftime('%s', 'now') * 1000, 'Work item status columns for issues');

View File

@@ -0,0 +1,186 @@
1. **Isolate scheduled behavior from manual `sync`**
Reasoning: Your current plan injects backoff into `handle_sync_cmd`, which affects all `lore sync` calls (including manual recovery runs). Scheduled behavior should be isolated so humans arent unexpectedly blocked by service backoff.
```diff
@@ Context
-`lore sync` runs a 4-stage pipeline (issues, MRs, docs, embeddings) that takes 2-4 minutes.
+`lore sync` remains the manual/operator command.
+`lore service run` (hidden/internal) is the scheduled execution entrypoint.
@@ Commands & User Journeys
+### `lore service run` (hidden/internal)
+**What it does:** Executes one scheduled sync attempt with service-only policy:
+- applies service backoff policy
+- records service run state
+- invokes sync pipeline with configured profile
+- updates retry state on success/failure
+
+**Invocation:** scheduler always runs:
+`lore --robot service run --reason timer`
@@ Backoff Integration into `handle_sync_cmd`
-Insert **after** config load but **before** the dry_run check:
+Do not add backoff checks to `handle_sync_cmd`.
+Backoff logic lives only in `handle_service_run`.
```
2. **Use DB as source-of-truth for service state (not a standalone JSON status file)**
Reasoning: You already have `sync_runs` in SQLite. A separate JSON status file creates split-brain and race/corruption risk. Keep JSON as optional cache/export only.
```diff
@@ Status File
-Location: `{get_data_dir()}/sync-status.json`
+Primary state location: SQLite (`service_state` table) + existing `sync_runs`.
+Optional mirror file: `{get_data_dir()}/sync-status.json` (best-effort export only).
@@ File-by-File Implementation Details
-### `src/core/sync_status.rs` (NEW)
+### `migrations/015_service_state.sql` (NEW)
+CREATE TABLE service_state (
+ id INTEGER PRIMARY KEY CHECK (id = 1),
+ installed INTEGER NOT NULL DEFAULT 0,
+ platform TEXT,
+ interval_seconds INTEGER,
+ profile TEXT NOT NULL DEFAULT 'balanced',
+ consecutive_failures INTEGER NOT NULL DEFAULT 0,
+ next_retry_at_ms INTEGER,
+ last_error_code TEXT,
+ last_error_message TEXT,
+ updated_at_ms INTEGER NOT NULL
+);
+
+### `src/core/service_state.rs` (NEW)
+- read/write state row
+- derive backoff/next_retry
+- join with latest `sync_runs` for status output
```
3. **Backoff policy should be configurable, jittered, and error-aware**
Reasoning: Fixed hardcoded backoff (`base=1800`) is wrong when user sets another interval. Also permanent failures (bad token/config) should not burn retries forever; they should enter paused/error state.
```diff
@@ Backoff Logic
-// Exponential: base * 2^failures, capped at 4 hours
+// Exponential with jitter: base * 2^(failures-1), capped, ±20% jitter
+// Applies only to transient errors.
+// Permanent errors set `paused_reason` and stop retries until user action.
@@ CLI Definition Changes
+ServiceCommand::Resume, // clear paused state / failures
+ServiceCommand::Run, // hidden
@@ Error Types
+ServicePaused, // scheduler paused due to permanent error
+ServiceCommandFailed, // OS command failure with stderr context
```
4. **Add a pipeline-level single-flight lock**
Reasoning: Current locking is in ingest stages; theres still overlap risk across full sync pipelines (docs/embed can overlap with another run). Add a top-level lock for scheduled/manual sync pipeline execution.
```diff
@@ Architecture
+Add `sync_pipeline` lock at top-level sync execution.
+Keep existing ingest lock (`sync`) for ingest internals.
@@ Backoff Integration into `handle_sync_cmd`
+Before starting sync pipeline, acquire `AppLock` with:
+name = "sync_pipeline"
+stale_lock_minutes = config.sync.stale_lock_minutes
+heartbeat_interval_seconds = config.sync.heartbeat_interval_seconds
```
5. **Dont embed token in service files by default**
Reasoning: Embedding PAT into unit/plist is a high-risk secret leak path. Make secure storage explicit and default-safe.
```diff
@@ `lore service install [--interval 30m]`
+`lore service install [--interval 30m] [--token-source env-file|embedded]`
+Default: `env-file` (0600 perms, user-owned)
+`embedded` allowed only with explicit opt-in and warning
@@ Robot output
- "token_embedded": true
+ "token_source": "env_file"
@@ Human output
- Note: Your GITLAB_TOKEN is embedded in the service file.
+ Note: Token is stored in a user-private env file (0600).
```
6. **Introduce a command-runner abstraction with timeout + stderr capture**
Reasoning: `launchctl/systemctl/schtasks` calls are failure-prone; you need consistent error mapping and deterministic tests.
```diff
@@ Platform Backends
-exports free functions that dispatch via `#[cfg(target_os)]`
+exports backend + shared `CommandRunner`:
+- run(cmd, args, timeout)
+- capture stdout/stderr/exit code
+- map failure to `ServiceCommandFailed { cmd, exit_code, stderr }`
```
7. **Persist install manifest to avoid brittle file parsing**
Reasoning: Parsing timer/plist for interval/state is fragile and platform-format dependent. Persist a manifest with checksums and expected artifacts.
```diff
@@ Platform Backends
-Same pattern for ... `get_interval_seconds()`
+Add manifest: `{data_dir}/service-manifest.json`
+Stores platform, interval, profile, generated files, and command.
+`service status` reads manifest first, then verifies platform state.
@@ Acceptance criteria
+Install is idempotent:
+- if manifest+files already match, report `no_change: true`
+- if drift detected, reconcile and rewrite
```
8. **Make schedule profile explicit (`fast|balanced|full`)**
Reasoning: This makes the feature more useful and performance-tunable without requiring users to understand internal flags.
```diff
@@ `lore service install [--interval 30m]`
+`lore service install [--interval 30m] [--profile fast|balanced|full]`
+
+Profiles:
+- fast: `sync --no-docs --no-embed`
+- balanced (default): `sync --no-embed`
+- full: `sync`
```
9. **Upgrade `service status` to include scheduler health + recent run summary**
Reasoning: Single last-sync snapshot is too shallow. Include recent attempts and whether scheduler is paused/backing off/running.
```diff
@@ `lore service status`
-What it does: Shows whether the service is installed, its configuration, last sync result, and next scheduled run.
+What it does: Shows install state, scheduler state (running/backoff/paused), recent runs, and next run estimate.
@@ Robot output
- "last_sync": { ... },
- "backoff": null
+ "scheduler_state": "running|backoff|paused|idle",
+ "last_sync": { ... },
+ "recent_runs": [{"run_id":"...","status":"...","started_at_iso":"..."}],
+ "backoff": null,
+ "paused_reason": null
```
10. **Strengthen tests around determinism and cross-platform generation**
Reasoning: Time-based backoff and shell quoting are classic flaky points. Add fake clock + fake command runner for deterministic tests.
```diff
@@ Testing Strategy
+Add deterministic test seams:
+- `Clock` trait for backoff/now calculations
+- `CommandRunner` trait for backend command execution
+
+Add tests:
+- transient vs permanent error classification
+- backoff schedule with jitter bounds
+- manifest drift reconciliation
+- quoting/escaping for paths with spaces and special chars
+- `service run` does not modify manual `sync` behavior
```
If you want, I can rewrite your full plan as a single clean revised document with these changes already integrated (instead of patch fragments).

View File

@@ -0,0 +1,182 @@
**High-Impact Revisions (ordered by priority)**
1. **Make service identity project-scoped (avoid collisions across repos/users)**
Analysis: Current fixed names (`com.gitlore.sync`, `LoreSync`, `lore-sync.timer`) will collide when users run multiple gitlore workspaces. This causes silent overwrites and broken uninstall/status behavior.
Diff:
```diff
--- a/plan.md
+++ b/plan.md
@@ Commands & User Journeys / install
- lore service install [--interval 30m] [--profile balanced] [--token-source env-file]
+ lore service install [--interval 30m] [--profile balanced] [--token-source auto] [--name <optional>]
@@ Install Manifest Schema
+ /// Stable per-install identity (default derived from project root hash)
+ pub service_id: String,
@@ Platform Backends
- Label: com.gitlore.sync
+ Label: com.gitlore.sync.{service_id}
- Task name: LoreSync
+ Task name: LoreSync-{service_id}
- ~/.config/systemd/user/lore-sync.service
+ ~/.config/systemd/user/lore-sync-{service_id}.service
```
2. **Replace token model with secure per-OS defaults**
Analysis: The current “env-file default” is not actually secure on macOS launchd (token still ends up in plist). On Windows, assumptions about inherited environment are fragile. Use OS-native secure stores by default and keep `embedded` as explicit opt-in only.
Diff:
```diff
--- a/plan.md
+++ b/plan.md
@@ Token storage strategies
-| env-file (default) | ...
+| auto (default) | macOS: Keychain, Linux: env-file (0600), Windows: Credential Manager |
+| env-file | Linux/systemd only |
| embedded | ... explicit warning ...
@@ macOS launchd section
- env-file strategy stores canonical token in service-env but embeds token in plist
+ default strategy is Keychain lookup at runtime; no token persisted in plist
+ env-file is not offered on macOS
@@ Windows schtasks section
- token must be in user's system environment
+ default strategy stores token in Windows Credential Manager and injects at runtime
```
3. **Version and atomically persist manifest/status**
Analysis: `Option<Self>` on read hides corruption, and non-atomic writes risk truncated JSON on crashes. This will create false “not installed” and scheduler confusion.
Diff:
```diff
--- a/plan.md
+++ b/plan.md
@@ Install Manifest Schema
+ pub schema_version: u32, // start at 1
+ pub updated_at_iso: String,
@@ Status File Schema
+ pub schema_version: u32, // start at 1
+ pub updated_at_iso: String,
@@ Read/Write
- read(path) -> Option<Self>
+ read(path) -> Result<Option<Self>, LoreError>
- write(...) -> std::io::Result<()>
+ write_atomic(...) -> std::io::Result<()> // tmp file + fsync + rename
```
4. **Persist `next_retry_at_ms` instead of recomputing jitter**
Analysis: Deterministic jitter from timestamp modulo is predictable and can herd retries. Persisting `next_retry_at_ms` at failure time makes status accurate, stable, and cheap to compute.
Diff:
```diff
--- a/plan.md
+++ b/plan.md
@@ SyncStatusFile
pub consecutive_failures: u32,
+ pub next_retry_at_ms: Option<i64>,
@@ Backoff Logic
- compute backoff from last_run.timestamp_ms and deterministic jitter each read
+ compute backoff once on failure, store next_retry_at_ms, read-only comparison afterward
+ jitter algorithm: full jitter in [0, cap], injectable RNG for tests
```
5. **Add circuit breaker for repeated transient failures**
Analysis: Infinite transient retries can run forever on systemic failures (DB corruption, bad network policy). After N transient failures, pause with actionable reason.
Diff:
```diff
--- a/plan.md
+++ b/plan.md
@@ Scheduler states
- backoff — transient failures, waiting to retry
+ backoff — transient failures, waiting to retry
+ paused — permanent error OR circuit breaker tripped after N transient failures
@@ Service run flow
- On transient failure: increment failures, compute backoff
+ On transient failure: increment failures, compute backoff, if failures >= max_transient_failures -> pause
```
6. **Stage-aware outcome policy (core freshness over all-or-nothing)**
Analysis: Failing embeddings/docs should not block issues/MRs freshness. Split stage outcomes and only treat core stages as hard-fail by default. This improves reliability and practical usefulness.
Diff:
```diff
--- a/plan.md
+++ b/plan.md
@@ Context
- lore sync runs a 4-stage pipeline ... treated as one run result
+ lore service run records per-stage outcomes (issues, mrs, docs, embeddings)
@@ Status File Schema
+ pub stage_results: Vec<StageResult>,
@@ service run flow
- Execute sync pipeline with flags derived from profile
+ Execute stage-by-stage and classify severity:
+ - critical: issues, mrs
+ - optional: docs, embeddings
+ optional stage failures mark run as degraded, not failed
```
7. **Replace cfg free-function backend with trait-based backend**
Analysis: Current backend API is hard to test end-to-end without real OS commands. A `SchedulerBackend` trait enables deterministic integration tests and cleaner architecture.
Diff:
```diff
--- a/plan.md
+++ b/plan.md
@@ Platform Backends / Architecture
- exports free functions dispatched via #[cfg]
+ define trait SchedulerBackend { install, uninstall, state, file_paths, next_run }
+ provide LaunchdBackend, SystemdBackend, SchtasksBackend implementations
+ include FakeBackend for integration tests
```
8. **Harden platform units and detect scheduler prerequisites**
Analysis: systemd user timers often fail silently without user manager/linger; launchd context can be wrong in headless sessions. Add explicit diagnostics and unit hardening.
Diff:
```diff
--- a/plan.md
+++ b/plan.md
@@ Linux systemd unit
[Service]
Type=oneshot
ExecStart=...
+TimeoutStartSec=900
+NoNewPrivileges=true
+PrivateTmp=true
+ProtectSystem=strict
+ProtectHome=read-only
@@ Linux install/status
+ detect user manager availability and linger state; surface warning/action
@@ macOS install/status
+ detect non-GUI bootstrap context and return actionable error
```
9. **Add operational commands: `trigger`, `doctor`, and non-interactive log tail**
Analysis: `logs` opening an editor is weak for automation and incident response. Operators need a preflight and immediate controlled run.
Diff:
```diff
--- a/plan.md
+++ b/plan.md
@@ ServiceCommand
+ Trigger, // run one attempt through service policy now
+ Doctor, // validate scheduler, token, paths, permissions
@@ logs
- opens editor
+ supports --tail <n> and --follow in human mode
+ robot mode can return last_n lines optionally
```
10. **Fix plan inconsistencies and edge-case correctness**
Analysis: There are internal mismatches that will cause implementation drift.
Diff:
```diff
--- a/plan.md
+++ b/plan.md
@@ Interval Parsing
- supports 's' suffix
+ remove 's' suffix (acceptance only allows 5m..24h)
@@ uninstall acceptance
- removes ALL service files only
+ explicitly also remove service-manifest and service-env (status/logs retained)
@@ SyncStatusFile schema
- pub last_run: SyncRunRecord
+ pub last_run: Option<SyncRunRecord> // matches idle/no runs state
```
---
**Recommended Architecture Upgrade Summary**
The strongest improvement set is: **(1) project-scoped IDs, (2) secure token defaults, (3) atomic/versioned state, (4) persisted retry schedule + circuit breaker, (5) stage-aware outcomes**. That combination materially improves correctness, multi-repo safety, security, operability, and real-world reliability without changing your core manual-vs-scheduled separation principle.

View File

@@ -0,0 +1,174 @@
Below are the highest-impact revisions Id make, ordered by severity/ROI. These focus on correctness first, then security, then operability and UX.
1. **Fix multi-install ambiguity (`service_id` exists, but commands cant target one explicitly)**
Analysis: The plan introduces `service-manifest-{service_id}.json`, but `status/uninstall/resume/logs` have no selector. In a multi-workspace or multi-name install scenario, behavior becomes ambiguous and error-prone. Add explicit targeting plus discovery.
```diff
@@ ## Commands & User Journeys
+### `lore service list`
+Lists installed services discovered from `{data_dir}/service-manifest-*.json`.
+Robot output includes `service_id`, `platform`, `interval_seconds`, `profile`, `installed_at_iso`.
@@ ### `lore service uninstall`
-### `lore service uninstall`
+### `lore service uninstall [--service <service_id|name>] [--all]`
@@
-2. CLI reads install manifest to find `service_id`
+2. CLI resolves target service via `--service` or current-project-derived default.
+3. If multiple candidates and no selector, return actionable error.
@@ ### `lore service status`
-### `lore service status`
+### `lore service status [--service <service_id|name>]`
```
2. **Make status state service-scoped (not global)**
Analysis: A single `sync-status.json` for all services causes cross-service contamination (pause/backoff/outcome from one profile affecting another). Keep lock global, but state per service.
```diff
@@ ## Status File
-### Location
-`{get_data_dir()}/sync-status.json`
+### Location
+`{get_data_dir()}/sync-status-{service_id}.json`
@@ ## Paths Module Additions
-pub fn get_service_status_path() -> PathBuf {
- get_data_dir().join("sync-status.json")
+pub fn get_service_status_path(service_id: &str) -> PathBuf {
+ get_data_dir().join(format!("sync-status-{service_id}.json"))
}
@@
-Note: `sync-status.json` is NOT scoped by `service_id`
+Note: status is scoped by `service_id`; lock remains global (`sync_pipeline`) to prevent overlapping writes.
```
3. **Stop classifying permanence via string matching**
Analysis: Matching `"401 Unauthorized"` in strings is brittle and will misclassify edge cases. Carry machine codes through stage results and classify by `ErrorCode` only.
```diff
@@ pub struct StageResult {
- pub error: Option<String>,
+ pub error: Option<String>,
+ pub error_code: Option<String>, // e.g., AUTH_FAILED, NETWORK_ERROR
}
@@ Error classification helpers
-fn is_permanent_error_message(msg: Option<&str>) -> bool { ...string contains... }
+fn is_permanent_error_code(code: Option<&str>) -> bool {
+ matches!(code, Some("TOKEN_NOT_SET" | "AUTH_FAILED" | "CONFIG_NOT_FOUND" | "CONFIG_INVALID" | "MIGRATION_FAILED"))
+}
```
4. **Install should be transactional (manifest written last)**
Analysis: Current order writes manifest before scheduler enable. If enable fails, you persist a false “installed” state. Use two-phase install with rollback.
```diff
@@ ### `lore service install` User journey
-9. CLI writes install manifest ...
-10. CLI runs the platform-specific enable command
+9. CLI runs the platform-specific enable command
+10. On success, CLI writes install manifest atomically
+11. On failure, CLI removes generated files and returns `ServiceCommandFailed`
```
5. **Fix launchd token security gap (env-file currently still embeds token)**
Analysis: Current “env-file” on macOS still writes token into plist, defeating the main security goal. Generate a private wrapper script that reads env file at runtime and execs `lore`.
```diff
@@ ### macOS: launchd
-<key>ProgramArguments</key>
-<array>
- <string>{binary_path}</string>
- <string>--robot</string>
- <string>service</string>
- <string>run</string>
-</array>
+<key>ProgramArguments</key>
+<array>
+ <string>{data_dir}/service-run-{service_id}.sh</string>
+</array>
@@
-`env-file`: ... token value must still appear in plist ...
+`env-file`: token never appears in plist; wrapper loads `{data_dir}/service-env-{service_id}` at runtime.
```
6. **Improve backoff math and add half-open circuit recovery**
Analysis: Current jitter + min clamp makes first retry deterministic and can over-pause. Also circuit-breaker requires manual resume forever. Add cooldown + half-open probe to self-heal.
```diff
@@ Backoff Logic
-let backoff_secs = ((base_backoff as f64) * jitter_factor) as u64;
-let backoff_secs = backoff_secs.max(base_interval_seconds);
+let max_backoff = base_backoff;
+let min_backoff = base_interval_seconds;
+let span = max_backoff.saturating_sub(min_backoff);
+let backoff_secs = min_backoff + ((span as f64) * jitter_factor) as u64;
@@ Scheduler states
-- `paused` — permanent error ... OR circuit breaker tripped ...
+- `paused` — permanent error requiring intervention
+- `half_open` — probe state after circuit cooldown; one trial run allowed
@@ Circuit breaker
-... transitions to `paused` ... Run: lore service resume
+... transitions to `half_open` after cooldown (default 30m). Successful probe closes breaker automatically; failed probe returns to backoff/paused.
```
7. **Promote backend trait to v1 (not v2) for deterministic integration tests**
Analysis: This is a reliability-critical feature spanning OS schedulers. A trait abstraction now gives true behavior tests and safer refactors.
```diff
@@ ### Platform Backends
-> Future architecture note: A `SchedulerBackend` trait ... for v2.
+Adopt `SchedulerBackend` trait in v1 with real backends (`launchd/systemd/schtasks`) and `FakeBackend` for tests.
+This enables deterministic install/uninstall/status/run-path integration tests without touching host scheduler.
```
8. **Harden `run_cmd` timeout behavior**
Analysis: If timeout occurs, child process must be killed and reaped. Otherwise you leak processes and can wedge repeated runs.
```diff
@@ fn run_cmd(...)
-// Wait with timeout
-let output = wait_with_timeout(output, timeout_secs)?;
+// Wait with timeout; on timeout kill child and wait to reap
+let output = wait_with_timeout_kill_and_reap(child, timeout_secs)?;
```
9. **Add manual control commands (`pause`, `trigger`, `repair`)**
Analysis: These are high-utility operational controls. `trigger` helps immediate sync without waiting interval. `pause` supports maintenance windows. `repair` avoids manual file deletion for corrupt state.
```diff
@@ pub enum ServiceCommand {
+ /// Pause scheduled execution without uninstalling
+ Pause { #[arg(long)] reason: Option<String> },
+ /// Trigger an immediate one-off run using installed profile
+ Trigger { #[arg(long)] ignore_backoff: bool },
+ /// Repair corrupt manifest/status by backing up and reinitializing
+ Repair { #[arg(long)] service: Option<String> },
}
```
10. **Make `logs` default non-interactive and add rotation policy**
Analysis: Opening editor by default is awkward for automation/SSH and slower for normal diagnosis. Defaulting to `tail` is more practical; `--open` can preserve editor behavior.
```diff
@@ ### `lore service logs`
-By default, opens in the user's preferred editor.
+By default, prints last 100 lines to stdout.
+Use `--open` to open editor.
@@
+Log rotation: rotate `service-stdout.log` / `service-stderr.log` at 10 MB, keep 5 files.
```
11. **Remove destructive/shell-unsafe suggested action**
Analysis: `actions(): ["rm {path}", ...]` is unsafe (shell injection + destructive guidance). Replace with safe command path.
```diff
@@ LoreError::actions()
-Self::ServiceCorruptState { path, .. } => vec![&format!("rm {path}"), "lore service install"],
+Self::ServiceCorruptState { .. } => vec!["lore service repair", "lore service install"],
```
12. **Tighten scheduler units for real-world reliability**
Analysis: Add explicit working directory and success-exit handling to reduce environment drift and edge failures.
```diff
@@ systemd service unit
[Service]
Type=oneshot
ExecStart={binary_path} --robot service run
+WorkingDirectory={data_dir}
+SuccessExitStatus=0
TimeoutStartSec=900
```
If you want, I can produce a single consolidated “v3 plan” markdown with these revisions already merged into your original structure.

View File

@@ -0,0 +1,190 @@
No `## Rejected Recommendations` section was present in the plan you shared, so the proposals below are all net-new.
1. **Make scheduled runs explicitly target a single service instance**
Analysis: right now `service run` has no selector, but the plan supports multiple installed services. That creates ambiguity and incorrect manifest/status selection. This is the most important architectural fix.
```diff
@@ `lore service install` What it does
- runs `lore --robot service run` at the specified interval
+ runs `lore --robot service run --service-id <service_id>` at the specified interval
@@ Robot output (`install`)
- "sync_command": "/usr/local/bin/lore --robot service run",
+ "sync_command": "/usr/local/bin/lore --robot service run --service-id a1b2c3d4",
@@ `ServiceCommand` enum
- #[command(hide = true)]
- Run,
+ #[command(hide = true)]
+ Run {
+ /// Internal selector injected by scheduler backend
+ #[arg(long, hide = true)]
+ service_id: String,
+ },
@@ `handle_service_run` signature
-pub fn handle_service_run(start: std::time::Instant) -> Result<(), Box<dyn std::error::Error>>
+pub fn handle_service_run(service_id: &str, start: std::time::Instant) -> Result<(), Box<dyn std::error::Error>>
@@ run flow step 1
- Read install manifest
+ Read install manifest for `service_id`
```
2. **Strengthen `service_id` derivation to avoid cross-workspace collisions**
Analysis: hashing config path alone can collide when many workspaces share one global config. Identity should represent what is being synced, not only where config lives.
```diff
@@ Key Design Principles / Project-Scoped Service Identity
- derive from a stable hash of the config file path
+ derive from a stable fingerprint of:
+ - canonical workspace root
+ - normalized configured GitLab project URLs
+ - canonical config path
+ then take first 12 hex chars of SHA-256
@@ `compute_service_id`
- Returns first 8 hex chars of SHA-256 of the canonical config path.
+ Returns first 12 hex chars of SHA-256 of a canonical identity tuple
+ (workspace_root + sorted project URLs + config_path).
```
3. **Introduce a service-state machine with a dedicated admin lock**
Analysis: install/uninstall/pause/resume/repair/status can race each other. A lock and explicit transition table prevents invalid states and file races.
```diff
@@ New section: Service State Model
+ All state mutations are serialized by `AppLock("service-admin-{service_id}")`.
+ Legal transitions:
+ - idle -> running -> success|degraded|backoff|paused
+ - backoff -> running|paused
+ - paused -> half_open|running (resume)
+ - half_open -> running|paused
+ Any invalid transition is rejected with `ServiceCorruptState`.
@@ `handle_install`, `handle_uninstall`, `handle_pause`, `handle_resume`, `handle_repair`
+ Acquire `service-admin-{service_id}` before mutating manifest/status/service files.
```
4. **Unify manual and scheduled sync execution behind one orchestrator**
Analysis: the plan currently duplicates stage logic and error classification in `service run`, increasing drift risk. A shared orchestrator gives one authoritative pipeline behavior.
```diff
@@ Key Design Principles
+ #### 6. Single Sync Orchestrator
+ Both `lore sync` and `lore service run` call `SyncOrchestrator`.
+ Service mode adds policy (backoff/circuit-breaker); manual mode bypasses policy.
@@ Service Run Implementation
- execute_sync_stages(&sync_args)
+ SyncOrchestrator::run(SyncMode::Service { profile, policy })
@@ manual sync
- separate pipeline path
+ SyncOrchestrator::run(SyncMode::Manual { flags })
```
5. **Add bounded in-run retries for transient core-stage failures**
Analysis: single-shot failure handling will over-trigger backoff on temporary network blips. One short retry per core stage significantly improves freshness without much extra runtime.
```diff
@@ Stage-aware execution
+ Core stages (`issues`, `mrs`) get up to 1 immediate retry on transient errors
+ (jittered 1-5s). Permanent errors are never retried.
+ Optional stages keep best-effort semantics.
@@ Acceptance criteria (`service run`)
+ Retries transient core stage failures once before counting run as failed.
```
6. **Harden persistence with full crash-safety semantics**
Analysis: current atomic write description is good but incomplete for power-loss durability. You should fsync parent directory after rename and include lightweight integrity metadata.
```diff
@@ `write_atomic`
- tmp file + fsync + rename
+ tmp file + fsync(file) + rename + fsync(parent_dir)
@@ `ServiceManifest` and `SyncStatusFile`
+ pub write_seq: u64,
+ pub content_sha256: String, // optional integrity guard for repair/doctor
```
7. **Fix token handling to avoid shell/env injection and add secure-store mode**
Analysis: sourcing env files in shell is brittle if token contains special chars/newlines. Also, secure OS credential stores should be first-class for production reliability/security.
```diff
@@ Token storage strategies
-| `env-file` (default) ...
+| `auto` (default) | use secure-store when available, else env-file |
+| `secure-store` | macOS Keychain / libsecret / Windows Credential Manager |
+| `env-file` | explicit fallback |
@@ macOS wrapper script
-. "{data_dir}/service-env-{service_id}"
-export {token_env_var}
+TOKEN_VALUE="$(cat "{data_dir}/service-token-{service_id}" )"
+export {token_env_var}="$TOKEN_VALUE"
@@ Acceptance criteria
+ Reject token values containing `\0` or newline for env-file mode.
+ Never eval/source untrusted token content.
```
8. **Correct platform/runtime implementation hazards**
Analysis: there are a few correctness risks that should be fixed in-plan now.
```diff
@@ macOS install steps
- Get UID via `unsafe { libc::getuid() }`
+ Get UID via safe API (`nix::unistd::Uid::current()` or equivalent safe helper)
@@ Command Runner Helper
- poll try_wait and read stdout/stderr after exit
+ avoid potential pipe backpressure deadlock:
+ use wait-with-timeout + concurrent stdout/stderr draining
@@ Linux timer
- OnUnitActiveSec={interval_seconds}s
+ OnUnitInactiveSec={interval_seconds}s
+ AccuracySec=1min
```
9. **Make logs fully service-scoped**
Analysis: you already scoped manifest/status by `service_id`; logs are still global in several places. Multi-service installs will overwrite each others logs.
```diff
@@ Paths Module Additions
-pub fn get_service_log_path() -> PathBuf
+pub fn get_service_log_path(service_id: &str, stream: LogStream) -> PathBuf
@@ log filenames
- logs/service-stderr.log
- logs/service-stdout.log
+ logs/service-{service_id}-stderr.log
+ logs/service-{service_id}-stdout.log
@@ `service logs`
- default path: `{data_dir}/logs/service-stderr.log`
+ default path: `{data_dir}/logs/service-{service_id}-stderr.log`
```
10. **Resolve internal spec contradictions and rollback gaps**
Analysis: there are a few contradictory statements and incomplete rollback behavior that will cause implementation churn.
```diff
@@ `service logs` behavior
- default (no flags): open in editor (human)
+ default (no flags): print last 100 lines (human and robot metadata mode)
+ `--open` is explicit opt-in
@@ install rollback
- On failure: removes generated service files
+ On failure: removes generated service files, env file, wrapper script, and temp manifest
@@ `handle_service_run` sample code
- let manifest_path = get_service_manifest_path();
+ let manifest_path = get_service_manifest_path(service_id);
```
If you want, I can take these revisions and produce a single consolidated “Iteration 4” replacement plan block with all sections rewritten coherently so its ready to hand to an implementer.

View File

@@ -0,0 +1,196 @@
I reviewed the full plan and avoided everything already listed in `## Rejected Recommendations`. These are the highest-impact revisions Id make.
1. **Fix identity model inconsistency and prevent `--name` alias collisions**
Why this improves the plan: your text says identity includes workspace root, but the current derivation code does not. Also, using `--name` as the actual `service_id` risks accidental cross-project collisions and destructive updates.
```diff
--- a/plan.md
+++ b/plan.md
@@ Key Design Principles / 2. Project-Scoped Service Identity
- Each installed service gets a unique `service_id` derived from a canonical identity tuple: the config file path, sorted GitLab project URLs, and workspace root.
+ Each installed service gets an immutable `identity_hash` derived from a canonical identity tuple:
+ workspace root + canonical config path + sorted normalized project URLs.
+ `service_id` remains the scheduler identifier; `--name` is a human alias only.
+ If `--name` collides with an existing service that has a different `identity_hash`, install fails with an actionable error.
@@ Install Manifest / Schema
+ /// Immutable identity hash for collision-safe matching across reinstalls
+ pub identity_hash: String,
+ /// Optional human-readable alias passed via --name
+ #[serde(skip_serializing_if = "Option::is_none")]
+ pub service_alias: Option<String>,
+ /// Canonical workspace root used in identity derivation
+ pub workspace_root: String,
@@ service_id derivation
-pub fn compute_service_id(config_path: &Path, project_urls: &[&str]) -> String
+pub fn compute_identity_hash(workspace_root: &Path, config_path: &Path, project_urls: &[&str]) -> String
```
2. **Add lock protocol to eliminate uninstall/run race conditions**
Why this improves the plan: today `service run` does not take admin lock, and admin commands do not take pipeline lock. `uninstall` can race with an active run and remove files mid-execution.
```diff
--- a/plan.md
+++ b/plan.md
@@ Key Design Principles / 6. Serialized Admin Mutations
- The `service run` entrypoint does NOT acquire the admin lock — it only acquires the `sync_pipeline` lock
+ The `service run` entrypoint acquires only `sync_pipeline`.
+ Destructive admin operations (`install` overwrite, `uninstall`, `repair --regenerate`) must:
+ 1) acquire `service-admin-{service_id}`
+ 2) disable scheduler backend entrypoint
+ 3) acquire `sync_pipeline` lock with timeout
+ 4) mutate/remove files
+ This lock ordering is mandatory to prevent deadlocks and run/delete races.
@@ lore service uninstall / User journey
- 4. Runs platform-specific disable command
- 5. Removes service files from disk
+ 4. Acquires `sync_pipeline` lock (after disabling scheduler) with bounded wait
+ 5. Removes service files from disk only after lock acquisition
```
3. **Make transient handling `Retry-After` aware**
Why this improves the plan: rate-limit and 503 responses often carry retry hints. Ignoring them causes useless retries and longer outages.
```diff
--- a/plan.md
+++ b/plan.md
@@ Transient vs permanent error classification
-| Transient | Retry with backoff | Network timeout, rate limited, DB locked, 5xx from GitLab |
+| Transient | Retry with adaptive backoff | Network timeout, DB locked, 5xx from GitLab |
+| Transient (hinted) | Respect server retry hint | Rate limited with Retry-After/X-RateLimit-Reset |
@@ Backoff Logic
+ If an error includes a retry hint (e.g., `Retry-After`), set:
+ `next_retry_at_ms = max(computed_backoff, hinted_retry_at_ms)`.
+ Persist `backoff_reason` for status visibility.
```
4. **Decouple optional stage cadence from core sync interval**
Why this improves the plan: running docs/embeddings every 530 minutes is expensive and unnecessary. Separate freshness windows reduce cost/latency while keeping core data fresh.
```diff
--- a/plan.md
+++ b/plan.md
@@ Sync profiles
-| `balanced` (default) | `--no-embed` | Issues + MRs + doc generation |
-| `full` | (none) | Full pipeline including embeddings |
+| `balanced` (default) | core every interval, docs every 60m, no embeddings | Fast + useful docs |
+| `full` | core every interval, docs every interval, embeddings every 6h (default) | Full freshness with bounded cost |
@@ Status File / StageResult
+ /// true when stage intentionally skipped due freshness window
+ #[serde(default)]
+ pub skipped: bool,
@@ lore service run / Stage-aware execution
+ Optional stages may be skipped when their last successful run is within configured freshness windows.
+ Skipped optional stages do not count as failures and are recorded explicitly.
```
5. **Give Windows parity for secure token handling (env-file + wrapper)**
Why this improves the plan: current Windows path requires global/system env and has poor UX. A wrapper+env-file model gives platform parity and avoids global token exposure.
```diff
--- a/plan.md
+++ b/plan.md
@@ Token storage strategies
-| On Windows, neither strategy applies — the token must be in the user's system environment
+| On Windows, `env-file` is supported via a generated wrapper script (`service-run-{service_id}.cmd` or `.ps1`)
+| that reads `{data_dir}/service-env-{service_id}` and launches `lore --robot service run ...`.
+| `embedded` remains opt-in and warned as less secure.
@@ Windows: schtasks
- Token handling on Windows: The env var must be set system-wide via `setx`
+ Token handling on Windows:
+ - `env-file` (default): wrapper script reads token from private file at runtime
+ - `embedded`: token passed via wrapper-set environment variable
+ - `system_env`: still supported as fallback
```
6. **Add run heartbeat and stale-run detection**
Why this improves the plan: `running` state can become misleading after crashes or stale locks. Heartbeat metadata makes status accurate and improves incident triage.
```diff
--- a/plan.md
+++ b/plan.md
@@ Status File / Schema
+ /// In-flight run metadata for crash/stale detection
+ #[serde(skip_serializing_if = "Option::is_none")]
+ pub current_run: Option<CurrentRunState>,
+
+pub struct CurrentRunState {
+ pub run_id: String,
+ pub started_at_ms: i64,
+ pub last_heartbeat_ms: i64,
+ pub pid: u32,
+}
@@ lore service status
- - `running` — currently executing (sync_pipeline lock held)
+ - `running` — currently executing with live heartbeat
+ - `running_stale` — in-flight metadata exists but heartbeat exceeded stale threshold
```
7. **Upgrade drift detection from “loaded/unloaded” to spec-level drift**
Why this improves the plan: platform state alone misses manual edits to unit/plist/wrapper files. Spec-hash drift gives reliable “what changed?” diagnostics and safe regeneration.
```diff
--- a/plan.md
+++ b/plan.md
@@ Install Manifest / Schema
+ /// Hash of generated scheduler artifacts and command spec
+ pub spec_hash: String,
@@ lore service status
- Detects manifest/platform drift and reports it
+ Detects:
+ - platform drift (loaded/unloaded mismatch)
+ - spec drift (artifact content hash mismatch)
+ - command drift (sync command differs from manifest)
@@ lore service repair
+ Add `--regenerate` to rewrite scheduler artifacts from manifest when spec drift is detected.
+ This is non-destructive and does not delete status/log history.
```
8. **Add safe operational modes: `install --dry-run` and `doctor --fix`**
Why this improves the plan: dry-run reduces risk before writing OS scheduler files; fix-mode improves operator ergonomics and lowers support burden.
```diff
--- a/plan.md
+++ b/plan.md
@@ lore service install
+ Add `--dry-run`:
+ - validates config/token/prereqs
+ - renders service files and planned commands
+ - writes nothing, executes nothing
@@ lore service doctor
+ Add `--fix` for safe, non-destructive remediations:
+ - create missing dirs
+ - correct file permissions on env/wrapper files
+ - run `systemctl --user daemon-reload` when applicable
+ - report applied fixes in robot output
```
9. **Define explicit schema migration behavior (not just `schema_version` fields)**
Why this improves the plan: version fields without migration policy become operational risk during upgrades.
```diff
--- a/plan.md
+++ b/plan.md
@@ ServiceManifest Read/Write
- `ServiceManifest::read(path: &Path) -> Result<Option<Self>, LoreError>`
+ `ServiceManifest::read_and_migrate(path: &Path) -> Result<Option<Self>, LoreError>`
+ - Migrates known older schema versions to current in-memory model
+ - Rewrites migrated file atomically
+ - Fails with actionable `ServiceCorruptState` for unknown future major versions
@@ SyncStatusFile Read/Write
- `SyncStatusFile::read(path: &Path) -> Result<Option<Self>, LoreError>`
+ `SyncStatusFile::read_and_migrate(path: &Path) -> Result<Option<Self>, LoreError>`
```
If you want, I can produce a fully rewritten v5 plan text that integrates all nine changes coherently section-by-section.

3759
plans/lore-service.md Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,128 @@
**Best Revisions To Strengthen The Plan**
1. **[Critical] Replace one-hop rename matching with canonical path identities**
Analysis and rationale: Current `old_path OR new_path` fixes direct renames, but it still breaks on rename chains (`a.rs -> b.rs -> c.rs`) and split/move patterns. A canonical `path_identity` graph built from `mr_file_changes(old_path,new_path)` gives stable identity over time, which is the right architectural boundary for expertise history.
```diff
@@ ## Context
-- Match both old and new paths in all signal queries AND path resolution probes so expertise survives file renames
+- Build canonical path identities from rename edges and score by identity, not raw path strings, so expertise survives multi-hop renames and moves
@@ ## Files to Modify
-2. **`src/cli/commands/who.rs`** — Core changes:
+2. **`src/cli/commands/who.rs`** — Core changes:
...
- - Match both `new_path` and `old_path` in all signal queries (rename awareness)
+ - Resolve queried paths to `path_identity_id` and match all aliases in that identity set
+4. **`src/core/path_identity.rs`** — New module:
+ - Build/maintain rename graph from `mr_file_changes`
+ - Resolve path -> identity + aliases for probes/scoring
```
2. **[Critical] Shift scoring input from runtime CTE joins to a normalized `expertise_events` table**
Analysis and rationale: Your SQL is correct but complex and expensive at query time. Precomputing normalized events at ingestion gives simpler, faster, and more reliable scoring queries; it also enables model versioning/backfills without touching raw MR/note tables each request.
```diff
@@ ## Files to Modify
-3. **`src/core/db.rs`** — Add migration for indexes supporting the new query shapes
+3. **`src/core/db.rs`** — Add migrations for:
+ - `expertise_events` table (normalized scoring events)
+ - supporting indexes
+4. **`src/core/ingest/expertise_events.rs`** — New:
+ - Incremental upsert of events during sync/ingest
@@ ## SQL Restructure (who.rs)
-The SQL uses CTE-based dual-path matching and hybrid aggregation...
+Runtime SQL reads precomputed `expertise_events` filtered by path identity + time window.
+Heavy joins/aggregation move to ingest-time normalization.
```
3. **[High] Upgrade reviewer engagement model beyond char-count threshold**
Analysis and rationale: `min_note_chars` is a useful guardrail but brittle (easy to game, penalizes concise high-quality comments). Add explicit review-state signals (`approved`, `changes_requested`) and trivial-comment pattern filtering to better capture real reviewer expertise.
```diff
@@ ## Scoring Formula
-| **Reviewer Participated** (left DiffNote on MR/path) | 10 | 90 days |
+| **Reviewer Participated** (substantive DiffNote and/or formal review action) | 10 | 90 days |
+| **Review Decision: changes_requested** | 6 | 120 days |
+| **Review Decision: approved** | 4 | 75 days |
@@ ### 1. ScoringConfig (config.rs)
pub reviewer_min_note_chars: u32,
+ pub reviewer_trivial_note_patterns: Vec<String>, // default: ["lgtm","+1","nit","ship it","👍"]
+ pub review_approved_weight: i64, // default: 4
+ pub review_changes_requested_weight: i64, // default: 6
```
4. **[High] Make temporal semantics explicit and deterministic**
Analysis and rationale: `--as-of` is good, but day parsing and boundary semantics can still cause subtle reproducibility issues. Define window as `[since_ms, as_of_ms)` and parse `YYYY-MM-DD` as end-of-day UTC (or explicit timezone) so user expectations match outputs.
```diff
@@ ### 5a. Reproducible Scoring via `--as-of`
-- All event selection is bounded by `[since_ms, as_of_ms]`
+- All event selection is bounded by `[since_ms, as_of_ms)` (exclusive upper bound)
+- `YYYY-MM-DD` is interpreted as `23:59:59.999Z` unless `--timezone` is provided
+- Robot output includes `window_start_iso`, `window_end_iso`, `window_end_exclusive: true`
```
5. **[High] Replace fixed default `--since 24m` with contribution-floor auto cutoff**
Analysis and rationale: A static window is simple but often over-scans data. Compute a model-derived horizon from a minimum contribution floor (for example `0.01` points) per signal; this keeps results equivalent while reducing query cost.
```diff
@@ ### 5. Default --since Change
-Expert mode: `"6m"` -> `"24m"`
+Expert mode default: `--since auto`
+`auto` computes earliest relevant timestamp from configured weights/half-lives and `min_contribution_floor`
+Add config: `min_contribution_floor` (default: 0.01)
+`--since` still overrides, `--all-history` still bypasses cutoff
```
6. **[High] Add bot/service-account filtering now (not later)**
Analysis and rationale: Bot activity can materially distort expertise rankings in real repos. This is low implementation cost with high quality gain and should be in v1 of the scoring revamp, not deferred.
```diff
@@ ### 1. ScoringConfig (config.rs)
+ pub excluded_username_patterns: Vec<String>, // default: ["bot","\\[bot\\]","service-account","ci-"]
@@ ### 2. SQL Restructure (who.rs)
+Apply username exclusion in all signal sources unless `--include-bots` is set
@@ ### 5b. Score Explainability via `--explain-score`
+Add `filtered_events` counts in robot output metadata
```
7. **[Medium] Enforce deterministic floating-point accumulation**
Analysis and rationale: Even with small sets, unordered `HashMap` iteration can cause tiny platform-dependent ranking differences near ties. Sorting contributions and using Neumaier summation removes nondeterminism and stabilizes tests/CI.
```diff
@@ ### 4. Rust-Side Aggregation (who.rs)
-Compute score as `f64`.
+Compute score as `f64` using deterministic contribution ordering:
+1) sort by (username, signal, mr_id, ts)
+2) sum with Neumaier compensation
+Tie-break remains `(raw_score DESC, last_seen DESC, username ASC)`
```
8. **[Medium] Strengthen explainability with evidence, not just totals**
Analysis and rationale: Component totals help, but disputes usually need “why this user got this score now.” Add compact top evidence rows per component (`mr_id`, `ts`, `raw_contribution`) behind an optional mode.
```diff
@@ ### 5b. Score Explainability via `--explain-score`
-Component breakdown only (4 floats per user).
+Add `--explain-score=summary|full`:
+`summary`: current 4-component totals
+`full`: adds top N evidence rows per component (default N=3)
+Robot output includes per-evidence `mr_id`, `signal`, `ts`, `contribution`
```
9. **[Medium] Make query plan strategy explicit: `UNION ALL` default for dual-path scans**
Analysis and rationale: You currently treat `UNION ALL` as fallback if planner regresses. For SQLite, OR-across-indexed-columns regressions are common enough that defaulting to branch-split queries is often more predictable.
```diff
@@ **Index optimization fallback (UNION ALL split)**
-Start with the simpler `OR` approach and only switch to `UNION ALL` if query plans confirm degradation.
+Use `UNION ALL` + dedup as default for dual-path matching.
+Keep `OR` variant as optional strategy flag for benchmarking/regression checks.
```
10. **[Medium] Add explicit performance SLO + benchmark gate**
Analysis and rationale: This plan is query-heavy and ranking-critical; add measurable performance budgets so future edits do not silently degrade UX. Include synthetic fixture benchmarks for exact, prefix, and suffix path modes.
```diff
@@ ## Verification
+8. Performance regression gate:
+ - `cargo bench --bench who_expert_scoring`
+ - Dataset tiers: 100k, 1M, 5M notes
+ - SLOs: p95 exact path < 150ms, prefix < 250ms, suffix < 400ms on reference hardware
+ - Fail CI if regression > 20% vs stored baseline
```
If you want, I can produce a single consolidated “iteration 5” plan document with these changes already merged into your current structure.

View File

@@ -2,12 +2,12 @@
plan: true plan: true
title: "" title: ""
status: iterating status: iterating
iteration: 4 iteration: 5
target_iterations: 8 target_iterations: 8
beads_revision: 0 beads_revision: 1
related_plans: [] related_plans: []
created: 2026-02-08 created: 2026-02-08
updated: 2026-02-08 updated: 2026-02-09
--- ---
# Time-Decay Expert Scoring Model # Time-Decay Expert Scoring Model
@@ -78,6 +78,7 @@ Author/reviewer signals are deduplicated per MR (one signal per distinct MR). No
- Change default `--since` from `"6m"` to `"24m"` (2 years captures all meaningful decayed signals) - Change default `--since` from `"6m"` to `"24m"` (2 years captures all meaningful decayed signals)
- Add `--as-of` flag for reproducible scoring at a fixed timestamp - Add `--as-of` flag for reproducible scoring at a fixed timestamp
- Add `--explain-score` flag for per-user score component breakdown - Add `--explain-score` flag for per-user score component breakdown
- Add `--include-bots` flag to disable bot/service-account filtering
- Sort on raw f64 score, round only for display - Sort on raw f64 score, round only for display
- Update tests - Update tests
3. **`src/core/db.rs`** — Add migration for indexes supporting the new query shapes (dual-path matching, reviewer participation CTE, path resolution probes) 3. **`src/core/db.rs`** — Add migration for indexes supporting the new query shapes (dual-path matching, reviewer participation CTE, path resolution probes)
@@ -100,6 +101,7 @@ pub struct ScoringConfig {
pub note_half_life_days: u32, // default: 45 pub note_half_life_days: u32, // default: 45
pub closed_mr_multiplier: f64, // default: 0.5 (applied to closed-without-merge MRs) pub closed_mr_multiplier: f64, // default: 0.5 (applied to closed-without-merge MRs)
pub reviewer_min_note_chars: u32, // default: 20 (minimum note body length to count as participation) pub reviewer_min_note_chars: u32, // default: 20 (minimum note body length to count as participation)
pub excluded_usernames: Vec<String>, // default: [] (exact-match usernames to exclude, e.g. ["renovate-bot", "gitlab-ci"])
} }
``` ```
@@ -108,6 +110,7 @@ pub struct ScoringConfig {
- All `*_weight` / `*_bonus` must be >= 0 (negative weights produce nonsensical scores) - All `*_weight` / `*_bonus` must be >= 0 (negative weights produce nonsensical scores)
- `closed_mr_multiplier` must be in `(0.0, 1.0]` (0 would discard closed MRs entirely; >1 would over-weight them) - `closed_mr_multiplier` must be in `(0.0, 1.0]` (0 would discard closed MRs entirely; >1 would over-weight them)
- `reviewer_min_note_chars` must be >= 0 (0 disables the filter; typical useful values: 10-50) - `reviewer_min_note_chars` must be >= 0 (0 disables the filter; typical useful values: 10-50)
- `excluded_usernames` entries must be non-empty strings (no blank entries)
- Return `LoreError::ConfigInvalid` with a clear message on failure - Return `LoreError::ConfigInvalid` with a clear message on failure
### 2. Decay Function (who.rs) ### 2. Decay Function (who.rs)
@@ -128,25 +131,51 @@ The SQL uses **CTE-based dual-path matching** and **hybrid aggregation**. Rather
MR-level signals return one row per (username, signal, mr_id) with a timestamp; note signals return one row per (username, mr_id) with `note_count` and `max_ts`. This keeps row counts bounded (dozens to low hundreds per path) while giving Rust the data it needs for decay and `log2(1+count)`. MR-level signals return one row per (username, signal, mr_id) with a timestamp; note signals return one row per (username, mr_id) with `note_count` and `max_ts`. This keeps row counts bounded (dozens to low hundreds per path) while giving Rust the data it needs for decay and `log2(1+count)`.
```sql ```sql
WITH matched_notes AS ( WITH matched_notes_raw AS (
-- Centralize dual-path matching for DiffNotes -- Branch 1: match on new_path (uses idx_notes_new_path or equivalent)
SELECT n.id, n.discussion_id, n.author_username, n.created_at, SELECT n.id, n.discussion_id, n.author_username, n.created_at, n.project_id
n.position_new_path, n.position_old_path, n.project_id
FROM notes n FROM notes n
WHERE n.note_type = 'DiffNote' WHERE n.note_type = 'DiffNote'
AND n.is_system = 0 AND n.is_system = 0
AND n.author_username IS NOT NULL AND n.author_username IS NOT NULL
AND n.created_at >= ?2 AND n.created_at >= ?2
AND n.created_at <= ?4 AND n.created_at < ?4
AND (?3 IS NULL OR n.project_id = ?3) AND (?3 IS NULL OR n.project_id = ?3)
AND (n.position_new_path {path_op} OR n.position_old_path {path_op}) AND n.position_new_path {path_op}
UNION ALL
-- Branch 2: match on old_path (uses idx_notes_old_path_author)
SELECT n.id, n.discussion_id, n.author_username, n.created_at, n.project_id
FROM notes n
WHERE n.note_type = 'DiffNote'
AND n.is_system = 0
AND n.author_username IS NOT NULL
AND n.created_at >= ?2
AND n.created_at < ?4
AND (?3 IS NULL OR n.project_id = ?3)
AND n.position_old_path {path_op}
), ),
matched_file_changes AS ( matched_notes AS (
-- Centralize dual-path matching for file changes -- Dedup: prevent double-counting when old_path = new_path (no rename)
SELECT DISTINCT id, discussion_id, author_username, created_at, project_id
FROM matched_notes_raw
),
matched_file_changes_raw AS (
-- Branch 1: match on new_path (uses idx_mfc_new_path_project_mr)
SELECT fc.merge_request_id, fc.project_id SELECT fc.merge_request_id, fc.project_id
FROM mr_file_changes fc FROM mr_file_changes fc
WHERE (?3 IS NULL OR fc.project_id = ?3) WHERE (?3 IS NULL OR fc.project_id = ?3)
AND (fc.new_path {path_op} OR fc.old_path {path_op}) AND fc.new_path {path_op}
UNION ALL
-- Branch 2: match on old_path (uses idx_mfc_old_path_project_mr)
SELECT fc.merge_request_id, fc.project_id
FROM mr_file_changes fc
WHERE (?3 IS NULL OR fc.project_id = ?3)
AND fc.old_path {path_op}
),
matched_file_changes AS (
-- Dedup: prevent double-counting when old_path = new_path (no rename)
SELECT DISTINCT merge_request_id, project_id
FROM matched_file_changes_raw
), ),
reviewer_participation AS ( reviewer_participation AS (
-- Precompute which (mr_id, username) pairs have substantive DiffNote participation. -- Precompute which (mr_id, username) pairs have substantive DiffNote participation.
@@ -196,7 +225,7 @@ raw AS (
WHERE m.author_username IS NOT NULL WHERE m.author_username IS NOT NULL
AND m.state IN ('opened','merged','closed') AND m.state IN ('opened','merged','closed')
AND {state_aware_ts} >= ?2 AND {state_aware_ts} >= ?2
AND {state_aware_ts} <= ?4 AND {state_aware_ts} < ?4
UNION ALL UNION ALL
@@ -212,7 +241,7 @@ raw AS (
AND (m.author_username IS NULL OR r.username != m.author_username) AND (m.author_username IS NULL OR r.username != m.author_username)
AND m.state IN ('opened','merged','closed') AND m.state IN ('opened','merged','closed')
AND {state_aware_ts} >= ?2 AND {state_aware_ts} >= ?2
AND {state_aware_ts} <= ?4 AND {state_aware_ts} < ?4
UNION ALL UNION ALL
@@ -229,7 +258,7 @@ raw AS (
AND (m.author_username IS NULL OR r.username != m.author_username) AND (m.author_username IS NULL OR r.username != m.author_username)
AND m.state IN ('opened','merged','closed') AND m.state IN ('opened','merged','closed')
AND {state_aware_ts} >= ?2 AND {state_aware_ts} >= ?2
AND {state_aware_ts} <= ?4 AND {state_aware_ts} < ?4
), ),
aggregated AS ( aggregated AS (
-- MR-level signals: 1 row per (username, signal_class, mr_id) with MAX(ts) -- MR-level signals: 1 row per (username, signal_class, mr_id) with MAX(ts)
@@ -245,11 +274,11 @@ aggregated AS (
SELECT username, signal, mr_id, qty, ts, mr_state FROM aggregated WHERE username IS NOT NULL SELECT username, signal, mr_id, qty, ts, mr_state FROM aggregated WHERE username IS NOT NULL
``` ```
Where `{state_aware_ts}` is the state-aware timestamp expression (defined in the next section), `{path_op}` is either `= ?1` or `LIKE ?1 ESCAPE '\\'` depending on the path query type, `?4` is the `as_of_ms` upper bound (defaults to `now_ms` when `--as-of` is not specified), and `{reviewer_min_note_chars}` is the configured `reviewer_min_note_chars` value (default 20, inlined as a literal in the SQL string). The `BETWEEN ?2 AND ?4` pattern ensures that when `--as-of` is set to a past date, events after that date are excluded — without this, "future" events would leak in with full weight, breaking reproducibility. Where `{state_aware_ts}` is the state-aware timestamp expression (defined in the next section), `{path_op}` is either `= ?1` or `LIKE ?1 ESCAPE '\\'` depending on the path query type, `?4` is the `as_of_ms` exclusive upper bound (defaults to `now_ms` when `--as-of` is not specified), and `{reviewer_min_note_chars}` is the configured `reviewer_min_note_chars` value (default 20, inlined as a literal in the SQL string). The `>= ?2 AND < ?4` pattern (half-open interval) ensures that when `--as-of` is set to a past date, events at or after that date are excluded — without this, "future" events would leak in with full weight, breaking reproducibility. The exclusive upper bound avoids edge-case ambiguity when events have timestamps exactly equal to the as-of value.
**Rationale for CTE-based dual-path matching**: The previous approach (repeating `OR old_path` in every signal subquery) duplicated the path matching logic 5 times. Factoring it into `matched_notes` and `matched_file_changes` CTEs means path matching is defined once, the indexes are hit once, and adding future path resolution logic (e.g., alias chains) only requires changes in one place. **Rationale for CTE-based dual-path matching**: The previous approach (repeating `OR old_path` in every signal subquery) duplicated the path matching logic 5 times. Factoring it into foundational CTEs (`matched_notes_raw``matched_notes`, `matched_file_changes_raw``matched_file_changes`) means path matching is defined once, each index branch is explicit, and adding future path resolution logic (e.g., alias chains) only requires changes in one place. The UNION ALL + dedup pattern ensures SQLite uses the optimal index for each path column independently.
**Index optimization fallback (UNION ALL split)**: SQLite's query planner sometimes struggles with `OR` across two indexed columns, falling back to a full table scan instead of using either index. If EXPLAIN QUERY PLAN shows this during step 6 verification, replace the `OR`-based CTEs with a `UNION ALL` split + dedup pattern: **Dual-path matching strategy (UNION ALL split)**: SQLite's query planner commonly struggles with `OR` across two indexed columns, falling back to a full table scan instead of using either index. Rather than starting with `OR` and hoping the planner cooperates, use `UNION ALL` + dedup as the default strategy:
```sql ```sql
matched_notes AS ( matched_notes AS (
SELECT ... FROM notes n WHERE ... AND n.position_new_path {path_op} SELECT ... FROM notes n WHERE ... AND n.position_new_path {path_op}
@@ -261,18 +290,18 @@ matched_notes_dedup AS (
FROM matched_notes FROM matched_notes
), ),
``` ```
This ensures each branch can use its respective index independently. The dedup CTE prevents double-counting when `old_path = new_path` (no rename). Start with the simpler `OR` approach and only switch to `UNION ALL` if query plans confirm the degradation. This ensures each branch can use its respective index independently. The dedup CTE prevents double-counting when `old_path = new_path` (no rename). The same pattern applies to `matched_file_changes`. The simpler `OR` variant is retained as a comment for benchmarking — if a future SQLite version handles `OR` well, the split can be collapsed.
**Rationale for precomputed participation set**: The previous approach used correlated `EXISTS`/`NOT EXISTS` subqueries to classify reviewers. The `reviewer_participation` CTE materializes the set of `(mr_id, username)` pairs from matched DiffNotes once, then signal 4a JOINs against it (participated) and signal 4b LEFT JOINs with `IS NULL` (assigned-only). This avoids per-reviewer-row correlated scans, is easier to reason about, and produces the same exhaustive split — every `mr_reviewers` row falls into exactly one bucket. **Rationale for precomputed participation set**: The previous approach used correlated `EXISTS`/`NOT EXISTS` subqueries to classify reviewers. The `reviewer_participation` CTE materializes the set of `(mr_id, username)` pairs from matched DiffNotes once, then signal 4a JOINs against it (participated) and signal 4b LEFT JOINs with `IS NULL` (assigned-only). This avoids per-reviewer-row correlated scans, is easier to reason about, and produces the same exhaustive split — every `mr_reviewers` row falls into exactly one bucket.
**Rationale for hybrid over fully-raw**: Pre-aggregating note counts in SQL prevents row explosion from heavy DiffNote volume on frequently-discussed paths. MR-level signals are already 1-per-MR by nature (deduped via GROUP BY in each subquery). This keeps memory and latency predictable regardless of review activity density. **Rationale for hybrid over fully-raw**: Pre-aggregating note counts in SQL prevents row explosion from heavy DiffNote volume on frequently-discussed paths. MR-level signals are already 1-per-MR by nature (deduped via GROUP BY in each subquery). This keeps memory and latency predictable regardless of review activity density.
**Path rename awareness**: Both `matched_notes` and `matched_file_changes` CTEs match against both old and new path columns: **Path rename awareness**: Both `matched_notes` and `matched_file_changes` use UNION ALL + dedup to match against both old and new path columns independently, ensuring each branch uses its respective index:
- Notes: `(n.position_new_path {path_op} OR n.position_old_path {path_op})` - Notes: branch 1 matches `position_new_path`, branch 2 matches `position_old_path`, deduped by `notes.id`
- File changes: `(fc.new_path {path_op} OR fc.old_path {path_op})` - File changes: branch 1 matches `new_path`, branch 2 matches `old_path`, deduped by `(merge_request_id, project_id)`
Both columns already exist in the schema (`notes.position_old_path` from migration 002, `mr_file_changes.old_path` from migration 016). The `OR` match ensures expertise is credited even when a file was renamed after the work was done. For prefix queries (`--path src/foo/`), the `LIKE` operator applies to both columns identically. Both columns already exist in the schema (`notes.position_old_path` from migration 002, `mr_file_changes.old_path` from migration 016). The UNION ALL approach ensures expertise is credited even when a file was renamed after the work was done. For prefix queries (`--path src/foo/`), the `LIKE` operator applies to both columns identically.
**Signal 4 splits into two**: The current signal 4 (`file_reviewer`) joins `mr_reviewers` but doesn't distinguish participation. In the new plan: **Signal 4 splits into two**: The current signal 4 (`file_reviewer`) joins `mr_reviewers` but doesn't distinguish participation. In the new plan:
@@ -332,7 +361,7 @@ For each username, accumulate into a struct with:
The `mr_state` field from each SQL row is stored alongside the timestamp so the Rust-side can apply `closed_mr_multiplier` when `mr_state == "closed"`. The `mr_state` field from each SQL row is stored alongside the timestamp so the Rust-side can apply `closed_mr_multiplier` when `mr_state == "closed"`.
Compute score as `f64`. Each MR-level contribution is multiplied by `closed_mr_multiplier` (default 0.5) when the MR's state is `"closed"`: Compute score as `f64` with **deterministic contribution ordering**: within each signal type, sort contributions by `(mr_id ASC)` before summing. This eliminates platform-dependent HashMap iteration order as a source of f64 rounding variance near ties, ensuring CI reproducibility without the complexity of compensated summation (Neumaier/Kahan). Each MR-level contribution is multiplied by `closed_mr_multiplier` (default 0.5) when the MR's state is `"closed"`:
``` ```
state_mult(mr) = if mr.state == "closed" { closed_mr_multiplier } else { 1.0 } state_mult(mr) = if mr.state == "closed" { closed_mr_multiplier } else { 1.0 }
@@ -352,6 +381,8 @@ Compute counts from the accumulated data:
- `review_note_count = notes_per_mr.values().map(|(count, _)| count).sum()` - `review_note_count = notes_per_mr.values().map(|(count, _)| count).sum()`
- `author_mr_count = author_mrs.len()` - `author_mr_count = author_mrs.len()`
**Bot/service-account filtering**: After accumulating all user scores and before sorting, filter out any username that appears in `config.scoring.excluded_usernames` (exact match, case-insensitive). This is applied in Rust post-query (not SQL) to keep the SQL clean and avoid parameter explosion. When `--include-bots` is active, the filter is skipped entirely. The robot JSON `resolved_input` includes `excluded_usernames_applied: true|false` to indicate whether filtering was active.
Truncate to limit after sorting. Truncate to limit after sorting.
### 5. Default --since Change ### 5. Default --since Change
@@ -364,10 +395,11 @@ At 2 years, author decay = 6%, reviewer decay = 0.4%, note decay = 0.006% — ne
### 5a. Reproducible Scoring via `--as-of` ### 5a. Reproducible Scoring via `--as-of`
Add `--as-of <RFC3339|YYYY-MM-DD>` flag that overrides the `now_ms` reference point used for decay calculations. When set: Add `--as-of <RFC3339|YYYY-MM-DD>` flag that overrides the `now_ms` reference point used for decay calculations. When set:
- All event selection is bounded by `[since_ms, as_of_ms]` — events after `as_of_ms` are excluded from SQL results entirely (not just decayed) - All event selection is bounded by `[since_ms, as_of_ms)` — exclusive upper bound; events at or after `as_of_ms` are excluded from SQL results entirely (not just decayed). The SQL uses `< ?4` (strict less-than), not `<= ?4`.
- `YYYY-MM-DD` input (without time component) is interpreted as end-of-day UTC: `T23:59:59.999Z`. This matches user intuition that `--as-of 2025-06-01` means "as of the end of June 1st" rather than "as of midnight at the start of June 1st" which would exclude the entire day's activity.
- All decay computations use `as_of_ms` instead of `SystemTime::now()` - All decay computations use `as_of_ms` instead of `SystemTime::now()`
- The `--since` window is calculated relative to `as_of_ms` (not wall clock) - The `--since` window is calculated relative to `as_of_ms` (not wall clock)
- Robot JSON `resolved_input` includes `as_of_ms` and `as_of_iso` fields - Robot JSON `resolved_input` includes `as_of_ms`, `as_of_iso`, `window_start_iso`, `window_end_iso`, and `window_end_exclusive: true` fields — making the exact query window unambiguous in output
**Rationale**: Decayed scoring is time-sensitive by nature. Without a fixed reference point, the same query run minutes apart produces different rankings, making debugging and test reproducibility difficult. `--as-of` pins the clock so that results are deterministic for a given dataset. The upper-bound filter in SQL is critical — without it, events after the as-of date would enter with full weight (since `elapsed.max(0.0)` clamps negative elapsed time to zero), breaking the reproducibility guarantee. **Rationale**: Decayed scoring is time-sensitive by nature. Without a fixed reference point, the same query run minutes apart produces different rankings, making debugging and test reproducibility difficult. `--as-of` pins the clock so that results are deterministic for a given dataset. The upper-bound filter in SQL is critical — without it, events after the as-of date would enter with full weight (since `elapsed.max(0.0)` clamps negative elapsed time to zero), breaking the reproducibility guarantee.
@@ -484,7 +516,13 @@ Add timestamp-aware variants:
**`test_closed_mr_multiplier`**: Two identical MRs (same author, same age, same path). One is `merged`, one is `closed`. The merged MR should contribute `author_weight * decay(...)`, the closed MR should contribute `author_weight * closed_mr_multiplier * decay(...)`. With default multiplier 0.5, the closed MR contributes half. **`test_closed_mr_multiplier`**: Two identical MRs (same author, same age, same path). One is `merged`, one is `closed`. The merged MR should contribute `author_weight * decay(...)`, the closed MR should contribute `author_weight * closed_mr_multiplier * decay(...)`. With default multiplier 0.5, the closed MR contributes half.
**`test_as_of_excludes_future_events`**: Insert events at timestamps T1 (past) and T2 (future relative to as-of). With `--as-of` set between T1 and T2, only T1 events should appear in results. T2 events must be excluded entirely, not just decayed. Validates the upper-bound filtering in SQL. **`test_as_of_excludes_future_events`**: Insert events at timestamps T1 (past) and T2 (future relative to as-of). With `--as-of` set between T1 and T2, only T1 events should appear in results. T2 events must be excluded entirely, not just decayed. Validates the exclusive upper-bound (`< ?4`) filtering in SQL.
**`test_as_of_exclusive_upper_bound`**: Insert an event with timestamp exactly equal to the `as_of_ms` value. Verify it is excluded from results (strict less-than, not less-than-or-equal). This validates the half-open interval `[since, as_of)` semantics.
**`test_excluded_usernames_filters_bots`**: Insert signals for a user named "renovate-bot" and a user named "jsmith", both with the same activity. With `excluded_usernames: ["renovate-bot"]` in config, only "jsmith" should appear in results. Validates the Rust-side post-query filtering.
**`test_include_bots_flag_disables_filtering`**: Same setup as above, but with `--include-bots` active. Both "renovate-bot" and "jsmith" should appear in results.
**`test_null_timestamp_fallback_to_created_at`**: Insert a merged MR with `merged_at = NULL` (edge case: old data before the column was populated). The state-aware timestamp should fall back to `created_at`. Verify the score reflects `created_at`, not 0 or a panic. **`test_null_timestamp_fallback_to_created_at`**: Insert a merged MR with `merged_at = NULL` (edge case: old data before the column was populated). The state-aware timestamp should fall back to `created_at`. Verify the score reflects `created_at`, not 0 or a panic.
@@ -496,11 +534,13 @@ Add timestamp-aware variants:
**`test_reviewer_split_is_exhaustive`**: For a reviewer assigned to an MR, they must appear in exactly one of: participated (has substantive DiffNotes meeting `reviewer_min_note_chars`) or assigned-only (no DiffNotes, or only trivial ones below the threshold). Never both, never neither. Test three cases: (1) reviewer with substantive DiffNotes -> participated only, (2) reviewer with no DiffNotes -> assigned-only only, (3) reviewer with only trivial notes ("LGTM") -> assigned-only only. **`test_reviewer_split_is_exhaustive`**: For a reviewer assigned to an MR, they must appear in exactly one of: participated (has substantive DiffNotes meeting `reviewer_min_note_chars`) or assigned-only (no DiffNotes, or only trivial ones below the threshold). Never both, never neither. Test three cases: (1) reviewer with substantive DiffNotes -> participated only, (2) reviewer with no DiffNotes -> assigned-only only, (3) reviewer with only trivial notes ("LGTM") -> assigned-only only.
**`test_deterministic_accumulation_order`**: Insert signals for a user with contributions at many different timestamps (10+ MRs with varied ages). Run `query_expert` 100 times in a loop. All 100 runs must produce the exact same `f64` score (bit-identical). Validates that the sorted contribution ordering eliminates HashMap-iteration-order nondeterminism.
### 9. Existing Test Compatibility ### 9. Existing Test Compatibility
All existing tests insert data with `now_ms()`. With decay, elapsed ~0ms means decay ~1.0, so scores round to the same integers as before. No existing test assertions should break. All existing tests insert data with `now_ms()`. With decay, elapsed ~0ms means decay ~1.0, so scores round to the same integers as before. No existing test assertions should break.
The `test_expert_scoring_weights_are_configurable` test needs `..Default::default()` added to fill the new half-life fields, `reviewer_assignment_weight` / `reviewer_assignment_half_life_days`, `closed_mr_multiplier`, and `reviewer_min_note_chars` fields. The `test_expert_scoring_weights_are_configurable` test needs `..Default::default()` added to fill the new half-life fields, `reviewer_assignment_weight` / `reviewer_assignment_half_life_days`, `closed_mr_multiplier`, `reviewer_min_note_chars`, and `excluded_usernames` fields.
## Verification ## Verification
@@ -511,11 +551,17 @@ The `test_expert_scoring_weights_are_configurable` test needs `..Default::defaul
5. `ubs src/cli/commands/who.rs src/core/config.rs src/core/db.rs` — no bug scanner findings 5. `ubs src/cli/commands/who.rs src/core/config.rs src/core/db.rs` — no bug scanner findings
6. Manual query plan verification (not automated — SQLite planner varies across versions): 6. Manual query plan verification (not automated — SQLite planner varies across versions):
- Run `EXPLAIN QUERY PLAN` on the expert query (both exact and prefix modes) against a real database - Run `EXPLAIN QUERY PLAN` on the expert query (both exact and prefix modes) against a real database
- Confirm that `matched_notes` CTE uses `idx_notes_old_path_author` or the existing new_path index (not a full table scan) - Confirm that `matched_notes_raw` branch 1 uses the existing new_path index and branch 2 uses `idx_notes_old_path_author` (not a full table scan on either branch)
- Confirm that `matched_file_changes` CTE uses `idx_mfc_old_path_project_mr` or `idx_mfc_new_path_project_mr` - Confirm that `matched_file_changes_raw` branch 1 uses `idx_mfc_new_path_project_mr` and branch 2 uses `idx_mfc_old_path_project_mr`
- Confirm that `reviewer_participation` CTE uses `idx_notes_diffnote_discussion_author` - Confirm that `reviewer_participation` CTE uses `idx_notes_diffnote_discussion_author`
- Document the observed plan in a comment near the SQL for future regression reference - Document the observed plan in a comment near the SQL for future regression reference
7. Real-world validation: 7. Performance baseline (manual, not CI-gated):
- Run `time cargo run --release -- who --path <exact-path>` on the real database for exact, prefix, and suffix modes
- Target SLOs: p95 exact path < 200ms, prefix < 300ms, suffix < 500ms on development hardware
- Record baseline timings as a comment near the SQL for regression reference
- If any mode exceeds 2x the baseline after future changes, investigate before merging
- Note: These are soft targets for developer awareness, not automated CI gates. Automated benchmarking with synthetic fixtures (100k/1M/5M notes) is a v2 investment if performance becomes a real concern.
8. Real-world validation:
- `cargo run --release -- who --path MeasurementQualityDialog.tsx` — verify jdefting/zhayes old reviews are properly discounted relative to recent authors - `cargo run --release -- who --path MeasurementQualityDialog.tsx` — verify jdefting/zhayes old reviews are properly discounted relative to recent authors
- `cargo run --release -- who --path MeasurementQualityDialog.tsx --all-history` — compare full history vs 24m window to validate cutoff is reasonable - `cargo run --release -- who --path MeasurementQualityDialog.tsx --all-history` — compare full history vs 24m window to validate cutoff is reasonable
- `cargo run --release -- who --path MeasurementQualityDialog.tsx --explain-score` — verify component breakdown sums to total and authored signal dominates for known authors - `cargo run --release -- who --path MeasurementQualityDialog.tsx --explain-score` — verify component breakdown sums to total and authored signal dominates for known authors
@@ -524,6 +570,7 @@ The `test_expert_scoring_weights_are_configurable` test needs `..Default::defaul
- `cargo run --release -- who --path MeasurementQualityDialog.tsx --as-of 2025-06-01` — verify deterministic output across repeated runs - `cargo run --release -- who --path MeasurementQualityDialog.tsx --as-of 2025-06-01` — verify deterministic output across repeated runs
- Spot-check that reviewers who only left "LGTM"-style notes are classified as assigned-only (not participated) - Spot-check that reviewers who only left "LGTM"-style notes are classified as assigned-only (not participated)
- Verify closed MRs contribute at ~50% of equivalent merged MR scores via `--explain-score` - Verify closed MRs contribute at ~50% of equivalent merged MR scores via `--explain-score`
- If the project has known bot accounts (e.g., renovate-bot), add them to `excluded_usernames` config and verify they no longer appear in results. Run again with `--include-bots` to confirm they reappear.
## Accepted from External Review ## Accepted from External Review
@@ -553,12 +600,20 @@ Ideas incorporated from ChatGPT review (feedback-1 through feedback-4) that genu
- **EXPLAIN QUERY PLAN verification step**: Manual check that the restructured queries use the new indexes (not automated, since SQLite planner varies across versions). - **EXPLAIN QUERY PLAN verification step**: Manual check that the restructured queries use the new indexes (not automated, since SQLite planner varies across versions).
**From feedback-4:** **From feedback-4:**
- **`--as-of` temporal correctness (critical)**: The plan described `--as-of` but the SQL only enforced a lower bound (`>= ?2`). Events after the as-of date would leak in with full weight (because `elapsed.max(0.0)` clamps negative elapsed time to zero). Added `<= ?4` upper bound to all SQL timestamp filters, making the query window `[since_ms, as_of_ms]`. Without this, `--as-of` reproducibility was fundamentally broken. - **`--as-of` temporal correctness (critical)**: The plan described `--as-of` but the SQL only enforced a lower bound (`>= ?2`). Events after the as-of date would leak in with full weight (because `elapsed.max(0.0)` clamps negative elapsed time to zero). Added `< ?4` upper bound to all SQL timestamp filters, making the query window `[since_ms, as_of_ms)`. Without this, `--as-of` reproducibility was fundamentally broken. (Refined to exclusive upper bound in feedback-5.)
- **Closed-state inconsistency resolution**: The state-aware CASE expression handled `closed` state but the WHERE clause filtered to `('opened','merged')` only — dead code. Resolved by including `'closed'` in state filters and adding a `closed_mr_multiplier` (default 0.5) applied in Rust to all signals from closed-without-merge MRs. This credits real review effort on abandoned MRs while appropriately discounting it. - **Closed-state inconsistency resolution**: The state-aware CASE expression handled `closed` state but the WHERE clause filtered to `('opened','merged')` only — dead code. Resolved by including `'closed'` in state filters and adding a `closed_mr_multiplier` (default 0.5) applied in Rust to all signals from closed-without-merge MRs. This credits real review effort on abandoned MRs while appropriately discounting it.
- **Substantive note threshold for reviewer participation**: A single "LGTM" shouldn't promote a reviewer from 3-point (assigned-only) to 10-point (participated) weight. Added `reviewer_min_note_chars` (default 20) config field and `LENGTH(TRIM(body))` filter in the `reviewer_participation` CTE. This raises the bar for participation classification to actual substantive review comments. - **Substantive note threshold for reviewer participation**: A single "LGTM" shouldn't promote a reviewer from 3-point (assigned-only) to 10-point (participated) weight. Added `reviewer_min_note_chars` (default 20) config field and `LENGTH(TRIM(body))` filter in the `reviewer_participation` CTE. This raises the bar for participation classification to actual substantive review comments.
- **UNION ALL optimization fallback for path predicates**: SQLite's planner can degrade `OR` across two indexed columns to a table scan. Added documentation of a `UNION ALL` split + dedup fallback pattern to use if EXPLAIN QUERY PLAN shows degradation during verification. Start with the simpler `OR` approach; switch only if needed. - **UNION ALL optimization for path predicates**: SQLite's planner can degrade `OR` across two indexed columns to a table scan. Originally documented as a fallback; promoted to default strategy in feedback-5 iteration. The UNION ALL + dedup approach ensures each index branch is used independently.
- **New tests**: `test_trivial_note_does_not_count_as_participation`, `test_closed_mr_multiplier`, `test_as_of_excludes_future_events` — cover the three new features added from this review round. - **New tests**: `test_trivial_note_does_not_count_as_participation`, `test_closed_mr_multiplier`, `test_as_of_excludes_future_events` — cover the three new features added from this review round.
**From feedback-5 (ChatGPT review):**
- **Exclusive upper bound for `--as-of`**: Changed from `[since_ms, as_of_ms]` (inclusive) to `[since_ms, as_of_ms)` (exclusive). Half-open intervals are the standard convention in temporal systems — they eliminate edge-case ambiguity when events have timestamps exactly at the boundary. Also added `YYYY-MM-DD` → end-of-day UTC parsing and window metadata in robot output.
- **UNION ALL as default for dual-path matching**: Promoted from "fallback if planner regresses" to default strategy. SQLite `OR`-across-indexed-columns degradation is common enough that the predictable UNION ALL + dedup approach is the safer starting point. The simpler `OR` variant is retained as a comment for benchmarking.
- **Deterministic contribution ordering**: Within each signal type, sort contributions by `mr_id` before summing. This eliminates HashMap iteration order as a source of f64 rounding variance near ties, ensuring CI reproducibility without the overhead of compensated summation (Neumaier/Kahan was rejected as overkill at this scale).
- **Minimal bot/service-account filtering**: Added `excluded_usernames` (exact match, case-insensitive) to `ScoringConfig` and `--include-bots` CLI flag. Applied as a Rust-side post-filter (not SQL) to keep queries clean. Scope is deliberately minimal — no regex patterns, no heuristic detection. Users configure the list for their team's specific bots.
- **Performance baseline SLOs**: Added manual performance baseline step to verification — record timings for exact/prefix/suffix modes and flag >2x regressions. Kept lightweight (no CI gating, no synthetic benchmarks) to match the project's current maturity.
- **New tests**: `test_as_of_exclusive_upper_bound`, `test_excluded_usernames_filters_bots`, `test_include_bots_flag_disables_filtering`, `test_deterministic_accumulation_order` — cover the newly-accepted features.
## Rejected Ideas (with rationale) ## Rejected Ideas (with rationale)
These suggestions were considered during review but explicitly excluded from this iteration: These suggestions were considered during review but explicitly excluded from this iteration:
@@ -573,3 +628,10 @@ These suggestions were considered during review but explicitly excluded from thi
- **Split scoring engine into core module** (feedback-4 #5): Proposed extracting scoring math from `who.rs` into `src/core/scoring/model_v2_decay.rs`. Premature modularization — `who.rs` is the only consumer and is ~800 lines. Adding module plumbing and indirection for a single call site adds complexity without reducing it. If we add a second scoring consumer (e.g., automated triage), revisit. - **Split scoring engine into core module** (feedback-4 #5): Proposed extracting scoring math from `who.rs` into `src/core/scoring/model_v2_decay.rs`. Premature modularization — `who.rs` is the only consumer and is ~800 lines. Adding module plumbing and indirection for a single call site adds complexity without reducing it. If we add a second scoring consumer (e.g., automated triage), revisit.
- **Bot/service-account filtering** (feedback-4 #7): Real concern but orthogonal to time-decay scoring. This is a general data quality feature that belongs in its own issue — it affects all `who` modes, not just expert scoring. Adding `excluded_username_patterns` config and `--include-bots` flag is scope expansion that should be designed and tested independently. - **Bot/service-account filtering** (feedback-4 #7): Real concern but orthogonal to time-decay scoring. This is a general data quality feature that belongs in its own issue — it affects all `who` modes, not just expert scoring. Adding `excluded_username_patterns` config and `--include-bots` flag is scope expansion that should be designed and tested independently.
- **Model compare mode / rank-delta diagnostics** (feedback-4 #9): Over-engineered rollout safety for an internal CLI tool with ~3 users. Maintaining two parallel scoring codepaths (v1 flat + v2 decayed) doubles test surface and code complexity. The `--explain-score` + `--as-of` combination already provides debugging capability. If a future model change is risky enough to warrant A/B comparison, build it then. - **Model compare mode / rank-delta diagnostics** (feedback-4 #9): Over-engineered rollout safety for an internal CLI tool with ~3 users. Maintaining two parallel scoring codepaths (v1 flat + v2 decayed) doubles test surface and code complexity. The `--explain-score` + `--as-of` combination already provides debugging capability. If a future model change is risky enough to warrant A/B comparison, build it then.
- **Canonical path identity graph** (feedback-5 #1, also feedback-2 #2, feedback-4 #4): Third time proposed, third time rejected. Building a rename graph from `mr_file_changes(old_path, new_path)` with identity resolution requires new schema (`path_identities`, `path_aliases` tables), ingestion pipeline changes, graph traversal at query time, and backfill logic for existing data. The UNION ALL dual-path matching already covers the 80%+ case (direct renames). Multi-hop rename chains (A→B→C) are rare in practice and can be addressed in v2 with real usage data showing the gap matters.
- **Normalized `expertise_events` table** (feedback-5 #2): Proposes shifting from query-time CTE joins to a precomputed `expertise_events` table populated at ingest time. While architecturally appealing for read performance, this doubles the data surface area (raw tables + derived events), requires new ingestion pipelines with incremental upsert logic, backfill tooling for existing databases, and introduces consistency risks when raw data is corrected/re-synced. The CTE approach is correct, maintainable, and performant at our current scale. If query latency becomes a real bottleneck (see performance baseline SLOs), materialized views or derived tables become a v2 optimization.
- **Reviewer engagement model upgrade** (feedback-5 #3): Proposes adding `approved`/`changes_requested` review-state signals and trivial-comment pattern matching (`["lgtm","+1","nit","ship it"]`). Expands the signal type count from 4 to 6 and adds a fragile pattern-matching layer (what about "don't ship it"? "lgtm but..."?). The `reviewer_min_note_chars` threshold is imperfect but pragmatic — it's a single configurable number with no false-positive risk from substring matching. Review-state signals may be worth adding later as a separate enhancement when we have data on how often they diverge from DiffNote participation.
- **Contribution-floor auto cutoff for `--since`** (feedback-5 #5): Proposes `--since auto` computing the earliest relevant timestamp from `min_contribution_floor` (e.g., 0.01 points). Adds a non-obvious config parameter for minimal benefit — the 24m default is already mathematically justified from the decay curves (author: 6%, reviewer: 0.4% at 2 years) and easily overridden with `--since` or `--all-history`. The auto-derivation formula (`ceil(max_half_life * log2(1/floor))`) is opaque to users who just want to understand why a certain time range was selected.
- **Full evidence drill-down in `--explain-score`** (feedback-5 #8): Proposes `--explain-score=summary|full` with per-MR evidence rows. Already rejected in feedback-2 #7. Component totals are sufficient for v1 debugging — they answer "which signal type drives this user's score." Per-MR drill-down requires additional SQL queries and significant output format complexity. Deferred unless component breakdowns prove insufficient.
- **Neumaier compensated summation** (feedback-5 #7 partial): Accepted the sorting aspect for deterministic ordering, but rejected Neumaier/Kahan compensated summation. At the scale of dozens to low hundreds of contributions per user, the rounding error from naive f64 summation is on the order of 1e-14 — several orders of magnitude below any meaningful score difference. Compensated summation adds code complexity and a maintenance burden for no practical benefit at this scale.
- **Automated CI benchmark gate** (feedback-5 #10 partial): Accepted manual performance baselines, but rejected automated CI regression gating with synthetic fixtures (100k/1M/5M notes). Building and maintaining benchmark infrastructure is a significant investment that's premature for a CLI tool with ~3 users. Manual timing checks during development are sufficient until performance becomes a real concern.

View File

@@ -0,0 +1,209 @@
No `## Rejected Recommendations` section was present, so these are all net-new improvements.
1. Keep core `lore` stable; isolate nightly to a TUI crate
Rationale: the current plan says “whole project nightly” but later assumes TUI is feature-gated. Isolating nightly removes unnecessary risk from non-TUI users, CI, and release cadence.
```diff
@@ 3.2 Nightly Rust Strategy
-- The entire gitlore project moves to pinned nightly, not just the TUI feature.
+- Keep core `lore` on stable Rust.
+- Add workspace member `lore-tui` pinned to nightly for FrankenTUI.
+- Ship `lore tui` only when `--features tui` (or separate `lore-tui` binary) is enabled.
@@ 10.1 New Files
+- crates/lore-tui/Cargo.toml
+- crates/lore-tui/src/main.rs
@@ 11. Assumptions
-17. TUI module is feature-gated.
+17. TUI is isolated in a workspace crate and feature-gated in root CLI integration.
```
2. Add a framework adapter boundary from day 1
Rationale: the “3-day ratatui escape hatch” is optimistic without a strict interface. A tiny `UiRuntime` + screen renderer trait makes fallback real, not aspirational.
```diff
@@ 4. Architecture
+### 4.9 UI Runtime Abstraction
+Introduce `UiRuntime` trait (`run`, `send`, `subscribe`) and `ScreenRenderer` trait.
+FrankenTUI implementation is default; ratatui adapter can be dropped in with no state/action rewrite.
@@ 3.5 Escape Hatch
-- The migration cost to ratatui is ~3 days
+- Migration cost target is ~3-5 days, validated by one ratatui spike screen in Phase 1.
```
3. Stop using CLI command modules as the TUI query API
Rationale: coupling TUI to CLI output-era structs creates long-term friction and accidental regressions. Create a shared domain query layer used by both CLI and TUI.
```diff
@@ 10.20 Refactor: Extract Query Functions
-- extract query_* from cli/commands/*
+- introduce `src/domain/query/*` as the canonical read model API.
+- CLI and TUI both depend on domain query layer.
+- CLI modules retain formatting/output only.
@@ 10.2 Modified Files
+- src/domain/query/mod.rs
+- src/domain/query/issues.rs
+- src/domain/query/mrs.rs
+- src/domain/query/search.rs
+- src/domain/query/who.rs
```
4. Replace single `Arc<Mutex<Connection>>` with connection manager
Rationale: one locked connection serializes everything and hurts responsiveness, especially during sync. Use separate read pool + writer connection with WAL and busy timeout.
```diff
@@ 4.4 App — Implementing the Model Trait
- pub db: Arc<Mutex<Connection>>,
+ pub db: Arc<DbManager>, // read pool + single writer coordination
@@ 4.5 Async Action System
- Each Cmd::task closure locks the mutex, runs the query, and returns a Msg
+ Reads use pooled read-only connections.
+ Sync/write path uses dedicated writer connection.
+ Enforce WAL, busy_timeout, and retry policy for SQLITE_BUSY.
```
5. Make debouncing/cancellation explicit and correct
Rationale: “runtime coalesces rapid keypresses” is not a safe correctness guarantee. Add request IDs and stale-response dropping to prevent flicker and wrong data.
```diff
@@ 4.3 Core Types (Msg)
+ SearchRequestStarted { request_id: u64, query: String }
- SearchExecuted(SearchResults),
+ SearchExecuted { request_id: u64, results: SearchResults },
@@ 4.4 maybe_debounced_query()
- runtime coalesces rapid keypresses
+ use explicit 200ms debounce timer + monotonic request_id
+ ignore results whose request_id != current_search_request_id
```
6. Implement true streaming sync, not batch-at-end pseudo-streaming
Rationale: the plan promises real-time logs/progress but code currently returns one completion message. This gap will disappoint users and complicate cancellation.
```diff
@@ 4.4 start_sync_task()
- Pragmatic approach: run sync synchronously, collect all progress events, return summary.
+ Use event channel subscription for `SyncProgress`/`SyncLogLine` streaming.
+ Keep `SyncCompleted` only as terminal event.
+ Add cooperative cancel token mapped to `Esc` while running.
@@ 5.9 Sync
+ Add "Resume from checkpoint" option for interrupted syncs.
```
7. Fix entity identity ambiguity across projects
Rationale: using `iid` alone is unsafe in multi-project datasets. Navigation and cross-refs should key by `(project_id, iid)` or global ID.
```diff
@@ 4.3 Core Types
- IssueDetail(i64)
- MrDetail(i64)
+ IssueDetail(EntityKey)
+ MrDetail(EntityKey)
+ pub struct EntityKey { pub project_id: i64, pub iid: i64, pub kind: EntityKind }
@@ 10.12.4 Cross-Reference Widget
- parse "group/project#123" -> iid only
+ parse into `{project_path, iid, kind}` then resolve to `project_id` before navigation
```
8. Resolve keybinding conflicts and formalize keymap precedence
Rationale: current spec conflicts (`Tab` sort vs focus filter; `gg` vs go-prefix). A deterministic keymap contract prevents UX bugs.
```diff
@@ 8.2 List Screens
- Tab | Cycle sort column
- f | Focus filter bar
+ Tab | Focus filter bar
+ S | Cycle sort column
+ / | Focus filter bar (alias)
@@ 4.4 interpret_key()
+ Add explicit precedence table:
+ 1) modal/palette
+ 2) focused input
+ 3) global
+ 4) screen-local
+ Add configurable go-prefix timeout (default 500ms) with cancel feedback.
```
9. Add performance SLOs and DB/index plan
Rationale: “fast enough” is vague. Add measurable budgets, required indexes, and query-plan gates in CI for predictable performance.
```diff
@@ 3.1 Risk Matrix
+ Add risk: "Query latency regressions on large datasets"
@@ 9.3 Phase 0 — Toolchain Gate
+7. p95 list query latency < 75ms on 100k issues synthetic fixture
+8. p95 search latency < 200ms on 1M docs (lexical mode)
@@ 11. Assumptions
-5. SQLite queries are fast enough for interactive use (<50ms for filtered results).
+5. Performance budgets are enforced by benchmark fixtures and query-plan checks.
+6. Required indexes documented and migration-backed before TUI GA.
```
10. Add reliability/observability model (error classes, retries, tracing)
Rationale: one string toast is not enough for production debugging. Add typed errors, retry policy, and an in-TUI diagnostics pane.
```diff
@@ 4.3 Core Types (Msg)
- Error(String),
+ Error(AppError),
+ pub enum AppError {
+ DbBusy, DbCorruption, NetworkRateLimited, NetworkUnavailable,
+ AuthFailed, ParseError, Internal(String)
+ }
@@ 5.11 Doctor / Stats
+ Add "Diagnostics" tab:
+ - last 100 errors
+ - retry counts
+ - current sync/backoff state
+ - DB contention metrics
```
11. Add “Saved Views + Watchlist” as high-value product features
Rationale: this makes the TUI compelling daily, not just navigable. Users can persist filters and monitor critical slices (e.g., “P1 auth issues updated in last 24h”).
```diff
@@ 1. Executive Summary
+ - Saved Views (named filters and layouts)
+ - Watchlist panel (tracked queries with delta badges)
@@ 5. Screen Taxonomy
+### 5.12 Saved Views / Watchlist
+Persistent named filters for Issues/MRs/Search.
+Dashboard shows per-watchlist deltas since last session.
@@ 6. User Flows
+### 6.9 Flow: "Run morning watchlist triage"
+Dashboard -> Watchlist -> filtered IssueList/MRList -> detail drilldown
```
12. Strengthen testing plan with deterministic behavior and chaos cases
Rationale: snapshot tests alone wont catch race/staleness/cancellation issues. Add concurrency, cancellation, and flaky terminal behavior tests.
```diff
@@ 9.2 Phases
+Phase 5.5 Reliability Test Pack (2d)
+ - stale response drop tests
+ - sync cancel/resume tests
+ - SQLITE_BUSY retry tests
+ - resize storm and rapid key-chord tests
@@ 10.9 Snapshot Test Example
+ Add non-snapshot tests:
+ - property tests for navigation invariants
+ - integration tests for request ordering correctness
+ - benchmark tests for query budgets
```
If you want, I can produce a consolidated “PRD v2.1 patch” with all of the above merged into one coherent updated document structure.

View File

@@ -0,0 +1,203 @@
I excluded the two items in your `## Rejected Recommendations` and focused on net-new improvements.
These are the highest-impact revisions Id make.
### 1. Fix the package graph now (avoid a hard Cargo cycle)
Your current plan has `root -> optional lore-tui` and `lore-tui -> lore (root)`, which creates a cyclic dependency risk. Split shared logic into a dedicated core crate so CLI and TUI both depend downward.
```diff
diff --git a/PRD.md b/PRD.md
@@ ## 9.1 Dependency Changes
-[workspace]
-members = [".", "crates/lore-tui"]
+[workspace]
+members = [".", "crates/lore-core", "crates/lore-tui"]
@@
-[dependencies]
-lore-tui = { path = "crates/lore-tui", optional = true }
+[dependencies]
+lore-core = { path = "crates/lore-core" }
+lore-tui = { path = "crates/lore-tui", optional = true }
@@ # crates/lore-tui/Cargo.toml
-lore = { path = "../.." } # Core lore library
+lore-core = { path = "../lore-core" } # Shared domain/query crate (acyclic graph)
```
### 2. Stop coupling TUI to `cli/commands/*` internals
Calling CLI command modules from TUI is brittle and will drift. Introduce a shared query/service layer with DTOs owned by core.
```diff
diff --git a/PRD.md b/PRD.md
@@ ## 4.1 Module Structure
- action.rs # Async action runners (DB queries, GitLab calls)
+ action.rs # Task dispatch only
+ service/
+ mod.rs
+ query.rs # Shared read services (CLI + TUI)
+ sync.rs # Shared sync orchestration facade
+ dto.rs # UI-agnostic data contracts
@@ ## 10.2 Modified Files
-src/cli/commands/list.rs # Extract query_issues(), query_mrs() as pub fns
-src/cli/commands/show.rs # Extract query_issue_detail(), query_mr_detail() as pub fns
-src/cli/commands/who.rs # Extract query_experts(), etc. as pub fns
-src/cli/commands/search.rs # Extract run_search_query() as pub fn
+crates/lore-core/src/query/issues.rs # Canonical issue queries
+crates/lore-core/src/query/mrs.rs # Canonical MR queries
+crates/lore-core/src/query/show.rs # Canonical detail queries
+crates/lore-core/src/query/who.rs # Canonical people queries
+crates/lore-core/src/query/search.rs # Canonical search queries
+src/cli/commands/*.rs # Consume lore-core query services
+crates/lore-tui/src/action.rs # Consume lore-core query services
```
### 3. Add a real task supervisor (dedupe + cancellation + priority)
Right now tasks are ad hoc and can overrun each other. Add a scheduler keyed by screen+intent.
```diff
diff --git a/PRD.md b/PRD.md
@@ ## 4.5 Async Action System
-The `Cmd::task(|| { ... })` pattern runs a blocking closure on a background thread pool.
+The TUI uses a `TaskSupervisor`:
+- Keyed tasks (`TaskKey`) to dedupe redundant requests
+- Priority lanes (`Input`, `Navigation`, `Background`)
+- Cooperative cancellation tokens per task
+- Late-result drop via generation IDs (not just search)
@@ ## 4.3 Core Types
+pub enum TaskKey {
+ LoadScreen(Screen),
+ Search { generation: u64 },
+ SyncStream,
+}
```
### 4. Correct sync streaming architecture (current sketch loses streamed events)
The sample creates `tx/rx` then drops `rx`; events never reach update loop. Define an explicit stream subscription with bounded queue and backpressure policy.
```diff
diff --git a/PRD.md b/PRD.md
@@ ## 4.4 App — Implementing the Model Trait
- let (tx, _rx) = std::sync::mpsc::channel::<Msg>();
+ let (tx, rx) = std::sync::mpsc::sync_channel::<Msg>(1024);
+ // rx is registered via Subscription::from_receiver("sync-stream", rx)
@@
- let result = crate::ingestion::orchestrator::run_sync(
+ let result = crate::ingestion::orchestrator::run_sync(
&config,
&conn,
|event| {
@@
- let _ = tx.send(Msg::SyncProgress(event.clone()));
- let _ = tx.send(Msg::SyncLogLine(format!("{event:?}")));
+ if tx.try_send(Msg::SyncProgress(event.clone())).is_err() {
+ let _ = tx.try_send(Msg::SyncBackpressureDrop);
+ }
+ let _ = tx.try_send(Msg::SyncLogLine(format!("{event:?}")));
},
);
```
### 5. Upgrade data-plane performance plan (keyset pagination + index contracts)
Virtualized list without keyset paging still forces expensive scans. Add explicit keyset pagination and query-plan CI checks.
```diff
diff --git a/PRD.md b/PRD.md
@@ ## 9.3 Phase 0 — Toolchain Gate
-7. p95 list query latency < 75ms on synthetic fixture (10k issues, 5k MRs)
+7. p95 list page fetch latency < 75ms using keyset pagination (10k issues, 5k MRs)
+8. EXPLAIN QUERY PLAN must show index usage for top 10 TUI queries
+9. No full table scan on issues/MRs/discussions under default filters
@@
-8. p95 search latency < 200ms on synthetic fixture (50k documents, lexical mode)
+10. p95 search latency < 200ms on synthetic fixture (50k documents, lexical mode)
+## 9.4 Required Indexes (GA blocker)
+- `issues(project_id, state, updated_at DESC, iid DESC)`
+- `merge_requests(project_id, state, updated_at DESC, iid DESC)`
+- `discussions(project_id, entity_type, entity_iid, created_at DESC)`
+- `notes(discussion_id, created_at ASC)`
```
### 6. Enforce `EntityKey` everywhere (remove bare IID paths)
You correctly identified multi-project IID collisions, but many message/state signatures still use `i64`. Make `EntityKey` mandatory in all navigation and detail loaders.
```diff
diff --git a/PRD.md b/PRD.md
@@ ## 4.3 Core Types
- IssueSelected(i64),
+ IssueSelected(EntityKey),
@@
- MrSelected(i64),
+ MrSelected(EntityKey),
@@
- IssueDetailLoaded(IssueDetail),
+ IssueDetailLoaded { key: EntityKey, detail: IssueDetail },
@@
- MrDetailLoaded(MrDetail),
+ MrDetailLoaded { key: EntityKey, detail: MrDetail },
@@ ## 10.10 State Module — Complete
- Cmd::msg(Msg::NavigateTo(Screen::IssueDetail(iid)))
+ Cmd::msg(Msg::NavigateTo(Screen::IssueDetail(entity_key)))
```
### 7. Harden filter/search semantics (strict parser + inline diagnostics + explain scores)
Current filter parser silently ignores unknown fields; that causes hidden mistakes. Add strict parse diagnostics and search score explainability.
```diff
diff --git a/PRD.md b/PRD.md
@@ ## 10.12.1 Filter Bar Widget
- _ => {} // Unknown fields silently ignored
+ _ => self.errors.push(format!("Unknown filter field: {}", token.field))
+ pub errors: Vec<String>, // inline parse/validation errors
+ pub warnings: Vec<String>, // non-fatal coercions
@@ ## 5.6 Search
-- **Live preview:** Selected result shows snippet + metadata in right pane
+- **Live preview:** Selected result shows snippet + metadata in right pane
+- **Explain score:** Optional breakdown (lexical, semantic, recency, boosts) for trust/debug
```
### 8. Add operational resilience: safe mode + panic report + startup fallback
TUI failures should degrade gracefully, not block usage.
```diff
diff --git a/PRD.md b/PRD.md
@@ ## 3.1 Risk Matrix
+| Runtime panic leaves user blocked | High | Medium | Panic hook writes crash report, restores terminal, offers fallback CLI command |
@@ ## 10.3 Entry Point
+pub fn launch_tui(config: Config, db_path: &Path) -> Result<(), LoreError> {
+ install_panic_hook_for_tui(); // terminal restore + crash dump path
+ ...
+}
@@ ## 8.1 Global (Available Everywhere)
+| `:` | Show fallback equivalent CLI command for current screen/action |
```
### 9. Add a “jump list” (forward/back navigation, not only stack pop)
Current model has only push/pop and reset. Add browser-like history for investigation workflows.
```diff
diff --git a/PRD.md b/PRD.md
@@ ## 4.7 Navigation Stack Implementation
pub struct NavigationStack {
- stack: Vec<Screen>,
+ back_stack: Vec<Screen>,
+ current: Screen,
+ forward_stack: Vec<Screen>,
+ jump_list: Vec<Screen>, // recent entity/detail hops
}
@@ ## 8.1 Global (Available Everywhere)
+| `Ctrl+o` | Jump backward in jump list |
+| `Ctrl+i` | Jump forward in jump list |
```
If you want, I can produce a single consolidated “PRD v2.1” patch that applies all nine revisions coherently section-by-section.

View File

@@ -0,0 +1,163 @@
I excluded everything already listed in `## Rejected Recommendations`.
These are the highest-impact net-new revisions Id make.
1. **Enforce Entity Identity Consistency End-to-End (P0)**
Analysis: The PRD defines `EntityKey`, but many code paths still pass bare `iid` (`IssueSelected(item.iid)`, timeline refs, search refs). In multi-project datasets this will cause wrong-entity navigation and subtle data corruption in cached state. Make `EntityKey` mandatory in every navigation message and add compile-time constructors.
```diff
@@ 4.3 Core Types
pub struct EntityKey {
pub project_id: i64,
pub iid: i64,
pub kind: EntityKind,
}
+impl EntityKey {
+ pub fn issue(project_id: i64, iid: i64) -> Self { Self { project_id, iid, kind: EntityKind::Issue } }
+ pub fn mr(project_id: i64, iid: i64) -> Self { Self { project_id, iid, kind: EntityKind::MergeRequest } }
+}
@@ 10.10 state/issue_list.rs
- .map(|item| Msg::IssueSelected(item.iid))
+ .map(|item| Msg::IssueSelected(EntityKey::issue(item.project_id, item.iid)))
@@ 10.10 state/mr_list.rs
- .map(|item| Msg::MrSelected(item.iid))
+ .map(|item| Msg::MrSelected(EntityKey::mr(item.project_id, item.iid)))
```
2. **Make TaskSupervisor Mandatory for All Background Work (P0)**
Analysis: The plan introduces `TaskSupervisor` but still dispatches many direct `Cmd::task` calls. That will reintroduce stale updates, duplicate queries, and priority inversion under rapid input. Centralize all background task creation through one spawn path that enforces dedupe, cancellation tokening, and generation checks.
```diff
@@ 4.5.1 Task Supervisor (Dedup + Cancellation + Priority)
-The supervisor is owned by `LoreApp` and consulted before dispatching any `Cmd::task`.
+The supervisor is owned by `LoreApp` and is the ONLY allowed path for background work.
+All task launches use `LoreApp::spawn_task(TaskKey, TaskPriority, closure)`.
@@ 4.4 App — Implementing the Model Trait
- Cmd::task(move || { ... })
+ self.spawn_task(TaskKey::LoadScreen(screen.clone()), TaskPriority::Navigation, move |token| { ... })
```
3. **Remove the Sync Streaming TODO and Make Real-Time Streaming a GA Gate (P0)**
Analysis: Current text admits sync progress is buffered with a TODO. That undercuts one of the main value props. Make streaming progress/log delivery non-optional, with bounded buffers and dropped-line accounting.
```diff
@@ 4.4 start_sync_task()
- // TODO: Register rx as subscription when FrankenTUI supports it.
- // For now, the task returns the final Msg and progress is buffered.
+ // Register rx as a live subscription (`Subscription::from_receiver` adapter).
+ // Progress and logs must render in real time (no batch-at-end fallback).
+ // Keep a bounded ring buffer (N=5000) and surface `dropped_log_lines` in UI.
@@ 9.3 Phase 0 — Toolchain Gate
+11. Real-time sync stream verified: progress updates visible during run, not only at completion.
```
4. **Upgrade List/Search Data Strategy to Windowed Keyset + Prefetch (P0)**
Analysis: “Virtualized list” alone does not solve query/transfer cost if full result sets are loaded. Move to fixed-size keyset windows with next-window prefetch and fast first paint; this keeps latency predictable on 100k+ records.
```diff
@@ 5.2 Issue List
- Pagination: Virtual scrolling for large result sets
+ Pagination: Windowed keyset pagination (window=200 rows) with background prefetch of next window.
+ First paint uses current window only; no full-result materialization.
@@ 5.4 MR List
+ Same windowed keyset pagination strategy as Issue List.
@@ 9.3 Success criteria
- 7. p95 list page fetch latency < 75ms using keyset pagination on synthetic fixture (10k issues, 5k MRs)
+ 7. p95 first-paint latency < 50ms and p95 next-window fetch < 75ms on synthetic fixture (100k issues, 50k MRs)
```
5. **Add Resumable Sync Checkpoints + Per-Project Fault Isolation (P1)**
Analysis: If sync is interrupted or one project fails, current design mostly falls back to cancel/fail. Add checkpoints so long runs can resume, and isolate failures to project/resource scope while continuing others.
```diff
@@ 3.1 Risk Matrix
+| Interrupted sync loses progress | High | Medium | Persist phase checkpoints and offer resume |
@@ 5.9 Sync
+Running mode: failed project/resource lanes are marked degraded while other lanes continue.
+Summary mode: offer `[R]esume interrupted sync` from last checkpoint.
@@ 11 Assumptions
-16. No new SQLite tables needed (but required indexes must be verified — see Performance SLOs).
+16. Add minimal internal tables for reliability: `sync_runs` and `sync_checkpoints` (append-only metadata).
```
6. **Add Capability-Adaptive Rendering Modes (P1)**
Analysis: Terminal compatibility is currently test-focused, but runtime adaptation is under-specified. Add explicit degradations for no-truecolor, no-unicode, slow SSH/tmux paths to reduce rendering artifacts and support incidents.
```diff
@@ 3.4 Terminal Compatibility Testing
+Add capability matrix validation: truecolor/256/16 color, unicode/ascii glyphs, alt-screen on/off.
@@ 10.19 CLI Integration
+Tui {
+ #[arg(long, default_value="auto")] render_mode: String, // auto|full|minimal
+ #[arg(long)] ascii: bool,
+ #[arg(long)] no_alt_screen: bool,
+}
```
7. **Harden Browser/Open and Log Privacy (P1)**
Analysis: `open_current_in_browser` currently trusts stored URLs; sync logs may expose tokens/emails from upstream messages. Add host allowlisting and redaction pipeline by default.
```diff
@@ 4.4 open_current_in_browser()
- if let Some(url) = url { ... open ... }
+ if let Some(url) = url {
+ if !self.state.security.is_allowed_gitlab_url(&url) {
+ self.state.set_error("Blocked non-GitLab URL".into());
+ return;
+ }
+ ... open ...
+ }
@@ 5.9 Sync
+Log stream passes through redaction (tokens, auth headers, email local-parts) before render/storage.
```
8. **Add “My Workbench” Screen for Daily Pull (P1, new feature)**
Analysis: The PRD is strong on exploration, weaker on “what should I do now?”. Add a focused operator screen aggregating assigned issues, requested reviews, unresolved threads mentioning me, and stale approvals. This makes the TUI habit-forming.
```diff
@@ 5. Screen Taxonomy
+### 5.12 My Workbench
+Single-screen triage cockpit:
+- Assigned-to-me open issues/MRs
+- Review requests awaiting action
+- Threads mentioning me and unresolved
+- Recently stale approvals / blocked MRs
@@ 8.1 Global
+| `gb` | Go to My Workbench |
@@ 9.2 Phases
+section Phase 3.5 — Daily Workflow
+My Workbench screen + queries :p35a, after p3d, 2d
```
9. **Add Rollout, SLO Telemetry, and Kill-Switch Plan (P0)**
Analysis: You have implementation phases but no production rollout control. Add explicit experiment flags, health telemetry, and rollback criteria so risk is operationally bounded.
```diff
@@ Table of Contents
-11. [Assumptions](#11-assumptions)
+11. [Assumptions](#11-assumptions)
+12. [Rollout & Telemetry](#12-rollout--telemetry)
@@ NEW SECTION 12
+## 12. Rollout & Telemetry
+- Feature flags: `tui_experimental`, `tui_sync_streaming`, `tui_workbench`
+- Metrics: startup_ms, frame_render_p95_ms, db_busy_rate, panic_free_sessions, sync_drop_events
+- Kill-switch: disable `tui` feature path at runtime if panic rate > 0.5% sessions over 24h
+- Canary rollout: internal only -> opt-in beta -> default-on
```
10. **Strengthen Reliability Pack with Event-Fuzz + Soak Tests (P0)**
Analysis: Current tests are good but still light on prolonged event pressure. Add deterministic fuzzed key/resize/paste streams and a long soak to catch rare deadlocks/leaks and state corruption.
```diff
@@ 9.2 Phase 5.5 — Reliability Test Pack
+Event fuzz tests (key/resize/paste interleavings) :p55g, after p55e, 1d
+30-minute soak test (no panic, bounded memory) :p55h, after p55g, 1d
@@ 9.3 Success criteria
+12. Event-fuzz suite passes with zero invariant violations across 10k randomized traces.
+13. 30-minute soak: no panic, no deadlock, memory growth < 5%.
```
If you want, I can produce a single consolidated unified diff of the full PRD text next (all edits merged, ready to apply as v3).

View File

@@ -0,0 +1,157 @@
Below are my strongest revisions, focused on correctness, reliability, and long-term maintainability, while avoiding all items in your `## Rejected Recommendations`.
1. **Fix the Cargo/toolchain architecture (current plan has a real dependency-cycle risk and shaky per-member toolchain behavior).**
Analysis: The current plan has `lore -> lore-tui (optional)` and `lore-tui -> lore`, which creates a package cycle when `tui` is enabled. Also, per-member `rust-toolchain.toml` in a workspace is easy to misapply in CI/dev workflows. The cleanest robust shape is: `lore-tui` is a separate binary crate (nightly), `lore` remains stable and delegates at runtime (`lore tui` shells out to `lore-tui`).
```diff
--- a/Gitlore_TUI_PRD_v2.md
+++ b/Gitlore_TUI_PRD_v2.md
@@ 3.2 Nightly Rust Strategy
-- The `lore` binary integrates TUI via `lore tui` subcommand. The `lore-tui` crate is a library dependency feature-gated in the root.
+- `lore-tui` is a separate binary crate built on pinned nightly.
+- `lore` (stable) does not compile-link `lore-tui`; `lore tui` delegates by spawning `lore-tui`.
+- This removes Cargo dependency-cycle risk and keeps stable builds nightly-free.
@@ 9.1 Dependency Changes
-[features]
-tui = ["dep:lore-tui"]
-[dependencies]
-lore-tui = { path = "crates/lore-tui", optional = true }
+[dependencies]
+# no compile-time dependency on lore-tui from lore
+# runtime delegation keeps toolchains isolated
@@ 10.19 CLI Integration
-Add Tui match arm that directly calls crate::tui::launch_tui(...)
+Add Tui match arm that resolves and spawns `lore-tui` with passthrough args.
+If missing, print actionable install/build command.
```
2. **Make `TaskSupervisor` the *actual* single async path (remove contradictory direct `Cmd::task` usage in state handlers).**
Analysis: You declare “direct `Cmd::task` is prohibited outside supervisor,” but later `handle_screen_msg` still launches tasks directly. That contradiction will reintroduce stale-result bugs and race conditions. Make state handlers pure (intent-only); all async launch/cancel/dedup goes through one supervised API.
```diff
--- a/Gitlore_TUI_PRD_v2.md
+++ b/Gitlore_TUI_PRD_v2.md
@@ 4.5.1 Task Supervisor
-The supervisor is the ONLY allowed path for background work.
+The supervisor is the ONLY allowed path for background work, enforced by architecture:
+`AppState` emits intents only; `LoreApp::update` launches tasks via `spawn_task(...)`.
@@ 10.10 State Module — Complete
-pub fn handle_screen_msg(..., db: &Arc<Mutex<Connection>>) -> Cmd<Msg>
+pub fn handle_screen_msg(...) -> ScreenIntent
+// no DB access, no Cmd::task in state layer
```
3. **Enforce `EntityKey` everywhere (remove raw IID navigation paths).**
Analysis: Multi-project identity is one of your strongest ideas, but multiple snippets still navigate by bare IID (`document_id`, `EntityRef::Issue(i64)`). That can misroute across projects and create silent correctness bugs. Make all navigation-bearing results carry `EntityKey` end-to-end.
```diff
--- a/Gitlore_TUI_PRD_v2.md
+++ b/Gitlore_TUI_PRD_v2.md
@@ 4.3 Core Types
-pub enum EntityRef { Issue(i64), MergeRequest(i64) }
+pub enum EntityRef { Issue(EntityKey), MergeRequest(EntityKey) }
@@ 10.10 state/search.rs
-Some(Msg::NavigateTo(Screen::IssueDetail(r.document_id)))
+Some(Msg::NavigateTo(Screen::IssueDetail(r.entity_key.clone())))
@@ 10.11 action.rs
-pub fn fetch_issue_detail(conn: &Connection, iid: i64) -> Result<IssueDetail, LoreError>
+pub fn fetch_issue_detail(conn: &Connection, key: &EntityKey) -> Result<IssueDetail, LoreError>
```
4. **Introduce a shared query boundary inside the existing crate (not a new crate) to decouple TUI from CLI presentation structs.**
Analysis: Reusing CLI command modules directly is fast initially, but it ties TUI to output-layer types and command concerns. A minimal internal `core::query::*` module gives a stable data contract used by both CLI and TUI without the overhead of a new crate split.
```diff
--- a/Gitlore_TUI_PRD_v2.md
+++ b/Gitlore_TUI_PRD_v2.md
@@ 10.2 Modified Files
-src/cli/commands/list.rs # extract query_issues/query_mrs as pub
-src/cli/commands/show.rs # extract query_issue_detail/query_mr_detail as pub
+src/core/query/mod.rs
+src/core/query/issues.rs
+src/core/query/mrs.rs
+src/core/query/detail.rs
+src/core/query/search.rs
+src/core/query/who.rs
+src/cli/commands/* now call core::query::* + format output
+TUI action.rs calls core::query::* directly
```
5. **Add terminal-safety sanitization for untrusted text (ANSI/OSC injection hardening).**
Analysis: Issue/MR bodies, notes, and logs are untrusted text in a terminal context. Without sanitization, terminal escape/control sequences can spoof UI or trigger unintended behavior. Add explicit sanitization and a strict URL policy before rendering/opening.
```diff
--- a/Gitlore_TUI_PRD_v2.md
+++ b/Gitlore_TUI_PRD_v2.md
@@ 3.1 Risk Matrix
+| Terminal escape/control-sequence injection via issue/note text | High | Medium | Strip ANSI/OSC/control chars before render; escape markdown output; allowlist URL scheme+host |
@@ 4.1 Module Structure
+ safety.rs # sanitize_for_terminal(), safe_url_policy()
@@ 10.5/10.8/10.14/10.16
+All user-sourced text passes through `sanitize_for_terminal()` before widget rendering.
+Disable markdown raw HTML and clickable links unless URL policy passes.
```
6. **Move resumable sync checkpoints into v1 (lightweight version).**
Analysis: You already identify interruption risk as real. Deferring resumability to post-v1 leaves a major reliability gap in exactly the heaviest workflow. A lightweight checkpoint table (resource cursor + updated-at watermark) gives large reliability gain with modest complexity.
```diff
--- a/Gitlore_TUI_PRD_v2.md
+++ b/Gitlore_TUI_PRD_v2.md
@@ 3.1 Risk Matrix
-- Resumable checkpoints planned for post-v1
+Resumable checkpoints included in v1 (lightweight cursors per project/resource lane)
@@ 9.3 Success Criteria
+14. Interrupt-and-resume test: sync resumes from checkpoint and reaches completion without full restart.
@@ 9.3.1 Required Indexes (GA Blocker)
+CREATE TABLE IF NOT EXISTS sync_checkpoints (
+ project_id INTEGER NOT NULL,
+ lane TEXT NOT NULL,
+ cursor TEXT,
+ updated_at_ms INTEGER NOT NULL,
+ PRIMARY KEY (project_id, lane)
+);
```
7. **Strengthen performance gates with tiered fixtures and memory ceilings.**
Analysis: Current thresholds are good, but fixture sizes are too close to mid-scale only. Add S/M/L fixtures and memory budget checks so regressions appear before real-world datasets hit them. This gives much more confidence in long-term scalability.
```diff
--- a/Gitlore_TUI_PRD_v2.md
+++ b/Gitlore_TUI_PRD_v2.md
@@ 9.3 Phase 0 — Toolchain Gate
-7. p95 first-paint latency < 50ms ... (100k issues, 50k MRs)
-10. p95 search latency < 200ms ... (50k documents)
+7. Tiered fixtures:
+ S: 10k issues / 5k MRs / 50k notes
+ M: 100k issues / 50k MRs / 500k notes
+ L: 250k issues / 100k MRs / 1M notes
+ Enforce p95 targets per tier and memory ceiling (<250MB RSS in M tier).
+10. Search SLO validated in S and M tiers, lexical and hybrid modes.
```
8. **Add session restore (last screen + filters + selection), with explicit `--fresh` opt-out.**
Analysis: This is high-value daily UX with low complexity, and it makes the TUI feel materially more “compelling/useful” without feature bloat. It also reduces friction when recovering from crash/restart.
```diff
--- a/Gitlore_TUI_PRD_v2.md
+++ b/Gitlore_TUI_PRD_v2.md
@@ 1. Executive Summary
+- **Session restore** — resume last screen, filters, and selection on startup.
@@ 4.1 Module Structure
+ session.rs # persisted UI session state
@@ 8.1 Global
+| `Ctrl+R` | Reset session state for current screen |
@@ 10.19 CLI Integration
+`lore tui --fresh` starts without restoring prior session state.
@@ 11. Assumptions
-12. No TUI-specific configuration initially.
+12. Minimal TUI state file is allowed for session restore only.
```
9. **Add parity tests between TUI data panels and `--robot` outputs.**
Analysis: You already have `ShowCliEquivalent`; parity tests make that claim trustworthy and prevent drift between interfaces. This is a strong reliability multiplier and helps future refactors.
```diff
--- a/Gitlore_TUI_PRD_v2.md
+++ b/Gitlore_TUI_PRD_v2.md
@@ 9.2 Phases / 9.3 Success Criteria
+Phase 5.6 — CLI/TUI Parity Pack
+ - Dashboard count parity vs `lore --robot count/status`
+ - List/detail parity for issues/MRs on sampled entities
+ - Search result identity parity (top-N ids) for lexical mode
+Success criterion: parity suite passes on CI fixtures.
```
If you want, I can produce a single consolidated patch of the PRD text (one unified diff) so you can drop it directly into the next iteration.

View File

@@ -0,0 +1,200 @@
1. **Fix the structural inconsistency between `src/tui` and `crates/lore-tui/src`**
Analysis: The PRD currently defines two different code layouts for the same system. That will cause implementation drift, wrong imports, and duplicated modules. Locking to one canonical layout early prevents execution churn and makes every snippet/action item unambiguous.
```diff
@@ 4.1 Module Structure @@
-src/
- tui/
+crates/lore-tui/src/
mod.rs
app.rs
message.rs
@@
-### 10.5 Dashboard View (FrankenTUI Native)
-// src/tui/view/dashboard.rs
+### 10.5 Dashboard View (FrankenTUI Native)
+// crates/lore-tui/src/view/dashboard.rs
@@
-### 10.6 Sync View
-// src/tui/view/sync.rs
+### 10.6 Sync View
+// crates/lore-tui/src/view/sync.rs
```
2. **Add a small `ui_adapter` seam to contain FrankenTUI API churn**
Analysis: You already identified high likelihood of upstream breakage. Pinning a commit helps, but if every screen imports raw `ftui_*` types directly, churn ripples through dozens of files. A thin adapter layer reduces upgrade cost without introducing the rejected “full portability abstraction”.
```diff
@@ 3.1 Risk Matrix @@
| API breaking changes | High | High (v0.x) | Pin exact git commit; vendor source if needed |
+| API breakage blast radius across app code | High | High | Constrain ftui usage behind `ui_adapter/*` wrappers |
@@ 4.1 Module Structure @@
+ ui_adapter/
+ mod.rs # Re-export stable local UI primitives
+ runtime.rs # App launch/options wrappers
+ widgets.rs # Table/List/Modal wrapper constructors
+ input.rs # Text input + focus helpers
@@ 9.3 Phase 0 — Toolchain Gate @@
+14. `ui_adapter` compile-check: no screen module imports `ftui_*` directly (lint-enforced)
```
3. **Correct search mode behavior and replace sleep-based debounce with cancelable scheduling**
Analysis: Current plan hardcodes `"hybrid"` in `execute_search`, so mode switching is UI-only and incorrect. Also, spawning sleeping tasks per keypress is wasteful under fast typing. Make mode a first-class query parameter and debounce via one cancelable scheduled event per input domain.
```diff
@@ 4.4 maybe_debounced_query @@
-std::thread::sleep(std::time::Duration::from_millis(200));
-match crate::tui::action::execute_search(&conn, &query, &filters) {
+// no thread sleep; schedule SearchRequestStarted after 200ms via debounce scheduler
+match crate::tui::action::execute_search(&conn, &query, &filters, mode) {
@@ 10.11 Action Module — Query Bridge @@
-pub fn execute_search(conn: &Connection, query: &str, filters: &SearchCliFilters) -> Result<SearchResponse, LoreError> {
- let mode_str = "hybrid"; // default; TUI mode selector overrides
+pub fn execute_search(
+ conn: &Connection,
+ query: &str,
+ filters: &SearchCliFilters,
+ mode: SearchMode,
+) -> Result<SearchResponse, LoreError> {
+ let mode_str = match mode {
+ SearchMode::Hybrid => "hybrid",
+ SearchMode::Lexical => "lexical",
+ SearchMode::Semantic => "semantic",
+ };
@@ 9.3 Phase 0 — Toolchain Gate @@
+15. Search mode parity: lexical/hybrid/semantic each return mode-consistent top-N IDs on fixture
```
4. **Guarantee consistent multi-query reads and add query interruption for responsiveness**
Analysis: Detail screens combine multiple queries that can observe mixed states during sync writes. Wrap each detail fetch in a single read transaction for snapshot consistency. Add cancellation/interrupt checks for long-running queries so UI remains responsive under heavy datasets.
```diff
@@ 4.5 Async Action System @@
+All detail fetches (`issue_detail`, `mr_detail`, timeline expansion) run inside one read transaction
+to guarantee snapshot consistency across subqueries.
@@ 10.11 Action Module — Query Bridge @@
+pub fn with_read_snapshot<T>(
+ conn: &Connection,
+ f: impl FnOnce(&rusqlite::Transaction<'_>) -> Result<T, LoreError>,
+) -> Result<T, LoreError> { ... }
+// Long queries register interrupt checks tied to CancelToken
+// to avoid >1s uninterruptible stalls during rapid navigation/filtering.
```
5. **Formalize sync event streaming contract to prevent “stuck” states**
Analysis: Dropping events on backpressure is acceptable, but completion must never be dropped and event ordering must be explicit. Add a typed `SyncUiEvent` stream with guaranteed terminal sentinel and progress coalescing to reduce load while preserving correctness.
```diff
@@ 4.4 start_sync_task @@
-let (tx, rx) = std::sync::mpsc::sync_channel::<Msg>(1024);
+let (tx, rx) = std::sync::mpsc::sync_channel::<SyncUiEvent>(2048);
-// drop this progress update rather than blocking the sync thread
+// coalesce progress to max 30Hz per lane; never drop terminal events
+// always emit SyncUiEvent::StreamClosed { outcome }
@@ 5.9 Sync @@
-- Log viewer with streaming output
+- Log viewer with streaming output and explicit stream-finalization state
+- UI shows dropped/coalesced event counters for transparency
```
6. **Version and validate session restore payloads**
Analysis: A raw JSON session file without schema/version checks is fragile across releases and DB switches. Add schema version, DB fingerprint, and safe fallback rules so session restore never blocks startup or applies stale state incorrectly.
```diff
@@ 11. Assumptions @@
-12. Minimal TUI state file allowed for session restore only ...
+12. Versioned TUI state file allowed for session restore only:
+ fields include `schema_version`, `app_version`, `db_fingerprint`, `saved_at`, `state`.
@@ 10.1 New Files @@
crates/lore-tui/src/session.rs # Lightweight session state persistence
+ # + versioning, validation, corruption quarantine
@@ 4.1 Module Structure @@
session.rs # Lightweight session state persistence
+ # corrupted file -> `.bad-<timestamp>` and fresh start
```
7. **Harden terminal safety beyond ANSI stripping**
Analysis: ANSI stripping is necessary but not sufficient. Bidi controls and invisible Unicode controls can still spoof displayed content. URL checks should normalize host/port and disallow deceptive variants. This closes realistic terminal spoofing vectors.
```diff
@@ 3.1 Risk Matrix @@
| Terminal escape/control-sequence injection via issue/note text | High | Medium | Strip ANSI/OSC/control chars via sanitize_for_terminal() ... |
+| Bidi/invisible Unicode spoofing in rendered text | High | Medium | Strip bidi overrides + zero-width controls in untrusted text |
@@ 10.4.1 Terminal Safety — Untrusted Text Sanitization @@
-Strip ANSI escape sequences, OSC commands, and control characters
+Strip ANSI/OSC/control chars, bidi overrides (RLO/LRO/PDF/RLI/LRI/FSI/PDI),
+and zero-width/invisible controls from untrusted text
-pub fn is_safe_url(url: &str, allowed_hosts: &[String]) -> bool {
+pub fn is_safe_url(url: &str, allowed_origins: &[Origin]) -> bool {
+ // normalize host (IDNA), enforce scheme+host+port match
```
8. **Use progressive hydration for detail screens**
Analysis: Issue/MR detail first-paint can become slow when discussions are large. Split fetch into phases: metadata first, then discussions/file changes, then deep thread content on expand. This improves perceived performance and keeps navigation snappy on large repos.
```diff
@@ 5.3 Issue Detail @@
-Data source: `lore issues <iid>` + discussions + cross-references
+Data source (progressive):
+1) metadata/header (first paint)
+2) discussions summary + cross-refs
+3) full thread bodies loaded on demand when expanded
@@ 5.5 MR Detail @@
-Unique features: File changes list, Diff discussions ...
+Unique features (progressive hydration):
+- file change summary in first paint
+- diff discussion bodies loaded lazily per expanded thread
@@ 9.3 Phase 0 — Toolchain Gate @@
+16. Detail first-paint p95 < 75ms on M-tier fixtures (metadata-only phase)
```
9. **Make reliability tests reproducible with deterministic clocks/seeds**
Analysis: Relative-time rendering and fuzz tests are currently tied to wall clock/randomness, which makes CI flakes hard to diagnose. Introduce a `Clock` abstraction and deterministic fuzz seeds with failure replay output.
```diff
@@ 10.9.1 Non-Snapshot Tests @@
+/// All time-based rendering uses injected `Clock` in tests.
+/// Fuzz failures print deterministic seed for replay.
@@ 9.2 Phase 5.5 — Reliability Test Pack @@
-Event fuzz tests (key/resize/paste):p55g
+Event fuzz tests (key/resize/paste, deterministic seed replay):p55g
+Deterministic clock/render tests:p55i
```
10. **Add an “Actionable Insights” dashboard panel for stronger day-to-day utility**
Analysis: Current dashboard is informative, but not prioritizing. Adding ranked insights (stale P1s, blocked MRs, discussion hotspots) turns it into a decision surface, not just a metrics screen. This makes the TUI materially more compelling for triage workflows.
```diff
@@ 1. Executive Summary @@
- Dashboard — sync status, project health, counts at a glance
+- Dashboard — sync status, project health, counts, and ranked actionable insights
@@ 5.1 Dashboard (Home Screen) @@
-│ Recent Activity │
+│ Recent Activity │
+│ Actionable Insights │
+│ 1) 7 opened P1 issues >14d │
+│ 2) 3 MRs blocked by unresolved │
+│ 3) auth/ has +42% note velocity │
@@ 6. User Flows @@
+### 6.9 Flow: "Risk-first morning sweep"
+Dashboard -> select insight -> jump to pre-filtered list/detail
```
These 10 changes stay clear of your `Rejected Recommendations` list and materially improve correctness, operability, and product value without adding speculative architecture.

View File

@@ -0,0 +1,150 @@
Your plan is strong and unusually detailed. The biggest upgrades Id make are around build isolation, async correctness, terminal correctness, and turning existing data into sharper triage workflows.
## 1) Fix toolchain isolation so stable builds cannot accidentally pull nightly
Rationale: a `rust-toolchain.toml` inside `crates/lore-tui` is not a complete guard when running workspace commands from repo root. You should structurally prevent stable workflows from touching nightly-only code.
```diff
@@ 3.2 Nightly Rust Strategy
-[workspace]
-members = [".", "crates/lore-tui"]
+[workspace]
+members = ["."]
+exclude = ["crates/lore-tui"]
+`crates/lore-tui` is built as an isolated workspace/package with explicit toolchain invocation:
+ cargo +nightly-2026-02-08 check --manifest-path crates/lore-tui/Cargo.toml
+Core repo remains:
+ cargo +stable check --workspace
```
## 2) Add an explicit `lore` <-> `lore-tui` compatibility contract
Rationale: runtime delegation is correct, but version drift between binaries will become the #1 support failure mode. Add a handshake before launch.
```diff
@@ 10.19 CLI Integration — Adding `lore tui`
+Before spawning `lore-tui`, `lore` runs:
+ lore-tui --print-contract-json
+and validates:
+ - minimum_core_version
+ - supported_db_schema_range
+ - contract_version
+On mismatch, print actionable remediation:
+ cargo install --path crates/lore-tui
```
## 3) Make TaskSupervisor truly authoritative (remove split async paths)
Rationale: the document says supervisor is the only path, but examples still use direct `Cmd::task` and `search_request_id`. Close that contradiction now to avoid stale-data races.
```diff
@@ 4.4 App — Implementing the Model Trait
- search_request_id: u64,
+ task_supervisor: TaskSupervisor,
@@ 4.5.1 Task Supervisor
-The `search_request_id` field in `LoreApp` is superseded...
+`search_request_id` is removed. All async work uses TaskSupervisor generations.
+No direct `Cmd::task` from screen handlers or ad-hoc helpers.
```
## 4) Resolve keybinding conflicts and implement real go-prefix timeout
Rationale: `Ctrl+I` collides with `Tab` in terminals. Also your 500ms go-prefix timeout is described but not enforced in code.
```diff
@@ 8.1 Global (Available Everywhere)
-| `Ctrl+I` | Jump forward in jump list (entity hops) |
+| `Alt+o` | Jump forward in jump list (entity hops) |
@@ 8.2 Keybinding precedence
+Go-prefix timeout is enforced by timestamped state + tick check.
+Backspace global-back behavior is implemented (currently documented but not wired).
```
## 5) Add a shared display-width text utility (Unicode-safe truncation and alignment)
Rationale: current `truncate()` implementations use byte/char length and will misalign CJK/emoji/full-width text in tables and trees.
```diff
@@ 10.1 New Files
+crates/lore-tui/src/text_width.rs # grapheme-safe truncation + display width helpers
@@ 10.5 Dashboard View / 10.13 Issue List / 10.16 Who View
-fn truncate(s: &str, max: usize) -> String { ... }
+use crate::text_width::truncate_display_width;
+// all column fitting/truncation uses terminal display width, not bytes/chars
```
## 6) Upgrade sync streaming to a QoS event bus with sequence IDs
Rationale: today progress/log events can be dropped under load with weak observability. Keep UI responsive while guaranteeing completion semantics and visible gap accounting.
```diff
@@ 4.4 start_sync_task()
-let (tx, rx) = std::sync::mpsc::sync_channel::<SyncUiEvent>(2048);
+let (ctrl_tx, ctrl_rx) = std::sync::mpsc::sync_channel::<SyncCtrlEvent>(256); // never-drop
+let (data_tx, data_rx) = std::sync::mpsc::sync_channel::<SyncDataEvent>(4096); // coalescible
+Every streamed event carries seq_no.
+UI detects gaps and renders: "Dropped N log/progress events due to backpressure."
+Terminal events (started/completed/failed/cancelled) remain lossless.
```
## 7) Make list pagination truly keyset-driven in state, not just in prose
Rationale: plan text promises windowed keyset paging, but state examples still keep a single list without cursor model. Encode pagination state explicitly.
```diff
@@ 10.10 state/issue_list.rs
-pub items: Vec<IssueListRow>,
+pub window: Vec<IssueListRow>,
+pub next_cursor: Option<IssueCursor>,
+pub prev_cursor: Option<IssueCursor>,
+pub prefetch: Option<Vec<IssueListRow>>,
+pub window_size: usize, // default 200
@@ 5.2 Issue List
-Pagination: Windowed keyset pagination...
+Pagination: Keyset cursor model is first-class state with forward/back cursors and prefetch buffer.
```
## 8) Harden session restore with atomic persistence + integrity checksum
Rationale: versioning/quarantine is good, but you still need crash-safe write semantics and tamper/corruption detection to avoid random boot failures.
```diff
@@ 10.1 New Files
-crates/lore-tui/src/session.rs # Versioned session state persistence + validation + corruption quarantine
+crates/lore-tui/src/session.rs # + atomic write (tmp->fsync->rename), checksum, max-size guard
@@ 11. Assumptions
+Session writes are atomic and checksummed.
+Invalid checksum or oversized file triggers quarantine and fresh boot.
```
## 9) Evolve Doctor from read-only text into actionable remediation
Rationale: your CLI already returns machine-actionable `actions`. TUI should surface those as one-key fixes; this materially increases usefulness.
```diff
@@ 5.11 Doctor / Stats (Info Screens)
-Simple read-only views rendering the output...
+Doctor is interactive:
+ - shows health checks + severity
+ - exposes suggested `actions` from robot-mode errors
+ - Enter runs selected action command (with confirmation modal)
+Stats remains read-only.
```
## 10) Add a Dependency Lens to Issue/MR detail (high-value triage feature)
Rationale: you already have cross-refs + discussions + timeline. A compact dependency panel (blocked-by / blocks / unresolved threads) makes this data operational for prioritization.
```diff
@@ 5.3 Issue Detail
-│ ┌─ Cross-References ─────────────────────────────────────────┐ │
+│ ┌─ Dependency Lens ──────────────────────────────────────────┐ │
+│ │ Blocked by: #1198 (open, stale 9d) │ │
+│ │ Blocks: !458 (opened, 2 unresolved threads) │ │
+│ │ Risk: High (P1 + stale blocker + open MR discussion) │ │
+│ └────────────────────────────────────────────────────────────┘ │
@@ 9.2 Phases
+Dependency Lens (issue/mr detail, computed risk score) :p3e, after p2e, 1d
```
---
If you want, I can next produce a consolidated **“v2.1 patch”** of the PRD with all these edits merged into one coherent updated document structure.

View File

@@ -0,0 +1,264 @@
1. **Fix a critical contradiction in workspace/toolchain isolation**
Rationale: Section `3.2` says `crates/lore-tui` is excluded from the root workspace, but Section `9.1` currently adds it as a member. That inconsistency will cause broken CI/tooling behavior and confusion about whether stable-only workflows remain safe.
```diff
--- a/PRD.md
+++ b/PRD.md
@@ 9.1 Dependency Changes
-# Root Cargo.toml changes
-[workspace]
-members = [".", "crates/lore-tui"]
+# Root Cargo.toml changes
+[workspace]
+members = ["."]
+exclude = ["crates/lore-tui"]
@@
-# Add workspace member (no lore-tui dep, no tui feature)
+# Keep lore-tui EXCLUDED from root workspace (nightly isolation boundary)
@@ 9.3 Phase 0 — Toolchain Gate
-1. `cargo check --all-targets` passes on pinned nightly (TUI crate) and stable (core)
+1. `cargo +stable check --workspace --all-targets` passes for root workspace
+2. `cargo +nightly-2026-02-08 check --manifest-path crates/lore-tui/Cargo.toml --all-targets` passes
```
2. **Replace global loading spinner with per-screen stale-while-revalidate**
Rationale: A single `is_loading` flag causes full-screen flicker and blocked context during quick refreshes. Per-screen load states keep existing data visible while background refresh runs, improving perceived performance and usability.
```diff
--- a/PRD.md
+++ b/PRD.md
@@ 10.10 State Module — Complete
- pub is_loading: bool,
+ pub load_state: ScreenLoadStateMap,
@@
- pub fn set_loading(&mut self, loading: bool) {
- self.is_loading = loading;
- }
+ pub fn set_loading(&mut self, screen: ScreenId, state: LoadState) {
+ self.load_state.insert(screen, state);
+ }
+
+pub enum LoadState {
+ Idle,
+ LoadingInitial,
+ Refreshing, // stale data remains visible
+ Error(String),
+}
@@ 4.4 App — Implementing the Model Trait
- // Loading spinner overlay (while async data is fetching)
- if self.state.is_loading {
- crate::tui::view::common::render_loading(frame, body);
- } else {
- match self.navigation.current() { ... }
- }
+ // Always render screen; show lightweight refresh indicator when needed.
+ match self.navigation.current() { ... }
+ crate::tui::view::common::render_refresh_indicator_if_needed(
+ self.navigation.current(), &self.state.load_state, frame, body
+ );
```
3. **Make `TaskSupervisor` a real scheduler (not just token registry)**
Rationale: Current design declares priority lanes but still dispatches directly with `Cmd::task`, and debounce uses `thread::sleep` per keystroke (wastes worker threads). A bounded scheduler with queued tasks and timer-driven debounce will reduce contention and tail latency.
```diff
--- a/PRD.md
+++ b/PRD.md
@@ 4.5.1 Task Supervisor (Dedup + Cancellation + Priority)
-pub struct TaskSupervisor {
- active: HashMap<TaskKey, Arc<CancelToken>>,
- generation: AtomicU64,
-}
+pub struct TaskSupervisor {
+ active: HashMap<TaskKey, Arc<CancelToken>>,
+ generation: AtomicU64,
+ queue: BinaryHeap<ScheduledTask>,
+ inflight: HashMap<TaskPriority, usize>,
+ limits: TaskLaneLimits, // e.g. Input=4, Navigation=2, Background=1
+}
@@
-// 200ms debounce via cancelable scheduled event (not thread::sleep).
-Cmd::task(move || {
- std::thread::sleep(std::time::Duration::from_millis(200));
- ...
-})
+// Debounce via runtime timer message; no sleeping worker thread.
+self.state.search.debounce_deadline = Some(now + 200ms);
+Cmd::none()
@@ 4.4 update()
+Msg::Tick => {
+ if self.state.search.debounce_expired(now) {
+ return self.dispatch_supervised(TaskKey::Search, TaskPriority::Input, ...);
+ }
+ self.task_supervisor.dispatch_ready(now)
+}
```
4. **Add a sync run ledger for exact “new since sync” navigation**
Rationale: “Since last sync” based on timestamps is ambiguous with partial failures, retries, and clock drift. A lightweight `sync_runs` + `sync_deltas` ledger makes summary-mode drill-down exact and auditable without implementing full resumable checkpoints.
```diff
--- a/PRD.md
+++ b/PRD.md
@@ 5.9 Sync
-- `i` navigates to Issue List pre-filtered to "since last sync"
-- `m` navigates to MR List pre-filtered to "since last sync"
+- `i` navigates to Issue List pre-filtered to `sync_run_id=<last_run>`
+- `m` navigates to MR List pre-filtered to `sync_run_id=<last_run>`
+- Filters are driven by persisted `sync_deltas` rows (exact entity keys changed in run)
@@ 10.1 New Files
+src/core/migrations/00xx_add_sync_run_ledger.sql
@@ New migration (appendix)
+CREATE TABLE sync_runs (
+ id INTEGER PRIMARY KEY,
+ started_at_ms INTEGER NOT NULL,
+ completed_at_ms INTEGER,
+ status TEXT NOT NULL
+);
+CREATE TABLE sync_deltas (
+ sync_run_id INTEGER NOT NULL,
+ entity_kind TEXT NOT NULL,
+ project_id INTEGER NOT NULL,
+ iid INTEGER NOT NULL,
+ change_kind TEXT NOT NULL
+);
+CREATE INDEX idx_sync_deltas_run_kind ON sync_deltas(sync_run_id, entity_kind);
@@ 11 Assumptions
-16. No new SQLite tables needed for v1
+16. Two small v1 tables are added: `sync_runs` and `sync_deltas` for deterministic post-sync UX.
```
5. **Expand the GA index set to match actual filter surface**
Rationale: Current required indexes only cover default sort paths; they do not match common filters like `author`, `assignee`, `reviewer`, `target_branch`, label-based filtering. This will likely miss p95 SLOs at M tier.
```diff
--- a/PRD.md
+++ b/PRD.md
@@ 9.3.1 Required Indexes (GA Blocker)
CREATE INDEX IF NOT EXISTS idx_issues_list_default
ON issues(project_id, state, updated_at DESC, iid DESC);
+CREATE INDEX IF NOT EXISTS idx_issues_author_updated
+ ON issues(project_id, state, author_username, updated_at DESC, iid DESC);
+CREATE INDEX IF NOT EXISTS idx_issues_assignee_updated
+ ON issues(project_id, state, assignee_username, updated_at DESC, iid DESC);
@@
CREATE INDEX IF NOT EXISTS idx_mrs_list_default
ON merge_requests(project_id, state, updated_at DESC, iid DESC);
+CREATE INDEX IF NOT EXISTS idx_mrs_reviewer_updated
+ ON merge_requests(project_id, state, reviewer_username, updated_at DESC, iid DESC);
+CREATE INDEX IF NOT EXISTS idx_mrs_target_updated
+ ON merge_requests(project_id, state, target_branch, updated_at DESC, iid DESC);
+CREATE INDEX IF NOT EXISTS idx_mrs_source_updated
+ ON merge_requests(project_id, state, source_branch, updated_at DESC, iid DESC);
@@
+-- If labels are normalized through join table:
+CREATE INDEX IF NOT EXISTS idx_issue_labels_label_issue ON issue_labels(label, issue_id);
+CREATE INDEX IF NOT EXISTS idx_mr_labels_label_mr ON mr_labels(label, mr_id);
@@ CI enforcement
-asserts that none show `SCAN TABLE` for the primary entity tables
+asserts that none show full scans for primary tables under default filters AND top 8 user-facing filter combinations
```
6. **Add DB schema compatibility preflight (separate from binary compat)**
Rationale: Binary compat (`--compat-version`) does not protect against schema mismatches. Add explicit schema version checks before booting the TUI to avoid runtime SQL errors deep in navigation paths.
```diff
--- a/PRD.md
+++ b/PRD.md
@@ 3.2 Nightly Rust Strategy
-- **Compatibility contract:** Before spawning `lore-tui`, the `lore tui` subcommand runs `lore-tui --compat-version` ...
+- **Compatibility contract:** Before spawning `lore-tui`, `lore tui` validates:
+ 1) binary compat version (`lore-tui --compat-version`)
+ 2) DB schema range (`lore-tui --check-schema <db-path>`)
+If schema is out-of-range, print remediation: `lore migrate`.
@@ 9.3 Phase 0 — Toolchain Gate
+17. Schema preflight test: incompatible DB schema yields actionable error and non-zero exit before entering TUI loop.
```
7. **Refine terminal sanitization to preserve legitimate Unicode while blocking control attacks**
Rationale: Current sanitizer strips zero-width joiners and similar characters, which breaks emoji/grapheme rendering and undermines your own `text_width` goals. Keep benign Unicode, remove only dangerous controls/bidi spoof vectors, and sanitize markdown link targets too.
```diff
--- a/PRD.md
+++ b/PRD.md
@@ 10.4.1 Terminal Safety — Untrusted Text Sanitization
-- Strip bidi overrides ... and zero-width/invisible controls ...
+- Strip ANSI/OSC/control chars and bidi spoof controls.
+- Preserve legitimate grapheme-joining characters (ZWJ/ZWNJ/combining marks) for correct Unicode rendering.
+- Sanitize markdown link targets with strict URL allowlist before rendering clickable links.
@@ safety.rs
- // Strip zero-width and invisible controls
- '\u{200B}' | '\u{200C}' | '\u{200D}' | '\u{FEFF}' | '\u{00AD}' => {}
+ // Preserve grapheme/emoji join behavior; remove only harmful controls.
+ // (ZWJ/ZWNJ/combining marks are retained)
@@ Enforcement rule
- Search result snippets
- Author names and labels
+- Markdown link destinations (scheme + origin validation before render/open)
```
8. **Add key normalization layer for terminal portability**
Rationale: Collision notes are good, but you still need a canonicalization layer because terminals emit different sequences for Alt/Meta/Backspace/Enter variants. This reduces “works in iTerm, broken in tmux/SSH” bugs.
```diff
--- a/PRD.md
+++ b/PRD.md
@@ 8.2 List Screens
**Terminal keybinding safety notes:**
@@
- `Ctrl+M` is NOT used — it collides with `Enter` ...
+
+**Key normalization layer (new):**
+- Introduce `KeyNormalizer` before `interpret_key()`:
+ - normalize Backspace variants (`^H`, `DEL`)
+ - normalize Alt/Meta prefixes
+ - normalize Shift+Tab vs Tab where terminal supports it
+ - normalize kitty/CSI-u enhanced key protocols when present
@@ 9.2 Phases
+ Key normalization integration tests :p5d, after p5c, 1d
+ Terminal profile replay tests :p5e, after p5d, 1d
```
9. **Add deterministic event-trace capture for crash reproduction**
Rationale: Panic logs without recent event context are often insufficient for TUI race bugs. Persist last-N normalized events + active screen + task state snapshot on panic for one-command repro.
```diff
--- a/PRD.md
+++ b/PRD.md
@@ 3.1 Risk Matrix
| Runtime panic leaves user blocked | High | Medium | Panic hook writes crash report, restores terminal, offers fallback CLI command |
+| Hard-to-reproduce input race bugs | Medium | Medium | Persist last 2k normalized events + state hash on panic for deterministic replay |
@@ 10.3 Entry Point / panic hook
- // 2. Write crash dump
+ // 2. Write crash dump + event trace snapshot
+ // Includes: last 2000 normalized events, current screen, in-flight task keys/generations
@@ 10.9.1 Non-Snapshot Tests
+/// Replay captured event trace from panic artifact and assert no panic.
+#[test]
+fn replay_trace_artifact_is_stable() { ... }
```
10. **Do a plan-wide consistency pass on pseudocode contracts**
Rationale: There are internal mismatches that will create implementation churn (`search_request_id` still referenced after replacement, `items` vs `window`, keybinding mismatch `Ctrl+I` vs `Alt+o`). Tightening these now saves real engineering time later.
```diff
--- a/PRD.md
+++ b/PRD.md
@@ 4.4 LoreApp::new
- search_request_id: 0,
+ // dedup generation handled by TaskSupervisor
@@ 8.1 Global
-| `Ctrl+O` | Jump backward in jump list (entity hops) |
-| `Alt+o` | Jump forward in jump list (entity hops) |
+| `Ctrl+O` | Jump backward in jump list (entity hops) |
+| `Alt+o` | Jump forward in jump list (entity hops) |
@@ 10.10 IssueListState
- pub fn selected_item(&self) -> Option<&IssueListRow> {
- self.items.get(self.selected_index)
- }
+ pub fn selected_item(&self) -> Option<&IssueListRow> {
+ self.window.get(self.selected_index)
+ }
```
If you want, I can now produce a single consolidated unified diff patch of the full PRD with these revisions merged end-to-end.

View File

@@ -0,0 +1,211 @@
Below are the strongest revisions Id make. I intentionally avoided anything in your `## Rejected Recommendations`.
1. **Unify commands/keybindings/help/palette into one registry**
Rationale: your plan currently duplicates action definitions across `execute_palette_action`, `ShowCliEquivalent`, help overlay text, and status hints. That will drift quickly and create correctness bugs. A single `CommandRegistry` makes behavior consistent and testable.
```diff
diff --git a/PRD.md b/PRD.md
@@ 4.1 Module Structure
+ commands.rs # Single source of truth for actions, keybindings, CLI equivalents
@@ 4.4 App — Implementing the Model Trait
- fn execute_palette_action(&self, action_id: &str) -> Cmd<Msg> { ... big match ... }
+ fn execute_palette_action(&self, action_id: &str) -> Cmd<Msg> {
+ if let Some(spec) = self.commands.get(action_id) {
+ return self.update(spec.to_msg(self.navigation.current()));
+ }
+ Cmd::none()
+ }
@@ 8. Keybinding Reference
+All keybinding/help/status/palette definitions are generated from `commands.rs`.
+No hardcoded duplicate maps in view/state modules.
```
2. **Replace ad-hoc key flags with explicit input state machine**
Rationale: `pending_go` + `go_prefix_instant` is fragile and already inconsistent with documented behavior. A typed `InputMode` removes edge-case bugs and makes prefix timeout deterministic.
```diff
diff --git a/PRD.md b/PRD.md
@@ 4.4 LoreApp struct
- pending_go: bool,
- go_prefix_instant: Option<std::time::Instant>,
+ input_mode: InputMode, // Normal | Text | Palette | GoPrefix { started_at }
@@ 8.2 List Screens
-| `g` `g` | Jump to top |
+| `g` `g` | Jump to top (current list screen) |
@@ 4.4 interpret_key
- KeyCode::Char('g') => Msg::IssueListScrollToTop
+ KeyCode::Char('g') => Msg::ScrollToTopCurrentScreen
```
3. **Fix TaskSupervisor contract and message schema drift**
Rationale: the plan mixes `request_id` and `generation`, and `TaskKey::Search { generation }` defeats dedup by making every key unique. This can silently reintroduce stale-result races.
```diff
diff --git a/PRD.md b/PRD.md
@@ 4.3 Core Types (Msg)
- SearchRequestStarted { request_id: u64, query: String },
- SearchExecuted { request_id: u64, results: SearchResults },
+ SearchRequestStarted { generation: u64, query: String },
+ SearchExecuted { generation: u64, results: SearchResults },
@@ 4.5.1 Task Supervisor
- Search { generation: u64 },
+ Search,
+ struct TaskStamp { key: TaskKey, generation: u64 }
@@ 10.9.1 Non-Snapshot Tests
- Msg::SearchExecuted { request_id: 3, ... }
+ Msg::SearchExecuted { generation: 3, ... }
```
4. **Add a `Clock` boundary everywhere time is computed**
Rationale: you call `SystemTime::now()` in many query/render paths, causing inconsistent relative-time labels inside one frame and flaky tests. Injected clock gives deterministic rendering and lower per-frame overhead.
```diff
diff --git a/PRD.md b/PRD.md
@@ 4.1 Module Structure
+ clock.rs # Clock trait: SystemClock/FakeClock
@@ 4.4 LoreApp struct
+ clock: Arc<dyn Clock>,
@@ 10.11 action.rs
- let now_ms = std::time::SystemTime::now()...
+ let now_ms = clock.now_ms();
@@ 9.3 Phase 0 success criteria
+19. Relative-time rendering deterministic under FakeClock across snapshot runs.
```
5. **Upgrade text truncation to grapheme-safe width handling**
Rationale: `unicode-width` alone is not enough for safe truncation; it can split grapheme clusters (emoji ZWJ sequences, skin tones, flags). You need width + grapheme segmentation together.
```diff
diff --git a/PRD.md b/PRD.md
@@ 10.1 New Files
-crates/lore-tui/src/text_width.rs # ... using unicode-width crate
+crates/lore-tui/src/text_width.rs # Grapheme-safe width/truncation using unicode-width + unicode-segmentation
@@ 10.1 New Files
+Cargo.toml (lore-tui): unicode-segmentation = "1"
@@ 9.3 Phase 0 success criteria
+20. Unicode rendering tests pass for CJK, emoji ZWJ, combining marks, RTL text.
```
6. **Redact sensitive values in logs and crash dumps**
Rationale: current crash/log strategy risks storing tokens/credentials in plain text. This is a serious operational/security gap for local tooling too.
```diff
diff --git a/PRD.md b/PRD.md
@@ 4.1 Module Structure
safety.rs # sanitize_for_terminal(), safe_url_policy()
+ redact.rs # redact_sensitive() for logs/crash reports
@@ 10.3 install_panic_hook_for_tui
- let _ = std::fs::write(&crash_path, format!("{panic_info:#?}"));
+ let report = redact_sensitive(format!("{panic_info:#?}"));
+ let _ = std::fs::write(&crash_path, report);
@@ 9.3 Phase 0 success criteria
+21. Redaction tests confirm tokens/Authorization headers never appear in persisted crash/log artifacts.
```
7. **Add search capability detection and mode fallback UX**
Rationale: semantic/hybrid mode should not silently degrade when embeddings are absent/stale. Explicit capability state increases trust and avoids “why are results weird?” confusion.
```diff
diff --git a/PRD.md b/PRD.md
@@ 5.6 Search
+Capability-aware modes:
+- If embeddings unavailable/stale, semantic mode is disabled with inline reason.
+- Hybrid mode auto-falls back to lexical and shows badge: "semantic unavailable".
@@ 4.3 Core Types
+ SearchCapabilitiesLoaded(SearchCapabilities)
@@ 9.3 Phase 0 success criteria
+22. Mode availability checks validated: lexical/hybrid/semantic correctly enabled/disabled by fixture capabilities.
```
8. **Define sync cancel latency SLO and enforce fine-grained checks**
Rationale: “check cancel between phases” is too coarse on big projects. Users need fast cancel acknowledgment and bounded stop time.
```diff
diff --git a/PRD.md b/PRD.md
@@ 5.9 Sync
-CANCELLATION: checked between sync phases
+CANCELLATION: checked at page boundaries, batch upsert boundaries, and before each network request.
+UX target: cancel acknowledged <250ms, sync stop p95 <2s after Esc.
@@ 9.3 Phase 0 success criteria
+23. Cancel latency test passes: p95 stop time <2s under M-tier fixtures.
```
9. **Add a “Hotspots” screen for risk/churn triage**
Rationale: this is high-value and uses existing data (events, unresolved discussions, stale items). It makes the TUI more compelling without needing new sync tables or rejected features.
```diff
diff --git a/PRD.md b/PRD.md
@@ 1. Executive Summary
+- **Hotspots** — file/path risk ranking by churn × unresolved discussion pressure × staleness
@@ 5. Screen Taxonomy
+### 5.12 Hotspots
+Shows top risky paths with drill-down to related issues/MRs/timeline.
@@ 8.1 Global
+| `gx` | Go to Hotspots |
@@ 10.1 New Files
+crates/lore-tui/src/state/hotspots.rs
+crates/lore-tui/src/view/hotspots.rs
```
10. **Add degraded startup mode when compat/schema checks fail**
Rationale: hard-exit on mismatch blocks users. A degraded mode that shells to `lore --robot` for read-only summary/doctor keeps the product usable and gives guided recovery.
```diff
diff --git a/PRD.md b/PRD.md
@@ 3.2 Nightly Rust Strategy
- On mismatch: actionable error and exit
+ On mismatch: actionable error with `--degraded` option.
+ `--degraded` launches limited TUI (Dashboard/Doctor/Stats via `lore --robot` subprocess calls).
@@ 10.3 TuiCli
+ /// Allow limited mode when schema/compat checks fail
+ #[arg(long)]
+ degraded: bool,
```
11. **Harden query-plan CI checks (dont rely on `SCAN TABLE` string matching)**
Rationale: SQLite planner text varies by version. Parse opcode structure and assert index usage semantically; otherwise CI will be flaky or miss regressions.
```diff
diff --git a/PRD.md b/PRD.md
@@ 9.3.1 Required Indexes (CI enforcement)
- asserts that none show `SCAN TABLE`
+ parses EXPLAIN QUERY PLAN rows and asserts:
+ - top-level loop uses expected index families
+ - no full scan on primary entity tables under default and top filter combos
+ - join order remains bounded (no accidental cartesian expansions)
```
12. **Enforce single-instance lock for session/state safety**
Rationale: assumption says no concurrent TUI sessions, but accidental double-launch will still happen. Locking prevents state corruption and confusing interleaved sync actions.
```diff
diff --git a/PRD.md b/PRD.md
@@ 10.1 New Files
+crates/lore-tui/src/instance_lock.rs # lock file with stale-lock recovery
@@ 11. Assumptions
-21. No concurrent TUI sessions.
+21. Concurrent sessions unsupported and actively prevented by instance lock (with clear error message).
```
If you want, I can turn this into a consolidated patched PRD (single unified diff) next.

View File

@@ -0,0 +1,198 @@
I reviewed the full PRD end-to-end and avoided all items already listed in `## Rejected Recommendations`.
These are the highest-impact revisions Id make.
1. **Fix keybinding/state-machine correctness gaps (critical)**
The plan currently has an internal conflict: the doc says jump-forward is `Alt+o`, but code sample uses `Ctrl+i` (which collides with `Tab` in many terminals). Also, `g`-prefix timeout depends on `Tick`, but `Tick` isnt guaranteed when idle, so prefix mode can get “stuck.” This is a correctness bug, not polish.
```diff
@@ 8.1 Global (Available Everywhere)
-| `Ctrl+O` | Jump backward in jump list (entity hops) |
-| `Alt+o` | Jump forward in jump list (entity hops) |
+| `Ctrl+O` | Jump backward in jump list (entity hops) |
+| `Alt+o` | Jump forward in jump list (entity hops) |
+| `Backspace` | Go back (when no text input is focused) |
@@ 4.4 LoreApp::interpret_key
- (KeyCode::Char('i'), m) if m.contains(Modifiers::CTRL) => {
- return Some(Msg::JumpForward);
- }
+ (KeyCode::Char('o'), m) if m.contains(Modifiers::ALT) => {
+ return Some(Msg::JumpForward);
+ }
+ (KeyCode::Backspace, Modifiers::NONE) => {
+ return Some(Msg::GoBack);
+ }
@@ 4.4 Model::subscriptions
+ // Go-prefix timeout enforcement must tick even when nothing is loading.
+ if matches!(self.input_mode, InputMode::GoPrefix { .. }) {
+ subs.push(Box::new(
+ Every::with_id(2, Duration::from_millis(50), || Msg::Tick)
+ ));
+ }
```
2. **Make `TaskSupervisor` API internally consistent and enforceable**
The plan uses `submit()`/`is_current()` in one place and `register()`/`next_generation()` in another. That inconsistency will cause implementation drift and stale-result bugs. Use one coherent API with a returned handle containing `{key, generation, cancel_token}`.
```diff
@@ 4.5.1 Task Supervisor (Dedup + Cancellation + Priority)
-pub struct TaskSupervisor {
- active: HashMap<TaskKey, Arc<CancelToken>>,
- generation: AtomicU64,
-}
+pub struct TaskSupervisor {
+ active: HashMap<TaskKey, TaskHandle>,
+}
+
+pub struct TaskHandle {
+ pub key: TaskKey,
+ pub generation: u64,
+ pub cancel: Arc<CancelToken>,
+}
- pub fn register(&mut self, key: TaskKey) -> Arc<CancelToken>
- pub fn next_generation(&self) -> u64
+ pub fn submit(&mut self, key: TaskKey) -> TaskHandle
+ pub fn is_current(&self, key: &TaskKey, generation: u64) -> bool
+ pub fn complete(&mut self, key: &TaskKey, generation: u64)
```
3. **Replace thread-sleep debounce with runtime timer messages**
`std::thread::sleep(200ms)` inside task closures wastes pool threads under fast typing and reduces responsiveness under contention. Use timer-driven debounce messages and only fire the latest generation. This improves latency stability on large datasets.
```diff
@@ 4.3 Core Types (Msg enum)
+ SearchDebounceArmed { generation: u64, query: String },
+ SearchDebounceFired { generation: u64 },
@@ 4.4 maybe_debounced_query
- Cmd::task(move || {
- std::thread::sleep(std::time::Duration::from_millis(200));
- ...
- })
+ // Arm debounce only; runtime timer emits SearchDebounceFired.
+ Cmd::msg(Msg::SearchDebounceArmed { generation, query })
@@ 4.4 subscriptions()
+ if self.state.search.debounce_pending() {
+ subs.push(Box::new(
+ Every::with_id(3, Duration::from_millis(200), || Msg::SearchDebounceFired { generation: ... })
+ ));
+ }
```
4. **Harden `DbManager` API to avoid lock-poison panics and accidental long-held guards**
Returning raw `MutexGuard<Connection>` invites accidental lock scope expansion and `expect("lock poisoned")` panics. Move to closure-based access (`with_reader`, `with_writer`) returning `Result`, and use cached statements. This reduces deadlock risk and tail latency.
```diff
@@ 4.4 DbManager
- pub fn reader(&self) -> MutexGuard<'_, Connection> { ...expect("reader lock poisoned") }
- pub fn writer(&self) -> MutexGuard<'_, Connection> { ...expect("writer lock poisoned") }
+ pub fn with_reader<T>(&self, f: impl FnOnce(&Connection) -> Result<T, LoreError>) -> Result<T, LoreError>
+ pub fn with_writer<T>(&self, f: impl FnOnce(&Connection) -> Result<T, LoreError>) -> Result<T, LoreError>
@@ 10.11 action.rs
- let conn = db.reader();
- match fetch_issues(&conn, &filter) { ... }
+ match db.with_reader(|conn| fetch_issues(conn, &filter)) { ... }
+ // Query hot paths use prepare_cached() to reduce parse overhead.
```
5. **Add read-path entity cache (LRU) for repeated drill-in/out workflows**
Your core daily flow is Enter/Esc bouncing between list/detail. Without caching, identical detail payloads are re-queried repeatedly. A bounded LRU by `EntityKey` with invalidation on sync completion gives near-instant reopen behavior and reduces DB pressure.
```diff
@@ 4.1 Module Structure
+ entity_cache.rs # Bounded LRU cache for detail payloads
@@ app.rs LoreApp fields
+ entity_cache: EntityCache,
@@ load_screen(Screen::IssueDetail / MrDetail)
+ if let Some(cached) = self.entity_cache.get_issue(&key) {
+ return Cmd::msg(Msg::IssueDetailLoaded { key, detail: cached.clone() });
+ }
@@ Msg::IssueDetailLoaded / Msg::MrDetailLoaded handlers
+ self.entity_cache.put_issue(key.clone(), detail.clone());
@@ Msg::SyncCompleted
+ self.entity_cache.invalidate_all();
```
6. **Tighten sync-stream observability and drop semantics without adding heavy architecture**
You already handle backpressure, but operators need visibility when it happens. Track dropped-progress count and max queue depth in state and surface it in running/summary views. This keeps the current simple design while making reliability measurable.
```diff
@@ 4.3 Msg
+ SyncStreamStats { dropped_progress: u64, max_queue_depth: usize },
@@ 5.9 Sync (Running mode footer)
-| Esc cancel f full sync e embed after d dry-run l log level|
+| Esc cancel f full sync e embed after d dry-run l log level stats:drop=12 qmax=1847 |
@@ 9.3 Success criteria
+24. Sync stream stats are emitted and rendered; terminal events (completed/failed/cancelled) delivery is 100% under induced backpressure.
```
7. **Make crash reporting match the promised diagnostic value**
The PRD promises event replay context, but sample hook writes only panic text. Add explicit crash context capture (`last events`, `current screen`, `task handles`, `build id`, `db fingerprint`) and retention policy. This materially improves post-mortem debugging.
```diff
@@ 4.1 Module Structure
+ crash_context.rs # ring buffer of normalized events + task/screen snapshot
@@ 10.3 install_panic_hook_for_tui()
- let report = crate::redact::redact_sensitive(&format!("{panic_info:#?}"));
+ let ctx = crate::crash_context::snapshot();
+ let report = crate::redact::redact_sensitive(&format!("{panic_info:#?}\n{ctx:#?}"));
+ // Retention: keep latest 20 crash files, delete oldest metadata entries only.
```
8. **Add Search Facets panel for faster triage (high-value feature, low risk)**
Search is central, but right now filtering requires manual field edits. Add facet counts (`issues`, `MRs`, `discussions`, top labels/projects/authors) with one-key apply. This makes search more compelling and actionable without introducing schema changes.
```diff
@@ 5.6 Search
-- Layout: Split pane — results list (left) + preview (right)
+- Layout: Three-pane on wide terminals — results (left) + preview (center) + facets (right)
+**Facets panel:**
+- Entity type counts (issue/MR/discussion)
+- Top labels/projects/authors for current query
+- `1/2/3` quick-apply type facet; `l` cycles top label facet
@@ 8.2 List/Search keybindings
+| `1` `2` `3` | Apply facet: Issue / MR / Discussion |
+| `l` | Apply next top-label facet |
```
9. **Strengthen text sanitization for terminal edge cases**
Current sanitizer is strong, but still misses some control-space edge cases (C1 controls, directional marks beyond the listed bidi set). Add those and test them. This closes spoofing/render confusion gaps with minimal complexity.
```diff
@@ 10.4.1 sanitize_for_terminal()
+ // Strip C1 control block (U+0080..U+009F) and additional directional marks
+ c if ('\u{0080}'..='\u{009F}').contains(&c) => {}
+ '\u{200E}' | '\u{200F}' | '\u{061C}' => {} // LRM, RLM, ALM
@@ tests
+ #[test] fn strips_c1_controls() { ... }
+ #[test] fn strips_lrm_rlm_alm() { ... }
```
10. **Add an explicit vertical-slice gate before broad screen expansion**
The plan is comprehensive, but risk is still front-loaded on framework + runtime behavior. Insert a strict vertical slice gate (`Dashboard + IssueList + IssueDetail + Sync running`) with perf and stability thresholds before Phase 3 features. This reduces rework if foundational assumptions break.
```diff
@@ 9.2 Phases
+section Phase 2.5 — Vertical Slice Gate
+Dashboard + IssueList + IssueDetail + Sync (running) integrated :p25a, after p2c, 3d
+Gate: p95 nav latency < 75ms on M tier; zero stuck-input-state bugs; cancel p95 < 2s :p25b, after p25a, 1d
+Only then proceed to Search/Timeline/Who/Palette expansion.
```
If you want, I can produce a full consolidated `diff` block against the entire PRD text (single patch), but the above is the set Id prioritize first.

File diff suppressed because it is too large Load Diff

2075
plans/tui-prd.md Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,157 @@
**Top Revisions I Recommend**
1. **Fix auth semantics + a real inconsistency in your test plan**
Your ACs require graceful handling for `403`, but the test list says the “403” test returns `401`. That hides the exact behavior you care about and can let permission regressions slip through.
```diff
@@ AC-1: GraphQL Client (Unit)
- [ ] HTTP 401 → `LoreError::GitLabAuthFailed`
+ [ ] HTTP 401 → `LoreError::GitLabAuthFailed`
+ [ ] HTTP 403 → `LoreError::GitLabForbidden`
@@ AC-3: Status Fetcher (Integration)
- [ ] GraphQL 403 → returns `Ok(HashMap::new())` with warning log
+ [ ] GraphQL 403 (`GitLabForbidden`) → returns `Ok(HashMap::new())` with warning log
@@ TDD Plan (RED)
- 13. `test_fetch_statuses_403_graceful` — mock returns 401 → `Ok(HashMap::new())`
+ 13. `test_fetch_statuses_403_graceful` — mock returns 403 → `Ok(HashMap::new())`
```
2. **Make enrichment atomic and stale-safe**
Current plan can leave stale status values forever when a widget disappears or status becomes null. Make writes transactional and clear status fields for fetched scope before upserts.
```diff
@@ AC-6: Enrichment in Orchestrator (Integration)
+ [ ] Enrichment DB writes are transactional per project (all-or-nothing)
+ [ ] Status fields are cleared for fetched issue scope before applying new statuses
+ [ ] If enrichment fails mid-project, prior persisted statuses are unchanged (rollback)
@@ File 6: `src/ingestion/orchestrator.rs`
- fn enrich_issue_statuses(...)
+ fn enrich_issue_statuses_txn(...)
+ // BEGIN TRANSACTION
+ // clear status columns for fetched issue scope
+ // apply updates
+ // COMMIT
```
3. **Add transient retry/backoff (429/5xx/network)**
Right now one transient failure loses status enrichment for that sync. Retrying with bounded backoff gives much better reliability at low cost.
```diff
@@ AC-1: GraphQL Client (Unit)
+ [ ] Retries 429/502/503/504/network errors with bounded exponential backoff + jitter (max 3 attempts)
+ [ ] Honors `Retry-After` on 429 before retrying
@@ AC-6: Enrichment in Orchestrator (Integration)
+ [ ] Cancellation signal is checked before each retry sleep and between paginated calls
```
4. **Stop full GraphQL scans when nothing changed**
Running full pagination on every sync will dominate runtime on large repos. Trigger enrichment only when issue ingestion reports changes, with a manual override.
```diff
@@ AC-6: Enrichment in Orchestrator (Integration)
- [ ] Runs on every sync (not gated by `--full`)
+ [ ] Runs when issue ingestion changed at least one issue in the project
+ [ ] New override flag `--refresh-status` forces enrichment even with zero issue deltas
+ [ ] Optional periodic full refresh (e.g. every N syncs) to prevent long-tail drift
```
5. **Do not expose raw token via `client.token()`**
Architecturally cleaner and safer: keep token encapsulated and expose a GraphQL-ready client factory from `GitLabClient`.
```diff
@@ File 13: `src/gitlab/client.rs`
- pub fn token(&self) -> &str
+ pub fn graphql_client(&self) -> crate::gitlab::graphql::GraphqlClient
@@ File 6: `src/ingestion/orchestrator.rs`
- let graphql_client = GraphqlClient::new(&config.gitlab.base_url, client.token());
+ let graphql_client = client.graphql_client();
```
6. **Add indexes for new status filters**
`--status` on large tables will otherwise full-scan `issues`. Add compound indexes aligned with project-scoped list queries.
```diff
@@ AC-4: Migration 021 (Unit)
+ [ ] Adds index `idx_issues_project_status_name(project_id, status_name)`
+ [ ] Adds index `idx_issues_project_status_category(project_id, status_category)`
@@ File 14: `migrations/021_work_item_status.sql`
ALTER TABLE issues ADD COLUMN status_name TEXT;
ALTER TABLE issues ADD COLUMN status_category TEXT;
ALTER TABLE issues ADD COLUMN status_color TEXT;
ALTER TABLE issues ADD COLUMN status_icon_name TEXT;
+CREATE INDEX IF NOT EXISTS idx_issues_project_status_name
+ ON issues(project_id, status_name);
+CREATE INDEX IF NOT EXISTS idx_issues_project_status_category
+ ON issues(project_id, status_category);
```
7. **Improve filter UX: add category filter + case-insensitive status**
Case-sensitive exact name matches are brittle with custom lifecycle names. Category filter is stable and useful for automation.
```diff
@@ AC-9: List Issues Filter (E2E)
- [ ] Filter is case-sensitive (matches GitLab's exact status name)
+ [ ] `--status` uses case-insensitive exact match by default (`COLLATE NOCASE`)
+ [ ] New filter `--status-category` supports `triage|to_do|in_progress|done|canceled`
+ [ ] `--status-exact` enables strict case-sensitive behavior when needed
```
8. **Add capability probe/cache to avoid pointless calls**
Free tier / old GitLab versions will never return status widget. Cache that capability per project (with TTL) to reduce noise and wasted requests.
```diff
@@ GitLab API Constraints
+### Capability Probe
+On first sync per project, detect status-widget support and cache result for 24h.
+If unsupported, skip enrichment silently (debug log) until TTL expiry.
@@ AC-3: Status Fetcher (Integration)
+ [ ] Unsupported capability state bypasses GraphQL fetch and warning spam
```
9. **Use a nested robot `status` object instead of 4 top-level fields**
This is cleaner schema design and scales better as status metadata grows (IDs, lifecycle, timestamps, etc.).
```diff
@@ AC-7: Show Issue Display (Robot)
- [ ] JSON includes `status_name`, `status_category`, `status_color`, `status_icon_name` fields
- [ ] Fields are `null` (not absent) when status not available
+ [ ] JSON includes `status` object:
+ `{ "name": "...", "category": "...", "color": "...", "icon_name": "..." }` or `null`
@@ AC-8: List Issues Display (Robot)
- [ ] JSON includes `status_name`, `status_category` fields on each issue
+ [ ] JSON includes `status` object (or `null`) on each issue
```
10. **Add one compelling feature: status analytics, not just status display**
Right now this is mostly a transport/display enhancement. Make it genuinely useful with “stale in-progress” detection and age-in-status filters.
```diff
@@ Acceptance Criteria
+### AC-11: Status Aging & Triage Value (E2E)
+- [ ] `lore list issues --status-category in_progress --stale-days 14` filters to stale work
+- [ ] Human table shows `Status Age` (days) when status exists
+- [ ] Robot output includes `status_age_days` (nullable integer)
```
11. **Harden test plan around failure modes youll actually hit**
The current tests are good, but miss rollback/staleness/retry behavior that drives real reliability.
```diff
@@ TDD Plan (RED) additions
+21. `test_enrich_clears_removed_status`
+22. `test_enrich_transaction_rolls_back_on_failure`
+23. `test_graphql_retry_429_then_success`
+24. `test_graphql_retry_503_then_success`
+25. `test_cancel_during_backoff_aborts_cleanly`
+26. `test_status_filter_query_uses_project_status_index` (EXPLAIN smoke test)
```
If you want, I can produce a fully revised v3 plan document end-to-end (frontmatter + reordered ACs + updated file list + updated TDD matrix) so it is ready to implement directly.

View File

@@ -0,0 +1,159 @@
Your plan is already strong, but Id revise it in 10 places to reduce risk at scale and make it materially more useful.
1. Shared transport + retries for GraphQL (must-have)
Reasoning: `REST` already has throttling/retry in `src/gitlab/client.rs`; your proposed GraphQL client would bypass that and can spike rate limits under concurrent project ingest (`src/cli/commands/ingest.rs`). Unifying transport prevents split behavior and cuts production incidents.
```diff
@@ AC-1: GraphQL Client (Unit)
- [ ] Network error → `LoreError::Other`
+ [ ] GraphQL requests use shared GitLab transport (same timeout, rate limiter, retry policy as REST)
+ [ ] Retries 429/502/503/504/network errors (max 3) with exponential backoff + jitter
+ [ ] 429 honors `Retry-After` before retrying
+ [ ] Exhausted network retries → `LoreError::GitLabNetworkError`
@@ Decisions
- 8. **No retry/backoff in v1** — DEFER.
+ 8. **Retry/backoff in v1** — YES (shared REST+GraphQL reliability policy).
@@ Implementation Detail
+ File 15: `src/gitlab/transport.rs` (NEW) — shared HTTP execution and retry/backoff policy.
```
2. Capability cache for unsupported projects (must-have)
Reasoning: Free tier / older GitLab will repeatedly emit warning noise every sync and waste calls. Cache support status per project and re-probe on TTL.
```diff
@@ AC-6: Enrichment in Orchestrator (Integration)
- [ ] On any GraphQL error: logs warning, continues to next project (never fails the sync)
+ [ ] Unsupported capability responses (missing endpoint/type/widget) are cached per project
+ [ ] While cached unsupported, enrichment is skipped without repeated warning spam
+ [ ] Capability cache auto-expires (default 24h) and is re-probed
@@ Migration Numbering
- This feature uses **migration 021**.
+ This feature uses **migrations 021-022**.
@@ Files Changed (Summary)
+ `migrations/022_project_capabilities.sql` | NEW — support cache table for project capabilities
```
3. Delta-first enrichment with periodic full reconcile (must-have)
Reasoning: Full GraphQL scan every sync is expensive for large projects. You already compute issue deltas in ingestion; use that as fast path and keep a periodic full sweep as safety net.
```diff
@@ AC-6: Enrichment in Orchestrator (Integration)
- [ ] Runs on every sync (not gated by `--full`)
+ [ ] Fast path: skip status enrichment when issue ingestion upserted 0 issues for that project
+ [ ] Safety net: run full reconciliation every `status_full_reconcile_hours` (default 24)
+ [ ] `--full` always forces reconciliation
@@ AC-5: Config Toggle (Unit)
+ [ ] `SyncConfig` has `status_full_reconcile_hours: u32` (default 24)
```
4. Strongly typed widget parsing via `__typename` (must-have)
Reasoning: current “deserialize arbitrary widget JSON into `StatusWidget`” is fragile. Query/type by `__typename` for forward compatibility and fewer silent parse mistakes.
```diff
@@ AC-3: Status Fetcher (Integration)
- [ ] Extracts status from `widgets` array by matching `WorkItemWidgetStatus` fragment
+ [ ] Query includes `widgets { __typename ... }` and parser matches `__typename == "WorkItemWidgetStatus"`
+ [ ] Non-status widgets are ignored deterministically (no heuristic JSON-deserialize attempts)
@@ GraphQL Query
+ widgets {
+ __typename
+ ... on WorkItemWidgetStatus { ... }
+ }
```
5. Set-based transactional DB apply (must-have)
Reasoning: row-by-row clear/update loops will be slow on large projects and hold write locks longer. Temp-table + set-based SQL inside one txn is faster and easier to reason about rollback.
```diff
@@ AC-3: Status Fetcher (Integration)
- `all_fetched_iids: Vec<i64>`
+ `all_fetched_iids: HashSet<i64>`
@@ AC-6: Enrichment in Orchestrator (Integration)
- [ ] Before applying updates, NULL out status fields ... (loop per IID)
- [ ] UPDATE SQL: `SET status_name=?, ... WHERE project_id=? AND iid=?`
+ [ ] Use temp tables and set-based SQL in one transaction:
+ [ ] (1) clear stale statuses for fetched IIDs absent from status rows
+ [ ] (2) apply status values for fetched IIDs with status
+ [ ] One commit per project; rollback leaves prior state intact
```
6. Fix index strategy for `COLLATE NOCASE` + default sorting (must-have)
Reasoning: your proposed `(project_id, status_name)` index may not fully help `COLLATE NOCASE` + `ORDER BY updated_at`. Tune index to real query shape in `src/cli/commands/list.rs`.
```diff
@@ AC-4: Migration 021 (Unit)
- [ ] Adds compound index `idx_issues_project_status_name(project_id, status_name)` for `--status` filter performance
+ [ ] Adds covering NOCASE-aware index:
+ [ ] `idx_issues_project_status_name_nocase_updated(project_id, status_name COLLATE NOCASE, updated_at DESC)`
+ [ ] Adds category index:
+ [ ] `idx_issues_project_status_category_nocase(project_id, status_category COLLATE NOCASE)`
```
7. Add stable/automation-friendly filters now (high-value feature)
Reasoning: status names are user-customizable and renameable; category is more stable. Also add `--no-status` for quality checks and migration visibility.
```diff
@@ AC-9: List Issues Filter (E2E)
+ [ ] `lore list issues --status-category in_progress` filters by category (case-insensitive)
+ [ ] `lore list issues --no-status` returns only issues where `status_name IS NULL`
+ [ ] `--status` + `--status-category` combine with AND logic
@@ File 9: `src/cli/mod.rs`
+ Add flags: `--status-category`, `--no-status`
@@ File 11: `src/cli/autocorrect.rs`
+ Register `--status-category` and `--no-status` for `issues`
```
8. Better enrichment observability and failure accounting (must-have ops)
Reasoning: only tracking `statuses_enriched` hides skipped/cleared/errors, and auth failures become silent partial data quality issues. Add counters and explicit progress events.
```diff
@@ AC-6: Enrichment in Orchestrator (Integration)
- [ ] `IngestProjectResult` gains `statuses_enriched: usize` counter
- [ ] Progress event: `ProgressEvent::StatusEnrichmentComplete { enriched: usize }`
+ [ ] `IngestProjectResult` gains:
+ [ ] `statuses_enriched`, `statuses_cleared`, `status_enrichment_skipped`, `status_enrichment_failed`
+ [ ] Progress events:
+ [ ] `StatusEnrichmentStarted`, `StatusEnrichmentSkipped`, `StatusEnrichmentComplete`, `StatusEnrichmentFailed`
+ [ ] End-of-sync summary includes per-project enrichment outcome counts
```
9. Add `status_changed_at` for immediately useful workflow analytics (high-value feature)
Reasoning: without change timestamp, you cant answer “how long has this been in progress?” which is one of the most useful agent/human queries.
```diff
@@ AC-4: Migration 021 (Unit)
+ [ ] Adds nullable INTEGER column `status_changed_at` (ms epoch UTC)
@@ AC-6: Enrichment in Orchestrator (Integration)
+ [ ] If status_name/category changes, update `status_changed_at = now_ms()`
+ [ ] If status is cleared, set `status_changed_at = NULL`
@@ AC-9: List Issues Filter (E2E)
+ [ ] `lore list issues --stale-status-days N` filters by `status_changed_at <= now - N days`
```
10. Expand test matrix for real-world failure/perf paths (must-have)
Reasoning: current tests are good, but the highest-risk failures are retry behavior, capability caching, idempotency under repeated runs, and large-project performance.
```diff
@@ TDD Plan — RED Phase
+ 26. `test_graphql_retries_429_with_retry_after_then_succeeds`
+ 27. `test_graphql_retries_503_then_fails_after_max_attempts`
+ 28. `test_capability_cache_skips_unsupported_project_until_ttl_expiry`
+ 29. `test_delta_skip_when_no_issue_upserts`
+ 30. `test_periodic_full_reconcile_runs_after_threshold`
+ 31. `test_set_based_enrichment_scales_10k_issues_without_timeout`
+ 32. `test_enrichment_idempotent_across_two_runs`
+ 33. `test_status_changed_at_updates_only_on_actual_status_change`
```
If you want, I can now produce a single consolidated revised plan document (full rewritten Markdown) with these changes merged in-place so its ready to execute.

View File

@@ -0,0 +1,124 @@
Your plan is already strong and implementation-aware. The best upgrades are mostly about reliability under real-world API instability, large-scale performance, and making the feature more useful for automation.
1. Promote retry/backoff from deferred to in-scope now.
Reason: Right now, transient failures cause silent status gaps until a later sync. Bounded retries with jitter and a time budget dramatically improve successful enrichment without making syncs hang.
```diff
@@ AC-1: GraphQL Client (Unit) @@
- [ ] Network error → `LoreError::Other`
+ [ ] Transient failures (`429`, `502`, `503`, `504`, timeout, connect reset) retry with exponential backoff + jitter (max 3 attempts)
+ [ ] `Retry-After` supports both delta-seconds and HTTP-date formats
+ [ ] Per-request retry budget capped (e.g. 120s total) to preserve cancellation responsiveness
@@ AC-6: Enrichment in Orchestrator (Integration) @@
- [ ] On any GraphQL error: logs warning, continues to next project (never fails the sync)
+ [ ] On transient GraphQL errors: retry policy applied before warning/skip behavior
@@ Decisions @@
- 8. **No retry/backoff in v1** — DEFER.
+ 8. **Retry/backoff in v1** — YES. Required for reliable enrichment under normal GitLab/API turbulence.
```
2. Add a capability cache so unsupported projects stop paying repeated GraphQL cost.
Reason: Free tier / older instances will never return status widgets. Re-querying every sync is wasted time and noisy logs.
```diff
@@ Acceptance Criteria @@
+ ### AC-11: Capability Probe & Cache (Integration)
+ - [ ] Add `project_capabilities` cache with `supports_work_item_status`, `checked_at`, `cooldown_until`
+ - [ ] 404/403/known-unsupported responses update capability cache and suppress repeated warnings until TTL expires
+ - [ ] Supported projects still enrich every run (subject to normal schedule)
@@ Future Enhancements (Not in Scope) @@
- **Capability probe/cache**: Detect status-widget support per project ... (deferred)
+ (moved into scope as AC-11)
```
3. Make enrichment delta-aware with periodic forced reconciliation.
Reason: Full pagination every sync is expensive on large projects. You can skip unnecessary status fetches when no issue changes occurred, while still doing periodic safety sweeps.
```diff
@@ AC-6: Enrichment in Orchestrator (Integration) @@
- [ ] Runs on every sync (not gated by `--full`)
+ [ ] Runs when issue ingestion reports project issue deltas OR reconcile window elapsed
+ [ ] New config: `status_reconcile_hours` (default: 24) for periodic full sweep
+ [ ] `--refresh-status` forces enrichment regardless of delta/reconcile window
```
4. Replace row-by-row update loops with set-based SQL via temp table.
Reason: Current per-IID loops are simple but slow at scale and hold locks longer. Set-based updates are much faster and reduce lock contention.
```diff
@@ File 6: `src/ingestion/orchestrator.rs` (MODIFY) @@
- for iid in all_fetched_iids { ... UPDATE issues ... }
- for (iid, status) in statuses { ... UPDATE issues ... }
+ CREATE TEMP TABLE temp_issue_status_updates(...)
+ bulk INSERT temp rows (iid, name, category, color, icon_name)
+ single set-based UPDATE for enriched rows
+ single set-based NULL-clear for fetched-without-status rows
+ commit transaction
```
5. Add strict mode and explicit partial-failure reporting.
Reason: “Warn and continue” is good default UX, but automation needs a fail-fast option and machine-readable failure output.
```diff
@@ AC-5: Config Toggle (Unit) @@
+ - [ ] `SyncConfig` adds `status_enrichment_strict: bool` (default false)
@@ AC-6: Enrichment in Orchestrator (Integration) @@
- [ ] On any GraphQL error: logs warning, continues to next project (never fails the sync)
+ [ ] Default mode: warn + continue
+ [ ] Strict mode: status enrichment error fails sync for that run
@@ AC-6: IngestProjectResult @@
+ - [ ] Adds `status_enrichment_error: Option<String>`
@@ AC-8 / Robot sync envelope @@
+ - [ ] Robot output includes `partial_failures` array with per-project enrichment failures
```
6. Fix case-insensitive matching robustness and track freshness.
Reason: SQLite `COLLATE NOCASE` is ASCII-centric; custom statuses may be non-ASCII. Also you need visibility into staleness.
```diff
@@ AC-4: Migration 021 (Unit) @@
- [ ] Migration adds 4 nullable TEXT columns to `issues`
+ [ ] Migration adds 6 columns:
+ `status_name`, `status_category`, `status_color`, `status_icon_name`,
+ `status_name_fold`, `status_synced_at`
- [ ] Adds compound index `idx_issues_project_status_name(project_id, status_name)`
+ [ ] Adds compound index `idx_issues_project_status_name_fold(project_id, status_name_fold)`
@@ AC-9: List Issues Filter (E2E) @@
- [ ] Filter uses case-insensitive matching (`COLLATE NOCASE`)
+ [ ] Filter uses `status_name_fold` (Unicode-safe fold normalization done at write time)
```
7. Expand filtering to category and missing-status workflows.
Reason: Name filters are useful, but automation is better on semantic categories and “missing data” detection.
```diff
@@ AC-9: List Issues Filter (E2E) @@
+ - [ ] `--status-category in_progress` filters by `status_category` (case-insensitive)
+ - [ ] `--no-status` returns only issues where `status_name IS NULL`
+ - [ ] `--status` and `--status-category` can be combined with AND logic
```
8. Change robot payload from flat status fields to a nested `status` object.
Reason: Better schema evolution and less top-level field sprawl as you add metadata (`synced_at`, future lifecycle fields).
```diff
@@ AC-7: Show Issue Display (E2E) @@
- [ ] JSON includes `status_name`, `status_category`, `status_color`, `status_icon_name` fields
- [ ] Fields are `null` (not absent) when status not available
+ [ ] JSON includes `status` object:
+ `{ "name", "category", "color", "icon_name", "synced_at" }`
+ [ ] `status: null` when not available
@@ AC-8: List Issues Display (E2E) @@
- [ ] `--fields` supports: `status_name`, `status_category`, `status_color`, `status_icon_name`
+ [ ] `--fields` supports: `status.name,status.category,status.color,status.icon_name,status.synced_at`
```
If you want, I can produce a fully rewritten “Iteration 5” plan document with these changes integrated end-to-end (ACs, files, migrations, TDD batches, and updated decisions/future-scope).

View File

@@ -0,0 +1,130 @@
Your iteration-5 plan is strong. The biggest remaining gaps are outcome ambiguity, cancellation safety, and long-term status identity. These are the revisions Id make.
1. **Make enrichment outcomes explicit (not “empty success”)**
Analysis:
Right now `404/403 -> Ok(empty)` is operationally ambiguous: “project has no statuses” vs “feature unavailable/auth issue.” Agents and dashboards need that distinction to make correct decisions.
This improves reliability and observability without making sync fail-hard.
```diff
@@ AC-3: Status Fetcher (Integration)
-- [ ] `fetch_issue_statuses()` returns `FetchStatusResult` containing:
+- [ ] `fetch_issue_statuses()` returns `FetchStatusOutcome`:
+ - `Fetched(FetchStatusResult)`
+ - `Unsupported { reason: UnsupportedReason }`
+ - `CancelledPartial(FetchStatusResult)`
@@
-- [ ] GraphQL 404 → returns `Ok(FetchStatusResult)` with empty collections + warning log
-- [ ] GraphQL 403 (`GitLabAuthFailed`) → returns `Ok(FetchStatusResult)` with empty collections + warning log
+- [ ] GraphQL 404 → `Unsupported { reason: GraphqlEndpointMissing }` + warning log
+- [ ] GraphQL 403 (`GitLabAuthFailed`) → `Unsupported { reason: AuthForbidden }` + warning log
@@ AC-10: Robot Sync Envelope (E2E)
-- [ ] `status_enrichment` object: `{ "enriched": N, "cleared": N, "error": null | "message" }`
+- [ ] `status_enrichment` object: `{ "mode": "fetched|unsupported|cancelled_partial", "reason": null|"...", "enriched": N, "cleared": N, "error": null|"message" }`
```
2. **Add cancellation and pagination loop safety**
Analysis:
Large projects can run long. Current flow checks cancellation only before enrichment starts; pagination and per-row update loops can ignore cancellation for too long. Also, GraphQL cursor bugs can create infinite loops (`hasNextPage=true` with unchanged cursor).
This is a robustness must-have.
```diff
@@ AC-3: Status Fetcher (Integration)
+ [ ] `fetch_issue_statuses()` accepts cancellation signal and checks it between page requests
+ [ ] Pagination guard: if `hasNextPage=true` but `endCursor` is `None` or unchanged, abort loop with warning and return partial outcome
+ [ ] Emits `pages_fetched` count for diagnostics
@@ File 1: `src/gitlab/graphql.rs`
-- pub async fn fetch_issue_statuses(client: &GraphqlClient, project_path: &str) -> Result<FetchStatusResult>
+- pub async fn fetch_issue_statuses(client: &GraphqlClient, project_path: &str, signal: &CancellationSignal) -> Result<FetchStatusOutcome>
```
3. **Persist stable `status_id` in addition to name**
Analysis:
`status_name` is display-oriented and mutable (rename/custom lifecycle changes). A stable status identifier is critical for durable automations, analytics, and future migrations.
This is a schema decision that is cheap now and expensive later if skipped.
```diff
@@ AC-2: Status Types (Unit)
-- [ ] `WorkItemStatus` struct has `name`, `category`, `color`, `icon_name`
+- [ ] `WorkItemStatus` struct has `id: String`, `name`, `category`, `color`, `icon_name`
@@ AC-4: Migration 021 (Unit)
-- [ ] Migration adds 5 nullable columns to `issues`: `status_name`, `status_category`, `status_color`, `status_icon_name`, `status_synced_at`
+- [ ] Migration adds 6 nullable columns to `issues`: `status_id`, `status_name`, `status_category`, `status_color`, `status_icon_name`, `status_synced_at`
+ [ ] Adds index `idx_issues_project_status_id(project_id, status_id)` for stable-machine filters
@@ GraphQL query
- status { name category color iconName }
+ status { id name category color iconName }
@@ AC-7 / AC-8 Robot
+ [ ] JSON includes `status_id` (null when unavailable)
```
4. **Handle GraphQL partial-data responses correctly**
Analysis:
GraphQL can return both `data` and `errors` in the same response. Current plan treats any `errors` as hard failure, which can discard valid data and reduce reliability.
Use partial-data semantics: keep data, log/report warnings.
```diff
@@ AC-1: GraphQL Client (Unit)
-- [ ] Error response: if top-level `errors` array is non-empty, returns `LoreError` with first error message
+- [ ] If `errors` non-empty and `data` missing: return `LoreError` with first error message
+- [ ] If `errors` non-empty and `data` present: return `data` + warning metadata (do not fail the whole fetch)
@@ TDD Plan (RED)
+ 33. `test_graphql_partial_data_with_errors_returns_data_and_warning`
```
5. **Extract status enrichment from orchestrator into dedicated module**
Analysis:
`orchestrator.rs` already has many phases. Putting status transport/parsing/transaction policy directly there increases coupling and test friction.
A dedicated module improves architecture clarity and makes future enhancements safer.
```diff
@@ Implementation Detail
+- File 15: `src/ingestion/enrichment/status.rs` (NEW)
+ - `run_status_enrichment(...)`
+ - `enrich_issue_statuses_txn(...)`
+ - outcome mapping + telemetry
@@ File 6: `src/ingestion/orchestrator.rs`
-- Inline Phase 1.5 logic + helper function
+- Delegates to `enrichment::status::run_status_enrichment(...)` and records returned stats
```
6. **Add status/state consistency checks**
Analysis:
GitLab states status categories and issue state should synchronize, but ingestion drift or API edge cases can violate this. Detecting mismatch is high-signal for data integrity issues.
This is compelling for agents because it catches “looks correct but isnt” problems.
```diff
@@ AC-6: Enrichment in Orchestrator (Integration)
+ [ ] Enrichment computes `status_state_mismatches` count:
+ - `DONE|CANCELED` with `state=open` or `TO_DO|IN_PROGRESS|TRIAGE` with `state=closed`
+ [ ] Logs warning summary when mismatches > 0
@@ AC-10: Robot Sync Envelope (E2E)
+ [ ] `status_enrichment` includes `state_mismatches: N`
```
7. **Add explicit performance envelope acceptance criterion**
Analysis:
Plan claims large-project handling, but no hard validation target is defined. Add a bounded, reproducible performance criterion to prevent regressions.
This is especially important with pagination + per-row writes.
```diff
@@ Acceptance Criteria
+ ### AC-12: Performance Envelope (Integration)
+ - [ ] 10k-issue fixture completes status fetch + apply within defined budget on CI baseline machine
+ - [ ] Memory usage remains O(page_size), not O(total_issues)
+ - [ ] Cancellation during large sync exits within a bounded latency target
@@ TDD Plan (RED)
+ 34. `test_enrichment_large_project_budget`
+ 35. `test_fetch_statuses_memory_bound_by_page`
+ 36. `test_cancellation_latency_during_pagination`
```
If you want, I can next produce a single consolidated “iteration 6” plan draft with these diffs fully merged so its ready to execute.

View File

@@ -0,0 +1,118 @@
**Highest-Impact Revisions (new, not in your rejected list)**
1. **Critical: Preserve GraphQL partial-error metadata end-to-end (dont just log it)**
Rationale: Right now partial GraphQL errors are warning-only. Agents get no machine-readable signal that status data may be incomplete, which can silently corrupt downstream automation decisions. Exposing partial-error metadata in `FetchStatusResult` and robot sync output makes reliability observable and actionable.
```diff
@@ AC-1: GraphQL Client (Unit)
- [ ] Partial-data response: if `errors` array is non-empty BUT `data` field is present and non-null, returns `data` and logs warning with first error message
+ [ ] Partial-data response: if `errors` array is non-empty BUT `data` field is present and non-null, returns `data` and warning metadata (`had_errors=true`, `first_error_message`)
+ [ ] `GraphqlClient::query()` returns `GraphqlQueryResult { data, had_errors, first_error_message }`
@@ AC-3: Status Fetcher (Integration)
+ [ ] `FetchStatusResult` includes `partial_error_count: usize` and `first_partial_error: Option<String>`
+ [ ] Partial GraphQL errors increment `partial_error_count` and are surfaced to orchestrator result
@@ AC-10: Robot Sync Envelope (E2E)
- { "mode": "...", "reason": ..., "enriched": N, "cleared": N, "error": ... }
+ { "mode": "...", "reason": ..., "enriched": N, "cleared": N, "error": ..., "partial_errors": N, "first_partial_error": null|"..." }
@@ File 1: src/gitlab/graphql.rs
- pub async fn query(...) -> Result<serde_json::Value>
+ pub async fn query(...) -> Result<GraphqlQueryResult>
+ pub struct GraphqlQueryResult { pub data: serde_json::Value, pub had_errors: bool, pub first_error_message: Option<String> }
```
2. **High: Add adaptive page-size fallback for GraphQL complexity/timeout failures**
Rationale: Fixed `first: 100` is brittle on self-hosted instances with stricter complexity/time limits. Adaptive page size (100→50→25→10) improves success rate without retries/backoff and avoids failing an entire project due to one tunable server constraint.
```diff
@@ Query Path
-query($projectPath: ID!, $after: String) { ... workItems(types: [ISSUE], first: 100, after: $after) ... }
+query($projectPath: ID!, $after: String, $first: Int!) { ... workItems(types: [ISSUE], first: $first, after: $after) ... }
@@ AC-3: Status Fetcher (Integration)
+ [ ] Starts with `first=100`; on GraphQL complexity/timeout errors, retries same cursor with smaller page size (50, 25, 10)
+ [ ] If smallest page size still fails, returns error as today
+ [ ] Emits warning including page size downgrade event
@@ TDD Plan (RED)
+ 36. `test_fetch_statuses_complexity_error_reduces_page_size`
+ 37. `test_fetch_statuses_timeout_error_reduces_page_size`
```
3. **High: Make project path lookup failure non-fatal for the sync**
Rationale: Enrichment is optional. If `projects.path_with_namespace` lookup fails for any reason, sync should continue with a structured enrichment error instead of risking full project pipeline failure.
```diff
@@ AC-6: Enrichment in Orchestrator (Integration)
+ [ ] If project path lookup fails/missing, status enrichment is skipped for that project, warning logged, and sync continues
+ [ ] `status_enrichment_error` captures `"project_path_missing"` (or DB error text)
@@ File 6: src/ingestion/orchestrator.rs
- let project_path: String = conn.query_row(...)?;
+ let project_path = conn.query_row(...).optional()?;
+ if project_path.is_none() {
+ result.status_enrichment_error = Some("project_path_missing".to_string());
+ result.status_enrichment_mode = "fetched".to_string(); // attempted but unavailable locally
+ emit(ProgressEvent::StatusEnrichmentComplete { enriched: 0, cleared: 0 });
+ // continue to discussion sync
+ }
```
4. **Medium: Upgrade `--status` from single-value to repeatable multi-value filter**
Rationale: Practical usage often needs “active buckets” (`To do` OR `In progress`). Repeatable `--status` with OR semantics dramatically improves usefulness without adding new conceptual surface area.
```diff
@@ AC-9: List Issues Filter (E2E)
- [ ] `lore list issues --status "In progress"` → only issues where `status_name = 'In progress'`
+ [ ] `lore list issues --status "In progress"` → unchanged single-value behavior
+ [ ] Repeatable flags supported: `--status "In progress" --status "To do"` (OR semantics across status values)
+ [ ] Repeated `--status` remains AND-composed with other filters
@@ File 9: src/cli/mod.rs
- pub status: Option<String>,
+ pub status: Vec<String>, // repeatable flag
@@ File 8: src/cli/commands/list.rs
- if let Some(status) = filters.status { where_clauses.push("i.status_name = ? COLLATE NOCASE"); ... }
+ if !filters.statuses.is_empty() { /* dynamic OR/IN clause with case-insensitive matching */ }
```
5. **Medium: Add coverage telemetry (`seen`, `with_status`, `without_status`)**
Rationale: `enriched`/`cleared` alone is not enough to judge enrichment health. Coverage counters make it obvious whether a project truly has no statuses, is unsupported, or has unexpectedly low status population.
```diff
@@ AC-6: Enrichment in Orchestrator (Integration)
+ [ ] `IngestProjectResult` gains `statuses_seen: usize` and `statuses_without_widget: usize`
+ [ ] Enrichment log includes `seen`, `enriched`, `cleared`, `without_widget`
@@ AC-10: Robot Sync Envelope (E2E)
- status_enrichment: { mode, reason, enriched, cleared, error }
+ status_enrichment: { mode, reason, seen, enriched, cleared, without_widget, error, partial_errors }
@@ File 6: src/ingestion/orchestrator.rs
+ result.statuses_seen = fetch_result.all_fetched_iids.len();
+ result.statuses_without_widget = result.statuses_seen.saturating_sub(result.statuses_enriched);
```
6. **Medium: Centralize color parsing/render decisions (single helper used by show/list)**
Rationale: Color parsing is duplicated in `show.rs` and `list.rs`, which invites drift and inconsistent behavior. One shared helper gives consistent fallback behavior and simpler tests.
```diff
@@ File 7: src/cli/commands/show.rs
- fn style_with_hex(...) { ...hex parse logic... }
+ use crate::cli::commands::color::style_with_hex;
@@ File 8: src/cli/commands/list.rs
- fn colored_cell_hex(...) { ...hex parse logic... }
+ use crate::cli::commands::color::colored_cell_hex;
@@ Files Changed (Summary)
+ `src/cli/commands/color.rs` (NEW) — shared hex parsing + styling helpers
- duplicated hex parsing blocks removed from show/list
```
---
If you want, I can produce a **single consolidated patch-style diff of the plan document itself** (all section edits merged, ready to paste as iteration 7).

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -61,6 +61,7 @@ const COMMAND_FLAGS: &[(&str, &[&str])] = &[
"--assignee", "--assignee",
"--label", "--label",
"--milestone", "--milestone",
"--status",
"--since", "--since",
"--due-before", "--due-before",
"--has-due", "--has-due",
@@ -134,6 +135,7 @@ const COMMAND_FLAGS: &[(&str, &[&str])] = &[
"--since", "--since",
"--updated-since", "--updated-since",
"--limit", "--limit",
"--fields",
"--explain", "--explain",
"--no-explain", "--no-explain",
"--fts-mode", "--fts-mode",
@@ -162,6 +164,7 @@ const COMMAND_FLAGS: &[(&str, &[&str])] = &[
"--depth", "--depth",
"--expand-mentions", "--expand-mentions",
"--limit", "--limit",
"--fields",
"--max-seeds", "--max-seeds",
"--max-entities", "--max-entities",
"--max-evidence", "--max-evidence",
@@ -177,6 +180,7 @@ const COMMAND_FLAGS: &[(&str, &[&str])] = &[
"--since", "--since",
"--project", "--project",
"--limit", "--limit",
"--fields",
"--detail", "--detail",
"--no-detail", "--no-detail",
], ],
@@ -193,6 +197,7 @@ const COMMAND_FLAGS: &[(&str, &[&str])] = &[
), ),
("generate-docs", &["--full", "--project"]), ("generate-docs", &["--full", "--project"]),
("completions", &[]), ("completions", &[]),
("robot-docs", &["--brief"]),
( (
"list", "list",
&[ &[

View File

@@ -44,6 +44,21 @@ pub struct IngestResult {
pub resource_events_failed: usize, pub resource_events_failed: usize,
pub mr_diffs_fetched: usize, pub mr_diffs_fetched: usize,
pub mr_diffs_failed: usize, pub mr_diffs_failed: usize,
pub status_enrichment_errors: usize,
pub status_enrichment_projects: Vec<ProjectStatusEnrichment>,
}
/// Per-project status enrichment result, collected during ingestion.
pub struct ProjectStatusEnrichment {
pub mode: String,
pub reason: Option<String>,
pub seen: usize,
pub enriched: usize,
pub cleared: usize,
pub without_widget: usize,
pub partial_errors: usize,
pub first_partial_error: Option<String>,
pub error: Option<String>,
} }
#[derive(Debug, Default, Clone, Serialize)] #[derive(Debug, Default, Clone, Serialize)]
@@ -517,6 +532,14 @@ async fn run_ingest_inner(
ProgressEvent::MrDiffsFetchComplete { .. } => { ProgressEvent::MrDiffsFetchComplete { .. } => {
disc_bar_clone.finish_and_clear(); disc_bar_clone.finish_and_clear();
} }
ProgressEvent::StatusEnrichmentComplete { enriched, cleared } => {
if enriched > 0 || cleared > 0 {
stage_bar_clone.set_message(format!(
"Status enrichment: {enriched} enriched, {cleared} cleared"
));
}
}
ProgressEvent::StatusEnrichmentSkipped => {}
}) })
}; };
@@ -587,6 +610,22 @@ async fn run_ingest_inner(
total.issues_skipped_discussion_sync += result.issues_skipped_discussion_sync; total.issues_skipped_discussion_sync += result.issues_skipped_discussion_sync;
total.resource_events_fetched += result.resource_events_fetched; total.resource_events_fetched += result.resource_events_fetched;
total.resource_events_failed += result.resource_events_failed; total.resource_events_failed += result.resource_events_failed;
if result.status_enrichment_error.is_some() {
total.status_enrichment_errors += 1;
}
total
.status_enrichment_projects
.push(ProjectStatusEnrichment {
mode: result.status_enrichment_mode.clone(),
reason: result.status_unsupported_reason.clone(),
seen: result.statuses_seen,
enriched: result.statuses_enriched,
cleared: result.statuses_cleared,
without_widget: result.statuses_without_widget,
partial_errors: result.partial_error_count,
first_partial_error: result.first_partial_error.clone(),
error: result.status_enrichment_error.clone(),
});
} }
Ok(ProjectIngestOutcome::Mrs { Ok(ProjectIngestOutcome::Mrs {
ref path, ref path,
@@ -767,6 +806,25 @@ struct IngestJsonData {
notes_upserted: usize, notes_upserted: usize,
resource_events_fetched: usize, resource_events_fetched: usize,
resource_events_failed: usize, resource_events_failed: usize,
#[serde(skip_serializing_if = "Vec::is_empty")]
status_enrichment: Vec<StatusEnrichmentJson>,
status_enrichment_errors: usize,
}
#[derive(Serialize)]
struct StatusEnrichmentJson {
mode: String,
#[serde(skip_serializing_if = "Option::is_none")]
reason: Option<String>,
seen: usize,
enriched: usize,
cleared: usize,
without_widget: usize,
partial_errors: usize,
#[serde(skip_serializing_if = "Option::is_none")]
first_partial_error: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
error: Option<String>,
} }
#[derive(Serialize)] #[derive(Serialize)]
@@ -814,6 +872,22 @@ pub fn print_ingest_summary_json(result: &IngestResult, elapsed_ms: u64) {
) )
}; };
let status_enrichment: Vec<StatusEnrichmentJson> = result
.status_enrichment_projects
.iter()
.map(|p| StatusEnrichmentJson {
mode: p.mode.clone(),
reason: p.reason.clone(),
seen: p.seen,
enriched: p.enriched,
cleared: p.cleared,
without_widget: p.without_widget,
partial_errors: p.partial_errors,
first_partial_error: p.first_partial_error.clone(),
error: p.error.clone(),
})
.collect();
let output = IngestJsonOutput { let output = IngestJsonOutput {
ok: true, ok: true,
data: IngestJsonData { data: IngestJsonData {
@@ -826,6 +900,8 @@ pub fn print_ingest_summary_json(result: &IngestResult, elapsed_ms: u64) {
notes_upserted: result.notes_upserted, notes_upserted: result.notes_upserted,
resource_events_fetched: result.resource_events_fetched, resource_events_fetched: result.resource_events_fetched,
resource_events_failed: result.resource_events_failed, resource_events_failed: result.resource_events_failed,
status_enrichment,
status_enrichment_errors: result.status_enrichment_errors,
}, },
meta: RobotMeta { elapsed_ms }, meta: RobotMeta { elapsed_ms },
}; };

View File

@@ -19,6 +19,29 @@ fn colored_cell(content: impl std::fmt::Display, color: Color) -> Cell {
} }
} }
fn colored_cell_hex(content: &str, hex: Option<&str>) -> Cell {
if !console::colors_enabled() {
return Cell::new(content);
}
let Some(hex) = hex else {
return Cell::new(content);
};
let hex = hex.trim_start_matches('#');
if hex.len() != 6 {
return Cell::new(content);
}
let Ok(r) = u8::from_str_radix(&hex[0..2], 16) else {
return Cell::new(content);
};
let Ok(g) = u8::from_str_radix(&hex[2..4], 16) else {
return Cell::new(content);
};
let Ok(b) = u8::from_str_radix(&hex[4..6], 16) else {
return Cell::new(content);
};
Cell::new(content).fg(Color::Rgb { r, g, b })
}
#[derive(Debug, Serialize)] #[derive(Debug, Serialize)]
pub struct IssueListRow { pub struct IssueListRow {
pub iid: i64, pub iid: i64,
@@ -34,6 +57,16 @@ pub struct IssueListRow {
pub assignees: Vec<String>, pub assignees: Vec<String>,
pub discussion_count: i64, pub discussion_count: i64,
pub unresolved_count: i64, pub unresolved_count: i64,
#[serde(skip_serializing_if = "Option::is_none")]
pub status_name: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub status_category: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub status_color: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub status_icon_name: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub status_synced_at: Option<i64>,
} }
#[derive(Serialize)] #[derive(Serialize)]
@@ -51,6 +84,16 @@ pub struct IssueListRowJson {
#[serde(skip_serializing_if = "Option::is_none")] #[serde(skip_serializing_if = "Option::is_none")]
pub web_url: Option<String>, pub web_url: Option<String>,
pub project_path: String, pub project_path: String,
#[serde(skip_serializing_if = "Option::is_none")]
pub status_name: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub status_category: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub status_color: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub status_icon_name: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub status_synced_at_iso: Option<String>,
} }
impl From<&IssueListRow> for IssueListRowJson { impl From<&IssueListRow> for IssueListRowJson {
@@ -68,6 +111,11 @@ impl From<&IssueListRow> for IssueListRowJson {
updated_at_iso: ms_to_iso(row.updated_at), updated_at_iso: ms_to_iso(row.updated_at),
web_url: row.web_url.clone(), web_url: row.web_url.clone(),
project_path: row.project_path.clone(), project_path: row.project_path.clone(),
status_name: row.status_name.clone(),
status_category: row.status_category.clone(),
status_color: row.status_color.clone(),
status_icon_name: row.status_icon_name.clone(),
status_synced_at_iso: row.status_synced_at.map(ms_to_iso),
} }
} }
} }
@@ -194,6 +242,7 @@ pub struct ListFilters<'a> {
pub since: Option<&'a str>, pub since: Option<&'a str>,
pub due_before: Option<&'a str>, pub due_before: Option<&'a str>,
pub has_due_date: bool, pub has_due_date: bool,
pub statuses: &'a [String],
pub sort: &'a str, pub sort: &'a str,
pub order: &'a str, pub order: &'a str,
} }
@@ -291,6 +340,22 @@ fn query_issues(conn: &Connection, filters: &ListFilters) -> Result<ListResult>
where_clauses.push("i.due_date IS NOT NULL"); where_clauses.push("i.due_date IS NOT NULL");
} }
let status_in_clause;
if filters.statuses.len() == 1 {
where_clauses.push("i.status_name = ? COLLATE NOCASE");
params.push(Box::new(filters.statuses[0].clone()));
} else if filters.statuses.len() > 1 {
let placeholders: Vec<&str> = filters.statuses.iter().map(|_| "?").collect();
status_in_clause = format!(
"i.status_name COLLATE NOCASE IN ({})",
placeholders.join(", ")
);
where_clauses.push(&status_in_clause);
for s in filters.statuses {
params.push(Box::new(s.clone()));
}
}
let where_sql = if where_clauses.is_empty() { let where_sql = if where_clauses.is_empty() {
String::new() String::new()
} else { } else {
@@ -338,7 +403,12 @@ fn query_issues(conn: &Connection, filters: &ListFilters) -> Result<ListResult>
(SELECT COUNT(*) FROM discussions d (SELECT COUNT(*) FROM discussions d
WHERE d.issue_id = i.id) AS discussion_count, WHERE d.issue_id = i.id) AS discussion_count,
(SELECT COUNT(*) FROM discussions d (SELECT COUNT(*) FROM discussions d
WHERE d.issue_id = i.id AND d.resolvable = 1 AND d.resolved = 0) AS unresolved_count WHERE d.issue_id = i.id AND d.resolvable = 1 AND d.resolved = 0) AS unresolved_count,
i.status_name,
i.status_category,
i.status_color,
i.status_icon_name,
i.status_synced_at
FROM issues i FROM issues i
JOIN projects p ON i.project_id = p.id JOIN projects p ON i.project_id = p.id
{where_sql} {where_sql}
@@ -375,6 +445,11 @@ fn query_issues(conn: &Connection, filters: &ListFilters) -> Result<ListResult>
assignees, assignees,
discussion_count: row.get(10)?, discussion_count: row.get(10)?,
unresolved_count: row.get(11)?, unresolved_count: row.get(11)?,
status_name: row.get(12)?,
status_category: row.get(13)?,
status_color: row.get(14)?,
status_icon_name: row.get(15)?,
status_synced_at: row.get(16)?,
}) })
})? })?
.collect::<std::result::Result<Vec<_>, _>>()?; .collect::<std::result::Result<Vec<_>, _>>()?;
@@ -683,19 +758,28 @@ pub fn print_list_issues(result: &ListResult) {
result.total_count result.total_count
); );
let mut table = Table::new(); let has_any_status = result.issues.iter().any(|i| i.status_name.is_some());
table
.set_content_arrangement(ContentArrangement::Dynamic) let mut header = vec![
.set_header(vec![
Cell::new("IID").add_attribute(Attribute::Bold), Cell::new("IID").add_attribute(Attribute::Bold),
Cell::new("Title").add_attribute(Attribute::Bold), Cell::new("Title").add_attribute(Attribute::Bold),
Cell::new("State").add_attribute(Attribute::Bold), Cell::new("State").add_attribute(Attribute::Bold),
];
if has_any_status {
header.push(Cell::new("Status").add_attribute(Attribute::Bold));
}
header.extend([
Cell::new("Assignee").add_attribute(Attribute::Bold), Cell::new("Assignee").add_attribute(Attribute::Bold),
Cell::new("Labels").add_attribute(Attribute::Bold), Cell::new("Labels").add_attribute(Attribute::Bold),
Cell::new("Disc").add_attribute(Attribute::Bold), Cell::new("Disc").add_attribute(Attribute::Bold),
Cell::new("Updated").add_attribute(Attribute::Bold), Cell::new("Updated").add_attribute(Attribute::Bold),
]); ]);
let mut table = Table::new();
table
.set_content_arrangement(ContentArrangement::Dynamic)
.set_header(header);
for issue in &result.issues { for issue in &result.issues {
let title = truncate_with_ellipsis(&issue.title, 45); let title = truncate_with_ellipsis(&issue.title, 45);
let relative_time = format_relative_time(issue.updated_at); let relative_time = format_relative_time(issue.updated_at);
@@ -709,15 +793,28 @@ pub fn print_list_issues(result: &ListResult) {
colored_cell(&issue.state, Color::DarkGrey) colored_cell(&issue.state, Color::DarkGrey)
}; };
table.add_row(vec![ let mut row = vec![
colored_cell(format!("#{}", issue.iid), Color::Cyan), colored_cell(format!("#{}", issue.iid), Color::Cyan),
Cell::new(title), Cell::new(title),
state_cell, state_cell,
];
if has_any_status {
match &issue.status_name {
Some(status) => {
row.push(colored_cell_hex(status, issue.status_color.as_deref()));
}
None => {
row.push(Cell::new(""));
}
}
}
row.extend([
colored_cell(assignee, Color::Magenta), colored_cell(assignee, Color::Magenta),
colored_cell(labels, Color::Yellow), colored_cell(labels, Color::Yellow),
Cell::new(discussions), Cell::new(discussions),
colored_cell(relative_time, Color::DarkGrey), colored_cell(relative_time, Color::DarkGrey),
]); ]);
table.add_row(row);
} }
println!("{table}"); println!("{table}");

View File

@@ -386,11 +386,20 @@ struct SearchMeta {
elapsed_ms: u64, elapsed_ms: u64,
} }
pub fn print_search_results_json(response: &SearchResponse, elapsed_ms: u64) { pub fn print_search_results_json(
response: &SearchResponse,
elapsed_ms: u64,
fields: Option<&[String]>,
) {
let output = SearchJsonOutput { let output = SearchJsonOutput {
ok: true, ok: true,
data: response, data: response,
meta: SearchMeta { elapsed_ms }, meta: SearchMeta { elapsed_ms },
}; };
println!("{}", serde_json::to_string(&output).unwrap()); let mut value = serde_json::to_value(&output).unwrap();
if let Some(f) = fields {
let expanded = crate::cli::robot::expand_fields_preset(f, "search");
crate::cli::robot::filter_fields(&mut value, "results", &expanded);
}
println!("{}", serde_json::to_string(&value).unwrap());
} }

View File

@@ -83,6 +83,11 @@ pub struct IssueDetail {
pub milestone: Option<String>, pub milestone: Option<String>,
pub closing_merge_requests: Vec<ClosingMrRef>, pub closing_merge_requests: Vec<ClosingMrRef>,
pub discussions: Vec<DiscussionDetail>, pub discussions: Vec<DiscussionDetail>,
pub status_name: Option<String>,
pub status_category: Option<String>,
pub status_color: Option<String>,
pub status_icon_name: Option<String>,
pub status_synced_at: Option<i64>,
} }
#[derive(Debug, Serialize)] #[derive(Debug, Serialize)]
@@ -134,6 +139,11 @@ pub fn run_show_issue(
milestone: issue.milestone_title, milestone: issue.milestone_title,
closing_merge_requests: closing_mrs, closing_merge_requests: closing_mrs,
discussions, discussions,
status_name: issue.status_name,
status_category: issue.status_category,
status_color: issue.status_color,
status_icon_name: issue.status_icon_name,
status_synced_at: issue.status_synced_at,
}) })
} }
@@ -150,6 +160,11 @@ struct IssueRow {
project_path: String, project_path: String,
due_date: Option<String>, due_date: Option<String>,
milestone_title: Option<String>, milestone_title: Option<String>,
status_name: Option<String>,
status_category: Option<String>,
status_color: Option<String>,
status_icon_name: Option<String>,
status_synced_at: Option<i64>,
} }
fn find_issue(conn: &Connection, iid: i64, project_filter: Option<&str>) -> Result<IssueRow> { fn find_issue(conn: &Connection, iid: i64, project_filter: Option<&str>) -> Result<IssueRow> {
@@ -159,7 +174,9 @@ fn find_issue(conn: &Connection, iid: i64, project_filter: Option<&str>) -> Resu
( (
"SELECT i.id, i.iid, i.title, i.description, i.state, i.author_username, "SELECT i.id, i.iid, i.title, i.description, i.state, i.author_username,
i.created_at, i.updated_at, i.web_url, p.path_with_namespace, i.created_at, i.updated_at, i.web_url, p.path_with_namespace,
i.due_date, i.milestone_title i.due_date, i.milestone_title,
i.status_name, i.status_category, i.status_color,
i.status_icon_name, i.status_synced_at
FROM issues i FROM issues i
JOIN projects p ON i.project_id = p.id JOIN projects p ON i.project_id = p.id
WHERE i.iid = ? AND i.project_id = ?", WHERE i.iid = ? AND i.project_id = ?",
@@ -169,7 +186,9 @@ fn find_issue(conn: &Connection, iid: i64, project_filter: Option<&str>) -> Resu
None => ( None => (
"SELECT i.id, i.iid, i.title, i.description, i.state, i.author_username, "SELECT i.id, i.iid, i.title, i.description, i.state, i.author_username,
i.created_at, i.updated_at, i.web_url, p.path_with_namespace, i.created_at, i.updated_at, i.web_url, p.path_with_namespace,
i.due_date, i.milestone_title i.due_date, i.milestone_title,
i.status_name, i.status_category, i.status_color,
i.status_icon_name, i.status_synced_at
FROM issues i FROM issues i
JOIN projects p ON i.project_id = p.id JOIN projects p ON i.project_id = p.id
WHERE i.iid = ?", WHERE i.iid = ?",
@@ -195,6 +214,11 @@ fn find_issue(conn: &Connection, iid: i64, project_filter: Option<&str>) -> Resu
project_path: row.get(9)?, project_path: row.get(9)?,
due_date: row.get(10)?, due_date: row.get(10)?,
milestone_title: row.get(11)?, milestone_title: row.get(11)?,
status_name: row.get(12)?,
status_category: row.get(13)?,
status_color: row.get(14)?,
status_icon_name: row.get(15)?,
status_synced_at: row.get(16)?,
}) })
})? })?
.collect::<std::result::Result<Vec<_>, _>>()?; .collect::<std::result::Result<Vec<_>, _>>()?;
@@ -603,6 +627,17 @@ pub fn print_show_issue(issue: &IssueDetail) {
}; };
println!("State: {}", state_styled); println!("State: {}", state_styled);
if let Some(status) = &issue.status_name {
let display = match &issue.status_category {
Some(cat) => format!("{status} ({})", cat.to_ascii_lowercase()),
None => status.clone(),
};
println!(
"Status: {}",
style_with_hex(&display, issue.status_color.as_deref())
);
}
println!("Author: @{}", issue.author_username); println!("Author: @{}", issue.author_username);
if !issue.assignees.is_empty() { if !issue.assignees.is_empty() {
@@ -864,6 +899,32 @@ fn print_diff_position(pos: &DiffNotePosition) {
} }
} }
fn style_with_hex<'a>(text: &'a str, hex: Option<&str>) -> console::StyledObject<&'a str> {
let styled = console::style(text);
let Some(hex) = hex else { return styled };
let hex = hex.trim_start_matches('#');
if hex.len() != 6 {
return styled;
}
let Ok(r) = u8::from_str_radix(&hex[0..2], 16) else {
return styled;
};
let Ok(g) = u8::from_str_radix(&hex[2..4], 16) else {
return styled;
};
let Ok(b) = u8::from_str_radix(&hex[4..6], 16) else {
return styled;
};
styled.color256(ansi256_from_rgb(r, g, b))
}
fn ansi256_from_rgb(r: u8, g: u8, b: u8) -> u8 {
let ri = (u16::from(r) * 5 + 127) / 255;
let gi = (u16::from(g) * 5 + 127) / 255;
let bi = (u16::from(b) * 5 + 127) / 255;
(16 + 36 * ri + 6 * gi + bi) as u8
}
#[derive(Serialize)] #[derive(Serialize)]
pub struct IssueDetailJson { pub struct IssueDetailJson {
pub id: i64, pub id: i64,
@@ -882,6 +943,11 @@ pub struct IssueDetailJson {
pub milestone: Option<String>, pub milestone: Option<String>,
pub closing_merge_requests: Vec<ClosingMrRefJson>, pub closing_merge_requests: Vec<ClosingMrRefJson>,
pub discussions: Vec<DiscussionDetailJson>, pub discussions: Vec<DiscussionDetailJson>,
pub status_name: Option<String>,
pub status_category: Option<String>,
pub status_color: Option<String>,
pub status_icon_name: Option<String>,
pub status_synced_at: Option<String>,
} }
#[derive(Serialize)] #[derive(Serialize)]
@@ -934,6 +1000,11 @@ impl From<&IssueDetail> for IssueDetailJson {
}) })
.collect(), .collect(),
discussions: issue.discussions.iter().map(|d| d.into()).collect(), discussions: issue.discussions.iter().map(|d| d.into()).collect(),
status_name: issue.status_name.clone(),
status_category: issue.status_category.clone(),
status_color: issue.status_color.clone(),
status_icon_name: issue.status_icon_name.clone(),
status_synced_at: issue.status_synced_at.map(ms_to_iso),
} }
} }
} }
@@ -1103,6 +1174,12 @@ mod tests {
.unwrap(); .unwrap();
} }
#[test]
fn test_ansi256_from_rgb() {
assert_eq!(ansi256_from_rgb(0, 0, 0), 16);
assert_eq!(ansi256_from_rgb(255, 255, 255), 231);
}
#[test] #[test]
fn test_get_issue_assignees_empty() { fn test_get_issue_assignees_empty() {
let conn = setup_test_db(); let conn = setup_test_db();

View File

@@ -39,6 +39,7 @@ pub struct SyncResult {
pub mr_diffs_failed: usize, pub mr_diffs_failed: usize,
pub documents_regenerated: usize, pub documents_regenerated: usize,
pub documents_embedded: usize, pub documents_embedded: usize,
pub status_enrichment_errors: usize,
} }
fn stage_spinner(stage: u8, total: u8, msg: &str, robot_mode: bool) -> ProgressBar { fn stage_spinner(stage: u8, total: u8, msg: &str, robot_mode: bool) -> ProgressBar {
@@ -123,6 +124,7 @@ pub async fn run_sync(
result.discussions_fetched += issues_result.discussions_fetched; result.discussions_fetched += issues_result.discussions_fetched;
result.resource_events_fetched += issues_result.resource_events_fetched; result.resource_events_fetched += issues_result.resource_events_fetched;
result.resource_events_failed += issues_result.resource_events_failed; result.resource_events_failed += issues_result.resource_events_failed;
result.status_enrichment_errors += issues_result.status_enrichment_errors;
spinner.finish_and_clear(); spinner.finish_and_clear();
if signal.is_cancelled() { if signal.is_cancelled() {

View File

@@ -252,6 +252,7 @@ pub fn print_timeline_json_with_meta(
total_events_before_limit: usize, total_events_before_limit: usize,
depth: u32, depth: u32,
expand_mentions: bool, expand_mentions: bool,
fields: Option<&[String]>,
) { ) {
let output = TimelineJsonEnvelope { let output = TimelineJsonEnvelope {
ok: true, ok: true,
@@ -268,10 +269,18 @@ pub fn print_timeline_json_with_meta(
}, },
}; };
match serde_json::to_string(&output) { let mut value = match serde_json::to_value(&output) {
Ok(json) => println!("{json}"), Ok(v) => v,
Err(e) => eprintln!("Error serializing timeline JSON: {e}"), Err(e) => {
eprintln!("Error serializing timeline JSON: {e}");
return;
} }
};
if let Some(f) = fields {
let expanded = crate::cli::robot::expand_fields_preset(f, "timeline");
crate::cli::robot::filter_fields(&mut value, "events", &expanded);
}
println!("{}", serde_json::to_string(&value).unwrap());
} }
#[derive(Serialize)] #[derive(Serialize)]

View File

@@ -2201,12 +2201,28 @@ pub fn print_who_json(run: &WhoRun, args: &WhoArgs, elapsed_ms: u64) {
meta: RobotMeta { elapsed_ms }, meta: RobotMeta { elapsed_ms },
}; };
println!( let mut value = serde_json::to_value(&output).unwrap_or_else(|e| {
"{}", serde_json::json!({"ok":false,"error":{"code":"INTERNAL_ERROR","message":format!("JSON serialization failed: {e}")}})
serde_json::to_string(&output).unwrap_or_else(|e| { });
format!(r#"{{"ok":false,"error":{{"code":"INTERNAL_ERROR","message":"JSON serialization failed: {e}"}}}}"#)
}) if let Some(f) = &args.fields {
); let preset_key = format!("who_{mode}");
let expanded = crate::cli::robot::expand_fields_preset(f, &preset_key);
// Each who mode uses a different array key; try all possible keys
for key in &[
"experts",
"assigned_issues",
"authored_mrs",
"review_mrs",
"categories",
"discussions",
"users",
] {
crate::cli::robot::filter_fields(&mut value, key, &expanded);
}
}
println!("{}", serde_json::to_string(&value).unwrap());
} }
#[derive(Serialize)] #[derive(Serialize)]
@@ -2577,6 +2593,7 @@ mod tests {
limit: 20, limit: 20,
detail: false, detail: false,
no_detail: false, no_detail: false,
fields: None,
}) })
.unwrap(), .unwrap(),
WhoMode::Expert { .. } WhoMode::Expert { .. }
@@ -2595,6 +2612,7 @@ mod tests {
limit: 20, limit: 20,
detail: false, detail: false,
no_detail: false, no_detail: false,
fields: None,
}) })
.unwrap(), .unwrap(),
WhoMode::Workload { .. } WhoMode::Workload { .. }
@@ -2613,6 +2631,7 @@ mod tests {
limit: 20, limit: 20,
detail: false, detail: false,
no_detail: false, no_detail: false,
fields: None,
}) })
.unwrap(), .unwrap(),
WhoMode::Workload { .. } WhoMode::Workload { .. }
@@ -2631,6 +2650,7 @@ mod tests {
limit: 20, limit: 20,
detail: false, detail: false,
no_detail: false, no_detail: false,
fields: None,
}) })
.unwrap(), .unwrap(),
WhoMode::Reviews { .. } WhoMode::Reviews { .. }
@@ -2649,6 +2669,7 @@ mod tests {
limit: 20, limit: 20,
detail: false, detail: false,
no_detail: false, no_detail: false,
fields: None,
}) })
.unwrap(), .unwrap(),
WhoMode::Expert { .. } WhoMode::Expert { .. }
@@ -2667,6 +2688,7 @@ mod tests {
limit: 20, limit: 20,
detail: false, detail: false,
no_detail: false, no_detail: false,
fields: None,
}) })
.unwrap(), .unwrap(),
WhoMode::Expert { .. } WhoMode::Expert { .. }
@@ -2686,6 +2708,7 @@ mod tests {
limit: 20, limit: 20,
detail: true, detail: true,
no_detail: false, no_detail: false,
fields: None,
}; };
let mode = resolve_mode(&args).unwrap(); let mode = resolve_mode(&args).unwrap();
let err = validate_mode_flags(&mode, &args).unwrap_err(); let err = validate_mode_flags(&mode, &args).unwrap_err();
@@ -2709,6 +2732,7 @@ mod tests {
limit: 20, limit: 20,
detail: true, detail: true,
no_detail: false, no_detail: false,
fields: None,
}; };
let mode = resolve_mode(&args).unwrap(); let mode = resolve_mode(&args).unwrap();
assert!(validate_mode_flags(&mode, &args).is_ok()); assert!(validate_mode_flags(&mode, &args).is_ok());

View File

@@ -186,7 +186,11 @@ pub enum Commands {
/// Machine-readable command manifest for agent self-discovery /// Machine-readable command manifest for agent self-discovery
#[command(name = "robot-docs")] #[command(name = "robot-docs")]
RobotDocs, RobotDocs {
/// Omit response_schema from output (~60% smaller)
#[arg(long)]
brief: bool,
},
/// Generate shell completions /// Generate shell completions
#[command(long_about = "Generate shell completions for lore.\n\n\ #[command(long_about = "Generate shell completions for lore.\n\n\
@@ -315,6 +319,10 @@ pub struct IssuesArgs {
#[arg(short = 'm', long, help_heading = "Filters")] #[arg(short = 'm', long, help_heading = "Filters")]
pub milestone: Option<String>, pub milestone: Option<String>,
/// Filter by work-item status name (repeatable, OR logic)
#[arg(long, help_heading = "Filters")]
pub status: Vec<String>,
/// Filter by time (7d, 2w, 1m, or YYYY-MM-DD) /// Filter by time (7d, 2w, 1m, or YYYY-MM-DD)
#[arg(long, help_heading = "Filters")] #[arg(long, help_heading = "Filters")]
pub since: Option<String>, pub since: Option<String>,
@@ -563,6 +571,10 @@ pub struct SearchArgs {
)] )]
pub limit: usize, pub limit: usize,
/// Select output fields (comma-separated, or 'minimal' preset: document_id,title,source_type,score)
#[arg(long, help_heading = "Output", value_delimiter = ',')]
pub fields: Option<Vec<String>>,
/// Show ranking explanation per result /// Show ranking explanation per result
#[arg(long, help_heading = "Output", overrides_with = "no_explain")] #[arg(long, help_heading = "Output", overrides_with = "no_explain")]
pub explain: bool, pub explain: bool,
@@ -682,6 +694,10 @@ pub struct TimelineArgs {
)] )]
pub limit: usize, pub limit: usize,
/// Select output fields (comma-separated, or 'minimal' preset: timestamp,type,entity_iid,detail)
#[arg(long, help_heading = "Output", value_delimiter = ',')]
pub fields: Option<Vec<String>>,
/// Maximum seed entities from search /// Maximum seed entities from search
#[arg(long = "max-seeds", default_value = "10", help_heading = "Expansion")] #[arg(long = "max-seeds", default_value = "10", help_heading = "Expansion")]
pub max_seeds: usize, pub max_seeds: usize,
@@ -752,6 +768,10 @@ pub struct WhoArgs {
)] )]
pub limit: u16, pub limit: u16,
/// Select output fields (comma-separated, or 'minimal' preset; varies by mode)
#[arg(long, help_heading = "Output", value_delimiter = ',')]
pub fields: Option<Vec<String>>,
/// Show per-MR detail breakdown (expert mode only) /// Show per-MR detail breakdown (expert mode only)
#[arg(long, help_heading = "Output", overrides_with = "no_detail")] #[arg(long, help_heading = "Output", overrides_with = "no_detail")]
pub detail: bool, pub detail: bool,

View File

@@ -36,9 +36,48 @@ pub fn expand_fields_preset(fields: &[String], entity: &str) -> Vec<String> {
.iter() .iter()
.map(|s| (*s).to_string()) .map(|s| (*s).to_string())
.collect(), .collect(),
"search" => ["document_id", "title", "source_type", "score"]
.iter()
.map(|s| (*s).to_string())
.collect(),
"timeline" => ["timestamp", "type", "entity_iid", "detail"]
.iter()
.map(|s| (*s).to_string())
.collect(),
"who_expert" => ["username", "score"]
.iter()
.map(|s| (*s).to_string())
.collect(),
"who_workload" => ["iid", "title", "state"]
.iter()
.map(|s| (*s).to_string())
.collect(),
"who_active" => ["entity_type", "iid", "title", "participants"]
.iter()
.map(|s| (*s).to_string())
.collect(),
"who_overlap" => ["username", "touch_count"]
.iter()
.map(|s| (*s).to_string())
.collect(),
"who_reviews" => ["name", "count", "percentage"]
.iter()
.map(|s| (*s).to_string())
.collect(),
_ => fields.to_vec(), _ => fields.to_vec(),
} }
} else { } else {
fields.to_vec() fields.to_vec()
} }
} }
/// Strip `response_schema` from every command entry for `--brief` mode.
pub fn strip_schemas(commands: &mut serde_json::Value) {
if let Some(map) = commands.as_object_mut() {
for (_cmd_name, cmd) in map.iter_mut() {
if let Some(obj) = cmd.as_object_mut() {
obj.remove("response_schema");
}
}
}
}

View File

@@ -52,6 +52,9 @@ pub struct SyncConfig {
#[serde(rename = "fetchMrFileChanges", default = "default_true")] #[serde(rename = "fetchMrFileChanges", default = "default_true")]
pub fetch_mr_file_changes: bool, pub fetch_mr_file_changes: bool,
#[serde(rename = "fetchWorkItemStatus", default = "default_true")]
pub fetch_work_item_status: bool,
} }
fn default_true() -> bool { fn default_true() -> bool {
@@ -70,6 +73,7 @@ impl Default for SyncConfig {
requests_per_second: 30.0, requests_per_second: 30.0,
fetch_resource_events: true, fetch_resource_events: true,
fetch_mr_file_changes: true, fetch_mr_file_changes: true,
fetch_work_item_status: true,
} }
} }
} }
@@ -348,6 +352,22 @@ mod tests {
); );
} }
#[test]
fn test_config_fetch_work_item_status_default_true() {
let config = SyncConfig::default();
assert!(config.fetch_work_item_status);
}
#[test]
fn test_config_deserialize_without_key() {
let json = r#"{}"#;
let config: SyncConfig = serde_json::from_str(json).unwrap();
assert!(
config.fetch_work_item_status,
"Missing key should default to true"
);
}
#[test] #[test]
fn test_load_rejects_negative_note_bonus() { fn test_load_rejects_negative_note_bonus() {
let dir = TempDir::new().unwrap(); let dir = TempDir::new().unwrap();

View File

@@ -65,6 +65,10 @@ const MIGRATIONS: &[(&str, &str)] = &[
"020", "020",
include_str!("../../migrations/020_mr_diffs_watermark.sql"), include_str!("../../migrations/020_mr_diffs_watermark.sql"),
), ),
(
"021",
include_str!("../../migrations/021_work_item_status.sql"),
),
]; ];
pub fn create_connection(db_path: &Path) -> Result<Connection> { pub fn create_connection(db_path: &Path) -> Result<Connection> {

View File

@@ -232,7 +232,7 @@ impl LoreError {
} }
pub fn is_permanent_api_error(&self) -> bool { pub fn is_permanent_api_error(&self) -> bool {
matches!(self, Self::GitLabNotFound { .. }) matches!(self, Self::GitLabNotFound { .. } | Self::GitLabAuthFailed)
} }
pub fn exit_code(&self) -> i32 { pub fn exit_code(&self) -> i32 {

View File

@@ -95,6 +95,10 @@ impl GitLabClient {
} }
} }
pub fn graphql_client(&self) -> crate::gitlab::graphql::GraphqlClient {
crate::gitlab::graphql::GraphqlClient::new(&self.base_url, &self.token)
}
pub async fn get_current_user(&self) -> Result<GitLabUser> { pub async fn get_current_user(&self) -> Result<GitLabUser> {
self.request("/api/v4/user").await self.request("/api/v4/user").await
} }

1281
src/gitlab/graphql.rs Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -1,4 +1,5 @@
pub mod client; pub mod client;
pub mod graphql;
pub mod transformers; pub mod transformers;
pub mod types; pub mod types;
@@ -10,5 +11,5 @@ pub use transformers::{
pub use types::{ pub use types::{
GitLabAuthor, GitLabDiscussion, GitLabIssue, GitLabIssueRef, GitLabLabelEvent, GitLabLabelRef, GitLabAuthor, GitLabDiscussion, GitLabIssue, GitLabIssueRef, GitLabLabelEvent, GitLabLabelRef,
GitLabMergeRequestRef, GitLabMilestoneEvent, GitLabMilestoneRef, GitLabMrDiff, GitLabNote, GitLabMergeRequestRef, GitLabMilestoneEvent, GitLabMilestoneRef, GitLabMrDiff, GitLabNote,
GitLabNotePosition, GitLabProject, GitLabStateEvent, GitLabUser, GitLabVersion, GitLabNotePosition, GitLabProject, GitLabStateEvent, GitLabUser, GitLabVersion, WorkItemStatus,
}; };

View File

@@ -262,3 +262,86 @@ pub struct GitLabMergeRequest {
pub merge_commit_sha: Option<String>, pub merge_commit_sha: Option<String>,
pub squash_commit_sha: Option<String>, pub squash_commit_sha: Option<String>,
} }
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct WorkItemStatus {
pub name: String,
pub category: Option<String>,
pub color: Option<String>,
#[serde(rename = "iconName")]
pub icon_name: Option<String>,
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_work_item_status_deserialize() {
let json = r##"{"name":"In progress","category":"IN_PROGRESS","color":"#1f75cb","iconName":"status-in-progress"}"##;
let status: WorkItemStatus = serde_json::from_str(json).unwrap();
assert_eq!(status.name, "In progress");
assert_eq!(status.category.as_deref(), Some("IN_PROGRESS"));
assert_eq!(status.color.as_deref(), Some("#1f75cb"));
assert_eq!(status.icon_name.as_deref(), Some("status-in-progress"));
}
#[test]
fn test_work_item_status_optional_fields() {
let json = r#"{"name":"To do"}"#;
let status: WorkItemStatus = serde_json::from_str(json).unwrap();
assert_eq!(status.name, "To do");
assert!(status.category.is_none());
assert!(status.color.is_none());
assert!(status.icon_name.is_none());
}
#[test]
fn test_work_item_status_unknown_category() {
let json = r#"{"name":"Custom","category":"SOME_FUTURE_VALUE"}"#;
let status: WorkItemStatus = serde_json::from_str(json).unwrap();
assert_eq!(status.category.as_deref(), Some("SOME_FUTURE_VALUE"));
}
#[test]
fn test_work_item_status_null_category() {
let json = r#"{"name":"In progress","category":null}"#;
let status: WorkItemStatus = serde_json::from_str(json).unwrap();
assert!(status.category.is_none());
}
#[test]
fn test_work_item_status_all_system_statuses() {
let cases = [
(
r##"{"name":"To do","category":"TO_DO","color":"#737278"}"##,
"TO_DO",
),
(
r##"{"name":"In progress","category":"IN_PROGRESS","color":"#1f75cb"}"##,
"IN_PROGRESS",
),
(
r##"{"name":"Done","category":"DONE","color":"#108548"}"##,
"DONE",
),
(
r##"{"name":"Won't do","category":"CANCELED","color":"#DD2B0E"}"##,
"CANCELED",
),
(
r##"{"name":"Duplicate","category":"CANCELED","color":"#DD2B0E"}"##,
"CANCELED",
),
];
for (json, expected_cat) in cases {
let status: WorkItemStatus = serde_json::from_str(json).unwrap();
assert_eq!(
status.category.as_deref(),
Some(expected_cat),
"Failed for: {}",
status.name
);
}
}
}

View File

@@ -45,6 +45,8 @@ pub enum ProgressEvent {
MrDiffsFetchStarted { total: usize }, MrDiffsFetchStarted { total: usize },
MrDiffFetched { current: usize, total: usize }, MrDiffFetched { current: usize, total: usize },
MrDiffsFetchComplete { fetched: usize, failed: usize }, MrDiffsFetchComplete { fetched: usize, failed: usize },
StatusEnrichmentComplete { enriched: usize, cleared: usize },
StatusEnrichmentSkipped,
} }
#[derive(Debug, Default)] #[derive(Debug, Default)]
@@ -59,6 +61,15 @@ pub struct IngestProjectResult {
pub issues_skipped_discussion_sync: usize, pub issues_skipped_discussion_sync: usize,
pub resource_events_fetched: usize, pub resource_events_fetched: usize,
pub resource_events_failed: usize, pub resource_events_failed: usize,
pub statuses_enriched: usize,
pub statuses_cleared: usize,
pub statuses_seen: usize,
pub statuses_without_widget: usize,
pub partial_error_count: usize,
pub first_partial_error: Option<String>,
pub status_enrichment_error: Option<String>,
pub status_enrichment_mode: String,
pub status_unsupported_reason: Option<String>,
} }
#[derive(Debug, Default)] #[derive(Debug, Default)]
@@ -135,6 +146,107 @@ pub async fn ingest_project_issues_with_progress(
total: result.issues_fetched, total: result.issues_fetched,
}); });
// ── Phase 1.5: Status enrichment via GraphQL ──────────────────────
if config.sync.fetch_work_item_status && !signal.is_cancelled() {
use rusqlite::OptionalExtension;
let project_path: Option<String> = conn
.query_row(
"SELECT path_with_namespace FROM projects WHERE id = ?1",
[project_id],
|r| r.get(0),
)
.optional()?;
match project_path {
None => {
warn!("Cannot enrich statuses: project path not found for project_id={project_id}");
result.status_enrichment_error = Some("project_path_missing".into());
result.status_enrichment_mode = "fetched".into();
emit(ProgressEvent::StatusEnrichmentComplete {
enriched: 0,
cleared: 0,
});
}
Some(path) => {
let graphql_client = client.graphql_client();
match crate::gitlab::graphql::fetch_issue_statuses(&graphql_client, &path).await {
Ok(fetch_result) => {
if let Some(ref reason) = fetch_result.unsupported_reason {
result.status_enrichment_mode = "unsupported".into();
result.status_unsupported_reason = Some(match reason {
crate::gitlab::graphql::UnsupportedReason::GraphqlEndpointMissing => {
"graphql_endpoint_missing".into()
}
crate::gitlab::graphql::UnsupportedReason::AuthForbidden => {
"auth_forbidden".into()
}
});
} else {
result.status_enrichment_mode = "fetched".into();
}
result.statuses_seen = fetch_result.all_fetched_iids.len();
result.partial_error_count = fetch_result.partial_error_count;
if fetch_result.first_partial_error.is_some() {
result.first_partial_error = fetch_result.first_partial_error.clone();
}
if signal.is_cancelled() {
info!("Shutdown requested after status fetch, skipping DB write");
emit(ProgressEvent::StatusEnrichmentComplete {
enriched: 0,
cleared: 0,
});
} else {
match enrich_issue_statuses_txn(
conn,
project_id,
&fetch_result.statuses,
&fetch_result.all_fetched_iids,
) {
Ok((enriched, cleared)) => {
result.statuses_enriched = enriched;
result.statuses_cleared = cleared;
result.statuses_without_widget =
result.statuses_seen.saturating_sub(enriched);
info!(
seen = result.statuses_seen,
enriched,
cleared,
without_widget = result.statuses_without_widget,
"Status enrichment complete"
);
}
Err(e) => {
warn!("Status enrichment DB write failed: {e}");
result.status_enrichment_error =
Some(format!("db_write_failed: {e}"));
}
}
emit(ProgressEvent::StatusEnrichmentComplete {
enriched: result.statuses_enriched,
cleared: result.statuses_cleared,
});
}
}
Err(e) => {
warn!("Status enrichment fetch failed: {e}");
result.status_enrichment_error = Some(e.to_string());
result.status_enrichment_mode = "fetched".into();
emit(ProgressEvent::StatusEnrichmentComplete {
enriched: 0,
cleared: 0,
});
}
}
}
}
} else if !config.sync.fetch_work_item_status {
result.status_enrichment_mode = "skipped".into();
emit(ProgressEvent::StatusEnrichmentSkipped);
}
let issues_needing_sync = issue_result.issues_needing_discussion_sync; let issues_needing_sync = issue_result.issues_needing_discussion_sync;
let total_issues: i64 = conn.query_row( let total_issues: i64 = conn.query_row(
@@ -238,6 +350,66 @@ pub async fn ingest_project_issues_with_progress(
Ok(result) Ok(result)
} }
fn enrich_issue_statuses_txn(
conn: &Connection,
project_id: i64,
statuses: &std::collections::HashMap<i64, crate::gitlab::types::WorkItemStatus>,
all_fetched_iids: &std::collections::HashSet<i64>,
) -> std::result::Result<(usize, usize), rusqlite::Error> {
let now_ms = std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH)
.unwrap_or_default()
.as_millis() as i64;
let tx = conn.unchecked_transaction()?;
let mut cleared = 0usize;
let mut enriched = 0usize;
// Phase 1: Clear stale statuses (fetched but no status widget)
{
let mut clear_stmt = tx.prepare_cached(
"UPDATE issues SET status_name = NULL, status_category = NULL, status_color = NULL,
status_icon_name = NULL, status_synced_at = ?3
WHERE project_id = ?1 AND iid = ?2 AND status_name IS NOT NULL",
)?;
for iid in all_fetched_iids {
if !statuses.contains_key(iid) {
let rows = clear_stmt.execute(rusqlite::params![project_id, iid, now_ms])?;
if rows > 0 {
cleared += 1;
}
}
}
}
// Phase 2: Apply new/updated statuses
{
let mut update_stmt = tx.prepare_cached(
"UPDATE issues SET status_name = ?1, status_category = ?2, status_color = ?3,
status_icon_name = ?4, status_synced_at = ?5
WHERE project_id = ?6 AND iid = ?7",
)?;
for (iid, status) in statuses {
let rows = update_stmt.execute(rusqlite::params![
&status.name,
&status.category,
&status.color,
&status.icon_name,
now_ms,
project_id,
iid,
])?;
if rows > 0 {
enriched += 1;
}
}
}
tx.commit()?;
Ok((enriched, cleared))
}
#[allow(clippy::too_many_arguments)] #[allow(clippy::too_many_arguments)]
async fn sync_discussions_sequential( async fn sync_discussions_sequential(
conn: &Connection, conn: &Connection,
@@ -1181,8 +1353,6 @@ fn store_closes_issues_refs(
mr_local_id: i64, mr_local_id: i64,
closes_issues: &[crate::gitlab::types::GitLabIssueRef], closes_issues: &[crate::gitlab::types::GitLabIssueRef],
) -> Result<()> { ) -> Result<()> {
conn.execute_batch("SAVEPOINT store_closes_refs")?;
let inner = || -> Result<()> {
for issue_ref in closes_issues { for issue_ref in closes_issues {
let target_local_id = resolve_issue_local_id(conn, project_id, issue_ref.iid)?; let target_local_id = resolve_issue_local_id(conn, project_id, issue_ref.iid)?;
@@ -1210,17 +1380,6 @@ fn store_closes_issues_refs(
insert_entity_reference(conn, &ref_)?; insert_entity_reference(conn, &ref_)?;
} }
Ok(()) Ok(())
};
match inner() {
Ok(()) => {
conn.execute_batch("RELEASE store_closes_refs")?;
Ok(())
}
Err(e) => {
let _ = conn.execute_batch("ROLLBACK TO store_closes_refs; RELEASE store_closes_refs");
Err(e)
}
}
} }
// ─── MR Diffs (file changes) ──────────────────────────────────────────────── // ─── MR Diffs (file changes) ────────────────────────────────────────────────
@@ -1469,6 +1628,15 @@ mod tests {
assert_eq!(result.issues_skipped_discussion_sync, 0); assert_eq!(result.issues_skipped_discussion_sync, 0);
assert_eq!(result.resource_events_fetched, 0); assert_eq!(result.resource_events_fetched, 0);
assert_eq!(result.resource_events_failed, 0); assert_eq!(result.resource_events_failed, 0);
assert_eq!(result.statuses_enriched, 0);
assert_eq!(result.statuses_cleared, 0);
assert_eq!(result.statuses_seen, 0);
assert_eq!(result.statuses_without_widget, 0);
assert_eq!(result.partial_error_count, 0);
assert!(result.first_partial_error.is_none());
assert!(result.status_enrichment_error.is_none());
assert!(result.status_enrichment_mode.is_empty());
assert!(result.status_unsupported_reason.is_none());
} }
#[test] #[test]
@@ -1509,5 +1677,10 @@ mod tests {
fetched: 8, fetched: 8,
failed: 2, failed: 2,
}; };
let _status_complete = ProgressEvent::StatusEnrichmentComplete {
enriched: 5,
cleared: 1,
};
let _status_skipped = ProgressEvent::StatusEnrichmentSkipped;
} }
} }

View File

@@ -24,7 +24,7 @@ use lore::cli::commands::{
run_init, run_list_issues, run_list_mrs, run_search, run_show_issue, run_show_mr, run_stats, run_init, run_list_issues, run_list_mrs, run_search, run_show_issue, run_show_mr, run_stats,
run_sync, run_sync_status, run_timeline, run_who, run_sync, run_sync_status, run_timeline, run_who,
}; };
use lore::cli::robot::RobotMeta; use lore::cli::robot::{RobotMeta, strip_schemas};
use lore::cli::{ use lore::cli::{
Cli, Commands, CountArgs, EmbedArgs, GenerateDocsArgs, IngestArgs, IssuesArgs, MrsArgs, Cli, Commands, CountArgs, EmbedArgs, GenerateDocsArgs, IngestArgs, IssuesArgs, MrsArgs,
SearchArgs, StatsArgs, SyncArgs, TimelineArgs, WhoArgs, SearchArgs, StatsArgs, SyncArgs, TimelineArgs, WhoArgs,
@@ -162,7 +162,7 @@ async fn main() {
// Phase 2: Handle no-args case - in robot mode, output robot-docs; otherwise show help // Phase 2: Handle no-args case - in robot mode, output robot-docs; otherwise show help
None => { None => {
if robot_mode { if robot_mode {
handle_robot_docs(robot_mode) handle_robot_docs(robot_mode, false)
} else { } else {
use clap::CommandFactory; use clap::CommandFactory;
let mut cmd = Cli::command(); let mut cmd = Cli::command();
@@ -224,7 +224,7 @@ async fn main() {
Some(Commands::Reset { yes: _ }) => handle_reset(robot_mode), Some(Commands::Reset { yes: _ }) => handle_reset(robot_mode),
Some(Commands::Migrate) => handle_migrate(cli.config.as_deref(), robot_mode).await, Some(Commands::Migrate) => handle_migrate(cli.config.as_deref(), robot_mode).await,
Some(Commands::Health) => handle_health(cli.config.as_deref(), robot_mode).await, Some(Commands::Health) => handle_health(cli.config.as_deref(), robot_mode).await,
Some(Commands::RobotDocs) => handle_robot_docs(robot_mode), Some(Commands::RobotDocs { brief }) => handle_robot_docs(robot_mode, brief),
Some(Commands::List { Some(Commands::List {
entity, entity,
@@ -703,6 +703,7 @@ fn handle_issues(
since: args.since.as_deref(), since: args.since.as_deref(),
due_before: args.due_before.as_deref(), due_before: args.due_before.as_deref(),
has_due_date: has_due, has_due_date: has_due,
statuses: &args.status,
sort: &args.sort, sort: &args.sort,
order, order,
}; };
@@ -1697,6 +1698,7 @@ fn handle_timeline(
result.total_events_before_limit, result.total_events_before_limit,
params.depth, params.depth,
params.expand_mentions, params.expand_mentions,
args.fields.as_deref(),
); );
} else { } else {
print_timeline(&result); print_timeline(&result);
@@ -1740,7 +1742,7 @@ async fn handle_search(
let elapsed_ms = start.elapsed().as_millis() as u64; let elapsed_ms = start.elapsed().as_millis() as u64;
if robot_mode { if robot_mode {
print_search_results_json(&response, elapsed_ms); print_search_results_json(&response, elapsed_ms, args.fields.as_deref());
} else { } else {
print_search_results(&response); print_search_results(&response);
} }
@@ -2038,7 +2040,7 @@ struct RobotDocsActivation {
auto: String, auto: String,
} }
fn handle_robot_docs(robot_mode: bool) -> Result<(), Box<dyn std::error::Error>> { fn handle_robot_docs(robot_mode: bool, brief: bool) -> Result<(), Box<dyn std::error::Error>> {
let version = env!("CARGO_PKG_VERSION").to_string(); let version = env!("CARGO_PKG_VERSION").to_string();
let commands = serde_json::json!({ let commands = serde_json::json!({
@@ -2105,7 +2107,7 @@ fn handle_robot_docs(robot_mode: bool) -> Result<(), Box<dyn std::error::Error>>
}, },
"issues": { "issues": {
"description": "List or show issues", "description": "List or show issues",
"flags": ["<IID>", "-n/--limit", "--fields <list>", "-s/--state", "-p/--project", "-a/--author", "-A/--assignee", "-l/--label", "-m/--milestone", "--since", "--due-before", "--has-due", "--no-has-due", "--sort", "--asc", "--no-asc", "-o/--open", "--no-open"], "flags": ["<IID>", "-n/--limit", "--fields <list>", "-s/--state", "--status <name>", "-p/--project", "-a/--author", "-A/--assignee", "-l/--label", "-m/--milestone", "--since", "--due-before", "--has-due", "--no-has-due", "--sort", "--asc", "--no-asc", "-o/--open", "--no-open"],
"example": "lore --robot issues --state opened --limit 10", "example": "lore --robot issues --state opened --limit 10",
"response_schema": { "response_schema": {
"list": { "list": {
@@ -2141,13 +2143,14 @@ fn handle_robot_docs(robot_mode: bool) -> Result<(), Box<dyn std::error::Error>>
}, },
"search": { "search": {
"description": "Search indexed documents (lexical, hybrid, semantic)", "description": "Search indexed documents (lexical, hybrid, semantic)",
"flags": ["<QUERY>", "--mode", "--type", "--author", "-p/--project", "--label", "--path", "--since", "--updated-since", "-n/--limit", "--explain", "--no-explain", "--fts-mode"], "flags": ["<QUERY>", "--mode", "--type", "--author", "-p/--project", "--label", "--path", "--since", "--updated-since", "-n/--limit", "--fields <list>", "--explain", "--no-explain", "--fts-mode"],
"example": "lore --robot search 'authentication bug' --mode hybrid --limit 10", "example": "lore --robot search 'authentication bug' --mode hybrid --limit 10",
"response_schema": { "response_schema": {
"ok": "bool", "ok": "bool",
"data": {"results": "[{doc_id:int, source_type:string, title:string, snippet:string, score:float, project_path:string, web_url:string?}]", "total_count": "int", "query": "string", "mode": "string"}, "data": {"results": "[{document_id:int, source_type:string, title:string, snippet:string, score:float, url:string?, author:string?, created_at:string?, updated_at:string?, project_path:string, labels:[string], paths:[string]}]", "total_results": "int", "query": "string", "mode": "string", "warnings": "[string]"},
"meta": {"elapsed_ms": "int"} "meta": {"elapsed_ms": "int"}
} },
"fields_presets": {"minimal": ["document_id", "title", "source_type", "score"]}
}, },
"count": { "count": {
"description": "Count entities in local database", "description": "Count entities in local database",
@@ -2226,17 +2229,18 @@ fn handle_robot_docs(robot_mode: bool) -> Result<(), Box<dyn std::error::Error>>
}, },
"timeline": { "timeline": {
"description": "Chronological timeline of events matching a keyword query", "description": "Chronological timeline of events matching a keyword query",
"flags": ["<QUERY>", "-p/--project", "--since <duration>", "--depth <n>", "--expand-mentions", "-n/--limit", "--max-seeds", "--max-entities", "--max-evidence"], "flags": ["<QUERY>", "-p/--project", "--since <duration>", "--depth <n>", "--expand-mentions", "-n/--limit", "--fields <list>", "--max-seeds", "--max-entities", "--max-evidence"],
"example": "lore --robot timeline '<keyword>' --since 30d", "example": "lore --robot timeline '<keyword>' --since 30d",
"response_schema": { "response_schema": {
"ok": "bool", "ok": "bool",
"data": {"entities": "[{type:string, iid:int, title:string, project_path:string}]", "events": "[{timestamp:string, type:string, entity_type:string, entity_iid:int, detail:string}]", "total_events": "int"}, "data": {"entities": "[{type:string, iid:int, title:string, project_path:string}]", "events": "[{timestamp:string, type:string, entity_type:string, entity_iid:int, detail:string}]", "total_events": "int"},
"meta": {"elapsed_ms": "int"} "meta": {"elapsed_ms": "int"}
} },
"fields_presets": {"minimal": ["timestamp", "type", "entity_iid", "detail"]}
}, },
"who": { "who": {
"description": "People intelligence: experts, workload, active discussions, overlap, review patterns", "description": "People intelligence: experts, workload, active discussions, overlap, review patterns",
"flags": ["<target>", "--path <path>", "--active", "--overlap <path>", "--reviews", "--since <duration>", "-p/--project", "-n/--limit"], "flags": ["<target>", "--path <path>", "--active", "--overlap <path>", "--reviews", "--since <duration>", "-p/--project", "-n/--limit", "--fields <list>"],
"modes": { "modes": {
"expert": "lore who <file-path> -- Who knows about this area? (also: --path for root files)", "expert": "lore who <file-path> -- Who knows about this area? (also: --path for root files)",
"workload": "lore who <username> -- What is someone working on?", "workload": "lore who <username> -- What is someone working on?",
@@ -2254,15 +2258,26 @@ fn handle_robot_docs(robot_mode: bool) -> Result<(), Box<dyn std::error::Error>>
"...": "mode-specific fields" "...": "mode-specific fields"
}, },
"meta": {"elapsed_ms": "int"} "meta": {"elapsed_ms": "int"}
},
"fields_presets": {
"expert_minimal": ["username", "score"],
"workload_minimal": ["entity_type", "iid", "title", "state"],
"active_minimal": ["entity_type", "iid", "title", "participants"]
} }
}, },
"robot-docs": { "robot-docs": {
"description": "This command (agent self-discovery manifest)", "description": "This command (agent self-discovery manifest)",
"flags": [], "flags": ["--brief"],
"example": "lore robot-docs" "example": "lore robot-docs --brief"
} }
}); });
// --brief: strip response_schema from every command (~60% smaller)
let mut commands = commands;
if brief {
strip_schemas(&mut commands);
}
let exit_codes = serde_json::json!({ let exit_codes = serde_json::json!({
"0": "Success", "0": "Success",
"1": "Internal error", "1": "Internal error",
@@ -2429,6 +2444,7 @@ async fn handle_list_compat(
since: since_filter, since: since_filter,
due_before: due_before_filter, due_before: due_before_filter,
has_due_date, has_due_date,
statuses: &[],
sort, sort,
order, order,
}; };