Replace serde_json::to_string(&output).unwrap() with match-based error
handling across all robot-mode JSON printers. On serialization failure,
the error is now written to stderr instead of panicking. This hardens
the CLI against unexpected Serialize failures in production.
Affected commands: count (2), embed, generate-docs, ingest (2), search,
stats, sync (2), sync-status, timeline.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add -t/--timings flag to the sync subcommand, allowing users to opt
into a per-stage timing breakdown after the sync summary. Wire the flag
through main.rs into print_sync() which passes it to the new
should_print_timings() gate.
Enrich the data structures that flow through the sync pipeline so
downstream renderers have full error visibility:
- ProjectSummary gains status_errors (issue-side status enrichment
failures per project)
- ProjectStatusEnrichment gains path (project path for sub-row display)
- SyncResult gains documents_errored and embedding_failed so the
summary can surface doc-gen and embed failures separately
- Autocorrect table updated with --timings for fuzzy flag matching
Add per-project detail rows beneath stage completion lines during multi-project
syncs, showing itemized counts (issues/MRs, discussions, events, statuses, diffs)
for each project. Previously, only aggregate totals were visible, making it hard
to diagnose which project contributed what during a sync.
Status enrichment gets proper progress bars replacing the old spinner-only
display: StatusEnrichmentStarted now carries a total count so the CLI can
render a determinate bar with rate and ETA. The enrichment SQL is tightened
to use IS NOT comparisons for diff-only UPDATEs (skip rows where values
haven't changed), and a follow-up touch_stmt ensures status_synced_at is
updated even for unchanged rows so staleness detection works correctly.
Other improvements:
- New ProjectSummary struct aggregates per-project metrics during ingestion
- SyncResult gains statuses_enriched + per-project summary vectors
- "Already up to date" message when sync finds zero changes
- Remove Arc<AtomicBool> tick_started pattern from docs/embed stages
(enable_steady_tick is idempotent, the guard was unnecessary)
- Progress bar styling: dim spinner, dark_gray track, per_sec + eta display
- Tick intervals tightened from 100ms to 60ms for smoother animation
- statuses_without_widget calculation uses fetch_result.statuses.len()
instead of subtracting enriched (more accurate when some statuses lack
work item widgets)
- Status enrichment completion log downgraded from info to debug
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace all console::style() calls in command modules with the centralized
Theme API and render:: utility functions. This ensures consistent color
behavior across the entire CLI, proper NO_COLOR/--color never support via
the LoreRenderer singleton, and eliminates duplicated formatting code.
Changes per module:
- count.rs: Theme for table headers, render::format_number replacing local
duplicate. Removed local format_number implementation.
- doctor.rs: Theme::success/warning/error for check status symbols and
messages. Unicode escapes for check/warning/cross symbols.
- drift.rs: Theme::bold/error/success for drift detection headers and
status messages.
- embed.rs: Compact output format — headline with count, zero-suppressed
detail lines, 'nothing to embed' short-circuit for no-op runs.
- generate_docs.rs: Same compact pattern — headline + detail + hint for
next step. No-op short-circuit when regenerated==0.
- ingest.rs: Theme for project summaries, sync status, dry-run preview.
All console::style -> Theme replacements.
- list.rs: Replace comfy-table with render::LoreTable for issue/MR listing.
Remove local colored_cell, colored_cell_hex, format_relative_time,
truncate_with_ellipsis, and format_labels (all moved to render.rs).
- list_tests.rs: Update test assertions to use render:: functions.
- search.rs: Add render_snippet() for FTS5 <mark> tag highlighting via
Theme::bold().underline(). Compact result layout with type badges.
- show.rs: Theme for entity detail views, delegate format_date and
wrap_text to render module.
- stats.rs: Section-based layout using render::section_divider. Compact
middle-dot format for document counts. Color-coded embedding coverage
percentage (green >=95%, yellow >=50%, red <50%).
- sync.rs: Compact sync summary — headline with counts and elapsed time,
zero-suppressed detail lines, visually prominent error-only section.
- sync_status.rs: Theme for run history headers, removed local
format_number duplicate.
- timeline.rs: Theme for headers/footers, render:: for date/truncate,
standard format! padding replacing console::pad_str.
- who.rs: Theme for all expert/workload/active/overlap/review output
modes, render:: for relative time and truncation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Previously the status enrichment phase (GraphQL work item status fetch)
ran silently — users saw no feedback between "syncing issues" and the
final enrichment summary. For projects with hundreds of issues and
adaptive page-size retries, this felt like a hang.
Changes across three layers:
GraphQL (graphql.rs):
- Extract fetch_issue_statuses_with_progress() accepting an optional
on_page callback invoked after each paginated fetch with the
running count of fetched IIDs
- Original fetch_issue_statuses() preserved as a zero-cost
delegation wrapper (no callback overhead)
Orchestrator (orchestrator.rs):
- Three new ProgressEvent variants: StatusEnrichmentStarted,
StatusEnrichmentPageFetched, StatusEnrichmentWriting
- Wire the page callback through to the new _with_progress fn
CLI (ingest.rs):
- Handle all three new events in the progress callback, updating
both the per-project spinner and the stage bar with live counts
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a "Phase 1.5" status enrichment step to the issue ingestion pipeline
that fetches work item statuses via the GitLab GraphQL API after the
standard REST API ingestion completes.
Schema changes (migration 021):
- Add status_name, status_category, status_color, status_icon_name, and
status_synced_at columns to the issues table (all nullable)
Ingestion pipeline changes:
- New `enrich_issue_statuses_txn()` function that applies fetched
statuses in a single transaction with two phases: clear stale statuses
for issues that no longer have a status widget, then apply new/updated
statuses from the GraphQL response
- ProgressEvent variants for status enrichment (complete/skipped)
- IngestProjectResult tracks enrichment metrics (seen, enriched, cleared,
without_widget, partial_error_count, enrichment_mode, errors)
- Robot mode JSON output includes per-project status enrichment details
Configuration:
- New `sync.fetchWorkItemStatus` config option (defaults true) to disable
GraphQL status enrichment on instances without Premium/Ultimate
- `LoreError::GitLabAuthFailed` now treated as permanent API error so
status enrichment auth failures don't trigger retries
Also removes the unnecessary nested SAVEPOINT in store_closes_issues_refs
(already runs within the orchestrator's transaction context).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds mr_diffs_fetched and mr_diffs_failed fields to IngestResult and
SyncResult, threads them through the orchestrator aggregation, includes
them in the structured tracing span and human-readable sync summary.
Previously MR diff failures were silently swallowed — now they appear
alongside resource event counts for full pipeline observability.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Chain: bd-jec (config flag) -> bd-2yo (fetch MR diffs) -> bd-3qn6 (rewrite who queries)
- Add fetch_mr_file_changes config option and --no-file-changes CLI flag
- Add GitLab MR diffs API fetch pipeline with watermark-based sync
- Create migration 020 for diffs_synced_for_updated_at watermark column
- Rewrite query_expert() and query_overlap() to use 4-signal UNION ALL:
DiffNote reviewers, DiffNote MR authors, file-change authors, file-change reviewers
- Deduplicate across signal types via COUNT(DISTINCT CASE WHEN ... THEN mr_id END)
- Add insert_file_change test helper, 8 new who tests, all 397 tests pass
- Also includes: list performance migration 019, autocorrect module, README updates
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Robot mode consistency improvements across all command output:
Timing:
- Every robot JSON response now includes meta.elapsed_ms measuring
wall-clock time from command start to serialization. Agents can use
this to detect slow queries and tune --limit or --project filters.
Field selection (--fields):
- print_list_issues_json and print_list_mrs_json accept an optional
fields slice that prunes each item in the response array to only
the requested keys. A "minimal" preset expands to
[iid, title, state, updated_at_iso] for token-efficient agent scans.
- filter_fields and expand_fields_preset live in the new
src/cli/robot.rs module alongside RobotMeta.
Actionable error recovery:
- LoreError gains an actions() method returning concrete shell commands
an agent can execute to recover (e.g. "ollama serve" for
OllamaUnavailable, "lore init" for ConfigNotFound).
- RobotError now serializes an "actions" array (empty array omitted)
so agents can parse and offer one-click fixes.
Envelope consistency:
- show issue/MR JSON responses now use the standard
{"ok":true,"data":...,"meta":...} envelope instead of bare data,
matching all other commands.
Files: src/cli/robot.rs (new), src/core/error.rs,
src/cli/commands/{count,embed,generate_docs,ingest,list,show,stats,sync_status}.rs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three fixes to the sync pipeline:
1. Atomic watermarks: wrap complete_job + update_watermark in a single
SQLite transaction so crash between them can't leave partial state.
2. Concurrent drain loops: prefetch HTTP requests via join_all (batch
size = dependent_concurrency), then write serially to DB. Reduces
~9K sequential requests from ~19 min to ~2.4 min.
3. Graceful shutdown: install Ctrl+C handler via ShutdownSignal
(Arc<AtomicBool>), thread through orchestrator/CLI, release locked
jobs on interrupt, record sync_run as "failed".
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Enables preview of operations without making changes, useful for
understanding what would happen before committing to a full sync.
Ingest dry-run (--dry-run flag):
- Shows resource type, sync mode (full vs incremental), project list
- Per-project info: existing count, has_cursor, last_synced timestamp
- No GitLab API calls, no database writes
Sync dry-run (--dry-run flag):
- Preview all four stages: issues ingest, MRs ingest, docs, embed
- Shows which stages would run vs be skipped (--no-docs, --no-embed)
- Per-project breakdown for both entity types
Stats repair dry-run (--dry-run flag):
- Shows what would be repaired without executing repairs
- "would fix" vs "fixed" indicator in terminal output
- dry_run: true field in JSON response
Implementation details:
- DryRunPreview struct captures project-level sync state
- SyncDryRunResult aggregates previews for all sync stages
- Terminal output uses yellow styling for "would" actions
- JSON output includes dry_run: true at top level
Flag handling:
- --dry-run and --no-dry-run pair for explicit control
- Defaults to false (normal operation)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Removes module-level doc comments (//! lines) and excessive inline doc
comments that were duplicating information already evident from:
- Function/struct names (self-documenting code)
- Type signatures (the what is clear from types)
- Implementation context (the how is clear from code)
Affected modules:
- cli/* - Removed command descriptions duplicating clap help text
- core/* - Removed module headers and obvious function docs
- documents/* - Removed extractor/regenerator/truncation docs
- embedding/* - Removed pipeline and chunking docs
- gitlab/* - Removed client and transformer docs (kept type definitions)
- ingestion/* - Removed orchestrator and ingestion docs
- search/* - Removed FTS and vector search docs
Philosophy: Code should be self-documenting. Comments should explain
"why" (business decisions, non-obvious constraints) not "what" (which
the code itself shows). This change reduces noise and maintenance burden
while keeping the codebase just as understandable.
Retains comments for:
- Non-obvious business logic
- Important safety invariants
- Complex algorithm explanations
- Public API boundaries where generated docs matter
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The sync command's stage spinners now show real-time aggregate progress
for each pipeline phase instead of static "syncing..." messages.
- Add `progress_callback` parameter to `run_embed` and
`run_generate_docs` so callers can receive `(processed, total)` updates
- Add `stage_bar` parameter to `run_ingest` for aggregate progress
across concurrently-ingested projects using shared AtomicUsize counters
- Update `stage_spinner` to use `{prefix}` for the `[N/M]` label,
allowing `{msg}` to be updated independently with progress details
- Thread `ProgressBar` clones into each concurrent project task so
per-entity progress (fetch, discussions, events) is reflected on the
aggregate spinner
- Pass `None` for progress callbacks at standalone CLI entry points
(handle_ingest, handle_generate_docs, handle_embed) to preserve
existing behavior when commands are run outside of sync
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add end-to-end observability to the sync and ingest pipelines:
Sync command:
- Generate UUID-based run_id for each sync invocation, propagated through
all child spans for log correlation across stages
- Accept MetricsLayer reference to extract hierarchical StageTiming data
after pipeline completion for robot-mode performance output
- Record sync runs in DB via SyncRunRecorder (start/succeed/fail lifecycle)
- Wrap entire sync execution in a root tracing span with run_id field
Ingest command:
- Wrap run_ingest in an instrumented root span with run_id and resource_type
- Add project path prefix to discussion progress bars for multi-project clarity
- Reset resource_events_synced_for_updated_at on --full re-sync
Sync status:
- Expand from single last_run to configurable recent runs list (default 10)
- Parse and expose StageTiming metrics from stored metrics_json
- Add run_id, total_items_processed, total_errors to SyncRunInfo
- Add mr_count to DataSummary for complete entity coverage
Orchestrator:
- Add #[instrument] with structured fields to issue and MR ingestion functions
- Record items_processed, items_skipped, errors on span close for MetricsLayer
- Emit granular progress events (IssuesFetchStarted, IssuesFetchComplete)
- Pass project_id through to drain_resource_events for scoped job claiming
Document regenerator and embedding pipeline:
- Add #[instrument] spans with items_processed, items_skipped, errors fields
- Record final counts on span close for metrics extraction
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The sync pipeline was bottlenecked at 10 req/s (hardcoded) with
sequential project processing and no retry on rate limiting. These
changes target 3-5x throughput improvement.
Rate limit configuration:
- Add requestsPerSecond to SyncConfig (default 30.0, was hardcoded 10)
- Pass configured rate through to GitLabClient::new from ingest
- Floor rate at 0.1 rps in RateLimiter::new to prevent panic on
Duration::from_secs_f64(1.0 / 0.0) — now reachable via user config
429 auto-retry:
- Both request() and request_with_headers() retry up to 3 times on
HTTP 429, respecting the retry-after header (default 60s)
- Extract parse_retry_after helper, reused by handle_response fallback
- After exhausting retries, the 429 error propagates as before
- Improved JSON decode errors now include a response body preview
Concurrent project ingestion:
- Derive Clone on GitLabClient (cheap: shares Arc<Mutex<RateLimiter>>
and reqwest::Client which is already Arc-backed)
- Restructure project loop to use futures::stream::buffer_unordered
with primary_concurrency (default 4) as the parallelism bound
- Each project gets its own SQLite connection (WAL mode + busy_timeout
handles concurrent writes)
- Add show_spinner field to IngestDisplay to separate the per-project
spinner from the sync-level stage spinner
- Error aggregation defers failures: all successful projects get their
summaries printed and results counted before returning the first error
- Bump dependentConcurrency default from 2 to 8 for discussion prefetch
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Three bugs fixed:
1. Early return in orchestrator when no discussions needed sync also
skipped resource event enqueue+drain. On incremental syncs (the most
common case), resource events were never fetched. Restructured to use
if/else instead of early return so Step 4 always executes.
2. Ingest command JSON and human-readable output silently dropped
resource_events_fetched/failed counts. Added to IngestJsonData and
print_ingest_summary.
3. Progress bar reuse after finish_and_clear caused indicatif to silently
ignore subsequent set_position/set_length calls. Added reset() call
before reconfiguring the bar for resource events.
Also removed stale comment referencing "unsafe" that didn't reflect
the actual unchecked_transaction approach.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Integrate resource event fetching as Step 4 of both issue and MR
ingestion, gated behind the fetch_resource_events config flag.
Orchestrator changes:
- Add ProgressEvent variants: ResourceEventsFetchStarted,
ResourceEventFetched, ResourceEventsFetchComplete
- Add resource_events_fetched/failed fields to IngestProjectResult
and IngestMrProjectResult
- New enqueue_resource_events_for_entity_type() queries all
issues/MRs for a project and enqueues resource_events jobs via
the dependent queue (INSERT OR IGNORE for idempotency)
- New drain_resource_events() claims jobs in batches, fetches
state/label/milestone events from GitLab API, stores them
atomically via unchecked_transaction, and handles failures
with exponential backoff via fail_job()
- Max-iterations guard prevents infinite retry loops within a
single drain run
- New store_resource_events() + per-type _tx helpers write events
using prepared statements inside a single transaction
- DrainResult struct tracks fetched/failed counts
CLI ingest changes:
- IngestResult gains resource_events_fetched/failed fields
- Progress bar repurposed for resource event fetch phase
(reuses discussion bar with updated template)
- Accumulates event counts from both issue and MR ingestion
CLI sync changes:
- SyncResult gains resource_events_fetched/failed fields
- Accumulates counts from both ingest stages
- print_sync() conditionally displays event counts
- Structured logging includes event counts
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Ingest:
- Introduce IngestDisplay struct with show_progress/show_text booleans
to decouple progress bars from text output. Replaces the robot_mode
bool parameter with explicit display control, enabling sync to show
progress without duplicating summary text (progress_only mode).
- Use resolve_project() for --project filtering instead of LIKE queries,
providing proper error messages for ambiguous or missing projects.
List:
- Add colored_cell() helper that checks console::colors_enabled() before
applying comfy-table foreground colors, bridging the gap between the
console and comfy-table crates for --color flag support.
- Use resolve_project() for project filtering (exact ID match).
- Improve since filter to return explicit errors instead of silently
ignoring invalid values.
- Improve format_relative_time for proper singular/plural forms.
Search:
- Validate --after/--updated-after with explicit error messages.
- Handle optional title field (Option<String>) in HydratedRow.
Show:
- Use resolve_project() for project disambiguation.
Sync:
- Thread robot_mode via SyncOptions for IngestDisplay selection.
- Use IngestDisplay::progress_only() in interactive sync mode.
GenerateDocs:
- Use resolve_project() for --project filtering.
Co-Authored-By: Claude (us.anthropic.claude-opus-4-5-20251101-v1:0) <noreply@anthropic.com>
Extends the CLI with six new commands that complete the search pipeline:
- lore search <QUERY>: Hybrid search with mode selection (lexical,
hybrid, semantic), rich filtering (--type, --author, --project,
--label, --path, --after, --updated-after), result limits, and
optional explain mode showing RRF score breakdowns. Safe FTS mode
sanitizes user input; raw mode passes through for power users.
- lore stats: Document and index statistics with optional --check
for integrity verification and --repair to fix inconsistencies
(orphaned documents, missing FTS entries, stale dirty queue items).
- lore embed: Generate vector embeddings via Ollama. Supports
--retry-failed to re-attempt previously failed embeddings.
- lore generate-docs: Drain the dirty queue to regenerate documents.
--full seeds all entities for complete rebuild. --project scopes
to a single project.
- lore sync: Full pipeline orchestration (ingest issues + MRs,
generate-docs, embed) with --no-embed and --no-docs flags for
partial runs. Reports per-stage results and total elapsed time.
- lore health: Quick pre-flight check (config exists, DB exists,
schema current). Returns exit code 1 if unhealthy. Designed for
agent pre-flight scripts.
- lore robot-docs: Machine-readable command manifest for agent
self-discovery. Returns all commands, flags, examples, exit codes,
and recommended workflows as structured JSON.
Also enhances lore init with --gitlab-url, --token-env-var, and
--projects flags for fully non-interactive robot-mode initialization.
Fixes init's force/non-interactive precedence logic and adds JSON
output for robot mode.
Updates all command files for the GiError -> LoreError rename.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Ingestion counters (discussions_upserted, notes_upserted,
discussions_fetched, diffnotes_count) were incremented before
tx.commit(), meaning a failed commit would report inflated
metrics. Counters now increment only after successful commit
so reported numbers accurately reflect persisted state.
Also simplifies the stale-removal guard in issue discussions:
the received_first_response flag was unnecessary since an empty
seen_discussion_ids list is safe to pass to remove_stale -- if
there were no discussions, stale removal correctly sweeps all
previously-stored discussions. The two separate code paths
(empty vs populated) are collapsed into a single branch.
Derives Default on IngestResult to eliminate verbose zero-init.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Extends all data commands to support merge requests alongside issues,
with consistent patterns and JSON output for robot mode.
List command (gi list mrs):
- MR-specific columns: branches, draft status, reviewers
- Filters: --state (opened|merged|closed|locked|all), --draft,
--no-draft, --reviewer, --target-branch, --source-branch
- Discussion count with unresolved indicator (e.g., "5/2!")
- JSON output includes full MR metadata
Show command (gi show mr <iid>):
- MR details with branches, assignees, reviewers, merge status
- DiffNote positions showing file:line for code review comments
- Full description and discussion bodies (no truncation in JSON)
- --json flag for structured output with ISO timestamps
Count command (gi count mrs):
- MR counting with optional --type filter for discussions/notes
- JSON output with breakdown by state
Ingest command (gi ingest --type mrs):
- Full MR sync with discussion prefetch
- Progress output shows MR-specific metrics (diffnotes count)
- JSON summary with comprehensive sync statistics
All commands respect global --robot mode for auto-JSON output.
The pattern "gi list mrs --json | jq '.mrs[] | .iid'" now works
for scripted MR processing.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This is a P1 fix from the CP1-CP2 alignment audit. The --full flag was
designed to enable complete data re-synchronization, but it only reset
sync_cursors for issues—it failed to reset the per-issue
discussions_synced_for_updated_at watermark.
The result was an inconsistent state: issues would be re-fetched from
GitLab (because sync_cursors were cleared), but their discussions would
NOT be re-synced (because the watermark comparison prevented it). This
was a subtle bug because the watermark check uses:
WHERE updated_at > COALESCE(discussions_synced_for_updated_at, 0)
When discussions_synced_for_updated_at is already set to the issue's
updated_at, the comparison fails and discussions are skipped.
Fix: Before clearing sync_cursors, set discussions_synced_for_updated_at
to NULL for all issues in the project. This makes COALESCE return 0,
ensuring all issues become eligible for discussion sync.
The ordering is important: watermarks must be reset BEFORE cursors to
ensure the full sync behaves consistently.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>