Files

teernisse a0519a4d0d feat(surgical-sync): add per-IID surgical sync pipeline

Implement lore sync --issue <IID> --mr <IID> -p <project> for on-demand
sync of specific entities without running the full project-wide pipeline.
Completes in seconds by fetching only targeted entities, their discussions,
resource events, and dependent data, then scoping doc regeneration and
embedding to only affected documents.

Pipeline stages: PREFLIGHT -> TOCTOU -> INGEST -> DEPENDENTS -> DOCS -> EMBED

New files:
- src/ingestion/surgical.rs: TOCTOU guard, preflight fetch, per-entity ingest
- src/ingestion/surgical_tests.rs: 17 unit/wiremock tests
- src/cli/commands/sync_surgical.rs: 719-line orchestrator
- src/embedding/pipeline_tests.rs: scoped embedding tests
- src/gitlab/client_tests.rs: get_by_iid wiremock tests
- migrations/027_surgical_sync_runs.sql: 12 surgical columns + indexes

Key changes:
- SyncOptions: issue_iids, mr_iids, project, preflight_only fields
- SyncResult: surgical_mode, surgical_iids, entity_results fields
- SyncRunRecorder: surgical lifecycle methods (set_surgical_metadata, etc)
- GitLabClient: get_issue_by_iid, get_mr_by_iid
- Scoped docs: regenerate_dirty_documents_for_sources
- Scoped embed: embed_documents_by_ids
- run_sync dispatches to run_sync_surgical when is_surgical()
- robot-docs updated with surgical sync schema + workflows
- All 1019 tests pass, clippy clean

Closes: bd-1sc6, bd-tiux, bd-159p, bd-1lja, bd-hs6j, bd-1elx, bd-arka,
        bd-3sez, bd-wcja, bd-kanh, bd-1i4i, bd-3bec

2026-02-18 15:39:14 -05:00

3.5 KiB

Raw Blame History

Trace/File-History Empty-Result Diagnostics

AC-1: Human mode shows searched paths on empty results

When lore trace <path> returns 0 chains in human mode, the output includes the resolved path(s) that were searched. If renames were followed, show the full rename chain.

AC-2: Human mode shows actionable reason on empty results

When 0 chains are found, the hint message distinguishes between:

"No MR file changes synced yet" (mr_file_changes table is empty for this project) -> suggest lore sync
"File paths not found in MR file changes" (sync has run but this file has no matches) -> suggest checking the path or that the file may predate the sync window

AC-3: Robot mode includes diagnostics object on empty results

When total_chains == 0 in robot JSON output, add a "diagnostics" key to "meta" containing:

paths_searched: [...] (already present as resolved_paths in data -- no duplication needed)
hints: [string] -- same actionable reasons as AC-2 but machine-readable

AC-4: Info-level logging at each pipeline stage

Add tracing::info! calls visible with -v:

After rename resolution: number of paths found
After MR query: number of MRs found
After issue/discussion enrichment: counts per MR

AC-5: Apply same pattern to `lore file-history`

All of the above (AC-1 through AC-4) also apply to lore file-history empty results.

Secure Token Resolution for Cron

AC-6: Config file supports optional stored token

config.json accepts an optional "token" field in the gitlab section. Existing configs without this field continue to load without error.

AC-7: Token resolution chain — env var wins, config file falls back

When resolving the GitLab token, the CLI checks in order: (1) environment variable named by tokenEnvVar, (2) token field in config file. The first non-empty value wins. If neither is set, the existing TOKEN_NOT_SET error is returned.

AC-8: `lore token set` stores token securely

A new lore token set subcommand:

Accepts token via --token flag, stdin pipe, or interactive prompt (masked input)
Validates the token against the GitLab API before storing
Writes the token into the existing config.json without disturbing other fields
Sets file permissions to 0600 (owner read/write only) on the config file after writing
Works in both human and robot mode

AC-9: `lore token show` reveals stored token

A new lore token show subcommand:

Shows the token source (env var or config file) and a masked version by default
--unmask flag reveals the full token value
Works in both human and robot mode

AC-10: All token lookups use centralized resolver

The 5 call sites that currently do std::env::var(&config.gitlab.token_env_var) are replaced with a single config.gitlab.resolve_token() method. No inline env var lookups remain for the GitLab token.

AC-11: `lore cron install` warns when no stored token

After installing the cron entry, if resolve_token() would fail without an env var (i.e., no token in config file), print a warning directing the user to run lore token set.

AC-12: `TOKEN_NOT_SET` error suggests `lore token set`

The error suggestion and actions array for TOKEN_NOT_SET include lore token set as the primary recommended fix, with env var export as the secondary option.

AC-13: `lore doctor` reports token source

The doctor command's GitLab check shows where the token was found: "from config file" or "from GITLAB_TOKEN env var", helping users diagnose cron issues.

3.5 KiB Raw Blame History