docs: add lore-service, work-item-status-graphql, and time-decay plans
Three implementation plans with iterative cross-model refinement: lore-service (5 iterations): HTTP service layer exposing lore's SQLite data via REST/SSE for integration with external tools (dashboards, IDE extensions, chat agents). Covers authentication, rate limiting, caching strategy, and webhook-driven sync triggers. work-item-status-graphql (7 iterations + TDD appendix): Detailed implementation plan for the GraphQL-based work item status enrichment feature (now implemented). Includes the TDD appendix with test-first development specifications covering GraphQL client, adaptive pagination, ingestion orchestration, CLI display, and robot mode output. time-decay-expert-scoring (iteration 5 feedback): Updates to the existing time-decay scoring plan incorporating feedback on decay curve parameterization, recency weighting for discussion contributions, and staleness detection thresholds. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
186
plans/lore-service.feedback-1.md
Normal file
186
plans/lore-service.feedback-1.md
Normal file
@@ -0,0 +1,186 @@
|
|||||||
|
1. **Isolate scheduled behavior from manual `sync`**
|
||||||
|
Reasoning: Your current plan injects backoff into `handle_sync_cmd`, which affects all `lore sync` calls (including manual recovery runs). Scheduled behavior should be isolated so humans aren’t unexpectedly blocked by service backoff.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ Context
|
||||||
|
-`lore sync` runs a 4-stage pipeline (issues, MRs, docs, embeddings) that takes 2-4 minutes.
|
||||||
|
+`lore sync` remains the manual/operator command.
|
||||||
|
+`lore service run` (hidden/internal) is the scheduled execution entrypoint.
|
||||||
|
|
||||||
|
@@ Commands & User Journeys
|
||||||
|
+### `lore service run` (hidden/internal)
|
||||||
|
+**What it does:** Executes one scheduled sync attempt with service-only policy:
|
||||||
|
+- applies service backoff policy
|
||||||
|
+- records service run state
|
||||||
|
+- invokes sync pipeline with configured profile
|
||||||
|
+- updates retry state on success/failure
|
||||||
|
+
|
||||||
|
+**Invocation:** scheduler always runs:
|
||||||
|
+`lore --robot service run --reason timer`
|
||||||
|
|
||||||
|
@@ Backoff Integration into `handle_sync_cmd`
|
||||||
|
-Insert **after** config load but **before** the dry_run check:
|
||||||
|
+Do not add backoff checks to `handle_sync_cmd`.
|
||||||
|
+Backoff logic lives only in `handle_service_run`.
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Use DB as source-of-truth for service state (not a standalone JSON status file)**
|
||||||
|
Reasoning: You already have `sync_runs` in SQLite. A separate JSON status file creates split-brain and race/corruption risk. Keep JSON as optional cache/export only.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ Status File
|
||||||
|
-Location: `{get_data_dir()}/sync-status.json`
|
||||||
|
+Primary state location: SQLite (`service_state` table) + existing `sync_runs`.
|
||||||
|
+Optional mirror file: `{get_data_dir()}/sync-status.json` (best-effort export only).
|
||||||
|
|
||||||
|
@@ File-by-File Implementation Details
|
||||||
|
-### `src/core/sync_status.rs` (NEW)
|
||||||
|
+### `migrations/015_service_state.sql` (NEW)
|
||||||
|
+CREATE TABLE service_state (
|
||||||
|
+ id INTEGER PRIMARY KEY CHECK (id = 1),
|
||||||
|
+ installed INTEGER NOT NULL DEFAULT 0,
|
||||||
|
+ platform TEXT,
|
||||||
|
+ interval_seconds INTEGER,
|
||||||
|
+ profile TEXT NOT NULL DEFAULT 'balanced',
|
||||||
|
+ consecutive_failures INTEGER NOT NULL DEFAULT 0,
|
||||||
|
+ next_retry_at_ms INTEGER,
|
||||||
|
+ last_error_code TEXT,
|
||||||
|
+ last_error_message TEXT,
|
||||||
|
+ updated_at_ms INTEGER NOT NULL
|
||||||
|
+);
|
||||||
|
+
|
||||||
|
+### `src/core/service_state.rs` (NEW)
|
||||||
|
+- read/write state row
|
||||||
|
+- derive backoff/next_retry
|
||||||
|
+- join with latest `sync_runs` for status output
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Backoff policy should be configurable, jittered, and error-aware**
|
||||||
|
Reasoning: Fixed hardcoded backoff (`base=1800`) is wrong when user sets another interval. Also permanent failures (bad token/config) should not burn retries forever; they should enter paused/error state.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ Backoff Logic
|
||||||
|
-// Exponential: base * 2^failures, capped at 4 hours
|
||||||
|
+// Exponential with jitter: base * 2^(failures-1), capped, ±20% jitter
|
||||||
|
+// Applies only to transient errors.
|
||||||
|
+// Permanent errors set `paused_reason` and stop retries until user action.
|
||||||
|
|
||||||
|
@@ CLI Definition Changes
|
||||||
|
+ServiceCommand::Resume, // clear paused state / failures
|
||||||
|
+ServiceCommand::Run, // hidden
|
||||||
|
|
||||||
|
@@ Error Types
|
||||||
|
+ServicePaused, // scheduler paused due to permanent error
|
||||||
|
+ServiceCommandFailed, // OS command failure with stderr context
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Add a pipeline-level single-flight lock**
|
||||||
|
Reasoning: Current locking is in ingest stages; there’s still overlap risk across full sync pipelines (docs/embed can overlap with another run). Add a top-level lock for scheduled/manual sync pipeline execution.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ Architecture
|
||||||
|
+Add `sync_pipeline` lock at top-level sync execution.
|
||||||
|
+Keep existing ingest lock (`sync`) for ingest internals.
|
||||||
|
|
||||||
|
@@ Backoff Integration into `handle_sync_cmd`
|
||||||
|
+Before starting sync pipeline, acquire `AppLock` with:
|
||||||
|
+name = "sync_pipeline"
|
||||||
|
+stale_lock_minutes = config.sync.stale_lock_minutes
|
||||||
|
+heartbeat_interval_seconds = config.sync.heartbeat_interval_seconds
|
||||||
|
```
|
||||||
|
|
||||||
|
5. **Don’t embed token in service files by default**
|
||||||
|
Reasoning: Embedding PAT into unit/plist is a high-risk secret leak path. Make secure storage explicit and default-safe.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ `lore service install [--interval 30m]`
|
||||||
|
+`lore service install [--interval 30m] [--token-source env-file|embedded]`
|
||||||
|
+Default: `env-file` (0600 perms, user-owned)
|
||||||
|
+`embedded` allowed only with explicit opt-in and warning
|
||||||
|
|
||||||
|
@@ Robot output
|
||||||
|
- "token_embedded": true
|
||||||
|
+ "token_source": "env_file"
|
||||||
|
|
||||||
|
@@ Human output
|
||||||
|
- Note: Your GITLAB_TOKEN is embedded in the service file.
|
||||||
|
+ Note: Token is stored in a user-private env file (0600).
|
||||||
|
```
|
||||||
|
|
||||||
|
6. **Introduce a command-runner abstraction with timeout + stderr capture**
|
||||||
|
Reasoning: `launchctl/systemctl/schtasks` calls are failure-prone; you need consistent error mapping and deterministic tests.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ Platform Backends
|
||||||
|
-exports free functions that dispatch via `#[cfg(target_os)]`
|
||||||
|
+exports backend + shared `CommandRunner`:
|
||||||
|
+- run(cmd, args, timeout)
|
||||||
|
+- capture stdout/stderr/exit code
|
||||||
|
+- map failure to `ServiceCommandFailed { cmd, exit_code, stderr }`
|
||||||
|
```
|
||||||
|
|
||||||
|
7. **Persist install manifest to avoid brittle file parsing**
|
||||||
|
Reasoning: Parsing timer/plist for interval/state is fragile and platform-format dependent. Persist a manifest with checksums and expected artifacts.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ Platform Backends
|
||||||
|
-Same pattern for ... `get_interval_seconds()`
|
||||||
|
+Add manifest: `{data_dir}/service-manifest.json`
|
||||||
|
+Stores platform, interval, profile, generated files, and command.
|
||||||
|
+`service status` reads manifest first, then verifies platform state.
|
||||||
|
|
||||||
|
@@ Acceptance criteria
|
||||||
|
+Install is idempotent:
|
||||||
|
+- if manifest+files already match, report `no_change: true`
|
||||||
|
+- if drift detected, reconcile and rewrite
|
||||||
|
```
|
||||||
|
|
||||||
|
8. **Make schedule profile explicit (`fast|balanced|full`)**
|
||||||
|
Reasoning: This makes the feature more useful and performance-tunable without requiring users to understand internal flags.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ `lore service install [--interval 30m]`
|
||||||
|
+`lore service install [--interval 30m] [--profile fast|balanced|full]`
|
||||||
|
+
|
||||||
|
+Profiles:
|
||||||
|
+- fast: `sync --no-docs --no-embed`
|
||||||
|
+- balanced (default): `sync --no-embed`
|
||||||
|
+- full: `sync`
|
||||||
|
```
|
||||||
|
|
||||||
|
9. **Upgrade `service status` to include scheduler health + recent run summary**
|
||||||
|
Reasoning: Single last-sync snapshot is too shallow. Include recent attempts and whether scheduler is paused/backing off/running.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ `lore service status`
|
||||||
|
-What it does: Shows whether the service is installed, its configuration, last sync result, and next scheduled run.
|
||||||
|
+What it does: Shows install state, scheduler state (running/backoff/paused), recent runs, and next run estimate.
|
||||||
|
|
||||||
|
@@ Robot output
|
||||||
|
- "last_sync": { ... },
|
||||||
|
- "backoff": null
|
||||||
|
+ "scheduler_state": "running|backoff|paused|idle",
|
||||||
|
+ "last_sync": { ... },
|
||||||
|
+ "recent_runs": [{"run_id":"...","status":"...","started_at_iso":"..."}],
|
||||||
|
+ "backoff": null,
|
||||||
|
+ "paused_reason": null
|
||||||
|
```
|
||||||
|
|
||||||
|
10. **Strengthen tests around determinism and cross-platform generation**
|
||||||
|
Reasoning: Time-based backoff and shell quoting are classic flaky points. Add fake clock + fake command runner for deterministic tests.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ Testing Strategy
|
||||||
|
+Add deterministic test seams:
|
||||||
|
+- `Clock` trait for backoff/now calculations
|
||||||
|
+- `CommandRunner` trait for backend command execution
|
||||||
|
+
|
||||||
|
+Add tests:
|
||||||
|
+- transient vs permanent error classification
|
||||||
|
+- backoff schedule with jitter bounds
|
||||||
|
+- manifest drift reconciliation
|
||||||
|
+- quoting/escaping for paths with spaces and special chars
|
||||||
|
+- `service run` does not modify manual `sync` behavior
|
||||||
|
```
|
||||||
|
|
||||||
|
If you want, I can rewrite your full plan as a single clean revised document with these changes already integrated (instead of patch fragments).
|
||||||
182
plans/lore-service.feedback-2.md
Normal file
182
plans/lore-service.feedback-2.md
Normal file
@@ -0,0 +1,182 @@
|
|||||||
|
**High-Impact Revisions (ordered by priority)**
|
||||||
|
|
||||||
|
1. **Make service identity project-scoped (avoid collisions across repos/users)**
|
||||||
|
Analysis: Current fixed names (`com.gitlore.sync`, `LoreSync`, `lore-sync.timer`) will collide when users run multiple gitlore workspaces. This causes silent overwrites and broken uninstall/status behavior.
|
||||||
|
Diff:
|
||||||
|
```diff
|
||||||
|
--- a/plan.md
|
||||||
|
+++ b/plan.md
|
||||||
|
@@ Commands & User Journeys / install
|
||||||
|
- lore service install [--interval 30m] [--profile balanced] [--token-source env-file]
|
||||||
|
+ lore service install [--interval 30m] [--profile balanced] [--token-source auto] [--name <optional>]
|
||||||
|
@@ Install Manifest Schema
|
||||||
|
+ /// Stable per-install identity (default derived from project root hash)
|
||||||
|
+ pub service_id: String,
|
||||||
|
@@ Platform Backends
|
||||||
|
- Label: com.gitlore.sync
|
||||||
|
+ Label: com.gitlore.sync.{service_id}
|
||||||
|
- Task name: LoreSync
|
||||||
|
+ Task name: LoreSync-{service_id}
|
||||||
|
- ~/.config/systemd/user/lore-sync.service
|
||||||
|
+ ~/.config/systemd/user/lore-sync-{service_id}.service
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Replace token model with secure per-OS defaults**
|
||||||
|
Analysis: The current “env-file default” is not actually secure on macOS launchd (token still ends up in plist). On Windows, assumptions about inherited environment are fragile. Use OS-native secure stores by default and keep `embedded` as explicit opt-in only.
|
||||||
|
Diff:
|
||||||
|
```diff
|
||||||
|
--- a/plan.md
|
||||||
|
+++ b/plan.md
|
||||||
|
@@ Token storage strategies
|
||||||
|
-| env-file (default) | ...
|
||||||
|
+| auto (default) | macOS: Keychain, Linux: env-file (0600), Windows: Credential Manager |
|
||||||
|
+| env-file | Linux/systemd only |
|
||||||
|
| embedded | ... explicit warning ...
|
||||||
|
@@ macOS launchd section
|
||||||
|
- env-file strategy stores canonical token in service-env but embeds token in plist
|
||||||
|
+ default strategy is Keychain lookup at runtime; no token persisted in plist
|
||||||
|
+ env-file is not offered on macOS
|
||||||
|
@@ Windows schtasks section
|
||||||
|
- token must be in user's system environment
|
||||||
|
+ default strategy stores token in Windows Credential Manager and injects at runtime
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Version and atomically persist manifest/status**
|
||||||
|
Analysis: `Option<Self>` on read hides corruption, and non-atomic writes risk truncated JSON on crashes. This will create false “not installed” and scheduler confusion.
|
||||||
|
Diff:
|
||||||
|
```diff
|
||||||
|
--- a/plan.md
|
||||||
|
+++ b/plan.md
|
||||||
|
@@ Install Manifest Schema
|
||||||
|
+ pub schema_version: u32, // start at 1
|
||||||
|
+ pub updated_at_iso: String,
|
||||||
|
@@ Status File Schema
|
||||||
|
+ pub schema_version: u32, // start at 1
|
||||||
|
+ pub updated_at_iso: String,
|
||||||
|
@@ Read/Write
|
||||||
|
- read(path) -> Option<Self>
|
||||||
|
+ read(path) -> Result<Option<Self>, LoreError>
|
||||||
|
- write(...) -> std::io::Result<()>
|
||||||
|
+ write_atomic(...) -> std::io::Result<()> // tmp file + fsync + rename
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Persist `next_retry_at_ms` instead of recomputing jitter**
|
||||||
|
Analysis: Deterministic jitter from timestamp modulo is predictable and can herd retries. Persisting `next_retry_at_ms` at failure time makes status accurate, stable, and cheap to compute.
|
||||||
|
Diff:
|
||||||
|
```diff
|
||||||
|
--- a/plan.md
|
||||||
|
+++ b/plan.md
|
||||||
|
@@ SyncStatusFile
|
||||||
|
pub consecutive_failures: u32,
|
||||||
|
+ pub next_retry_at_ms: Option<i64>,
|
||||||
|
@@ Backoff Logic
|
||||||
|
- compute backoff from last_run.timestamp_ms and deterministic jitter each read
|
||||||
|
+ compute backoff once on failure, store next_retry_at_ms, read-only comparison afterward
|
||||||
|
+ jitter algorithm: full jitter in [0, cap], injectable RNG for tests
|
||||||
|
```
|
||||||
|
|
||||||
|
5. **Add circuit breaker for repeated transient failures**
|
||||||
|
Analysis: Infinite transient retries can run forever on systemic failures (DB corruption, bad network policy). After N transient failures, pause with actionable reason.
|
||||||
|
Diff:
|
||||||
|
```diff
|
||||||
|
--- a/plan.md
|
||||||
|
+++ b/plan.md
|
||||||
|
@@ Scheduler states
|
||||||
|
- backoff — transient failures, waiting to retry
|
||||||
|
+ backoff — transient failures, waiting to retry
|
||||||
|
+ paused — permanent error OR circuit breaker tripped after N transient failures
|
||||||
|
@@ Service run flow
|
||||||
|
- On transient failure: increment failures, compute backoff
|
||||||
|
+ On transient failure: increment failures, compute backoff, if failures >= max_transient_failures -> pause
|
||||||
|
```
|
||||||
|
|
||||||
|
6. **Stage-aware outcome policy (core freshness over all-or-nothing)**
|
||||||
|
Analysis: Failing embeddings/docs should not block issues/MRs freshness. Split stage outcomes and only treat core stages as hard-fail by default. This improves reliability and practical usefulness.
|
||||||
|
Diff:
|
||||||
|
```diff
|
||||||
|
--- a/plan.md
|
||||||
|
+++ b/plan.md
|
||||||
|
@@ Context
|
||||||
|
- lore sync runs a 4-stage pipeline ... treated as one run result
|
||||||
|
+ lore service run records per-stage outcomes (issues, mrs, docs, embeddings)
|
||||||
|
@@ Status File Schema
|
||||||
|
+ pub stage_results: Vec<StageResult>,
|
||||||
|
@@ service run flow
|
||||||
|
- Execute sync pipeline with flags derived from profile
|
||||||
|
+ Execute stage-by-stage and classify severity:
|
||||||
|
+ - critical: issues, mrs
|
||||||
|
+ - optional: docs, embeddings
|
||||||
|
+ optional stage failures mark run as degraded, not failed
|
||||||
|
```
|
||||||
|
|
||||||
|
7. **Replace cfg free-function backend with trait-based backend**
|
||||||
|
Analysis: Current backend API is hard to test end-to-end without real OS commands. A `SchedulerBackend` trait enables deterministic integration tests and cleaner architecture.
|
||||||
|
Diff:
|
||||||
|
```diff
|
||||||
|
--- a/plan.md
|
||||||
|
+++ b/plan.md
|
||||||
|
@@ Platform Backends / Architecture
|
||||||
|
- exports free functions dispatched via #[cfg]
|
||||||
|
+ define trait SchedulerBackend { install, uninstall, state, file_paths, next_run }
|
||||||
|
+ provide LaunchdBackend, SystemdBackend, SchtasksBackend implementations
|
||||||
|
+ include FakeBackend for integration tests
|
||||||
|
```
|
||||||
|
|
||||||
|
8. **Harden platform units and detect scheduler prerequisites**
|
||||||
|
Analysis: systemd user timers often fail silently without user manager/linger; launchd context can be wrong in headless sessions. Add explicit diagnostics and unit hardening.
|
||||||
|
Diff:
|
||||||
|
```diff
|
||||||
|
--- a/plan.md
|
||||||
|
+++ b/plan.md
|
||||||
|
@@ Linux systemd unit
|
||||||
|
[Service]
|
||||||
|
Type=oneshot
|
||||||
|
ExecStart=...
|
||||||
|
+TimeoutStartSec=900
|
||||||
|
+NoNewPrivileges=true
|
||||||
|
+PrivateTmp=true
|
||||||
|
+ProtectSystem=strict
|
||||||
|
+ProtectHome=read-only
|
||||||
|
@@ Linux install/status
|
||||||
|
+ detect user manager availability and linger state; surface warning/action
|
||||||
|
@@ macOS install/status
|
||||||
|
+ detect non-GUI bootstrap context and return actionable error
|
||||||
|
```
|
||||||
|
|
||||||
|
9. **Add operational commands: `trigger`, `doctor`, and non-interactive log tail**
|
||||||
|
Analysis: `logs` opening an editor is weak for automation and incident response. Operators need a preflight and immediate controlled run.
|
||||||
|
Diff:
|
||||||
|
```diff
|
||||||
|
--- a/plan.md
|
||||||
|
+++ b/plan.md
|
||||||
|
@@ ServiceCommand
|
||||||
|
+ Trigger, // run one attempt through service policy now
|
||||||
|
+ Doctor, // validate scheduler, token, paths, permissions
|
||||||
|
@@ logs
|
||||||
|
- opens editor
|
||||||
|
+ supports --tail <n> and --follow in human mode
|
||||||
|
+ robot mode can return last_n lines optionally
|
||||||
|
```
|
||||||
|
|
||||||
|
10. **Fix plan inconsistencies and edge-case correctness**
|
||||||
|
Analysis: There are internal mismatches that will cause implementation drift.
|
||||||
|
Diff:
|
||||||
|
```diff
|
||||||
|
--- a/plan.md
|
||||||
|
+++ b/plan.md
|
||||||
|
@@ Interval Parsing
|
||||||
|
- supports 's' suffix
|
||||||
|
+ remove 's' suffix (acceptance only allows 5m..24h)
|
||||||
|
@@ uninstall acceptance
|
||||||
|
- removes ALL service files only
|
||||||
|
+ explicitly also remove service-manifest and service-env (status/logs retained)
|
||||||
|
@@ SyncStatusFile schema
|
||||||
|
- pub last_run: SyncRunRecord
|
||||||
|
+ pub last_run: Option<SyncRunRecord> // matches idle/no runs state
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Recommended Architecture Upgrade Summary**
|
||||||
|
|
||||||
|
The strongest improvement set is: **(1) project-scoped IDs, (2) secure token defaults, (3) atomic/versioned state, (4) persisted retry schedule + circuit breaker, (5) stage-aware outcomes**. That combination materially improves correctness, multi-repo safety, security, operability, and real-world reliability without changing your core manual-vs-scheduled separation principle.
|
||||||
174
plans/lore-service.feedback-3.md
Normal file
174
plans/lore-service.feedback-3.md
Normal file
@@ -0,0 +1,174 @@
|
|||||||
|
Below are the highest-impact revisions I’d make, ordered by severity/ROI. These focus on correctness first, then security, then operability and UX.
|
||||||
|
|
||||||
|
1. **Fix multi-install ambiguity (`service_id` exists, but commands can’t target one explicitly)**
|
||||||
|
Analysis: The plan introduces `service-manifest-{service_id}.json`, but `status/uninstall/resume/logs` have no selector. In a multi-workspace or multi-name install scenario, behavior becomes ambiguous and error-prone. Add explicit targeting plus discovery.
|
||||||
|
```diff
|
||||||
|
@@ ## Commands & User Journeys
|
||||||
|
+### `lore service list`
|
||||||
|
+Lists installed services discovered from `{data_dir}/service-manifest-*.json`.
|
||||||
|
+Robot output includes `service_id`, `platform`, `interval_seconds`, `profile`, `installed_at_iso`.
|
||||||
|
|
||||||
|
@@ ### `lore service uninstall`
|
||||||
|
-### `lore service uninstall`
|
||||||
|
+### `lore service uninstall [--service <service_id|name>] [--all]`
|
||||||
|
@@
|
||||||
|
-2. CLI reads install manifest to find `service_id`
|
||||||
|
+2. CLI resolves target service via `--service` or current-project-derived default.
|
||||||
|
+3. If multiple candidates and no selector, return actionable error.
|
||||||
|
|
||||||
|
@@ ### `lore service status`
|
||||||
|
-### `lore service status`
|
||||||
|
+### `lore service status [--service <service_id|name>]`
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Make status state service-scoped (not global)**
|
||||||
|
Analysis: A single `sync-status.json` for all services causes cross-service contamination (pause/backoff/outcome from one profile affecting another). Keep lock global, but state per service.
|
||||||
|
```diff
|
||||||
|
@@ ## Status File
|
||||||
|
-### Location
|
||||||
|
-`{get_data_dir()}/sync-status.json`
|
||||||
|
+### Location
|
||||||
|
+`{get_data_dir()}/sync-status-{service_id}.json`
|
||||||
|
|
||||||
|
@@ ## Paths Module Additions
|
||||||
|
-pub fn get_service_status_path() -> PathBuf {
|
||||||
|
- get_data_dir().join("sync-status.json")
|
||||||
|
+pub fn get_service_status_path(service_id: &str) -> PathBuf {
|
||||||
|
+ get_data_dir().join(format!("sync-status-{service_id}.json"))
|
||||||
|
}
|
||||||
|
@@
|
||||||
|
-Note: `sync-status.json` is NOT scoped by `service_id`
|
||||||
|
+Note: status is scoped by `service_id`; lock remains global (`sync_pipeline`) to prevent overlapping writes.
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Stop classifying permanence via string matching**
|
||||||
|
Analysis: Matching `"401 Unauthorized"` in strings is brittle and will misclassify edge cases. Carry machine codes through stage results and classify by `ErrorCode` only.
|
||||||
|
```diff
|
||||||
|
@@ pub struct StageResult {
|
||||||
|
- pub error: Option<String>,
|
||||||
|
+ pub error: Option<String>,
|
||||||
|
+ pub error_code: Option<String>, // e.g., AUTH_FAILED, NETWORK_ERROR
|
||||||
|
}
|
||||||
|
@@ Error classification helpers
|
||||||
|
-fn is_permanent_error_message(msg: Option<&str>) -> bool { ...string contains... }
|
||||||
|
+fn is_permanent_error_code(code: Option<&str>) -> bool {
|
||||||
|
+ matches!(code, Some("TOKEN_NOT_SET" | "AUTH_FAILED" | "CONFIG_NOT_FOUND" | "CONFIG_INVALID" | "MIGRATION_FAILED"))
|
||||||
|
+}
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Install should be transactional (manifest written last)**
|
||||||
|
Analysis: Current order writes manifest before scheduler enable. If enable fails, you persist a false “installed” state. Use two-phase install with rollback.
|
||||||
|
```diff
|
||||||
|
@@ ### `lore service install` User journey
|
||||||
|
-9. CLI writes install manifest ...
|
||||||
|
-10. CLI runs the platform-specific enable command
|
||||||
|
+9. CLI runs the platform-specific enable command
|
||||||
|
+10. On success, CLI writes install manifest atomically
|
||||||
|
+11. On failure, CLI removes generated files and returns `ServiceCommandFailed`
|
||||||
|
```
|
||||||
|
|
||||||
|
5. **Fix launchd token security gap (env-file currently still embeds token)**
|
||||||
|
Analysis: Current “env-file” on macOS still writes token into plist, defeating the main security goal. Generate a private wrapper script that reads env file at runtime and execs `lore`.
|
||||||
|
```diff
|
||||||
|
@@ ### macOS: launchd
|
||||||
|
-<key>ProgramArguments</key>
|
||||||
|
-<array>
|
||||||
|
- <string>{binary_path}</string>
|
||||||
|
- <string>--robot</string>
|
||||||
|
- <string>service</string>
|
||||||
|
- <string>run</string>
|
||||||
|
-</array>
|
||||||
|
+<key>ProgramArguments</key>
|
||||||
|
+<array>
|
||||||
|
+ <string>{data_dir}/service-run-{service_id}.sh</string>
|
||||||
|
+</array>
|
||||||
|
@@
|
||||||
|
-`env-file`: ... token value must still appear in plist ...
|
||||||
|
+`env-file`: token never appears in plist; wrapper loads `{data_dir}/service-env-{service_id}` at runtime.
|
||||||
|
```
|
||||||
|
|
||||||
|
6. **Improve backoff math and add half-open circuit recovery**
|
||||||
|
Analysis: Current jitter + min clamp makes first retry deterministic and can over-pause. Also circuit-breaker requires manual resume forever. Add cooldown + half-open probe to self-heal.
|
||||||
|
```diff
|
||||||
|
@@ Backoff Logic
|
||||||
|
-let backoff_secs = ((base_backoff as f64) * jitter_factor) as u64;
|
||||||
|
-let backoff_secs = backoff_secs.max(base_interval_seconds);
|
||||||
|
+let max_backoff = base_backoff;
|
||||||
|
+let min_backoff = base_interval_seconds;
|
||||||
|
+let span = max_backoff.saturating_sub(min_backoff);
|
||||||
|
+let backoff_secs = min_backoff + ((span as f64) * jitter_factor) as u64;
|
||||||
|
|
||||||
|
@@ Scheduler states
|
||||||
|
-- `paused` — permanent error ... OR circuit breaker tripped ...
|
||||||
|
+- `paused` — permanent error requiring intervention
|
||||||
|
+- `half_open` — probe state after circuit cooldown; one trial run allowed
|
||||||
|
|
||||||
|
@@ Circuit breaker
|
||||||
|
-... transitions to `paused` ... Run: lore service resume
|
||||||
|
+... transitions to `half_open` after cooldown (default 30m). Successful probe closes breaker automatically; failed probe returns to backoff/paused.
|
||||||
|
```
|
||||||
|
|
||||||
|
7. **Promote backend trait to v1 (not v2) for deterministic integration tests**
|
||||||
|
Analysis: This is a reliability-critical feature spanning OS schedulers. A trait abstraction now gives true behavior tests and safer refactors.
|
||||||
|
```diff
|
||||||
|
@@ ### Platform Backends
|
||||||
|
-> Future architecture note: A `SchedulerBackend` trait ... for v2.
|
||||||
|
+Adopt `SchedulerBackend` trait in v1 with real backends (`launchd/systemd/schtasks`) and `FakeBackend` for tests.
|
||||||
|
+This enables deterministic install/uninstall/status/run-path integration tests without touching host scheduler.
|
||||||
|
```
|
||||||
|
|
||||||
|
8. **Harden `run_cmd` timeout behavior**
|
||||||
|
Analysis: If timeout occurs, child process must be killed and reaped. Otherwise you leak processes and can wedge repeated runs.
|
||||||
|
```diff
|
||||||
|
@@ fn run_cmd(...)
|
||||||
|
-// Wait with timeout
|
||||||
|
-let output = wait_with_timeout(output, timeout_secs)?;
|
||||||
|
+// Wait with timeout; on timeout kill child and wait to reap
|
||||||
|
+let output = wait_with_timeout_kill_and_reap(child, timeout_secs)?;
|
||||||
|
```
|
||||||
|
|
||||||
|
9. **Add manual control commands (`pause`, `trigger`, `repair`)**
|
||||||
|
Analysis: These are high-utility operational controls. `trigger` helps immediate sync without waiting interval. `pause` supports maintenance windows. `repair` avoids manual file deletion for corrupt state.
|
||||||
|
```diff
|
||||||
|
@@ pub enum ServiceCommand {
|
||||||
|
+ /// Pause scheduled execution without uninstalling
|
||||||
|
+ Pause { #[arg(long)] reason: Option<String> },
|
||||||
|
+ /// Trigger an immediate one-off run using installed profile
|
||||||
|
+ Trigger { #[arg(long)] ignore_backoff: bool },
|
||||||
|
+ /// Repair corrupt manifest/status by backing up and reinitializing
|
||||||
|
+ Repair { #[arg(long)] service: Option<String> },
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
10. **Make `logs` default non-interactive and add rotation policy**
|
||||||
|
Analysis: Opening editor by default is awkward for automation/SSH and slower for normal diagnosis. Defaulting to `tail` is more practical; `--open` can preserve editor behavior.
|
||||||
|
```diff
|
||||||
|
@@ ### `lore service logs`
|
||||||
|
-By default, opens in the user's preferred editor.
|
||||||
|
+By default, prints last 100 lines to stdout.
|
||||||
|
+Use `--open` to open editor.
|
||||||
|
@@
|
||||||
|
+Log rotation: rotate `service-stdout.log` / `service-stderr.log` at 10 MB, keep 5 files.
|
||||||
|
```
|
||||||
|
|
||||||
|
11. **Remove destructive/shell-unsafe suggested action**
|
||||||
|
Analysis: `actions(): ["rm {path}", ...]` is unsafe (shell injection + destructive guidance). Replace with safe command path.
|
||||||
|
```diff
|
||||||
|
@@ LoreError::actions()
|
||||||
|
-Self::ServiceCorruptState { path, .. } => vec![&format!("rm {path}"), "lore service install"],
|
||||||
|
+Self::ServiceCorruptState { .. } => vec!["lore service repair", "lore service install"],
|
||||||
|
```
|
||||||
|
|
||||||
|
12. **Tighten scheduler units for real-world reliability**
|
||||||
|
Analysis: Add explicit working directory and success-exit handling to reduce environment drift and edge failures.
|
||||||
|
```diff
|
||||||
|
@@ systemd service unit
|
||||||
|
[Service]
|
||||||
|
Type=oneshot
|
||||||
|
ExecStart={binary_path} --robot service run
|
||||||
|
+WorkingDirectory={data_dir}
|
||||||
|
+SuccessExitStatus=0
|
||||||
|
TimeoutStartSec=900
|
||||||
|
```
|
||||||
|
|
||||||
|
If you want, I can produce a single consolidated “v3 plan” markdown with these revisions already merged into your original structure.
|
||||||
190
plans/lore-service.feedback-4.md
Normal file
190
plans/lore-service.feedback-4.md
Normal file
@@ -0,0 +1,190 @@
|
|||||||
|
No `## Rejected Recommendations` section was present in the plan you shared, so the proposals below are all net-new.
|
||||||
|
|
||||||
|
1. **Make scheduled runs explicitly target a single service instance**
|
||||||
|
Analysis: right now `service run` has no selector, but the plan supports multiple installed services. That creates ambiguity and incorrect manifest/status selection. This is the most important architectural fix.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ `lore service install` What it does
|
||||||
|
- runs `lore --robot service run` at the specified interval
|
||||||
|
+ runs `lore --robot service run --service-id <service_id>` at the specified interval
|
||||||
|
|
||||||
|
@@ Robot output (`install`)
|
||||||
|
- "sync_command": "/usr/local/bin/lore --robot service run",
|
||||||
|
+ "sync_command": "/usr/local/bin/lore --robot service run --service-id a1b2c3d4",
|
||||||
|
|
||||||
|
@@ `ServiceCommand` enum
|
||||||
|
- #[command(hide = true)]
|
||||||
|
- Run,
|
||||||
|
+ #[command(hide = true)]
|
||||||
|
+ Run {
|
||||||
|
+ /// Internal selector injected by scheduler backend
|
||||||
|
+ #[arg(long, hide = true)]
|
||||||
|
+ service_id: String,
|
||||||
|
+ },
|
||||||
|
|
||||||
|
@@ `handle_service_run` signature
|
||||||
|
-pub fn handle_service_run(start: std::time::Instant) -> Result<(), Box<dyn std::error::Error>>
|
||||||
|
+pub fn handle_service_run(service_id: &str, start: std::time::Instant) -> Result<(), Box<dyn std::error::Error>>
|
||||||
|
|
||||||
|
@@ run flow step 1
|
||||||
|
- Read install manifest
|
||||||
|
+ Read install manifest for `service_id`
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Strengthen `service_id` derivation to avoid cross-workspace collisions**
|
||||||
|
Analysis: hashing config path alone can collide when many workspaces share one global config. Identity should represent what is being synced, not only where config lives.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ Key Design Principles / Project-Scoped Service Identity
|
||||||
|
- derive from a stable hash of the config file path
|
||||||
|
+ derive from a stable fingerprint of:
|
||||||
|
+ - canonical workspace root
|
||||||
|
+ - normalized configured GitLab project URLs
|
||||||
|
+ - canonical config path
|
||||||
|
+ then take first 12 hex chars of SHA-256
|
||||||
|
|
||||||
|
@@ `compute_service_id`
|
||||||
|
- Returns first 8 hex chars of SHA-256 of the canonical config path.
|
||||||
|
+ Returns first 12 hex chars of SHA-256 of a canonical identity tuple
|
||||||
|
+ (workspace_root + sorted project URLs + config_path).
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Introduce a service-state machine with a dedicated admin lock**
|
||||||
|
Analysis: install/uninstall/pause/resume/repair/status can race each other. A lock and explicit transition table prevents invalid states and file races.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ New section: Service State Model
|
||||||
|
+ All state mutations are serialized by `AppLock("service-admin-{service_id}")`.
|
||||||
|
+ Legal transitions:
|
||||||
|
+ - idle -> running -> success|degraded|backoff|paused
|
||||||
|
+ - backoff -> running|paused
|
||||||
|
+ - paused -> half_open|running (resume)
|
||||||
|
+ - half_open -> running|paused
|
||||||
|
+ Any invalid transition is rejected with `ServiceCorruptState`.
|
||||||
|
|
||||||
|
@@ `handle_install`, `handle_uninstall`, `handle_pause`, `handle_resume`, `handle_repair`
|
||||||
|
+ Acquire `service-admin-{service_id}` before mutating manifest/status/service files.
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Unify manual and scheduled sync execution behind one orchestrator**
|
||||||
|
Analysis: the plan currently duplicates stage logic and error classification in `service run`, increasing drift risk. A shared orchestrator gives one authoritative pipeline behavior.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ Key Design Principles
|
||||||
|
+ #### 6. Single Sync Orchestrator
|
||||||
|
+ Both `lore sync` and `lore service run` call `SyncOrchestrator`.
|
||||||
|
+ Service mode adds policy (backoff/circuit-breaker); manual mode bypasses policy.
|
||||||
|
|
||||||
|
@@ Service Run Implementation
|
||||||
|
- execute_sync_stages(&sync_args)
|
||||||
|
+ SyncOrchestrator::run(SyncMode::Service { profile, policy })
|
||||||
|
|
||||||
|
@@ manual sync
|
||||||
|
- separate pipeline path
|
||||||
|
+ SyncOrchestrator::run(SyncMode::Manual { flags })
|
||||||
|
```
|
||||||
|
|
||||||
|
5. **Add bounded in-run retries for transient core-stage failures**
|
||||||
|
Analysis: single-shot failure handling will over-trigger backoff on temporary network blips. One short retry per core stage significantly improves freshness without much extra runtime.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ Stage-aware execution
|
||||||
|
+ Core stages (`issues`, `mrs`) get up to 1 immediate retry on transient errors
|
||||||
|
+ (jittered 1-5s). Permanent errors are never retried.
|
||||||
|
+ Optional stages keep best-effort semantics.
|
||||||
|
|
||||||
|
@@ Acceptance criteria (`service run`)
|
||||||
|
+ Retries transient core stage failures once before counting run as failed.
|
||||||
|
```
|
||||||
|
|
||||||
|
6. **Harden persistence with full crash-safety semantics**
|
||||||
|
Analysis: current atomic write description is good but incomplete for power-loss durability. You should fsync parent directory after rename and include lightweight integrity metadata.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ `write_atomic`
|
||||||
|
- tmp file + fsync + rename
|
||||||
|
+ tmp file + fsync(file) + rename + fsync(parent_dir)
|
||||||
|
|
||||||
|
@@ `ServiceManifest` and `SyncStatusFile`
|
||||||
|
+ pub write_seq: u64,
|
||||||
|
+ pub content_sha256: String, // optional integrity guard for repair/doctor
|
||||||
|
```
|
||||||
|
|
||||||
|
7. **Fix token handling to avoid shell/env injection and add secure-store mode**
|
||||||
|
Analysis: sourcing env files in shell is brittle if token contains special chars/newlines. Also, secure OS credential stores should be first-class for production reliability/security.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ Token storage strategies
|
||||||
|
-| `env-file` (default) ...
|
||||||
|
+| `auto` (default) | use secure-store when available, else env-file |
|
||||||
|
+| `secure-store` | macOS Keychain / libsecret / Windows Credential Manager |
|
||||||
|
+| `env-file` | explicit fallback |
|
||||||
|
|
||||||
|
@@ macOS wrapper script
|
||||||
|
-. "{data_dir}/service-env-{service_id}"
|
||||||
|
-export {token_env_var}
|
||||||
|
+TOKEN_VALUE="$(cat "{data_dir}/service-token-{service_id}" )"
|
||||||
|
+export {token_env_var}="$TOKEN_VALUE"
|
||||||
|
|
||||||
|
@@ Acceptance criteria
|
||||||
|
+ Reject token values containing `\0` or newline for env-file mode.
|
||||||
|
+ Never eval/source untrusted token content.
|
||||||
|
```
|
||||||
|
|
||||||
|
8. **Correct platform/runtime implementation hazards**
|
||||||
|
Analysis: there are a few correctness risks that should be fixed in-plan now.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ macOS install steps
|
||||||
|
- Get UID via `unsafe { libc::getuid() }`
|
||||||
|
+ Get UID via safe API (`nix::unistd::Uid::current()` or equivalent safe helper)
|
||||||
|
|
||||||
|
@@ Command Runner Helper
|
||||||
|
- poll try_wait and read stdout/stderr after exit
|
||||||
|
+ avoid potential pipe backpressure deadlock:
|
||||||
|
+ use wait-with-timeout + concurrent stdout/stderr draining
|
||||||
|
|
||||||
|
@@ Linux timer
|
||||||
|
- OnUnitActiveSec={interval_seconds}s
|
||||||
|
+ OnUnitInactiveSec={interval_seconds}s
|
||||||
|
+ AccuracySec=1min
|
||||||
|
```
|
||||||
|
|
||||||
|
9. **Make logs fully service-scoped**
|
||||||
|
Analysis: you already scoped manifest/status by `service_id`; logs are still global in several places. Multi-service installs will overwrite each other’s logs.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ Paths Module Additions
|
||||||
|
-pub fn get_service_log_path() -> PathBuf
|
||||||
|
+pub fn get_service_log_path(service_id: &str, stream: LogStream) -> PathBuf
|
||||||
|
|
||||||
|
@@ log filenames
|
||||||
|
- logs/service-stderr.log
|
||||||
|
- logs/service-stdout.log
|
||||||
|
+ logs/service-{service_id}-stderr.log
|
||||||
|
+ logs/service-{service_id}-stdout.log
|
||||||
|
|
||||||
|
@@ `service logs`
|
||||||
|
- default path: `{data_dir}/logs/service-stderr.log`
|
||||||
|
+ default path: `{data_dir}/logs/service-{service_id}-stderr.log`
|
||||||
|
```
|
||||||
|
|
||||||
|
10. **Resolve internal spec contradictions and rollback gaps**
|
||||||
|
Analysis: there are a few contradictory statements and incomplete rollback behavior that will cause implementation churn.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ `service logs` behavior
|
||||||
|
- default (no flags): open in editor (human)
|
||||||
|
+ default (no flags): print last 100 lines (human and robot metadata mode)
|
||||||
|
+ `--open` is explicit opt-in
|
||||||
|
|
||||||
|
@@ install rollback
|
||||||
|
- On failure: removes generated service files
|
||||||
|
+ On failure: removes generated service files, env file, wrapper script, and temp manifest
|
||||||
|
|
||||||
|
@@ `handle_service_run` sample code
|
||||||
|
- let manifest_path = get_service_manifest_path();
|
||||||
|
+ let manifest_path = get_service_manifest_path(service_id);
|
||||||
|
```
|
||||||
|
|
||||||
|
If you want, I can take these revisions and produce a single consolidated “Iteration 4” replacement plan block with all sections rewritten coherently so it’s ready to hand to an implementer.
|
||||||
196
plans/lore-service.feedback-5.md
Normal file
196
plans/lore-service.feedback-5.md
Normal file
@@ -0,0 +1,196 @@
|
|||||||
|
I reviewed the full plan and avoided everything already listed in `## Rejected Recommendations`. These are the highest-impact revisions I’d make.
|
||||||
|
|
||||||
|
1. **Fix identity model inconsistency and prevent `--name` alias collisions**
|
||||||
|
Why this improves the plan: your text says identity includes workspace root, but the current derivation code does not. Also, using `--name` as the actual `service_id` risks accidental cross-project collisions and destructive updates.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
--- a/plan.md
|
||||||
|
+++ b/plan.md
|
||||||
|
@@ Key Design Principles / 2. Project-Scoped Service Identity
|
||||||
|
- Each installed service gets a unique `service_id` derived from a canonical identity tuple: the config file path, sorted GitLab project URLs, and workspace root.
|
||||||
|
+ Each installed service gets an immutable `identity_hash` derived from a canonical identity tuple:
|
||||||
|
+ workspace root + canonical config path + sorted normalized project URLs.
|
||||||
|
+ `service_id` remains the scheduler identifier; `--name` is a human alias only.
|
||||||
|
+ If `--name` collides with an existing service that has a different `identity_hash`, install fails with an actionable error.
|
||||||
|
|
||||||
|
@@ Install Manifest / Schema
|
||||||
|
+ /// Immutable identity hash for collision-safe matching across reinstalls
|
||||||
|
+ pub identity_hash: String,
|
||||||
|
+ /// Optional human-readable alias passed via --name
|
||||||
|
+ #[serde(skip_serializing_if = "Option::is_none")]
|
||||||
|
+ pub service_alias: Option<String>,
|
||||||
|
+ /// Canonical workspace root used in identity derivation
|
||||||
|
+ pub workspace_root: String,
|
||||||
|
|
||||||
|
@@ service_id derivation
|
||||||
|
-pub fn compute_service_id(config_path: &Path, project_urls: &[&str]) -> String
|
||||||
|
+pub fn compute_identity_hash(workspace_root: &Path, config_path: &Path, project_urls: &[&str]) -> String
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Add lock protocol to eliminate uninstall/run race conditions**
|
||||||
|
Why this improves the plan: today `service run` does not take admin lock, and admin commands do not take pipeline lock. `uninstall` can race with an active run and remove files mid-execution.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
--- a/plan.md
|
||||||
|
+++ b/plan.md
|
||||||
|
@@ Key Design Principles / 6. Serialized Admin Mutations
|
||||||
|
- The `service run` entrypoint does NOT acquire the admin lock — it only acquires the `sync_pipeline` lock
|
||||||
|
+ The `service run` entrypoint acquires only `sync_pipeline`.
|
||||||
|
+ Destructive admin operations (`install` overwrite, `uninstall`, `repair --regenerate`) must:
|
||||||
|
+ 1) acquire `service-admin-{service_id}`
|
||||||
|
+ 2) disable scheduler backend entrypoint
|
||||||
|
+ 3) acquire `sync_pipeline` lock with timeout
|
||||||
|
+ 4) mutate/remove files
|
||||||
|
+ This lock ordering is mandatory to prevent deadlocks and run/delete races.
|
||||||
|
|
||||||
|
@@ lore service uninstall / User journey
|
||||||
|
- 4. Runs platform-specific disable command
|
||||||
|
- 5. Removes service files from disk
|
||||||
|
+ 4. Acquires `sync_pipeline` lock (after disabling scheduler) with bounded wait
|
||||||
|
+ 5. Removes service files from disk only after lock acquisition
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Make transient handling `Retry-After` aware**
|
||||||
|
Why this improves the plan: rate-limit and 503 responses often carry retry hints. Ignoring them causes useless retries and longer outages.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
--- a/plan.md
|
||||||
|
+++ b/plan.md
|
||||||
|
@@ Transient vs permanent error classification
|
||||||
|
-| Transient | Retry with backoff | Network timeout, rate limited, DB locked, 5xx from GitLab |
|
||||||
|
+| Transient | Retry with adaptive backoff | Network timeout, DB locked, 5xx from GitLab |
|
||||||
|
+| Transient (hinted) | Respect server retry hint | Rate limited with Retry-After/X-RateLimit-Reset |
|
||||||
|
|
||||||
|
@@ Backoff Logic
|
||||||
|
+ If an error includes a retry hint (e.g., `Retry-After`), set:
|
||||||
|
+ `next_retry_at_ms = max(computed_backoff, hinted_retry_at_ms)`.
|
||||||
|
+ Persist `backoff_reason` for status visibility.
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Decouple optional stage cadence from core sync interval**
|
||||||
|
Why this improves the plan: running docs/embeddings every 5–30 minutes is expensive and unnecessary. Separate freshness windows reduce cost/latency while keeping core data fresh.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
--- a/plan.md
|
||||||
|
+++ b/plan.md
|
||||||
|
@@ Sync profiles
|
||||||
|
-| `balanced` (default) | `--no-embed` | Issues + MRs + doc generation |
|
||||||
|
-| `full` | (none) | Full pipeline including embeddings |
|
||||||
|
+| `balanced` (default) | core every interval, docs every 60m, no embeddings | Fast + useful docs |
|
||||||
|
+| `full` | core every interval, docs every interval, embeddings every 6h (default) | Full freshness with bounded cost |
|
||||||
|
|
||||||
|
@@ Status File / StageResult
|
||||||
|
+ /// true when stage intentionally skipped due freshness window
|
||||||
|
+ #[serde(default)]
|
||||||
|
+ pub skipped: bool,
|
||||||
|
|
||||||
|
@@ lore service run / Stage-aware execution
|
||||||
|
+ Optional stages may be skipped when their last successful run is within configured freshness windows.
|
||||||
|
+ Skipped optional stages do not count as failures and are recorded explicitly.
|
||||||
|
```
|
||||||
|
|
||||||
|
5. **Give Windows parity for secure token handling (env-file + wrapper)**
|
||||||
|
Why this improves the plan: current Windows path requires global/system env and has poor UX. A wrapper+env-file model gives platform parity and avoids global token exposure.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
--- a/plan.md
|
||||||
|
+++ b/plan.md
|
||||||
|
@@ Token storage strategies
|
||||||
|
-| On Windows, neither strategy applies — the token must be in the user's system environment
|
||||||
|
+| On Windows, `env-file` is supported via a generated wrapper script (`service-run-{service_id}.cmd` or `.ps1`)
|
||||||
|
+| that reads `{data_dir}/service-env-{service_id}` and launches `lore --robot service run ...`.
|
||||||
|
+| `embedded` remains opt-in and warned as less secure.
|
||||||
|
|
||||||
|
@@ Windows: schtasks
|
||||||
|
- Token handling on Windows: The env var must be set system-wide via `setx`
|
||||||
|
+ Token handling on Windows:
|
||||||
|
+ - `env-file` (default): wrapper script reads token from private file at runtime
|
||||||
|
+ - `embedded`: token passed via wrapper-set environment variable
|
||||||
|
+ - `system_env`: still supported as fallback
|
||||||
|
```
|
||||||
|
|
||||||
|
6. **Add run heartbeat and stale-run detection**
|
||||||
|
Why this improves the plan: `running` state can become misleading after crashes or stale locks. Heartbeat metadata makes status accurate and improves incident triage.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
--- a/plan.md
|
||||||
|
+++ b/plan.md
|
||||||
|
@@ Status File / Schema
|
||||||
|
+ /// In-flight run metadata for crash/stale detection
|
||||||
|
+ #[serde(skip_serializing_if = "Option::is_none")]
|
||||||
|
+ pub current_run: Option<CurrentRunState>,
|
||||||
|
+
|
||||||
|
+pub struct CurrentRunState {
|
||||||
|
+ pub run_id: String,
|
||||||
|
+ pub started_at_ms: i64,
|
||||||
|
+ pub last_heartbeat_ms: i64,
|
||||||
|
+ pub pid: u32,
|
||||||
|
+}
|
||||||
|
|
||||||
|
@@ lore service status
|
||||||
|
- - `running` — currently executing (sync_pipeline lock held)
|
||||||
|
+ - `running` — currently executing with live heartbeat
|
||||||
|
+ - `running_stale` — in-flight metadata exists but heartbeat exceeded stale threshold
|
||||||
|
```
|
||||||
|
|
||||||
|
7. **Upgrade drift detection from “loaded/unloaded” to spec-level drift**
|
||||||
|
Why this improves the plan: platform state alone misses manual edits to unit/plist/wrapper files. Spec-hash drift gives reliable “what changed?” diagnostics and safe regeneration.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
--- a/plan.md
|
||||||
|
+++ b/plan.md
|
||||||
|
@@ Install Manifest / Schema
|
||||||
|
+ /// Hash of generated scheduler artifacts and command spec
|
||||||
|
+ pub spec_hash: String,
|
||||||
|
|
||||||
|
@@ lore service status
|
||||||
|
- Detects manifest/platform drift and reports it
|
||||||
|
+ Detects:
|
||||||
|
+ - platform drift (loaded/unloaded mismatch)
|
||||||
|
+ - spec drift (artifact content hash mismatch)
|
||||||
|
+ - command drift (sync command differs from manifest)
|
||||||
|
|
||||||
|
@@ lore service repair
|
||||||
|
+ Add `--regenerate` to rewrite scheduler artifacts from manifest when spec drift is detected.
|
||||||
|
+ This is non-destructive and does not delete status/log history.
|
||||||
|
```
|
||||||
|
|
||||||
|
8. **Add safe operational modes: `install --dry-run` and `doctor --fix`**
|
||||||
|
Why this improves the plan: dry-run reduces risk before writing OS scheduler files; fix-mode improves operator ergonomics and lowers support burden.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
--- a/plan.md
|
||||||
|
+++ b/plan.md
|
||||||
|
@@ lore service install
|
||||||
|
+ Add `--dry-run`:
|
||||||
|
+ - validates config/token/prereqs
|
||||||
|
+ - renders service files and planned commands
|
||||||
|
+ - writes nothing, executes nothing
|
||||||
|
|
||||||
|
@@ lore service doctor
|
||||||
|
+ Add `--fix` for safe, non-destructive remediations:
|
||||||
|
+ - create missing dirs
|
||||||
|
+ - correct file permissions on env/wrapper files
|
||||||
|
+ - run `systemctl --user daemon-reload` when applicable
|
||||||
|
+ - report applied fixes in robot output
|
||||||
|
```
|
||||||
|
|
||||||
|
9. **Define explicit schema migration behavior (not just `schema_version` fields)**
|
||||||
|
Why this improves the plan: version fields without migration policy become operational risk during upgrades.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
--- a/plan.md
|
||||||
|
+++ b/plan.md
|
||||||
|
@@ ServiceManifest Read/Write
|
||||||
|
- `ServiceManifest::read(path: &Path) -> Result<Option<Self>, LoreError>`
|
||||||
|
+ `ServiceManifest::read_and_migrate(path: &Path) -> Result<Option<Self>, LoreError>`
|
||||||
|
+ - Migrates known older schema versions to current in-memory model
|
||||||
|
+ - Rewrites migrated file atomically
|
||||||
|
+ - Fails with actionable `ServiceCorruptState` for unknown future major versions
|
||||||
|
|
||||||
|
@@ SyncStatusFile Read/Write
|
||||||
|
- `SyncStatusFile::read(path: &Path) -> Result<Option<Self>, LoreError>`
|
||||||
|
+ `SyncStatusFile::read_and_migrate(path: &Path) -> Result<Option<Self>, LoreError>`
|
||||||
|
```
|
||||||
|
|
||||||
|
If you want, I can produce a fully rewritten v5 plan text that integrates all nine changes coherently section-by-section.
|
||||||
3759
plans/lore-service.md
Normal file
3759
plans/lore-service.md
Normal file
File diff suppressed because it is too large
Load Diff
128
plans/time-decay-expert-scoring.feedback-5.md
Normal file
128
plans/time-decay-expert-scoring.feedback-5.md
Normal file
@@ -0,0 +1,128 @@
|
|||||||
|
**Best Revisions To Strengthen The Plan**
|
||||||
|
|
||||||
|
1. **[Critical] Replace one-hop rename matching with canonical path identities**
|
||||||
|
Analysis and rationale: Current `old_path OR new_path` fixes direct renames, but it still breaks on rename chains (`a.rs -> b.rs -> c.rs`) and split/move patterns. A canonical `path_identity` graph built from `mr_file_changes(old_path,new_path)` gives stable identity over time, which is the right architectural boundary for expertise history.
|
||||||
|
```diff
|
||||||
|
@@ ## Context
|
||||||
|
-- Match both old and new paths in all signal queries AND path resolution probes so expertise survives file renames
|
||||||
|
+- Build canonical path identities from rename edges and score by identity, not raw path strings, so expertise survives multi-hop renames and moves
|
||||||
|
|
||||||
|
@@ ## Files to Modify
|
||||||
|
-2. **`src/cli/commands/who.rs`** — Core changes:
|
||||||
|
+2. **`src/cli/commands/who.rs`** — Core changes:
|
||||||
|
...
|
||||||
|
- - Match both `new_path` and `old_path` in all signal queries (rename awareness)
|
||||||
|
+ - Resolve queried paths to `path_identity_id` and match all aliases in that identity set
|
||||||
|
+4. **`src/core/path_identity.rs`** — New module:
|
||||||
|
+ - Build/maintain rename graph from `mr_file_changes`
|
||||||
|
+ - Resolve path -> identity + aliases for probes/scoring
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **[Critical] Shift scoring input from runtime CTE joins to a normalized `expertise_events` table**
|
||||||
|
Analysis and rationale: Your SQL is correct but complex and expensive at query time. Precomputing normalized events at ingestion gives simpler, faster, and more reliable scoring queries; it also enables model versioning/backfills without touching raw MR/note tables each request.
|
||||||
|
```diff
|
||||||
|
@@ ## Files to Modify
|
||||||
|
-3. **`src/core/db.rs`** — Add migration for indexes supporting the new query shapes
|
||||||
|
+3. **`src/core/db.rs`** — Add migrations for:
|
||||||
|
+ - `expertise_events` table (normalized scoring events)
|
||||||
|
+ - supporting indexes
|
||||||
|
+4. **`src/core/ingest/expertise_events.rs`** — New:
|
||||||
|
+ - Incremental upsert of events during sync/ingest
|
||||||
|
|
||||||
|
@@ ## SQL Restructure (who.rs)
|
||||||
|
-The SQL uses CTE-based dual-path matching and hybrid aggregation...
|
||||||
|
+Runtime SQL reads precomputed `expertise_events` filtered by path identity + time window.
|
||||||
|
+Heavy joins/aggregation move to ingest-time normalization.
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **[High] Upgrade reviewer engagement model beyond char-count threshold**
|
||||||
|
Analysis and rationale: `min_note_chars` is a useful guardrail but brittle (easy to game, penalizes concise high-quality comments). Add explicit review-state signals (`approved`, `changes_requested`) and trivial-comment pattern filtering to better capture real reviewer expertise.
|
||||||
|
```diff
|
||||||
|
@@ ## Scoring Formula
|
||||||
|
-| **Reviewer Participated** (left DiffNote on MR/path) | 10 | 90 days |
|
||||||
|
+| **Reviewer Participated** (substantive DiffNote and/or formal review action) | 10 | 90 days |
|
||||||
|
+| **Review Decision: changes_requested** | 6 | 120 days |
|
||||||
|
+| **Review Decision: approved** | 4 | 75 days |
|
||||||
|
|
||||||
|
@@ ### 1. ScoringConfig (config.rs)
|
||||||
|
pub reviewer_min_note_chars: u32,
|
||||||
|
+ pub reviewer_trivial_note_patterns: Vec<String>, // default: ["lgtm","+1","nit","ship it","👍"]
|
||||||
|
+ pub review_approved_weight: i64, // default: 4
|
||||||
|
+ pub review_changes_requested_weight: i64, // default: 6
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **[High] Make temporal semantics explicit and deterministic**
|
||||||
|
Analysis and rationale: `--as-of` is good, but day parsing and boundary semantics can still cause subtle reproducibility issues. Define window as `[since_ms, as_of_ms)` and parse `YYYY-MM-DD` as end-of-day UTC (or explicit timezone) so user expectations match outputs.
|
||||||
|
```diff
|
||||||
|
@@ ### 5a. Reproducible Scoring via `--as-of`
|
||||||
|
-- All event selection is bounded by `[since_ms, as_of_ms]`
|
||||||
|
+- All event selection is bounded by `[since_ms, as_of_ms)` (exclusive upper bound)
|
||||||
|
+- `YYYY-MM-DD` is interpreted as `23:59:59.999Z` unless `--timezone` is provided
|
||||||
|
+- Robot output includes `window_start_iso`, `window_end_iso`, `window_end_exclusive: true`
|
||||||
|
```
|
||||||
|
|
||||||
|
5. **[High] Replace fixed default `--since 24m` with contribution-floor auto cutoff**
|
||||||
|
Analysis and rationale: A static window is simple but often over-scans data. Compute a model-derived horizon from a minimum contribution floor (for example `0.01` points) per signal; this keeps results equivalent while reducing query cost.
|
||||||
|
```diff
|
||||||
|
@@ ### 5. Default --since Change
|
||||||
|
-Expert mode: `"6m"` -> `"24m"`
|
||||||
|
+Expert mode default: `--since auto`
|
||||||
|
+`auto` computes earliest relevant timestamp from configured weights/half-lives and `min_contribution_floor`
|
||||||
|
+Add config: `min_contribution_floor` (default: 0.01)
|
||||||
|
+`--since` still overrides, `--all-history` still bypasses cutoff
|
||||||
|
```
|
||||||
|
|
||||||
|
6. **[High] Add bot/service-account filtering now (not later)**
|
||||||
|
Analysis and rationale: Bot activity can materially distort expertise rankings in real repos. This is low implementation cost with high quality gain and should be in v1 of the scoring revamp, not deferred.
|
||||||
|
```diff
|
||||||
|
@@ ### 1. ScoringConfig (config.rs)
|
||||||
|
+ pub excluded_username_patterns: Vec<String>, // default: ["bot","\\[bot\\]","service-account","ci-"]
|
||||||
|
@@ ### 2. SQL Restructure (who.rs)
|
||||||
|
+Apply username exclusion in all signal sources unless `--include-bots` is set
|
||||||
|
@@ ### 5b. Score Explainability via `--explain-score`
|
||||||
|
+Add `filtered_events` counts in robot output metadata
|
||||||
|
```
|
||||||
|
|
||||||
|
7. **[Medium] Enforce deterministic floating-point accumulation**
|
||||||
|
Analysis and rationale: Even with small sets, unordered `HashMap` iteration can cause tiny platform-dependent ranking differences near ties. Sorting contributions and using Neumaier summation removes nondeterminism and stabilizes tests/CI.
|
||||||
|
```diff
|
||||||
|
@@ ### 4. Rust-Side Aggregation (who.rs)
|
||||||
|
-Compute score as `f64`.
|
||||||
|
+Compute score as `f64` using deterministic contribution ordering:
|
||||||
|
+1) sort by (username, signal, mr_id, ts)
|
||||||
|
+2) sum with Neumaier compensation
|
||||||
|
+Tie-break remains `(raw_score DESC, last_seen DESC, username ASC)`
|
||||||
|
```
|
||||||
|
|
||||||
|
8. **[Medium] Strengthen explainability with evidence, not just totals**
|
||||||
|
Analysis and rationale: Component totals help, but disputes usually need “why this user got this score now.” Add compact top evidence rows per component (`mr_id`, `ts`, `raw_contribution`) behind an optional mode.
|
||||||
|
```diff
|
||||||
|
@@ ### 5b. Score Explainability via `--explain-score`
|
||||||
|
-Component breakdown only (4 floats per user).
|
||||||
|
+Add `--explain-score=summary|full`:
|
||||||
|
+`summary`: current 4-component totals
|
||||||
|
+`full`: adds top N evidence rows per component (default N=3)
|
||||||
|
+Robot output includes per-evidence `mr_id`, `signal`, `ts`, `contribution`
|
||||||
|
```
|
||||||
|
|
||||||
|
9. **[Medium] Make query plan strategy explicit: `UNION ALL` default for dual-path scans**
|
||||||
|
Analysis and rationale: You currently treat `UNION ALL` as fallback if planner regresses. For SQLite, OR-across-indexed-columns regressions are common enough that defaulting to branch-split queries is often more predictable.
|
||||||
|
```diff
|
||||||
|
@@ **Index optimization fallback (UNION ALL split)**
|
||||||
|
-Start with the simpler `OR` approach and only switch to `UNION ALL` if query plans confirm degradation.
|
||||||
|
+Use `UNION ALL` + dedup as default for dual-path matching.
|
||||||
|
+Keep `OR` variant as optional strategy flag for benchmarking/regression checks.
|
||||||
|
```
|
||||||
|
|
||||||
|
10. **[Medium] Add explicit performance SLO + benchmark gate**
|
||||||
|
Analysis and rationale: This plan is query-heavy and ranking-critical; add measurable performance budgets so future edits do not silently degrade UX. Include synthetic fixture benchmarks for exact, prefix, and suffix path modes.
|
||||||
|
```diff
|
||||||
|
@@ ## Verification
|
||||||
|
+8. Performance regression gate:
|
||||||
|
+ - `cargo bench --bench who_expert_scoring`
|
||||||
|
+ - Dataset tiers: 100k, 1M, 5M notes
|
||||||
|
+ - SLOs: p95 exact path < 150ms, prefix < 250ms, suffix < 400ms on reference hardware
|
||||||
|
+ - Fail CI if regression > 20% vs stored baseline
|
||||||
|
```
|
||||||
|
|
||||||
|
If you want, I can produce a single consolidated “iteration 5” plan document with these changes already merged into your current structure.
|
||||||
@@ -2,12 +2,12 @@
|
|||||||
plan: true
|
plan: true
|
||||||
title: ""
|
title: ""
|
||||||
status: iterating
|
status: iterating
|
||||||
iteration: 4
|
iteration: 5
|
||||||
target_iterations: 8
|
target_iterations: 8
|
||||||
beads_revision: 0
|
beads_revision: 1
|
||||||
related_plans: []
|
related_plans: []
|
||||||
created: 2026-02-08
|
created: 2026-02-08
|
||||||
updated: 2026-02-08
|
updated: 2026-02-09
|
||||||
---
|
---
|
||||||
|
|
||||||
# Time-Decay Expert Scoring Model
|
# Time-Decay Expert Scoring Model
|
||||||
@@ -78,6 +78,7 @@ Author/reviewer signals are deduplicated per MR (one signal per distinct MR). No
|
|||||||
- Change default `--since` from `"6m"` to `"24m"` (2 years captures all meaningful decayed signals)
|
- Change default `--since` from `"6m"` to `"24m"` (2 years captures all meaningful decayed signals)
|
||||||
- Add `--as-of` flag for reproducible scoring at a fixed timestamp
|
- Add `--as-of` flag for reproducible scoring at a fixed timestamp
|
||||||
- Add `--explain-score` flag for per-user score component breakdown
|
- Add `--explain-score` flag for per-user score component breakdown
|
||||||
|
- Add `--include-bots` flag to disable bot/service-account filtering
|
||||||
- Sort on raw f64 score, round only for display
|
- Sort on raw f64 score, round only for display
|
||||||
- Update tests
|
- Update tests
|
||||||
3. **`src/core/db.rs`** — Add migration for indexes supporting the new query shapes (dual-path matching, reviewer participation CTE, path resolution probes)
|
3. **`src/core/db.rs`** — Add migration for indexes supporting the new query shapes (dual-path matching, reviewer participation CTE, path resolution probes)
|
||||||
@@ -100,6 +101,7 @@ pub struct ScoringConfig {
|
|||||||
pub note_half_life_days: u32, // default: 45
|
pub note_half_life_days: u32, // default: 45
|
||||||
pub closed_mr_multiplier: f64, // default: 0.5 (applied to closed-without-merge MRs)
|
pub closed_mr_multiplier: f64, // default: 0.5 (applied to closed-without-merge MRs)
|
||||||
pub reviewer_min_note_chars: u32, // default: 20 (minimum note body length to count as participation)
|
pub reviewer_min_note_chars: u32, // default: 20 (minimum note body length to count as participation)
|
||||||
|
pub excluded_usernames: Vec<String>, // default: [] (exact-match usernames to exclude, e.g. ["renovate-bot", "gitlab-ci"])
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -108,6 +110,7 @@ pub struct ScoringConfig {
|
|||||||
- All `*_weight` / `*_bonus` must be >= 0 (negative weights produce nonsensical scores)
|
- All `*_weight` / `*_bonus` must be >= 0 (negative weights produce nonsensical scores)
|
||||||
- `closed_mr_multiplier` must be in `(0.0, 1.0]` (0 would discard closed MRs entirely; >1 would over-weight them)
|
- `closed_mr_multiplier` must be in `(0.0, 1.0]` (0 would discard closed MRs entirely; >1 would over-weight them)
|
||||||
- `reviewer_min_note_chars` must be >= 0 (0 disables the filter; typical useful values: 10-50)
|
- `reviewer_min_note_chars` must be >= 0 (0 disables the filter; typical useful values: 10-50)
|
||||||
|
- `excluded_usernames` entries must be non-empty strings (no blank entries)
|
||||||
- Return `LoreError::ConfigInvalid` with a clear message on failure
|
- Return `LoreError::ConfigInvalid` with a clear message on failure
|
||||||
|
|
||||||
### 2. Decay Function (who.rs)
|
### 2. Decay Function (who.rs)
|
||||||
@@ -128,25 +131,51 @@ The SQL uses **CTE-based dual-path matching** and **hybrid aggregation**. Rather
|
|||||||
MR-level signals return one row per (username, signal, mr_id) with a timestamp; note signals return one row per (username, mr_id) with `note_count` and `max_ts`. This keeps row counts bounded (dozens to low hundreds per path) while giving Rust the data it needs for decay and `log2(1+count)`.
|
MR-level signals return one row per (username, signal, mr_id) with a timestamp; note signals return one row per (username, mr_id) with `note_count` and `max_ts`. This keeps row counts bounded (dozens to low hundreds per path) while giving Rust the data it needs for decay and `log2(1+count)`.
|
||||||
|
|
||||||
```sql
|
```sql
|
||||||
WITH matched_notes AS (
|
WITH matched_notes_raw AS (
|
||||||
-- Centralize dual-path matching for DiffNotes
|
-- Branch 1: match on new_path (uses idx_notes_new_path or equivalent)
|
||||||
SELECT n.id, n.discussion_id, n.author_username, n.created_at,
|
SELECT n.id, n.discussion_id, n.author_username, n.created_at, n.project_id
|
||||||
n.position_new_path, n.position_old_path, n.project_id
|
|
||||||
FROM notes n
|
FROM notes n
|
||||||
WHERE n.note_type = 'DiffNote'
|
WHERE n.note_type = 'DiffNote'
|
||||||
AND n.is_system = 0
|
AND n.is_system = 0
|
||||||
AND n.author_username IS NOT NULL
|
AND n.author_username IS NOT NULL
|
||||||
AND n.created_at >= ?2
|
AND n.created_at >= ?2
|
||||||
AND n.created_at <= ?4
|
AND n.created_at < ?4
|
||||||
AND (?3 IS NULL OR n.project_id = ?3)
|
AND (?3 IS NULL OR n.project_id = ?3)
|
||||||
AND (n.position_new_path {path_op} OR n.position_old_path {path_op})
|
AND n.position_new_path {path_op}
|
||||||
|
UNION ALL
|
||||||
|
-- Branch 2: match on old_path (uses idx_notes_old_path_author)
|
||||||
|
SELECT n.id, n.discussion_id, n.author_username, n.created_at, n.project_id
|
||||||
|
FROM notes n
|
||||||
|
WHERE n.note_type = 'DiffNote'
|
||||||
|
AND n.is_system = 0
|
||||||
|
AND n.author_username IS NOT NULL
|
||||||
|
AND n.created_at >= ?2
|
||||||
|
AND n.created_at < ?4
|
||||||
|
AND (?3 IS NULL OR n.project_id = ?3)
|
||||||
|
AND n.position_old_path {path_op}
|
||||||
),
|
),
|
||||||
matched_file_changes AS (
|
matched_notes AS (
|
||||||
-- Centralize dual-path matching for file changes
|
-- Dedup: prevent double-counting when old_path = new_path (no rename)
|
||||||
|
SELECT DISTINCT id, discussion_id, author_username, created_at, project_id
|
||||||
|
FROM matched_notes_raw
|
||||||
|
),
|
||||||
|
matched_file_changes_raw AS (
|
||||||
|
-- Branch 1: match on new_path (uses idx_mfc_new_path_project_mr)
|
||||||
SELECT fc.merge_request_id, fc.project_id
|
SELECT fc.merge_request_id, fc.project_id
|
||||||
FROM mr_file_changes fc
|
FROM mr_file_changes fc
|
||||||
WHERE (?3 IS NULL OR fc.project_id = ?3)
|
WHERE (?3 IS NULL OR fc.project_id = ?3)
|
||||||
AND (fc.new_path {path_op} OR fc.old_path {path_op})
|
AND fc.new_path {path_op}
|
||||||
|
UNION ALL
|
||||||
|
-- Branch 2: match on old_path (uses idx_mfc_old_path_project_mr)
|
||||||
|
SELECT fc.merge_request_id, fc.project_id
|
||||||
|
FROM mr_file_changes fc
|
||||||
|
WHERE (?3 IS NULL OR fc.project_id = ?3)
|
||||||
|
AND fc.old_path {path_op}
|
||||||
|
),
|
||||||
|
matched_file_changes AS (
|
||||||
|
-- Dedup: prevent double-counting when old_path = new_path (no rename)
|
||||||
|
SELECT DISTINCT merge_request_id, project_id
|
||||||
|
FROM matched_file_changes_raw
|
||||||
),
|
),
|
||||||
reviewer_participation AS (
|
reviewer_participation AS (
|
||||||
-- Precompute which (mr_id, username) pairs have substantive DiffNote participation.
|
-- Precompute which (mr_id, username) pairs have substantive DiffNote participation.
|
||||||
@@ -196,7 +225,7 @@ raw AS (
|
|||||||
WHERE m.author_username IS NOT NULL
|
WHERE m.author_username IS NOT NULL
|
||||||
AND m.state IN ('opened','merged','closed')
|
AND m.state IN ('opened','merged','closed')
|
||||||
AND {state_aware_ts} >= ?2
|
AND {state_aware_ts} >= ?2
|
||||||
AND {state_aware_ts} <= ?4
|
AND {state_aware_ts} < ?4
|
||||||
|
|
||||||
UNION ALL
|
UNION ALL
|
||||||
|
|
||||||
@@ -212,7 +241,7 @@ raw AS (
|
|||||||
AND (m.author_username IS NULL OR r.username != m.author_username)
|
AND (m.author_username IS NULL OR r.username != m.author_username)
|
||||||
AND m.state IN ('opened','merged','closed')
|
AND m.state IN ('opened','merged','closed')
|
||||||
AND {state_aware_ts} >= ?2
|
AND {state_aware_ts} >= ?2
|
||||||
AND {state_aware_ts} <= ?4
|
AND {state_aware_ts} < ?4
|
||||||
|
|
||||||
UNION ALL
|
UNION ALL
|
||||||
|
|
||||||
@@ -229,7 +258,7 @@ raw AS (
|
|||||||
AND (m.author_username IS NULL OR r.username != m.author_username)
|
AND (m.author_username IS NULL OR r.username != m.author_username)
|
||||||
AND m.state IN ('opened','merged','closed')
|
AND m.state IN ('opened','merged','closed')
|
||||||
AND {state_aware_ts} >= ?2
|
AND {state_aware_ts} >= ?2
|
||||||
AND {state_aware_ts} <= ?4
|
AND {state_aware_ts} < ?4
|
||||||
),
|
),
|
||||||
aggregated AS (
|
aggregated AS (
|
||||||
-- MR-level signals: 1 row per (username, signal_class, mr_id) with MAX(ts)
|
-- MR-level signals: 1 row per (username, signal_class, mr_id) with MAX(ts)
|
||||||
@@ -245,11 +274,11 @@ aggregated AS (
|
|||||||
SELECT username, signal, mr_id, qty, ts, mr_state FROM aggregated WHERE username IS NOT NULL
|
SELECT username, signal, mr_id, qty, ts, mr_state FROM aggregated WHERE username IS NOT NULL
|
||||||
```
|
```
|
||||||
|
|
||||||
Where `{state_aware_ts}` is the state-aware timestamp expression (defined in the next section), `{path_op}` is either `= ?1` or `LIKE ?1 ESCAPE '\\'` depending on the path query type, `?4` is the `as_of_ms` upper bound (defaults to `now_ms` when `--as-of` is not specified), and `{reviewer_min_note_chars}` is the configured `reviewer_min_note_chars` value (default 20, inlined as a literal in the SQL string). The `BETWEEN ?2 AND ?4` pattern ensures that when `--as-of` is set to a past date, events after that date are excluded — without this, "future" events would leak in with full weight, breaking reproducibility.
|
Where `{state_aware_ts}` is the state-aware timestamp expression (defined in the next section), `{path_op}` is either `= ?1` or `LIKE ?1 ESCAPE '\\'` depending on the path query type, `?4` is the `as_of_ms` exclusive upper bound (defaults to `now_ms` when `--as-of` is not specified), and `{reviewer_min_note_chars}` is the configured `reviewer_min_note_chars` value (default 20, inlined as a literal in the SQL string). The `>= ?2 AND < ?4` pattern (half-open interval) ensures that when `--as-of` is set to a past date, events at or after that date are excluded — without this, "future" events would leak in with full weight, breaking reproducibility. The exclusive upper bound avoids edge-case ambiguity when events have timestamps exactly equal to the as-of value.
|
||||||
|
|
||||||
**Rationale for CTE-based dual-path matching**: The previous approach (repeating `OR old_path` in every signal subquery) duplicated the path matching logic 5 times. Factoring it into `matched_notes` and `matched_file_changes` CTEs means path matching is defined once, the indexes are hit once, and adding future path resolution logic (e.g., alias chains) only requires changes in one place.
|
**Rationale for CTE-based dual-path matching**: The previous approach (repeating `OR old_path` in every signal subquery) duplicated the path matching logic 5 times. Factoring it into foundational CTEs (`matched_notes_raw` → `matched_notes`, `matched_file_changes_raw` → `matched_file_changes`) means path matching is defined once, each index branch is explicit, and adding future path resolution logic (e.g., alias chains) only requires changes in one place. The UNION ALL + dedup pattern ensures SQLite uses the optimal index for each path column independently.
|
||||||
|
|
||||||
**Index optimization fallback (UNION ALL split)**: SQLite's query planner sometimes struggles with `OR` across two indexed columns, falling back to a full table scan instead of using either index. If EXPLAIN QUERY PLAN shows this during step 6 verification, replace the `OR`-based CTEs with a `UNION ALL` split + dedup pattern:
|
**Dual-path matching strategy (UNION ALL split)**: SQLite's query planner commonly struggles with `OR` across two indexed columns, falling back to a full table scan instead of using either index. Rather than starting with `OR` and hoping the planner cooperates, use `UNION ALL` + dedup as the default strategy:
|
||||||
```sql
|
```sql
|
||||||
matched_notes AS (
|
matched_notes AS (
|
||||||
SELECT ... FROM notes n WHERE ... AND n.position_new_path {path_op}
|
SELECT ... FROM notes n WHERE ... AND n.position_new_path {path_op}
|
||||||
@@ -261,18 +290,18 @@ matched_notes_dedup AS (
|
|||||||
FROM matched_notes
|
FROM matched_notes
|
||||||
),
|
),
|
||||||
```
|
```
|
||||||
This ensures each branch can use its respective index independently. The dedup CTE prevents double-counting when `old_path = new_path` (no rename). Start with the simpler `OR` approach and only switch to `UNION ALL` if query plans confirm the degradation.
|
This ensures each branch can use its respective index independently. The dedup CTE prevents double-counting when `old_path = new_path` (no rename). The same pattern applies to `matched_file_changes`. The simpler `OR` variant is retained as a comment for benchmarking — if a future SQLite version handles `OR` well, the split can be collapsed.
|
||||||
|
|
||||||
**Rationale for precomputed participation set**: The previous approach used correlated `EXISTS`/`NOT EXISTS` subqueries to classify reviewers. The `reviewer_participation` CTE materializes the set of `(mr_id, username)` pairs from matched DiffNotes once, then signal 4a JOINs against it (participated) and signal 4b LEFT JOINs with `IS NULL` (assigned-only). This avoids per-reviewer-row correlated scans, is easier to reason about, and produces the same exhaustive split — every `mr_reviewers` row falls into exactly one bucket.
|
**Rationale for precomputed participation set**: The previous approach used correlated `EXISTS`/`NOT EXISTS` subqueries to classify reviewers. The `reviewer_participation` CTE materializes the set of `(mr_id, username)` pairs from matched DiffNotes once, then signal 4a JOINs against it (participated) and signal 4b LEFT JOINs with `IS NULL` (assigned-only). This avoids per-reviewer-row correlated scans, is easier to reason about, and produces the same exhaustive split — every `mr_reviewers` row falls into exactly one bucket.
|
||||||
|
|
||||||
**Rationale for hybrid over fully-raw**: Pre-aggregating note counts in SQL prevents row explosion from heavy DiffNote volume on frequently-discussed paths. MR-level signals are already 1-per-MR by nature (deduped via GROUP BY in each subquery). This keeps memory and latency predictable regardless of review activity density.
|
**Rationale for hybrid over fully-raw**: Pre-aggregating note counts in SQL prevents row explosion from heavy DiffNote volume on frequently-discussed paths. MR-level signals are already 1-per-MR by nature (deduped via GROUP BY in each subquery). This keeps memory and latency predictable regardless of review activity density.
|
||||||
|
|
||||||
**Path rename awareness**: Both `matched_notes` and `matched_file_changes` CTEs match against both old and new path columns:
|
**Path rename awareness**: Both `matched_notes` and `matched_file_changes` use UNION ALL + dedup to match against both old and new path columns independently, ensuring each branch uses its respective index:
|
||||||
|
|
||||||
- Notes: `(n.position_new_path {path_op} OR n.position_old_path {path_op})`
|
- Notes: branch 1 matches `position_new_path`, branch 2 matches `position_old_path`, deduped by `notes.id`
|
||||||
- File changes: `(fc.new_path {path_op} OR fc.old_path {path_op})`
|
- File changes: branch 1 matches `new_path`, branch 2 matches `old_path`, deduped by `(merge_request_id, project_id)`
|
||||||
|
|
||||||
Both columns already exist in the schema (`notes.position_old_path` from migration 002, `mr_file_changes.old_path` from migration 016). The `OR` match ensures expertise is credited even when a file was renamed after the work was done. For prefix queries (`--path src/foo/`), the `LIKE` operator applies to both columns identically.
|
Both columns already exist in the schema (`notes.position_old_path` from migration 002, `mr_file_changes.old_path` from migration 016). The UNION ALL approach ensures expertise is credited even when a file was renamed after the work was done. For prefix queries (`--path src/foo/`), the `LIKE` operator applies to both columns identically.
|
||||||
|
|
||||||
**Signal 4 splits into two**: The current signal 4 (`file_reviewer`) joins `mr_reviewers` but doesn't distinguish participation. In the new plan:
|
**Signal 4 splits into two**: The current signal 4 (`file_reviewer`) joins `mr_reviewers` but doesn't distinguish participation. In the new plan:
|
||||||
|
|
||||||
@@ -332,7 +361,7 @@ For each username, accumulate into a struct with:
|
|||||||
|
|
||||||
The `mr_state` field from each SQL row is stored alongside the timestamp so the Rust-side can apply `closed_mr_multiplier` when `mr_state == "closed"`.
|
The `mr_state` field from each SQL row is stored alongside the timestamp so the Rust-side can apply `closed_mr_multiplier` when `mr_state == "closed"`.
|
||||||
|
|
||||||
Compute score as `f64`. Each MR-level contribution is multiplied by `closed_mr_multiplier` (default 0.5) when the MR's state is `"closed"`:
|
Compute score as `f64` with **deterministic contribution ordering**: within each signal type, sort contributions by `(mr_id ASC)` before summing. This eliminates platform-dependent HashMap iteration order as a source of f64 rounding variance near ties, ensuring CI reproducibility without the complexity of compensated summation (Neumaier/Kahan). Each MR-level contribution is multiplied by `closed_mr_multiplier` (default 0.5) when the MR's state is `"closed"`:
|
||||||
```
|
```
|
||||||
state_mult(mr) = if mr.state == "closed" { closed_mr_multiplier } else { 1.0 }
|
state_mult(mr) = if mr.state == "closed" { closed_mr_multiplier } else { 1.0 }
|
||||||
|
|
||||||
@@ -352,6 +381,8 @@ Compute counts from the accumulated data:
|
|||||||
- `review_note_count = notes_per_mr.values().map(|(count, _)| count).sum()`
|
- `review_note_count = notes_per_mr.values().map(|(count, _)| count).sum()`
|
||||||
- `author_mr_count = author_mrs.len()`
|
- `author_mr_count = author_mrs.len()`
|
||||||
|
|
||||||
|
**Bot/service-account filtering**: After accumulating all user scores and before sorting, filter out any username that appears in `config.scoring.excluded_usernames` (exact match, case-insensitive). This is applied in Rust post-query (not SQL) to keep the SQL clean and avoid parameter explosion. When `--include-bots` is active, the filter is skipped entirely. The robot JSON `resolved_input` includes `excluded_usernames_applied: true|false` to indicate whether filtering was active.
|
||||||
|
|
||||||
Truncate to limit after sorting.
|
Truncate to limit after sorting.
|
||||||
|
|
||||||
### 5. Default --since Change
|
### 5. Default --since Change
|
||||||
@@ -364,10 +395,11 @@ At 2 years, author decay = 6%, reviewer decay = 0.4%, note decay = 0.006% — ne
|
|||||||
### 5a. Reproducible Scoring via `--as-of`
|
### 5a. Reproducible Scoring via `--as-of`
|
||||||
|
|
||||||
Add `--as-of <RFC3339|YYYY-MM-DD>` flag that overrides the `now_ms` reference point used for decay calculations. When set:
|
Add `--as-of <RFC3339|YYYY-MM-DD>` flag that overrides the `now_ms` reference point used for decay calculations. When set:
|
||||||
- All event selection is bounded by `[since_ms, as_of_ms]` — events after `as_of_ms` are excluded from SQL results entirely (not just decayed)
|
- All event selection is bounded by `[since_ms, as_of_ms)` — exclusive upper bound; events at or after `as_of_ms` are excluded from SQL results entirely (not just decayed). The SQL uses `< ?4` (strict less-than), not `<= ?4`.
|
||||||
|
- `YYYY-MM-DD` input (without time component) is interpreted as end-of-day UTC: `T23:59:59.999Z`. This matches user intuition that `--as-of 2025-06-01` means "as of the end of June 1st" rather than "as of midnight at the start of June 1st" which would exclude the entire day's activity.
|
||||||
- All decay computations use `as_of_ms` instead of `SystemTime::now()`
|
- All decay computations use `as_of_ms` instead of `SystemTime::now()`
|
||||||
- The `--since` window is calculated relative to `as_of_ms` (not wall clock)
|
- The `--since` window is calculated relative to `as_of_ms` (not wall clock)
|
||||||
- Robot JSON `resolved_input` includes `as_of_ms` and `as_of_iso` fields
|
- Robot JSON `resolved_input` includes `as_of_ms`, `as_of_iso`, `window_start_iso`, `window_end_iso`, and `window_end_exclusive: true` fields — making the exact query window unambiguous in output
|
||||||
|
|
||||||
**Rationale**: Decayed scoring is time-sensitive by nature. Without a fixed reference point, the same query run minutes apart produces different rankings, making debugging and test reproducibility difficult. `--as-of` pins the clock so that results are deterministic for a given dataset. The upper-bound filter in SQL is critical — without it, events after the as-of date would enter with full weight (since `elapsed.max(0.0)` clamps negative elapsed time to zero), breaking the reproducibility guarantee.
|
**Rationale**: Decayed scoring is time-sensitive by nature. Without a fixed reference point, the same query run minutes apart produces different rankings, making debugging and test reproducibility difficult. `--as-of` pins the clock so that results are deterministic for a given dataset. The upper-bound filter in SQL is critical — without it, events after the as-of date would enter with full weight (since `elapsed.max(0.0)` clamps negative elapsed time to zero), breaking the reproducibility guarantee.
|
||||||
|
|
||||||
@@ -484,7 +516,13 @@ Add timestamp-aware variants:
|
|||||||
|
|
||||||
**`test_closed_mr_multiplier`**: Two identical MRs (same author, same age, same path). One is `merged`, one is `closed`. The merged MR should contribute `author_weight * decay(...)`, the closed MR should contribute `author_weight * closed_mr_multiplier * decay(...)`. With default multiplier 0.5, the closed MR contributes half.
|
**`test_closed_mr_multiplier`**: Two identical MRs (same author, same age, same path). One is `merged`, one is `closed`. The merged MR should contribute `author_weight * decay(...)`, the closed MR should contribute `author_weight * closed_mr_multiplier * decay(...)`. With default multiplier 0.5, the closed MR contributes half.
|
||||||
|
|
||||||
**`test_as_of_excludes_future_events`**: Insert events at timestamps T1 (past) and T2 (future relative to as-of). With `--as-of` set between T1 and T2, only T1 events should appear in results. T2 events must be excluded entirely, not just decayed. Validates the upper-bound filtering in SQL.
|
**`test_as_of_excludes_future_events`**: Insert events at timestamps T1 (past) and T2 (future relative to as-of). With `--as-of` set between T1 and T2, only T1 events should appear in results. T2 events must be excluded entirely, not just decayed. Validates the exclusive upper-bound (`< ?4`) filtering in SQL.
|
||||||
|
|
||||||
|
**`test_as_of_exclusive_upper_bound`**: Insert an event with timestamp exactly equal to the `as_of_ms` value. Verify it is excluded from results (strict less-than, not less-than-or-equal). This validates the half-open interval `[since, as_of)` semantics.
|
||||||
|
|
||||||
|
**`test_excluded_usernames_filters_bots`**: Insert signals for a user named "renovate-bot" and a user named "jsmith", both with the same activity. With `excluded_usernames: ["renovate-bot"]` in config, only "jsmith" should appear in results. Validates the Rust-side post-query filtering.
|
||||||
|
|
||||||
|
**`test_include_bots_flag_disables_filtering`**: Same setup as above, but with `--include-bots` active. Both "renovate-bot" and "jsmith" should appear in results.
|
||||||
|
|
||||||
**`test_null_timestamp_fallback_to_created_at`**: Insert a merged MR with `merged_at = NULL` (edge case: old data before the column was populated). The state-aware timestamp should fall back to `created_at`. Verify the score reflects `created_at`, not 0 or a panic.
|
**`test_null_timestamp_fallback_to_created_at`**: Insert a merged MR with `merged_at = NULL` (edge case: old data before the column was populated). The state-aware timestamp should fall back to `created_at`. Verify the score reflects `created_at`, not 0 or a panic.
|
||||||
|
|
||||||
@@ -496,11 +534,13 @@ Add timestamp-aware variants:
|
|||||||
|
|
||||||
**`test_reviewer_split_is_exhaustive`**: For a reviewer assigned to an MR, they must appear in exactly one of: participated (has substantive DiffNotes meeting `reviewer_min_note_chars`) or assigned-only (no DiffNotes, or only trivial ones below the threshold). Never both, never neither. Test three cases: (1) reviewer with substantive DiffNotes -> participated only, (2) reviewer with no DiffNotes -> assigned-only only, (3) reviewer with only trivial notes ("LGTM") -> assigned-only only.
|
**`test_reviewer_split_is_exhaustive`**: For a reviewer assigned to an MR, they must appear in exactly one of: participated (has substantive DiffNotes meeting `reviewer_min_note_chars`) or assigned-only (no DiffNotes, or only trivial ones below the threshold). Never both, never neither. Test three cases: (1) reviewer with substantive DiffNotes -> participated only, (2) reviewer with no DiffNotes -> assigned-only only, (3) reviewer with only trivial notes ("LGTM") -> assigned-only only.
|
||||||
|
|
||||||
|
**`test_deterministic_accumulation_order`**: Insert signals for a user with contributions at many different timestamps (10+ MRs with varied ages). Run `query_expert` 100 times in a loop. All 100 runs must produce the exact same `f64` score (bit-identical). Validates that the sorted contribution ordering eliminates HashMap-iteration-order nondeterminism.
|
||||||
|
|
||||||
### 9. Existing Test Compatibility
|
### 9. Existing Test Compatibility
|
||||||
|
|
||||||
All existing tests insert data with `now_ms()`. With decay, elapsed ~0ms means decay ~1.0, so scores round to the same integers as before. No existing test assertions should break.
|
All existing tests insert data with `now_ms()`. With decay, elapsed ~0ms means decay ~1.0, so scores round to the same integers as before. No existing test assertions should break.
|
||||||
|
|
||||||
The `test_expert_scoring_weights_are_configurable` test needs `..Default::default()` added to fill the new half-life fields, `reviewer_assignment_weight` / `reviewer_assignment_half_life_days`, `closed_mr_multiplier`, and `reviewer_min_note_chars` fields.
|
The `test_expert_scoring_weights_are_configurable` test needs `..Default::default()` added to fill the new half-life fields, `reviewer_assignment_weight` / `reviewer_assignment_half_life_days`, `closed_mr_multiplier`, `reviewer_min_note_chars`, and `excluded_usernames` fields.
|
||||||
|
|
||||||
## Verification
|
## Verification
|
||||||
|
|
||||||
@@ -511,11 +551,17 @@ The `test_expert_scoring_weights_are_configurable` test needs `..Default::defaul
|
|||||||
5. `ubs src/cli/commands/who.rs src/core/config.rs src/core/db.rs` — no bug scanner findings
|
5. `ubs src/cli/commands/who.rs src/core/config.rs src/core/db.rs` — no bug scanner findings
|
||||||
6. Manual query plan verification (not automated — SQLite planner varies across versions):
|
6. Manual query plan verification (not automated — SQLite planner varies across versions):
|
||||||
- Run `EXPLAIN QUERY PLAN` on the expert query (both exact and prefix modes) against a real database
|
- Run `EXPLAIN QUERY PLAN` on the expert query (both exact and prefix modes) against a real database
|
||||||
- Confirm that `matched_notes` CTE uses `idx_notes_old_path_author` or the existing new_path index (not a full table scan)
|
- Confirm that `matched_notes_raw` branch 1 uses the existing new_path index and branch 2 uses `idx_notes_old_path_author` (not a full table scan on either branch)
|
||||||
- Confirm that `matched_file_changes` CTE uses `idx_mfc_old_path_project_mr` or `idx_mfc_new_path_project_mr`
|
- Confirm that `matched_file_changes_raw` branch 1 uses `idx_mfc_new_path_project_mr` and branch 2 uses `idx_mfc_old_path_project_mr`
|
||||||
- Confirm that `reviewer_participation` CTE uses `idx_notes_diffnote_discussion_author`
|
- Confirm that `reviewer_participation` CTE uses `idx_notes_diffnote_discussion_author`
|
||||||
- Document the observed plan in a comment near the SQL for future regression reference
|
- Document the observed plan in a comment near the SQL for future regression reference
|
||||||
7. Real-world validation:
|
7. Performance baseline (manual, not CI-gated):
|
||||||
|
- Run `time cargo run --release -- who --path <exact-path>` on the real database for exact, prefix, and suffix modes
|
||||||
|
- Target SLOs: p95 exact path < 200ms, prefix < 300ms, suffix < 500ms on development hardware
|
||||||
|
- Record baseline timings as a comment near the SQL for regression reference
|
||||||
|
- If any mode exceeds 2x the baseline after future changes, investigate before merging
|
||||||
|
- Note: These are soft targets for developer awareness, not automated CI gates. Automated benchmarking with synthetic fixtures (100k/1M/5M notes) is a v2 investment if performance becomes a real concern.
|
||||||
|
8. Real-world validation:
|
||||||
- `cargo run --release -- who --path MeasurementQualityDialog.tsx` — verify jdefting/zhayes old reviews are properly discounted relative to recent authors
|
- `cargo run --release -- who --path MeasurementQualityDialog.tsx` — verify jdefting/zhayes old reviews are properly discounted relative to recent authors
|
||||||
- `cargo run --release -- who --path MeasurementQualityDialog.tsx --all-history` — compare full history vs 24m window to validate cutoff is reasonable
|
- `cargo run --release -- who --path MeasurementQualityDialog.tsx --all-history` — compare full history vs 24m window to validate cutoff is reasonable
|
||||||
- `cargo run --release -- who --path MeasurementQualityDialog.tsx --explain-score` — verify component breakdown sums to total and authored signal dominates for known authors
|
- `cargo run --release -- who --path MeasurementQualityDialog.tsx --explain-score` — verify component breakdown sums to total and authored signal dominates for known authors
|
||||||
@@ -524,6 +570,7 @@ The `test_expert_scoring_weights_are_configurable` test needs `..Default::defaul
|
|||||||
- `cargo run --release -- who --path MeasurementQualityDialog.tsx --as-of 2025-06-01` — verify deterministic output across repeated runs
|
- `cargo run --release -- who --path MeasurementQualityDialog.tsx --as-of 2025-06-01` — verify deterministic output across repeated runs
|
||||||
- Spot-check that reviewers who only left "LGTM"-style notes are classified as assigned-only (not participated)
|
- Spot-check that reviewers who only left "LGTM"-style notes are classified as assigned-only (not participated)
|
||||||
- Verify closed MRs contribute at ~50% of equivalent merged MR scores via `--explain-score`
|
- Verify closed MRs contribute at ~50% of equivalent merged MR scores via `--explain-score`
|
||||||
|
- If the project has known bot accounts (e.g., renovate-bot), add them to `excluded_usernames` config and verify they no longer appear in results. Run again with `--include-bots` to confirm they reappear.
|
||||||
|
|
||||||
## Accepted from External Review
|
## Accepted from External Review
|
||||||
|
|
||||||
@@ -553,12 +600,20 @@ Ideas incorporated from ChatGPT review (feedback-1 through feedback-4) that genu
|
|||||||
- **EXPLAIN QUERY PLAN verification step**: Manual check that the restructured queries use the new indexes (not automated, since SQLite planner varies across versions).
|
- **EXPLAIN QUERY PLAN verification step**: Manual check that the restructured queries use the new indexes (not automated, since SQLite planner varies across versions).
|
||||||
|
|
||||||
**From feedback-4:**
|
**From feedback-4:**
|
||||||
- **`--as-of` temporal correctness (critical)**: The plan described `--as-of` but the SQL only enforced a lower bound (`>= ?2`). Events after the as-of date would leak in with full weight (because `elapsed.max(0.0)` clamps negative elapsed time to zero). Added `<= ?4` upper bound to all SQL timestamp filters, making the query window `[since_ms, as_of_ms]`. Without this, `--as-of` reproducibility was fundamentally broken.
|
- **`--as-of` temporal correctness (critical)**: The plan described `--as-of` but the SQL only enforced a lower bound (`>= ?2`). Events after the as-of date would leak in with full weight (because `elapsed.max(0.0)` clamps negative elapsed time to zero). Added `< ?4` upper bound to all SQL timestamp filters, making the query window `[since_ms, as_of_ms)`. Without this, `--as-of` reproducibility was fundamentally broken. (Refined to exclusive upper bound in feedback-5.)
|
||||||
- **Closed-state inconsistency resolution**: The state-aware CASE expression handled `closed` state but the WHERE clause filtered to `('opened','merged')` only — dead code. Resolved by including `'closed'` in state filters and adding a `closed_mr_multiplier` (default 0.5) applied in Rust to all signals from closed-without-merge MRs. This credits real review effort on abandoned MRs while appropriately discounting it.
|
- **Closed-state inconsistency resolution**: The state-aware CASE expression handled `closed` state but the WHERE clause filtered to `('opened','merged')` only — dead code. Resolved by including `'closed'` in state filters and adding a `closed_mr_multiplier` (default 0.5) applied in Rust to all signals from closed-without-merge MRs. This credits real review effort on abandoned MRs while appropriately discounting it.
|
||||||
- **Substantive note threshold for reviewer participation**: A single "LGTM" shouldn't promote a reviewer from 3-point (assigned-only) to 10-point (participated) weight. Added `reviewer_min_note_chars` (default 20) config field and `LENGTH(TRIM(body))` filter in the `reviewer_participation` CTE. This raises the bar for participation classification to actual substantive review comments.
|
- **Substantive note threshold for reviewer participation**: A single "LGTM" shouldn't promote a reviewer from 3-point (assigned-only) to 10-point (participated) weight. Added `reviewer_min_note_chars` (default 20) config field and `LENGTH(TRIM(body))` filter in the `reviewer_participation` CTE. This raises the bar for participation classification to actual substantive review comments.
|
||||||
- **UNION ALL optimization fallback for path predicates**: SQLite's planner can degrade `OR` across two indexed columns to a table scan. Added documentation of a `UNION ALL` split + dedup fallback pattern to use if EXPLAIN QUERY PLAN shows degradation during verification. Start with the simpler `OR` approach; switch only if needed.
|
- **UNION ALL optimization for path predicates**: SQLite's planner can degrade `OR` across two indexed columns to a table scan. Originally documented as a fallback; promoted to default strategy in feedback-5 iteration. The UNION ALL + dedup approach ensures each index branch is used independently.
|
||||||
- **New tests**: `test_trivial_note_does_not_count_as_participation`, `test_closed_mr_multiplier`, `test_as_of_excludes_future_events` — cover the three new features added from this review round.
|
- **New tests**: `test_trivial_note_does_not_count_as_participation`, `test_closed_mr_multiplier`, `test_as_of_excludes_future_events` — cover the three new features added from this review round.
|
||||||
|
|
||||||
|
**From feedback-5 (ChatGPT review):**
|
||||||
|
- **Exclusive upper bound for `--as-of`**: Changed from `[since_ms, as_of_ms]` (inclusive) to `[since_ms, as_of_ms)` (exclusive). Half-open intervals are the standard convention in temporal systems — they eliminate edge-case ambiguity when events have timestamps exactly at the boundary. Also added `YYYY-MM-DD` → end-of-day UTC parsing and window metadata in robot output.
|
||||||
|
- **UNION ALL as default for dual-path matching**: Promoted from "fallback if planner regresses" to default strategy. SQLite `OR`-across-indexed-columns degradation is common enough that the predictable UNION ALL + dedup approach is the safer starting point. The simpler `OR` variant is retained as a comment for benchmarking.
|
||||||
|
- **Deterministic contribution ordering**: Within each signal type, sort contributions by `mr_id` before summing. This eliminates HashMap iteration order as a source of f64 rounding variance near ties, ensuring CI reproducibility without the overhead of compensated summation (Neumaier/Kahan was rejected as overkill at this scale).
|
||||||
|
- **Minimal bot/service-account filtering**: Added `excluded_usernames` (exact match, case-insensitive) to `ScoringConfig` and `--include-bots` CLI flag. Applied as a Rust-side post-filter (not SQL) to keep queries clean. Scope is deliberately minimal — no regex patterns, no heuristic detection. Users configure the list for their team's specific bots.
|
||||||
|
- **Performance baseline SLOs**: Added manual performance baseline step to verification — record timings for exact/prefix/suffix modes and flag >2x regressions. Kept lightweight (no CI gating, no synthetic benchmarks) to match the project's current maturity.
|
||||||
|
- **New tests**: `test_as_of_exclusive_upper_bound`, `test_excluded_usernames_filters_bots`, `test_include_bots_flag_disables_filtering`, `test_deterministic_accumulation_order` — cover the newly-accepted features.
|
||||||
|
|
||||||
## Rejected Ideas (with rationale)
|
## Rejected Ideas (with rationale)
|
||||||
|
|
||||||
These suggestions were considered during review but explicitly excluded from this iteration:
|
These suggestions were considered during review but explicitly excluded from this iteration:
|
||||||
@@ -573,3 +628,10 @@ These suggestions were considered during review but explicitly excluded from thi
|
|||||||
- **Split scoring engine into core module** (feedback-4 #5): Proposed extracting scoring math from `who.rs` into `src/core/scoring/model_v2_decay.rs`. Premature modularization — `who.rs` is the only consumer and is ~800 lines. Adding module plumbing and indirection for a single call site adds complexity without reducing it. If we add a second scoring consumer (e.g., automated triage), revisit.
|
- **Split scoring engine into core module** (feedback-4 #5): Proposed extracting scoring math from `who.rs` into `src/core/scoring/model_v2_decay.rs`. Premature modularization — `who.rs` is the only consumer and is ~800 lines. Adding module plumbing and indirection for a single call site adds complexity without reducing it. If we add a second scoring consumer (e.g., automated triage), revisit.
|
||||||
- **Bot/service-account filtering** (feedback-4 #7): Real concern but orthogonal to time-decay scoring. This is a general data quality feature that belongs in its own issue — it affects all `who` modes, not just expert scoring. Adding `excluded_username_patterns` config and `--include-bots` flag is scope expansion that should be designed and tested independently.
|
- **Bot/service-account filtering** (feedback-4 #7): Real concern but orthogonal to time-decay scoring. This is a general data quality feature that belongs in its own issue — it affects all `who` modes, not just expert scoring. Adding `excluded_username_patterns` config and `--include-bots` flag is scope expansion that should be designed and tested independently.
|
||||||
- **Model compare mode / rank-delta diagnostics** (feedback-4 #9): Over-engineered rollout safety for an internal CLI tool with ~3 users. Maintaining two parallel scoring codepaths (v1 flat + v2 decayed) doubles test surface and code complexity. The `--explain-score` + `--as-of` combination already provides debugging capability. If a future model change is risky enough to warrant A/B comparison, build it then.
|
- **Model compare mode / rank-delta diagnostics** (feedback-4 #9): Over-engineered rollout safety for an internal CLI tool with ~3 users. Maintaining two parallel scoring codepaths (v1 flat + v2 decayed) doubles test surface and code complexity. The `--explain-score` + `--as-of` combination already provides debugging capability. If a future model change is risky enough to warrant A/B comparison, build it then.
|
||||||
|
- **Canonical path identity graph** (feedback-5 #1, also feedback-2 #2, feedback-4 #4): Third time proposed, third time rejected. Building a rename graph from `mr_file_changes(old_path, new_path)` with identity resolution requires new schema (`path_identities`, `path_aliases` tables), ingestion pipeline changes, graph traversal at query time, and backfill logic for existing data. The UNION ALL dual-path matching already covers the 80%+ case (direct renames). Multi-hop rename chains (A→B→C) are rare in practice and can be addressed in v2 with real usage data showing the gap matters.
|
||||||
|
- **Normalized `expertise_events` table** (feedback-5 #2): Proposes shifting from query-time CTE joins to a precomputed `expertise_events` table populated at ingest time. While architecturally appealing for read performance, this doubles the data surface area (raw tables + derived events), requires new ingestion pipelines with incremental upsert logic, backfill tooling for existing databases, and introduces consistency risks when raw data is corrected/re-synced. The CTE approach is correct, maintainable, and performant at our current scale. If query latency becomes a real bottleneck (see performance baseline SLOs), materialized views or derived tables become a v2 optimization.
|
||||||
|
- **Reviewer engagement model upgrade** (feedback-5 #3): Proposes adding `approved`/`changes_requested` review-state signals and trivial-comment pattern matching (`["lgtm","+1","nit","ship it"]`). Expands the signal type count from 4 to 6 and adds a fragile pattern-matching layer (what about "don't ship it"? "lgtm but..."?). The `reviewer_min_note_chars` threshold is imperfect but pragmatic — it's a single configurable number with no false-positive risk from substring matching. Review-state signals may be worth adding later as a separate enhancement when we have data on how often they diverge from DiffNote participation.
|
||||||
|
- **Contribution-floor auto cutoff for `--since`** (feedback-5 #5): Proposes `--since auto` computing the earliest relevant timestamp from `min_contribution_floor` (e.g., 0.01 points). Adds a non-obvious config parameter for minimal benefit — the 24m default is already mathematically justified from the decay curves (author: 6%, reviewer: 0.4% at 2 years) and easily overridden with `--since` or `--all-history`. The auto-derivation formula (`ceil(max_half_life * log2(1/floor))`) is opaque to users who just want to understand why a certain time range was selected.
|
||||||
|
- **Full evidence drill-down in `--explain-score`** (feedback-5 #8): Proposes `--explain-score=summary|full` with per-MR evidence rows. Already rejected in feedback-2 #7. Component totals are sufficient for v1 debugging — they answer "which signal type drives this user's score." Per-MR drill-down requires additional SQL queries and significant output format complexity. Deferred unless component breakdowns prove insufficient.
|
||||||
|
- **Neumaier compensated summation** (feedback-5 #7 partial): Accepted the sorting aspect for deterministic ordering, but rejected Neumaier/Kahan compensated summation. At the scale of dozens to low hundreds of contributions per user, the rounding error from naive f64 summation is on the order of 1e-14 — several orders of magnitude below any meaningful score difference. Compensated summation adds code complexity and a maintenance burden for no practical benefit at this scale.
|
||||||
|
- **Automated CI benchmark gate** (feedback-5 #10 partial): Accepted manual performance baselines, but rejected automated CI regression gating with synthetic fixtures (100k/1M/5M notes). Building and maintaining benchmark infrastructure is a significant investment that's premature for a CLI tool with ~3 users. Manual timing checks during development are sufficient until performance becomes a real concern.
|
||||||
|
|||||||
157
plans/work-item-status-graphql.feedback-3.md
Normal file
157
plans/work-item-status-graphql.feedback-3.md
Normal file
@@ -0,0 +1,157 @@
|
|||||||
|
**Top Revisions I Recommend**
|
||||||
|
|
||||||
|
1. **Fix auth semantics + a real inconsistency in your test plan**
|
||||||
|
Your ACs require graceful handling for `403`, but the test list says the “403” test returns `401`. That hides the exact behavior you care about and can let permission regressions slip through.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-1: GraphQL Client (Unit)
|
||||||
|
- [ ] HTTP 401 → `LoreError::GitLabAuthFailed`
|
||||||
|
+ [ ] HTTP 401 → `LoreError::GitLabAuthFailed`
|
||||||
|
+ [ ] HTTP 403 → `LoreError::GitLabForbidden`
|
||||||
|
|
||||||
|
@@ AC-3: Status Fetcher (Integration)
|
||||||
|
- [ ] GraphQL 403 → returns `Ok(HashMap::new())` with warning log
|
||||||
|
+ [ ] GraphQL 403 (`GitLabForbidden`) → returns `Ok(HashMap::new())` with warning log
|
||||||
|
|
||||||
|
@@ TDD Plan (RED)
|
||||||
|
- 13. `test_fetch_statuses_403_graceful` — mock returns 401 → `Ok(HashMap::new())`
|
||||||
|
+ 13. `test_fetch_statuses_403_graceful` — mock returns 403 → `Ok(HashMap::new())`
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Make enrichment atomic and stale-safe**
|
||||||
|
Current plan can leave stale status values forever when a widget disappears or status becomes null. Make writes transactional and clear status fields for fetched scope before upserts.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-6: Enrichment in Orchestrator (Integration)
|
||||||
|
+ [ ] Enrichment DB writes are transactional per project (all-or-nothing)
|
||||||
|
+ [ ] Status fields are cleared for fetched issue scope before applying new statuses
|
||||||
|
+ [ ] If enrichment fails mid-project, prior persisted statuses are unchanged (rollback)
|
||||||
|
|
||||||
|
@@ File 6: `src/ingestion/orchestrator.rs`
|
||||||
|
- fn enrich_issue_statuses(...)
|
||||||
|
+ fn enrich_issue_statuses_txn(...)
|
||||||
|
+ // BEGIN TRANSACTION
|
||||||
|
+ // clear status columns for fetched issue scope
|
||||||
|
+ // apply updates
|
||||||
|
+ // COMMIT
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Add transient retry/backoff (429/5xx/network)**
|
||||||
|
Right now one transient failure loses status enrichment for that sync. Retrying with bounded backoff gives much better reliability at low cost.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-1: GraphQL Client (Unit)
|
||||||
|
+ [ ] Retries 429/502/503/504/network errors with bounded exponential backoff + jitter (max 3 attempts)
|
||||||
|
+ [ ] Honors `Retry-After` on 429 before retrying
|
||||||
|
|
||||||
|
@@ AC-6: Enrichment in Orchestrator (Integration)
|
||||||
|
+ [ ] Cancellation signal is checked before each retry sleep and between paginated calls
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Stop full GraphQL scans when nothing changed**
|
||||||
|
Running full pagination on every sync will dominate runtime on large repos. Trigger enrichment only when issue ingestion reports changes, with a manual override.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-6: Enrichment in Orchestrator (Integration)
|
||||||
|
- [ ] Runs on every sync (not gated by `--full`)
|
||||||
|
+ [ ] Runs when issue ingestion changed at least one issue in the project
|
||||||
|
+ [ ] New override flag `--refresh-status` forces enrichment even with zero issue deltas
|
||||||
|
+ [ ] Optional periodic full refresh (e.g. every N syncs) to prevent long-tail drift
|
||||||
|
```
|
||||||
|
|
||||||
|
5. **Do not expose raw token via `client.token()`**
|
||||||
|
Architecturally cleaner and safer: keep token encapsulated and expose a GraphQL-ready client factory from `GitLabClient`.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ File 13: `src/gitlab/client.rs`
|
||||||
|
- pub fn token(&self) -> &str
|
||||||
|
+ pub fn graphql_client(&self) -> crate::gitlab::graphql::GraphqlClient
|
||||||
|
|
||||||
|
@@ File 6: `src/ingestion/orchestrator.rs`
|
||||||
|
- let graphql_client = GraphqlClient::new(&config.gitlab.base_url, client.token());
|
||||||
|
+ let graphql_client = client.graphql_client();
|
||||||
|
```
|
||||||
|
|
||||||
|
6. **Add indexes for new status filters**
|
||||||
|
`--status` on large tables will otherwise full-scan `issues`. Add compound indexes aligned with project-scoped list queries.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-4: Migration 021 (Unit)
|
||||||
|
+ [ ] Adds index `idx_issues_project_status_name(project_id, status_name)`
|
||||||
|
+ [ ] Adds index `idx_issues_project_status_category(project_id, status_category)`
|
||||||
|
|
||||||
|
@@ File 14: `migrations/021_work_item_status.sql`
|
||||||
|
ALTER TABLE issues ADD COLUMN status_name TEXT;
|
||||||
|
ALTER TABLE issues ADD COLUMN status_category TEXT;
|
||||||
|
ALTER TABLE issues ADD COLUMN status_color TEXT;
|
||||||
|
ALTER TABLE issues ADD COLUMN status_icon_name TEXT;
|
||||||
|
+CREATE INDEX IF NOT EXISTS idx_issues_project_status_name
|
||||||
|
+ ON issues(project_id, status_name);
|
||||||
|
+CREATE INDEX IF NOT EXISTS idx_issues_project_status_category
|
||||||
|
+ ON issues(project_id, status_category);
|
||||||
|
```
|
||||||
|
|
||||||
|
7. **Improve filter UX: add category filter + case-insensitive status**
|
||||||
|
Case-sensitive exact name matches are brittle with custom lifecycle names. Category filter is stable and useful for automation.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-9: List Issues Filter (E2E)
|
||||||
|
- [ ] Filter is case-sensitive (matches GitLab's exact status name)
|
||||||
|
+ [ ] `--status` uses case-insensitive exact match by default (`COLLATE NOCASE`)
|
||||||
|
+ [ ] New filter `--status-category` supports `triage|to_do|in_progress|done|canceled`
|
||||||
|
+ [ ] `--status-exact` enables strict case-sensitive behavior when needed
|
||||||
|
```
|
||||||
|
|
||||||
|
8. **Add capability probe/cache to avoid pointless calls**
|
||||||
|
Free tier / old GitLab versions will never return status widget. Cache that capability per project (with TTL) to reduce noise and wasted requests.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ GitLab API Constraints
|
||||||
|
+### Capability Probe
|
||||||
|
+On first sync per project, detect status-widget support and cache result for 24h.
|
||||||
|
+If unsupported, skip enrichment silently (debug log) until TTL expiry.
|
||||||
|
|
||||||
|
@@ AC-3: Status Fetcher (Integration)
|
||||||
|
+ [ ] Unsupported capability state bypasses GraphQL fetch and warning spam
|
||||||
|
```
|
||||||
|
|
||||||
|
9. **Use a nested robot `status` object instead of 4 top-level fields**
|
||||||
|
This is cleaner schema design and scales better as status metadata grows (IDs, lifecycle, timestamps, etc.).
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-7: Show Issue Display (Robot)
|
||||||
|
- [ ] JSON includes `status_name`, `status_category`, `status_color`, `status_icon_name` fields
|
||||||
|
- [ ] Fields are `null` (not absent) when status not available
|
||||||
|
+ [ ] JSON includes `status` object:
|
||||||
|
+ `{ "name": "...", "category": "...", "color": "...", "icon_name": "..." }` or `null`
|
||||||
|
|
||||||
|
@@ AC-8: List Issues Display (Robot)
|
||||||
|
- [ ] JSON includes `status_name`, `status_category` fields on each issue
|
||||||
|
+ [ ] JSON includes `status` object (or `null`) on each issue
|
||||||
|
```
|
||||||
|
|
||||||
|
10. **Add one compelling feature: status analytics, not just status display**
|
||||||
|
Right now this is mostly a transport/display enhancement. Make it genuinely useful with “stale in-progress” detection and age-in-status filters.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ Acceptance Criteria
|
||||||
|
+### AC-11: Status Aging & Triage Value (E2E)
|
||||||
|
+- [ ] `lore list issues --status-category in_progress --stale-days 14` filters to stale work
|
||||||
|
+- [ ] Human table shows `Status Age` (days) when status exists
|
||||||
|
+- [ ] Robot output includes `status_age_days` (nullable integer)
|
||||||
|
```
|
||||||
|
|
||||||
|
11. **Harden test plan around failure modes you’ll actually hit**
|
||||||
|
The current tests are good, but miss rollback/staleness/retry behavior that drives real reliability.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ TDD Plan (RED) additions
|
||||||
|
+21. `test_enrich_clears_removed_status`
|
||||||
|
+22. `test_enrich_transaction_rolls_back_on_failure`
|
||||||
|
+23. `test_graphql_retry_429_then_success`
|
||||||
|
+24. `test_graphql_retry_503_then_success`
|
||||||
|
+25. `test_cancel_during_backoff_aborts_cleanly`
|
||||||
|
+26. `test_status_filter_query_uses_project_status_index` (EXPLAIN smoke test)
|
||||||
|
```
|
||||||
|
|
||||||
|
If you want, I can produce a fully revised v3 plan document end-to-end (frontmatter + reordered ACs + updated file list + updated TDD matrix) so it is ready to implement directly.
|
||||||
159
plans/work-item-status-graphql.feedback-4.md
Normal file
159
plans/work-item-status-graphql.feedback-4.md
Normal file
@@ -0,0 +1,159 @@
|
|||||||
|
Your plan is already strong, but I’d revise it in 10 places to reduce risk at scale and make it materially more useful.
|
||||||
|
|
||||||
|
1. Shared transport + retries for GraphQL (must-have)
|
||||||
|
Reasoning: `REST` already has throttling/retry in `src/gitlab/client.rs`; your proposed GraphQL client would bypass that and can spike rate limits under concurrent project ingest (`src/cli/commands/ingest.rs`). Unifying transport prevents split behavior and cuts production incidents.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-1: GraphQL Client (Unit)
|
||||||
|
- [ ] Network error → `LoreError::Other`
|
||||||
|
+ [ ] GraphQL requests use shared GitLab transport (same timeout, rate limiter, retry policy as REST)
|
||||||
|
+ [ ] Retries 429/502/503/504/network errors (max 3) with exponential backoff + jitter
|
||||||
|
+ [ ] 429 honors `Retry-After` before retrying
|
||||||
|
+ [ ] Exhausted network retries → `LoreError::GitLabNetworkError`
|
||||||
|
|
||||||
|
@@ Decisions
|
||||||
|
- 8. **No retry/backoff in v1** — DEFER.
|
||||||
|
+ 8. **Retry/backoff in v1** — YES (shared REST+GraphQL reliability policy).
|
||||||
|
|
||||||
|
@@ Implementation Detail
|
||||||
|
+ File 15: `src/gitlab/transport.rs` (NEW) — shared HTTP execution and retry/backoff policy.
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Capability cache for unsupported projects (must-have)
|
||||||
|
Reasoning: Free tier / older GitLab will repeatedly emit warning noise every sync and waste calls. Cache support status per project and re-probe on TTL.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-6: Enrichment in Orchestrator (Integration)
|
||||||
|
- [ ] On any GraphQL error: logs warning, continues to next project (never fails the sync)
|
||||||
|
+ [ ] Unsupported capability responses (missing endpoint/type/widget) are cached per project
|
||||||
|
+ [ ] While cached unsupported, enrichment is skipped without repeated warning spam
|
||||||
|
+ [ ] Capability cache auto-expires (default 24h) and is re-probed
|
||||||
|
|
||||||
|
@@ Migration Numbering
|
||||||
|
- This feature uses **migration 021**.
|
||||||
|
+ This feature uses **migrations 021-022**.
|
||||||
|
|
||||||
|
@@ Files Changed (Summary)
|
||||||
|
+ `migrations/022_project_capabilities.sql` | NEW — support cache table for project capabilities
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Delta-first enrichment with periodic full reconcile (must-have)
|
||||||
|
Reasoning: Full GraphQL scan every sync is expensive for large projects. You already compute issue deltas in ingestion; use that as fast path and keep a periodic full sweep as safety net.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-6: Enrichment in Orchestrator (Integration)
|
||||||
|
- [ ] Runs on every sync (not gated by `--full`)
|
||||||
|
+ [ ] Fast path: skip status enrichment when issue ingestion upserted 0 issues for that project
|
||||||
|
+ [ ] Safety net: run full reconciliation every `status_full_reconcile_hours` (default 24)
|
||||||
|
+ [ ] `--full` always forces reconciliation
|
||||||
|
|
||||||
|
@@ AC-5: Config Toggle (Unit)
|
||||||
|
+ [ ] `SyncConfig` has `status_full_reconcile_hours: u32` (default 24)
|
||||||
|
```
|
||||||
|
|
||||||
|
4. Strongly typed widget parsing via `__typename` (must-have)
|
||||||
|
Reasoning: current “deserialize arbitrary widget JSON into `StatusWidget`” is fragile. Query/type by `__typename` for forward compatibility and fewer silent parse mistakes.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-3: Status Fetcher (Integration)
|
||||||
|
- [ ] Extracts status from `widgets` array by matching `WorkItemWidgetStatus` fragment
|
||||||
|
+ [ ] Query includes `widgets { __typename ... }` and parser matches `__typename == "WorkItemWidgetStatus"`
|
||||||
|
+ [ ] Non-status widgets are ignored deterministically (no heuristic JSON-deserialize attempts)
|
||||||
|
|
||||||
|
@@ GraphQL Query
|
||||||
|
+ widgets {
|
||||||
|
+ __typename
|
||||||
|
+ ... on WorkItemWidgetStatus { ... }
|
||||||
|
+ }
|
||||||
|
```
|
||||||
|
|
||||||
|
5. Set-based transactional DB apply (must-have)
|
||||||
|
Reasoning: row-by-row clear/update loops will be slow on large projects and hold write locks longer. Temp-table + set-based SQL inside one txn is faster and easier to reason about rollback.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-3: Status Fetcher (Integration)
|
||||||
|
- `all_fetched_iids: Vec<i64>`
|
||||||
|
+ `all_fetched_iids: HashSet<i64>`
|
||||||
|
|
||||||
|
@@ AC-6: Enrichment in Orchestrator (Integration)
|
||||||
|
- [ ] Before applying updates, NULL out status fields ... (loop per IID)
|
||||||
|
- [ ] UPDATE SQL: `SET status_name=?, ... WHERE project_id=? AND iid=?`
|
||||||
|
+ [ ] Use temp tables and set-based SQL in one transaction:
|
||||||
|
+ [ ] (1) clear stale statuses for fetched IIDs absent from status rows
|
||||||
|
+ [ ] (2) apply status values for fetched IIDs with status
|
||||||
|
+ [ ] One commit per project; rollback leaves prior state intact
|
||||||
|
```
|
||||||
|
|
||||||
|
6. Fix index strategy for `COLLATE NOCASE` + default sorting (must-have)
|
||||||
|
Reasoning: your proposed `(project_id, status_name)` index may not fully help `COLLATE NOCASE` + `ORDER BY updated_at`. Tune index to real query shape in `src/cli/commands/list.rs`.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-4: Migration 021 (Unit)
|
||||||
|
- [ ] Adds compound index `idx_issues_project_status_name(project_id, status_name)` for `--status` filter performance
|
||||||
|
+ [ ] Adds covering NOCASE-aware index:
|
||||||
|
+ [ ] `idx_issues_project_status_name_nocase_updated(project_id, status_name COLLATE NOCASE, updated_at DESC)`
|
||||||
|
+ [ ] Adds category index:
|
||||||
|
+ [ ] `idx_issues_project_status_category_nocase(project_id, status_category COLLATE NOCASE)`
|
||||||
|
```
|
||||||
|
|
||||||
|
7. Add stable/automation-friendly filters now (high-value feature)
|
||||||
|
Reasoning: status names are user-customizable and renameable; category is more stable. Also add `--no-status` for quality checks and migration visibility.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-9: List Issues Filter (E2E)
|
||||||
|
+ [ ] `lore list issues --status-category in_progress` filters by category (case-insensitive)
|
||||||
|
+ [ ] `lore list issues --no-status` returns only issues where `status_name IS NULL`
|
||||||
|
+ [ ] `--status` + `--status-category` combine with AND logic
|
||||||
|
|
||||||
|
@@ File 9: `src/cli/mod.rs`
|
||||||
|
+ Add flags: `--status-category`, `--no-status`
|
||||||
|
|
||||||
|
@@ File 11: `src/cli/autocorrect.rs`
|
||||||
|
+ Register `--status-category` and `--no-status` for `issues`
|
||||||
|
```
|
||||||
|
|
||||||
|
8. Better enrichment observability and failure accounting (must-have ops)
|
||||||
|
Reasoning: only tracking `statuses_enriched` hides skipped/cleared/errors, and auth failures become silent partial data quality issues. Add counters and explicit progress events.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-6: Enrichment in Orchestrator (Integration)
|
||||||
|
- [ ] `IngestProjectResult` gains `statuses_enriched: usize` counter
|
||||||
|
- [ ] Progress event: `ProgressEvent::StatusEnrichmentComplete { enriched: usize }`
|
||||||
|
+ [ ] `IngestProjectResult` gains:
|
||||||
|
+ [ ] `statuses_enriched`, `statuses_cleared`, `status_enrichment_skipped`, `status_enrichment_failed`
|
||||||
|
+ [ ] Progress events:
|
||||||
|
+ [ ] `StatusEnrichmentStarted`, `StatusEnrichmentSkipped`, `StatusEnrichmentComplete`, `StatusEnrichmentFailed`
|
||||||
|
+ [ ] End-of-sync summary includes per-project enrichment outcome counts
|
||||||
|
```
|
||||||
|
|
||||||
|
9. Add `status_changed_at` for immediately useful workflow analytics (high-value feature)
|
||||||
|
Reasoning: without change timestamp, you can’t answer “how long has this been in progress?” which is one of the most useful agent/human queries.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-4: Migration 021 (Unit)
|
||||||
|
+ [ ] Adds nullable INTEGER column `status_changed_at` (ms epoch UTC)
|
||||||
|
|
||||||
|
@@ AC-6: Enrichment in Orchestrator (Integration)
|
||||||
|
+ [ ] If status_name/category changes, update `status_changed_at = now_ms()`
|
||||||
|
+ [ ] If status is cleared, set `status_changed_at = NULL`
|
||||||
|
|
||||||
|
@@ AC-9: List Issues Filter (E2E)
|
||||||
|
+ [ ] `lore list issues --stale-status-days N` filters by `status_changed_at <= now - N days`
|
||||||
|
```
|
||||||
|
|
||||||
|
10. Expand test matrix for real-world failure/perf paths (must-have)
|
||||||
|
Reasoning: current tests are good, but the highest-risk failures are retry behavior, capability caching, idempotency under repeated runs, and large-project performance.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ TDD Plan — RED Phase
|
||||||
|
+ 26. `test_graphql_retries_429_with_retry_after_then_succeeds`
|
||||||
|
+ 27. `test_graphql_retries_503_then_fails_after_max_attempts`
|
||||||
|
+ 28. `test_capability_cache_skips_unsupported_project_until_ttl_expiry`
|
||||||
|
+ 29. `test_delta_skip_when_no_issue_upserts`
|
||||||
|
+ 30. `test_periodic_full_reconcile_runs_after_threshold`
|
||||||
|
+ 31. `test_set_based_enrichment_scales_10k_issues_without_timeout`
|
||||||
|
+ 32. `test_enrichment_idempotent_across_two_runs`
|
||||||
|
+ 33. `test_status_changed_at_updates_only_on_actual_status_change`
|
||||||
|
```
|
||||||
|
|
||||||
|
If you want, I can now produce a single consolidated revised plan document (full rewritten Markdown) with these changes merged in-place so it’s ready to execute.
|
||||||
124
plans/work-item-status-graphql.feedback-5.md
Normal file
124
plans/work-item-status-graphql.feedback-5.md
Normal file
@@ -0,0 +1,124 @@
|
|||||||
|
Your plan is already strong and implementation-aware. The best upgrades are mostly about reliability under real-world API instability, large-scale performance, and making the feature more useful for automation.
|
||||||
|
|
||||||
|
1. Promote retry/backoff from deferred to in-scope now.
|
||||||
|
Reason: Right now, transient failures cause silent status gaps until a later sync. Bounded retries with jitter and a time budget dramatically improve successful enrichment without making syncs hang.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-1: GraphQL Client (Unit) @@
|
||||||
|
- [ ] Network error → `LoreError::Other`
|
||||||
|
+ [ ] Transient failures (`429`, `502`, `503`, `504`, timeout, connect reset) retry with exponential backoff + jitter (max 3 attempts)
|
||||||
|
+ [ ] `Retry-After` supports both delta-seconds and HTTP-date formats
|
||||||
|
+ [ ] Per-request retry budget capped (e.g. 120s total) to preserve cancellation responsiveness
|
||||||
|
|
||||||
|
@@ AC-6: Enrichment in Orchestrator (Integration) @@
|
||||||
|
- [ ] On any GraphQL error: logs warning, continues to next project (never fails the sync)
|
||||||
|
+ [ ] On transient GraphQL errors: retry policy applied before warning/skip behavior
|
||||||
|
|
||||||
|
@@ Decisions @@
|
||||||
|
- 8. **No retry/backoff in v1** — DEFER.
|
||||||
|
+ 8. **Retry/backoff in v1** — YES. Required for reliable enrichment under normal GitLab/API turbulence.
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Add a capability cache so unsupported projects stop paying repeated GraphQL cost.
|
||||||
|
Reason: Free tier / older instances will never return status widgets. Re-querying every sync is wasted time and noisy logs.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ Acceptance Criteria @@
|
||||||
|
+ ### AC-11: Capability Probe & Cache (Integration)
|
||||||
|
+ - [ ] Add `project_capabilities` cache with `supports_work_item_status`, `checked_at`, `cooldown_until`
|
||||||
|
+ - [ ] 404/403/known-unsupported responses update capability cache and suppress repeated warnings until TTL expires
|
||||||
|
+ - [ ] Supported projects still enrich every run (subject to normal schedule)
|
||||||
|
|
||||||
|
@@ Future Enhancements (Not in Scope) @@
|
||||||
|
- **Capability probe/cache**: Detect status-widget support per project ... (deferred)
|
||||||
|
+ (moved into scope as AC-11)
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Make enrichment delta-aware with periodic forced reconciliation.
|
||||||
|
Reason: Full pagination every sync is expensive on large projects. You can skip unnecessary status fetches when no issue changes occurred, while still doing periodic safety sweeps.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-6: Enrichment in Orchestrator (Integration) @@
|
||||||
|
- [ ] Runs on every sync (not gated by `--full`)
|
||||||
|
+ [ ] Runs when issue ingestion reports project issue deltas OR reconcile window elapsed
|
||||||
|
+ [ ] New config: `status_reconcile_hours` (default: 24) for periodic full sweep
|
||||||
|
+ [ ] `--refresh-status` forces enrichment regardless of delta/reconcile window
|
||||||
|
```
|
||||||
|
|
||||||
|
4. Replace row-by-row update loops with set-based SQL via temp table.
|
||||||
|
Reason: Current per-IID loops are simple but slow at scale and hold locks longer. Set-based updates are much faster and reduce lock contention.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ File 6: `src/ingestion/orchestrator.rs` (MODIFY) @@
|
||||||
|
- for iid in all_fetched_iids { ... UPDATE issues ... }
|
||||||
|
- for (iid, status) in statuses { ... UPDATE issues ... }
|
||||||
|
+ CREATE TEMP TABLE temp_issue_status_updates(...)
|
||||||
|
+ bulk INSERT temp rows (iid, name, category, color, icon_name)
|
||||||
|
+ single set-based UPDATE for enriched rows
|
||||||
|
+ single set-based NULL-clear for fetched-without-status rows
|
||||||
|
+ commit transaction
|
||||||
|
```
|
||||||
|
|
||||||
|
5. Add strict mode and explicit partial-failure reporting.
|
||||||
|
Reason: “Warn and continue” is good default UX, but automation needs a fail-fast option and machine-readable failure output.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-5: Config Toggle (Unit) @@
|
||||||
|
+ - [ ] `SyncConfig` adds `status_enrichment_strict: bool` (default false)
|
||||||
|
|
||||||
|
@@ AC-6: Enrichment in Orchestrator (Integration) @@
|
||||||
|
- [ ] On any GraphQL error: logs warning, continues to next project (never fails the sync)
|
||||||
|
+ [ ] Default mode: warn + continue
|
||||||
|
+ [ ] Strict mode: status enrichment error fails sync for that run
|
||||||
|
|
||||||
|
@@ AC-6: IngestProjectResult @@
|
||||||
|
+ - [ ] Adds `status_enrichment_error: Option<String>`
|
||||||
|
|
||||||
|
@@ AC-8 / Robot sync envelope @@
|
||||||
|
+ - [ ] Robot output includes `partial_failures` array with per-project enrichment failures
|
||||||
|
```
|
||||||
|
|
||||||
|
6. Fix case-insensitive matching robustness and track freshness.
|
||||||
|
Reason: SQLite `COLLATE NOCASE` is ASCII-centric; custom statuses may be non-ASCII. Also you need visibility into staleness.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-4: Migration 021 (Unit) @@
|
||||||
|
- [ ] Migration adds 4 nullable TEXT columns to `issues`
|
||||||
|
+ [ ] Migration adds 6 columns:
|
||||||
|
+ `status_name`, `status_category`, `status_color`, `status_icon_name`,
|
||||||
|
+ `status_name_fold`, `status_synced_at`
|
||||||
|
- [ ] Adds compound index `idx_issues_project_status_name(project_id, status_name)`
|
||||||
|
+ [ ] Adds compound index `idx_issues_project_status_name_fold(project_id, status_name_fold)`
|
||||||
|
|
||||||
|
@@ AC-9: List Issues Filter (E2E) @@
|
||||||
|
- [ ] Filter uses case-insensitive matching (`COLLATE NOCASE`)
|
||||||
|
+ [ ] Filter uses `status_name_fold` (Unicode-safe fold normalization done at write time)
|
||||||
|
```
|
||||||
|
|
||||||
|
7. Expand filtering to category and missing-status workflows.
|
||||||
|
Reason: Name filters are useful, but automation is better on semantic categories and “missing data” detection.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-9: List Issues Filter (E2E) @@
|
||||||
|
+ - [ ] `--status-category in_progress` filters by `status_category` (case-insensitive)
|
||||||
|
+ - [ ] `--no-status` returns only issues where `status_name IS NULL`
|
||||||
|
+ - [ ] `--status` and `--status-category` can be combined with AND logic
|
||||||
|
```
|
||||||
|
|
||||||
|
8. Change robot payload from flat status fields to a nested `status` object.
|
||||||
|
Reason: Better schema evolution and less top-level field sprawl as you add metadata (`synced_at`, future lifecycle fields).
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-7: Show Issue Display (E2E) @@
|
||||||
|
- [ ] JSON includes `status_name`, `status_category`, `status_color`, `status_icon_name` fields
|
||||||
|
- [ ] Fields are `null` (not absent) when status not available
|
||||||
|
+ [ ] JSON includes `status` object:
|
||||||
|
+ `{ "name", "category", "color", "icon_name", "synced_at" }`
|
||||||
|
+ [ ] `status: null` when not available
|
||||||
|
|
||||||
|
@@ AC-8: List Issues Display (E2E) @@
|
||||||
|
- [ ] `--fields` supports: `status_name`, `status_category`, `status_color`, `status_icon_name`
|
||||||
|
+ [ ] `--fields` supports: `status.name,status.category,status.color,status.icon_name,status.synced_at`
|
||||||
|
```
|
||||||
|
|
||||||
|
If you want, I can produce a fully rewritten “Iteration 5” plan document with these changes integrated end-to-end (ACs, files, migrations, TDD batches, and updated decisions/future-scope).
|
||||||
130
plans/work-item-status-graphql.feedback-6.md
Normal file
130
plans/work-item-status-graphql.feedback-6.md
Normal file
@@ -0,0 +1,130 @@
|
|||||||
|
Your iteration-5 plan is strong. The biggest remaining gaps are outcome ambiguity, cancellation safety, and long-term status identity. These are the revisions I’d make.
|
||||||
|
|
||||||
|
1. **Make enrichment outcomes explicit (not “empty success”)**
|
||||||
|
Analysis:
|
||||||
|
Right now `404/403 -> Ok(empty)` is operationally ambiguous: “project has no statuses” vs “feature unavailable/auth issue.” Agents and dashboards need that distinction to make correct decisions.
|
||||||
|
This improves reliability and observability without making sync fail-hard.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-3: Status Fetcher (Integration)
|
||||||
|
-- [ ] `fetch_issue_statuses()` returns `FetchStatusResult` containing:
|
||||||
|
+- [ ] `fetch_issue_statuses()` returns `FetchStatusOutcome`:
|
||||||
|
+ - `Fetched(FetchStatusResult)`
|
||||||
|
+ - `Unsupported { reason: UnsupportedReason }`
|
||||||
|
+ - `CancelledPartial(FetchStatusResult)`
|
||||||
|
@@
|
||||||
|
-- [ ] GraphQL 404 → returns `Ok(FetchStatusResult)` with empty collections + warning log
|
||||||
|
-- [ ] GraphQL 403 (`GitLabAuthFailed`) → returns `Ok(FetchStatusResult)` with empty collections + warning log
|
||||||
|
+- [ ] GraphQL 404 → `Unsupported { reason: GraphqlEndpointMissing }` + warning log
|
||||||
|
+- [ ] GraphQL 403 (`GitLabAuthFailed`) → `Unsupported { reason: AuthForbidden }` + warning log
|
||||||
|
|
||||||
|
@@ AC-10: Robot Sync Envelope (E2E)
|
||||||
|
-- [ ] `status_enrichment` object: `{ "enriched": N, "cleared": N, "error": null | "message" }`
|
||||||
|
+- [ ] `status_enrichment` object: `{ "mode": "fetched|unsupported|cancelled_partial", "reason": null|"...", "enriched": N, "cleared": N, "error": null|"message" }`
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Add cancellation and pagination loop safety**
|
||||||
|
Analysis:
|
||||||
|
Large projects can run long. Current flow checks cancellation only before enrichment starts; pagination and per-row update loops can ignore cancellation for too long. Also, GraphQL cursor bugs can create infinite loops (`hasNextPage=true` with unchanged cursor).
|
||||||
|
This is a robustness must-have.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-3: Status Fetcher (Integration)
|
||||||
|
+ [ ] `fetch_issue_statuses()` accepts cancellation signal and checks it between page requests
|
||||||
|
+ [ ] Pagination guard: if `hasNextPage=true` but `endCursor` is `None` or unchanged, abort loop with warning and return partial outcome
|
||||||
|
+ [ ] Emits `pages_fetched` count for diagnostics
|
||||||
|
|
||||||
|
@@ File 1: `src/gitlab/graphql.rs`
|
||||||
|
-- pub async fn fetch_issue_statuses(client: &GraphqlClient, project_path: &str) -> Result<FetchStatusResult>
|
||||||
|
+- pub async fn fetch_issue_statuses(client: &GraphqlClient, project_path: &str, signal: &CancellationSignal) -> Result<FetchStatusOutcome>
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Persist stable `status_id` in addition to name**
|
||||||
|
Analysis:
|
||||||
|
`status_name` is display-oriented and mutable (rename/custom lifecycle changes). A stable status identifier is critical for durable automations, analytics, and future migrations.
|
||||||
|
This is a schema decision that is cheap now and expensive later if skipped.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-2: Status Types (Unit)
|
||||||
|
-- [ ] `WorkItemStatus` struct has `name`, `category`, `color`, `icon_name`
|
||||||
|
+- [ ] `WorkItemStatus` struct has `id: String`, `name`, `category`, `color`, `icon_name`
|
||||||
|
|
||||||
|
@@ AC-4: Migration 021 (Unit)
|
||||||
|
-- [ ] Migration adds 5 nullable columns to `issues`: `status_name`, `status_category`, `status_color`, `status_icon_name`, `status_synced_at`
|
||||||
|
+- [ ] Migration adds 6 nullable columns to `issues`: `status_id`, `status_name`, `status_category`, `status_color`, `status_icon_name`, `status_synced_at`
|
||||||
|
+ [ ] Adds index `idx_issues_project_status_id(project_id, status_id)` for stable-machine filters
|
||||||
|
|
||||||
|
@@ GraphQL query
|
||||||
|
- status { name category color iconName }
|
||||||
|
+ status { id name category color iconName }
|
||||||
|
|
||||||
|
@@ AC-7 / AC-8 Robot
|
||||||
|
+ [ ] JSON includes `status_id` (null when unavailable)
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Handle GraphQL partial-data responses correctly**
|
||||||
|
Analysis:
|
||||||
|
GraphQL can return both `data` and `errors` in the same response. Current plan treats any `errors` as hard failure, which can discard valid data and reduce reliability.
|
||||||
|
Use partial-data semantics: keep data, log/report warnings.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-1: GraphQL Client (Unit)
|
||||||
|
-- [ ] Error response: if top-level `errors` array is non-empty, returns `LoreError` with first error message
|
||||||
|
+- [ ] If `errors` non-empty and `data` missing: return `LoreError` with first error message
|
||||||
|
+- [ ] If `errors` non-empty and `data` present: return `data` + warning metadata (do not fail the whole fetch)
|
||||||
|
|
||||||
|
@@ TDD Plan (RED)
|
||||||
|
+ 33. `test_graphql_partial_data_with_errors_returns_data_and_warning`
|
||||||
|
```
|
||||||
|
|
||||||
|
5. **Extract status enrichment from orchestrator into dedicated module**
|
||||||
|
Analysis:
|
||||||
|
`orchestrator.rs` already has many phases. Putting status transport/parsing/transaction policy directly there increases coupling and test friction.
|
||||||
|
A dedicated module improves architecture clarity and makes future enhancements safer.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ Implementation Detail
|
||||||
|
+- File 15: `src/ingestion/enrichment/status.rs` (NEW)
|
||||||
|
+ - `run_status_enrichment(...)`
|
||||||
|
+ - `enrich_issue_statuses_txn(...)`
|
||||||
|
+ - outcome mapping + telemetry
|
||||||
|
|
||||||
|
@@ File 6: `src/ingestion/orchestrator.rs`
|
||||||
|
-- Inline Phase 1.5 logic + helper function
|
||||||
|
+- Delegates to `enrichment::status::run_status_enrichment(...)` and records returned stats
|
||||||
|
```
|
||||||
|
|
||||||
|
6. **Add status/state consistency checks**
|
||||||
|
Analysis:
|
||||||
|
GitLab states status categories and issue state should synchronize, but ingestion drift or API edge cases can violate this. Detecting mismatch is high-signal for data integrity issues.
|
||||||
|
This is compelling for agents because it catches “looks correct but isn’t” problems.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-6: Enrichment in Orchestrator (Integration)
|
||||||
|
+ [ ] Enrichment computes `status_state_mismatches` count:
|
||||||
|
+ - `DONE|CANCELED` with `state=open` or `TO_DO|IN_PROGRESS|TRIAGE` with `state=closed`
|
||||||
|
+ [ ] Logs warning summary when mismatches > 0
|
||||||
|
|
||||||
|
@@ AC-10: Robot Sync Envelope (E2E)
|
||||||
|
+ [ ] `status_enrichment` includes `state_mismatches: N`
|
||||||
|
```
|
||||||
|
|
||||||
|
7. **Add explicit performance envelope acceptance criterion**
|
||||||
|
Analysis:
|
||||||
|
Plan claims large-project handling, but no hard validation target is defined. Add a bounded, reproducible performance criterion to prevent regressions.
|
||||||
|
This is especially important with pagination + per-row writes.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ Acceptance Criteria
|
||||||
|
+ ### AC-12: Performance Envelope (Integration)
|
||||||
|
+ - [ ] 10k-issue fixture completes status fetch + apply within defined budget on CI baseline machine
|
||||||
|
+ - [ ] Memory usage remains O(page_size), not O(total_issues)
|
||||||
|
+ - [ ] Cancellation during large sync exits within a bounded latency target
|
||||||
|
|
||||||
|
@@ TDD Plan (RED)
|
||||||
|
+ 34. `test_enrichment_large_project_budget`
|
||||||
|
+ 35. `test_fetch_statuses_memory_bound_by_page`
|
||||||
|
+ 36. `test_cancellation_latency_during_pagination`
|
||||||
|
```
|
||||||
|
|
||||||
|
If you want, I can next produce a single consolidated “iteration 6” plan draft with these diffs fully merged so it’s ready to execute.
|
||||||
118
plans/work-item-status-graphql.feedback-7.md
Normal file
118
plans/work-item-status-graphql.feedback-7.md
Normal file
@@ -0,0 +1,118 @@
|
|||||||
|
**Highest-Impact Revisions (new, not in your rejected list)**
|
||||||
|
|
||||||
|
1. **Critical: Preserve GraphQL partial-error metadata end-to-end (don’t just log it)**
|
||||||
|
Rationale: Right now partial GraphQL errors are warning-only. Agents get no machine-readable signal that status data may be incomplete, which can silently corrupt downstream automation decisions. Exposing partial-error metadata in `FetchStatusResult` and robot sync output makes reliability observable and actionable.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-1: GraphQL Client (Unit)
|
||||||
|
- [ ] Partial-data response: if `errors` array is non-empty BUT `data` field is present and non-null, returns `data` and logs warning with first error message
|
||||||
|
+ [ ] Partial-data response: if `errors` array is non-empty BUT `data` field is present and non-null, returns `data` and warning metadata (`had_errors=true`, `first_error_message`)
|
||||||
|
+ [ ] `GraphqlClient::query()` returns `GraphqlQueryResult { data, had_errors, first_error_message }`
|
||||||
|
|
||||||
|
@@ AC-3: Status Fetcher (Integration)
|
||||||
|
+ [ ] `FetchStatusResult` includes `partial_error_count: usize` and `first_partial_error: Option<String>`
|
||||||
|
+ [ ] Partial GraphQL errors increment `partial_error_count` and are surfaced to orchestrator result
|
||||||
|
|
||||||
|
@@ AC-10: Robot Sync Envelope (E2E)
|
||||||
|
- { "mode": "...", "reason": ..., "enriched": N, "cleared": N, "error": ... }
|
||||||
|
+ { "mode": "...", "reason": ..., "enriched": N, "cleared": N, "error": ..., "partial_errors": N, "first_partial_error": null|"..." }
|
||||||
|
|
||||||
|
@@ File 1: src/gitlab/graphql.rs
|
||||||
|
- pub async fn query(...) -> Result<serde_json::Value>
|
||||||
|
+ pub async fn query(...) -> Result<GraphqlQueryResult>
|
||||||
|
+ pub struct GraphqlQueryResult { pub data: serde_json::Value, pub had_errors: bool, pub first_error_message: Option<String> }
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **High: Add adaptive page-size fallback for GraphQL complexity/timeout failures**
|
||||||
|
Rationale: Fixed `first: 100` is brittle on self-hosted instances with stricter complexity/time limits. Adaptive page size (100→50→25→10) improves success rate without retries/backoff and avoids failing an entire project due to one tunable server constraint.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ Query Path
|
||||||
|
-query($projectPath: ID!, $after: String) { ... workItems(types: [ISSUE], first: 100, after: $after) ... }
|
||||||
|
+query($projectPath: ID!, $after: String, $first: Int!) { ... workItems(types: [ISSUE], first: $first, after: $after) ... }
|
||||||
|
|
||||||
|
@@ AC-3: Status Fetcher (Integration)
|
||||||
|
+ [ ] Starts with `first=100`; on GraphQL complexity/timeout errors, retries same cursor with smaller page size (50, 25, 10)
|
||||||
|
+ [ ] If smallest page size still fails, returns error as today
|
||||||
|
+ [ ] Emits warning including page size downgrade event
|
||||||
|
|
||||||
|
@@ TDD Plan (RED)
|
||||||
|
+ 36. `test_fetch_statuses_complexity_error_reduces_page_size`
|
||||||
|
+ 37. `test_fetch_statuses_timeout_error_reduces_page_size`
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **High: Make project path lookup failure non-fatal for the sync**
|
||||||
|
Rationale: Enrichment is optional. If `projects.path_with_namespace` lookup fails for any reason, sync should continue with a structured enrichment error instead of risking full project pipeline failure.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-6: Enrichment in Orchestrator (Integration)
|
||||||
|
+ [ ] If project path lookup fails/missing, status enrichment is skipped for that project, warning logged, and sync continues
|
||||||
|
+ [ ] `status_enrichment_error` captures `"project_path_missing"` (or DB error text)
|
||||||
|
|
||||||
|
@@ File 6: src/ingestion/orchestrator.rs
|
||||||
|
- let project_path: String = conn.query_row(...)?;
|
||||||
|
+ let project_path = conn.query_row(...).optional()?;
|
||||||
|
+ if project_path.is_none() {
|
||||||
|
+ result.status_enrichment_error = Some("project_path_missing".to_string());
|
||||||
|
+ result.status_enrichment_mode = "fetched".to_string(); // attempted but unavailable locally
|
||||||
|
+ emit(ProgressEvent::StatusEnrichmentComplete { enriched: 0, cleared: 0 });
|
||||||
|
+ // continue to discussion sync
|
||||||
|
+ }
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Medium: Upgrade `--status` from single-value to repeatable multi-value filter**
|
||||||
|
Rationale: Practical usage often needs “active buckets” (`To do` OR `In progress`). Repeatable `--status` with OR semantics dramatically improves usefulness without adding new conceptual surface area.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-9: List Issues Filter (E2E)
|
||||||
|
- [ ] `lore list issues --status "In progress"` → only issues where `status_name = 'In progress'`
|
||||||
|
+ [ ] `lore list issues --status "In progress"` → unchanged single-value behavior
|
||||||
|
+ [ ] Repeatable flags supported: `--status "In progress" --status "To do"` (OR semantics across status values)
|
||||||
|
+ [ ] Repeated `--status` remains AND-composed with other filters
|
||||||
|
|
||||||
|
@@ File 9: src/cli/mod.rs
|
||||||
|
- pub status: Option<String>,
|
||||||
|
+ pub status: Vec<String>, // repeatable flag
|
||||||
|
|
||||||
|
@@ File 8: src/cli/commands/list.rs
|
||||||
|
- if let Some(status) = filters.status { where_clauses.push("i.status_name = ? COLLATE NOCASE"); ... }
|
||||||
|
+ if !filters.statuses.is_empty() { /* dynamic OR/IN clause with case-insensitive matching */ }
|
||||||
|
```
|
||||||
|
|
||||||
|
5. **Medium: Add coverage telemetry (`seen`, `with_status`, `without_status`)**
|
||||||
|
Rationale: `enriched`/`cleared` alone is not enough to judge enrichment health. Coverage counters make it obvious whether a project truly has no statuses, is unsupported, or has unexpectedly low status population.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ AC-6: Enrichment in Orchestrator (Integration)
|
||||||
|
+ [ ] `IngestProjectResult` gains `statuses_seen: usize` and `statuses_without_widget: usize`
|
||||||
|
+ [ ] Enrichment log includes `seen`, `enriched`, `cleared`, `without_widget`
|
||||||
|
|
||||||
|
@@ AC-10: Robot Sync Envelope (E2E)
|
||||||
|
- status_enrichment: { mode, reason, enriched, cleared, error }
|
||||||
|
+ status_enrichment: { mode, reason, seen, enriched, cleared, without_widget, error, partial_errors }
|
||||||
|
|
||||||
|
@@ File 6: src/ingestion/orchestrator.rs
|
||||||
|
+ result.statuses_seen = fetch_result.all_fetched_iids.len();
|
||||||
|
+ result.statuses_without_widget = result.statuses_seen.saturating_sub(result.statuses_enriched);
|
||||||
|
```
|
||||||
|
|
||||||
|
6. **Medium: Centralize color parsing/render decisions (single helper used by show/list)**
|
||||||
|
Rationale: Color parsing is duplicated in `show.rs` and `list.rs`, which invites drift and inconsistent behavior. One shared helper gives consistent fallback behavior and simpler tests.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
@@ File 7: src/cli/commands/show.rs
|
||||||
|
- fn style_with_hex(...) { ...hex parse logic... }
|
||||||
|
+ use crate::cli::commands::color::style_with_hex;
|
||||||
|
|
||||||
|
@@ File 8: src/cli/commands/list.rs
|
||||||
|
- fn colored_cell_hex(...) { ...hex parse logic... }
|
||||||
|
+ use crate::cli::commands::color::colored_cell_hex;
|
||||||
|
|
||||||
|
@@ Files Changed (Summary)
|
||||||
|
+ `src/cli/commands/color.rs` (NEW) — shared hex parsing + styling helpers
|
||||||
|
- duplicated hex parsing blocks removed from show/list
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
If you want, I can produce a **single consolidated patch-style diff of the plan document itself** (all section edits merged, ready to paste as iteration 7).
|
||||||
1627
plans/work-item-status-graphql.md
Normal file
1627
plans/work-item-status-graphql.md
Normal file
File diff suppressed because it is too large
Load Diff
2036
plans/work-item-status-graphql.tdd-appendix.md
Normal file
2036
plans/work-item-status-graphql.tdd-appendix.md
Normal file
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user