Your iteration-5 plan is strong. The biggest remaining gaps are outcome ambiguity, cancellation safety, and long-term status identity. These are the revisions I’d make. 1. **Make enrichment outcomes explicit (not “empty success”)** Analysis: Right now `404/403 -> Ok(empty)` is operationally ambiguous: “project has no statuses” vs “feature unavailable/auth issue.” Agents and dashboards need that distinction to make correct decisions. This improves reliability and observability without making sync fail-hard. ```diff @@ AC-3: Status Fetcher (Integration) -- [ ] `fetch_issue_statuses()` returns `FetchStatusResult` containing: +- [ ] `fetch_issue_statuses()` returns `FetchStatusOutcome`: + - `Fetched(FetchStatusResult)` + - `Unsupported { reason: UnsupportedReason }` + - `CancelledPartial(FetchStatusResult)` @@ -- [ ] GraphQL 404 → returns `Ok(FetchStatusResult)` with empty collections + warning log -- [ ] GraphQL 403 (`GitLabAuthFailed`) → returns `Ok(FetchStatusResult)` with empty collections + warning log +- [ ] GraphQL 404 → `Unsupported { reason: GraphqlEndpointMissing }` + warning log +- [ ] GraphQL 403 (`GitLabAuthFailed`) → `Unsupported { reason: AuthForbidden }` + warning log @@ AC-10: Robot Sync Envelope (E2E) -- [ ] `status_enrichment` object: `{ "enriched": N, "cleared": N, "error": null | "message" }` +- [ ] `status_enrichment` object: `{ "mode": "fetched|unsupported|cancelled_partial", "reason": null|"...", "enriched": N, "cleared": N, "error": null|"message" }` ``` 2. **Add cancellation and pagination loop safety** Analysis: Large projects can run long. Current flow checks cancellation only before enrichment starts; pagination and per-row update loops can ignore cancellation for too long. Also, GraphQL cursor bugs can create infinite loops (`hasNextPage=true` with unchanged cursor). This is a robustness must-have. ```diff @@ AC-3: Status Fetcher (Integration) + [ ] `fetch_issue_statuses()` accepts cancellation signal and checks it between page requests + [ ] Pagination guard: if `hasNextPage=true` but `endCursor` is `None` or unchanged, abort loop with warning and return partial outcome + [ ] Emits `pages_fetched` count for diagnostics @@ File 1: `src/gitlab/graphql.rs` -- pub async fn fetch_issue_statuses(client: &GraphqlClient, project_path: &str) -> Result +- pub async fn fetch_issue_statuses(client: &GraphqlClient, project_path: &str, signal: &CancellationSignal) -> Result ``` 3. **Persist stable `status_id` in addition to name** Analysis: `status_name` is display-oriented and mutable (rename/custom lifecycle changes). A stable status identifier is critical for durable automations, analytics, and future migrations. This is a schema decision that is cheap now and expensive later if skipped. ```diff @@ AC-2: Status Types (Unit) -- [ ] `WorkItemStatus` struct has `name`, `category`, `color`, `icon_name` +- [ ] `WorkItemStatus` struct has `id: String`, `name`, `category`, `color`, `icon_name` @@ AC-4: Migration 021 (Unit) -- [ ] Migration adds 5 nullable columns to `issues`: `status_name`, `status_category`, `status_color`, `status_icon_name`, `status_synced_at` +- [ ] Migration adds 6 nullable columns to `issues`: `status_id`, `status_name`, `status_category`, `status_color`, `status_icon_name`, `status_synced_at` + [ ] Adds index `idx_issues_project_status_id(project_id, status_id)` for stable-machine filters @@ GraphQL query - status { name category color iconName } + status { id name category color iconName } @@ AC-7 / AC-8 Robot + [ ] JSON includes `status_id` (null when unavailable) ``` 4. **Handle GraphQL partial-data responses correctly** Analysis: GraphQL can return both `data` and `errors` in the same response. Current plan treats any `errors` as hard failure, which can discard valid data and reduce reliability. Use partial-data semantics: keep data, log/report warnings. ```diff @@ AC-1: GraphQL Client (Unit) -- [ ] Error response: if top-level `errors` array is non-empty, returns `LoreError` with first error message +- [ ] If `errors` non-empty and `data` missing: return `LoreError` with first error message +- [ ] If `errors` non-empty and `data` present: return `data` + warning metadata (do not fail the whole fetch) @@ TDD Plan (RED) + 33. `test_graphql_partial_data_with_errors_returns_data_and_warning` ``` 5. **Extract status enrichment from orchestrator into dedicated module** Analysis: `orchestrator.rs` already has many phases. Putting status transport/parsing/transaction policy directly there increases coupling and test friction. A dedicated module improves architecture clarity and makes future enhancements safer. ```diff @@ Implementation Detail +- File 15: `src/ingestion/enrichment/status.rs` (NEW) + - `run_status_enrichment(...)` + - `enrich_issue_statuses_txn(...)` + - outcome mapping + telemetry @@ File 6: `src/ingestion/orchestrator.rs` -- Inline Phase 1.5 logic + helper function +- Delegates to `enrichment::status::run_status_enrichment(...)` and records returned stats ``` 6. **Add status/state consistency checks** Analysis: GitLab states status categories and issue state should synchronize, but ingestion drift or API edge cases can violate this. Detecting mismatch is high-signal for data integrity issues. This is compelling for agents because it catches “looks correct but isn’t” problems. ```diff @@ AC-6: Enrichment in Orchestrator (Integration) + [ ] Enrichment computes `status_state_mismatches` count: + - `DONE|CANCELED` with `state=open` or `TO_DO|IN_PROGRESS|TRIAGE` with `state=closed` + [ ] Logs warning summary when mismatches > 0 @@ AC-10: Robot Sync Envelope (E2E) + [ ] `status_enrichment` includes `state_mismatches: N` ``` 7. **Add explicit performance envelope acceptance criterion** Analysis: Plan claims large-project handling, but no hard validation target is defined. Add a bounded, reproducible performance criterion to prevent regressions. This is especially important with pagination + per-row writes. ```diff @@ Acceptance Criteria + ### AC-12: Performance Envelope (Integration) + - [ ] 10k-issue fixture completes status fetch + apply within defined budget on CI baseline machine + - [ ] Memory usage remains O(page_size), not O(total_issues) + - [ ] Cancellation during large sync exits within a bounded latency target @@ TDD Plan (RED) + 34. `test_enrichment_large_project_budget` + 35. `test_fetch_statuses_memory_bound_by_page` + 36. `test_cancellation_latency_during_pagination` ``` If you want, I can next produce a single consolidated “iteration 6” plan draft with these diffs fully merged so it’s ready to execute.