docs: add lore-service, work-item-status-graphql, and time-decay plans

Three implementation plans with iterative cross-model refinement:

lore-service (5 iterations):
  HTTP service layer exposing lore's SQLite data via REST/SSE for
  integration with external tools (dashboards, IDE extensions, chat
  agents). Covers authentication, rate limiting, caching strategy, and
  webhook-driven sync triggers.

work-item-status-graphql (7 iterations + TDD appendix):
  Detailed implementation plan for the GraphQL-based work item status
  enrichment feature (now implemented). Includes the TDD appendix with
  test-first development specifications covering GraphQL client, adaptive
  pagination, ingestion orchestration, CLI display, and robot mode output.

time-decay-expert-scoring (iteration 5 feedback):
  Updates to the existing time-decay scoring plan incorporating feedback
  on decay curve parameterization, recency weighting for discussion
  contributions, and staleness detection thresholds.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Taylor Eernisse
2026-02-11 08:12:17 -05:00
parent 1161edb212
commit 2c9de1a6c3
15 changed files with 9261 additions and 33 deletions

View File

@@ -0,0 +1,130 @@
Your iteration-5 plan is strong. The biggest remaining gaps are outcome ambiguity, cancellation safety, and long-term status identity. These are the revisions Id make.
1. **Make enrichment outcomes explicit (not “empty success”)**
Analysis:
Right now `404/403 -> Ok(empty)` is operationally ambiguous: “project has no statuses” vs “feature unavailable/auth issue.” Agents and dashboards need that distinction to make correct decisions.
This improves reliability and observability without making sync fail-hard.
```diff
@@ AC-3: Status Fetcher (Integration)
-- [ ] `fetch_issue_statuses()` returns `FetchStatusResult` containing:
+- [ ] `fetch_issue_statuses()` returns `FetchStatusOutcome`:
+ - `Fetched(FetchStatusResult)`
+ - `Unsupported { reason: UnsupportedReason }`
+ - `CancelledPartial(FetchStatusResult)`
@@
-- [ ] GraphQL 404 → returns `Ok(FetchStatusResult)` with empty collections + warning log
-- [ ] GraphQL 403 (`GitLabAuthFailed`) → returns `Ok(FetchStatusResult)` with empty collections + warning log
+- [ ] GraphQL 404 → `Unsupported { reason: GraphqlEndpointMissing }` + warning log
+- [ ] GraphQL 403 (`GitLabAuthFailed`) → `Unsupported { reason: AuthForbidden }` + warning log
@@ AC-10: Robot Sync Envelope (E2E)
-- [ ] `status_enrichment` object: `{ "enriched": N, "cleared": N, "error": null | "message" }`
+- [ ] `status_enrichment` object: `{ "mode": "fetched|unsupported|cancelled_partial", "reason": null|"...", "enriched": N, "cleared": N, "error": null|"message" }`
```
2. **Add cancellation and pagination loop safety**
Analysis:
Large projects can run long. Current flow checks cancellation only before enrichment starts; pagination and per-row update loops can ignore cancellation for too long. Also, GraphQL cursor bugs can create infinite loops (`hasNextPage=true` with unchanged cursor).
This is a robustness must-have.
```diff
@@ AC-3: Status Fetcher (Integration)
+ [ ] `fetch_issue_statuses()` accepts cancellation signal and checks it between page requests
+ [ ] Pagination guard: if `hasNextPage=true` but `endCursor` is `None` or unchanged, abort loop with warning and return partial outcome
+ [ ] Emits `pages_fetched` count for diagnostics
@@ File 1: `src/gitlab/graphql.rs`
-- pub async fn fetch_issue_statuses(client: &GraphqlClient, project_path: &str) -> Result<FetchStatusResult>
+- pub async fn fetch_issue_statuses(client: &GraphqlClient, project_path: &str, signal: &CancellationSignal) -> Result<FetchStatusOutcome>
```
3. **Persist stable `status_id` in addition to name**
Analysis:
`status_name` is display-oriented and mutable (rename/custom lifecycle changes). A stable status identifier is critical for durable automations, analytics, and future migrations.
This is a schema decision that is cheap now and expensive later if skipped.
```diff
@@ AC-2: Status Types (Unit)
-- [ ] `WorkItemStatus` struct has `name`, `category`, `color`, `icon_name`
+- [ ] `WorkItemStatus` struct has `id: String`, `name`, `category`, `color`, `icon_name`
@@ AC-4: Migration 021 (Unit)
-- [ ] Migration adds 5 nullable columns to `issues`: `status_name`, `status_category`, `status_color`, `status_icon_name`, `status_synced_at`
+- [ ] Migration adds 6 nullable columns to `issues`: `status_id`, `status_name`, `status_category`, `status_color`, `status_icon_name`, `status_synced_at`
+ [ ] Adds index `idx_issues_project_status_id(project_id, status_id)` for stable-machine filters
@@ GraphQL query
- status { name category color iconName }
+ status { id name category color iconName }
@@ AC-7 / AC-8 Robot
+ [ ] JSON includes `status_id` (null when unavailable)
```
4. **Handle GraphQL partial-data responses correctly**
Analysis:
GraphQL can return both `data` and `errors` in the same response. Current plan treats any `errors` as hard failure, which can discard valid data and reduce reliability.
Use partial-data semantics: keep data, log/report warnings.
```diff
@@ AC-1: GraphQL Client (Unit)
-- [ ] Error response: if top-level `errors` array is non-empty, returns `LoreError` with first error message
+- [ ] If `errors` non-empty and `data` missing: return `LoreError` with first error message
+- [ ] If `errors` non-empty and `data` present: return `data` + warning metadata (do not fail the whole fetch)
@@ TDD Plan (RED)
+ 33. `test_graphql_partial_data_with_errors_returns_data_and_warning`
```
5. **Extract status enrichment from orchestrator into dedicated module**
Analysis:
`orchestrator.rs` already has many phases. Putting status transport/parsing/transaction policy directly there increases coupling and test friction.
A dedicated module improves architecture clarity and makes future enhancements safer.
```diff
@@ Implementation Detail
+- File 15: `src/ingestion/enrichment/status.rs` (NEW)
+ - `run_status_enrichment(...)`
+ - `enrich_issue_statuses_txn(...)`
+ - outcome mapping + telemetry
@@ File 6: `src/ingestion/orchestrator.rs`
-- Inline Phase 1.5 logic + helper function
+- Delegates to `enrichment::status::run_status_enrichment(...)` and records returned stats
```
6. **Add status/state consistency checks**
Analysis:
GitLab states status categories and issue state should synchronize, but ingestion drift or API edge cases can violate this. Detecting mismatch is high-signal for data integrity issues.
This is compelling for agents because it catches “looks correct but isnt” problems.
```diff
@@ AC-6: Enrichment in Orchestrator (Integration)
+ [ ] Enrichment computes `status_state_mismatches` count:
+ - `DONE|CANCELED` with `state=open` or `TO_DO|IN_PROGRESS|TRIAGE` with `state=closed`
+ [ ] Logs warning summary when mismatches > 0
@@ AC-10: Robot Sync Envelope (E2E)
+ [ ] `status_enrichment` includes `state_mismatches: N`
```
7. **Add explicit performance envelope acceptance criterion**
Analysis:
Plan claims large-project handling, but no hard validation target is defined. Add a bounded, reproducible performance criterion to prevent regressions.
This is especially important with pagination + per-row writes.
```diff
@@ Acceptance Criteria
+ ### AC-12: Performance Envelope (Integration)
+ - [ ] 10k-issue fixture completes status fetch + apply within defined budget on CI baseline machine
+ - [ ] Memory usage remains O(page_size), not O(total_issues)
+ - [ ] Cancellation during large sync exits within a bounded latency target
@@ TDD Plan (RED)
+ 34. `test_enrichment_large_project_budget`
+ 35. `test_fetch_statuses_memory_bound_by_page`
+ 36. `test_cancellation_latency_during_pagination`
```
If you want, I can next produce a single consolidated “iteration 6” plan draft with these diffs fully merged so its ready to execute.