docs: add lore-service, work-item-status-graphql, and time-decay plans

Three implementation plans with iterative cross-model refinement:

lore-service (5 iterations):
  HTTP service layer exposing lore's SQLite data via REST/SSE for
  integration with external tools (dashboards, IDE extensions, chat
  agents). Covers authentication, rate limiting, caching strategy, and
  webhook-driven sync triggers.

work-item-status-graphql (7 iterations + TDD appendix):
  Detailed implementation plan for the GraphQL-based work item status
  enrichment feature (now implemented). Includes the TDD appendix with
  test-first development specifications covering GraphQL client, adaptive
  pagination, ingestion orchestration, CLI display, and robot mode output.

time-decay-expert-scoring (iteration 5 feedback):
  Updates to the existing time-decay scoring plan incorporating feedback
  on decay curve parameterization, recency weighting for discussion
  contributions, and staleness detection thresholds.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Taylor Eernisse
2026-02-11 08:12:17 -05:00
parent 1161edb212
commit 2c9de1a6c3
15 changed files with 9261 additions and 33 deletions

View File

@@ -0,0 +1,159 @@
Your plan is already strong, but Id revise it in 10 places to reduce risk at scale and make it materially more useful.
1. Shared transport + retries for GraphQL (must-have)
Reasoning: `REST` already has throttling/retry in `src/gitlab/client.rs`; your proposed GraphQL client would bypass that and can spike rate limits under concurrent project ingest (`src/cli/commands/ingest.rs`). Unifying transport prevents split behavior and cuts production incidents.
```diff
@@ AC-1: GraphQL Client (Unit)
- [ ] Network error → `LoreError::Other`
+ [ ] GraphQL requests use shared GitLab transport (same timeout, rate limiter, retry policy as REST)
+ [ ] Retries 429/502/503/504/network errors (max 3) with exponential backoff + jitter
+ [ ] 429 honors `Retry-After` before retrying
+ [ ] Exhausted network retries → `LoreError::GitLabNetworkError`
@@ Decisions
- 8. **No retry/backoff in v1** — DEFER.
+ 8. **Retry/backoff in v1** — YES (shared REST+GraphQL reliability policy).
@@ Implementation Detail
+ File 15: `src/gitlab/transport.rs` (NEW) — shared HTTP execution and retry/backoff policy.
```
2. Capability cache for unsupported projects (must-have)
Reasoning: Free tier / older GitLab will repeatedly emit warning noise every sync and waste calls. Cache support status per project and re-probe on TTL.
```diff
@@ AC-6: Enrichment in Orchestrator (Integration)
- [ ] On any GraphQL error: logs warning, continues to next project (never fails the sync)
+ [ ] Unsupported capability responses (missing endpoint/type/widget) are cached per project
+ [ ] While cached unsupported, enrichment is skipped without repeated warning spam
+ [ ] Capability cache auto-expires (default 24h) and is re-probed
@@ Migration Numbering
- This feature uses **migration 021**.
+ This feature uses **migrations 021-022**.
@@ Files Changed (Summary)
+ `migrations/022_project_capabilities.sql` | NEW — support cache table for project capabilities
```
3. Delta-first enrichment with periodic full reconcile (must-have)
Reasoning: Full GraphQL scan every sync is expensive for large projects. You already compute issue deltas in ingestion; use that as fast path and keep a periodic full sweep as safety net.
```diff
@@ AC-6: Enrichment in Orchestrator (Integration)
- [ ] Runs on every sync (not gated by `--full`)
+ [ ] Fast path: skip status enrichment when issue ingestion upserted 0 issues for that project
+ [ ] Safety net: run full reconciliation every `status_full_reconcile_hours` (default 24)
+ [ ] `--full` always forces reconciliation
@@ AC-5: Config Toggle (Unit)
+ [ ] `SyncConfig` has `status_full_reconcile_hours: u32` (default 24)
```
4. Strongly typed widget parsing via `__typename` (must-have)
Reasoning: current “deserialize arbitrary widget JSON into `StatusWidget`” is fragile. Query/type by `__typename` for forward compatibility and fewer silent parse mistakes.
```diff
@@ AC-3: Status Fetcher (Integration)
- [ ] Extracts status from `widgets` array by matching `WorkItemWidgetStatus` fragment
+ [ ] Query includes `widgets { __typename ... }` and parser matches `__typename == "WorkItemWidgetStatus"`
+ [ ] Non-status widgets are ignored deterministically (no heuristic JSON-deserialize attempts)
@@ GraphQL Query
+ widgets {
+ __typename
+ ... on WorkItemWidgetStatus { ... }
+ }
```
5. Set-based transactional DB apply (must-have)
Reasoning: row-by-row clear/update loops will be slow on large projects and hold write locks longer. Temp-table + set-based SQL inside one txn is faster and easier to reason about rollback.
```diff
@@ AC-3: Status Fetcher (Integration)
- `all_fetched_iids: Vec<i64>`
+ `all_fetched_iids: HashSet<i64>`
@@ AC-6: Enrichment in Orchestrator (Integration)
- [ ] Before applying updates, NULL out status fields ... (loop per IID)
- [ ] UPDATE SQL: `SET status_name=?, ... WHERE project_id=? AND iid=?`
+ [ ] Use temp tables and set-based SQL in one transaction:
+ [ ] (1) clear stale statuses for fetched IIDs absent from status rows
+ [ ] (2) apply status values for fetched IIDs with status
+ [ ] One commit per project; rollback leaves prior state intact
```
6. Fix index strategy for `COLLATE NOCASE` + default sorting (must-have)
Reasoning: your proposed `(project_id, status_name)` index may not fully help `COLLATE NOCASE` + `ORDER BY updated_at`. Tune index to real query shape in `src/cli/commands/list.rs`.
```diff
@@ AC-4: Migration 021 (Unit)
- [ ] Adds compound index `idx_issues_project_status_name(project_id, status_name)` for `--status` filter performance
+ [ ] Adds covering NOCASE-aware index:
+ [ ] `idx_issues_project_status_name_nocase_updated(project_id, status_name COLLATE NOCASE, updated_at DESC)`
+ [ ] Adds category index:
+ [ ] `idx_issues_project_status_category_nocase(project_id, status_category COLLATE NOCASE)`
```
7. Add stable/automation-friendly filters now (high-value feature)
Reasoning: status names are user-customizable and renameable; category is more stable. Also add `--no-status` for quality checks and migration visibility.
```diff
@@ AC-9: List Issues Filter (E2E)
+ [ ] `lore list issues --status-category in_progress` filters by category (case-insensitive)
+ [ ] `lore list issues --no-status` returns only issues where `status_name IS NULL`
+ [ ] `--status` + `--status-category` combine with AND logic
@@ File 9: `src/cli/mod.rs`
+ Add flags: `--status-category`, `--no-status`
@@ File 11: `src/cli/autocorrect.rs`
+ Register `--status-category` and `--no-status` for `issues`
```
8. Better enrichment observability and failure accounting (must-have ops)
Reasoning: only tracking `statuses_enriched` hides skipped/cleared/errors, and auth failures become silent partial data quality issues. Add counters and explicit progress events.
```diff
@@ AC-6: Enrichment in Orchestrator (Integration)
- [ ] `IngestProjectResult` gains `statuses_enriched: usize` counter
- [ ] Progress event: `ProgressEvent::StatusEnrichmentComplete { enriched: usize }`
+ [ ] `IngestProjectResult` gains:
+ [ ] `statuses_enriched`, `statuses_cleared`, `status_enrichment_skipped`, `status_enrichment_failed`
+ [ ] Progress events:
+ [ ] `StatusEnrichmentStarted`, `StatusEnrichmentSkipped`, `StatusEnrichmentComplete`, `StatusEnrichmentFailed`
+ [ ] End-of-sync summary includes per-project enrichment outcome counts
```
9. Add `status_changed_at` for immediately useful workflow analytics (high-value feature)
Reasoning: without change timestamp, you cant answer “how long has this been in progress?” which is one of the most useful agent/human queries.
```diff
@@ AC-4: Migration 021 (Unit)
+ [ ] Adds nullable INTEGER column `status_changed_at` (ms epoch UTC)
@@ AC-6: Enrichment in Orchestrator (Integration)
+ [ ] If status_name/category changes, update `status_changed_at = now_ms()`
+ [ ] If status is cleared, set `status_changed_at = NULL`
@@ AC-9: List Issues Filter (E2E)
+ [ ] `lore list issues --stale-status-days N` filters by `status_changed_at <= now - N days`
```
10. Expand test matrix for real-world failure/perf paths (must-have)
Reasoning: current tests are good, but the highest-risk failures are retry behavior, capability caching, idempotency under repeated runs, and large-project performance.
```diff
@@ TDD Plan — RED Phase
+ 26. `test_graphql_retries_429_with_retry_after_then_succeeds`
+ 27. `test_graphql_retries_503_then_fails_after_max_attempts`
+ 28. `test_capability_cache_skips_unsupported_project_until_ttl_expiry`
+ 29. `test_delta_skip_when_no_issue_upserts`
+ 30. `test_periodic_full_reconcile_runs_after_threshold`
+ 31. `test_set_based_enrichment_scales_10k_issues_without_timeout`
+ 32. `test_enrichment_idempotent_across_two_runs`
+ 33. `test_status_changed_at_updates_only_on_actual_status_change`
```
If you want, I can now produce a single consolidated revised plan document (full rewritten Markdown) with these changes merged in-place so its ready to execute.