docs: add feature ideas catalog, time-decay scoring plan, and timeline issue doc

Ideas catalog (docs/ideas/): 25 feature concept documents covering future
lore capabilities including bottleneck detection, churn analysis, expert
scoring, collaboration patterns, milestone risk, knowledge silos, and more.
Each doc includes motivation, implementation sketch, data requirements, and
dependencies on existing infrastructure. README.md provides an overview and
SYSTEM-PROPOSAL.md presents the unified analytics vision.

Plans (plans/): Time-decay expert scoring design with four rounds of review
feedback exploring decay functions, scoring algebra, and integration points
with the existing who-expert pipeline.

Issue doc (docs/issues/001): Documents the timeline pipeline bug where
EntityRef was missing project context, causing ambiguous cross-project
references during the EXPAND stage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Taylor Eernisse
2026-02-09 10:16:48 -05:00
parent d54f669c5e
commit 4185abe05d
32 changed files with 4170 additions and 0 deletions

View File

@@ -0,0 +1,555 @@
# Project Manager System — Design Proposal
## The Problem
We have a growing backlog of ideas and issues in markdown files. Agents can ship
features in under an hour. The constraint isn't execution speed — it's knowing
WHAT to execute NEXT, in what ORDER, and detecting when the plan needs to change.
We need a system that:
1. Automatically scores and sequences work items
2. Detects when scope changes during spec generation
3. Tracks the full lifecycle: idea → spec → beads → shipped
4. Re-triages instantly when the dependency graph changes
5. Runs in seconds, not minutes
## Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ docs/ideas/*.md │
│ docs/issues/*.md │
│ (YAML frontmatter) │
└──────────────────────────┬──────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ IDEA TRIAGE SKILL │
│ │
│ Phase 1: INGEST — parse all frontmatter │
│ Phase 2: VALIDATE — check refs, detect staleness │
│ Phase 3: EVALUATE — detect scope changes since last run │
│ Phase 4: SCORE — compute priority with unlock graph │
│ Phase 5: SEQUENCE — topological sort by dependency + score │
│ Phase 6: RECOMMEND — top 3 + unlock advisories + warnings │
└──────────────────────────┬──────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ HUMAN DECIDES │
│ (picks from top 3, takes seconds) │
└──────────────────────────┬──────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ SPEC GENERATION (Claude/GPT) │
│ Takes the idea doc, generates detailed implementation spec │
│ ALSO: re-evaluates frontmatter fields based on deeper │
│ understanding. Updates effort, blocked-by, components. │
│ This is the SCOPE CHANGE DETECTION point. │
└──────────────────────────┬──────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ PLAN-TO-BEADS (existing skill) │
│ Spec → granular beads with dependencies via br CLI │
│ Links bead IDs back into the idea frontmatter │
└──────────────────────────┬──────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ AGENT IMPLEMENTATION │
│ Works beads via br/bv workflow │
│ bv --robot-triage handles execution-phase prioritization │
└──────────────────────────┬──────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ COMPLETION & RE-TRIAGE │
│ Beads close → idea status updates to implemented │
│ Skill re-runs → newly unblocked ideas surface │
│ Loop back to top │
└─────────────────────────────────────────────────────────────┘
```
## The Two Systems and Their Boundary
| Concern | Ideas System (new) | Beads System (existing) |
|---------|-------------------|------------------------|
| Phase | Pre-commitment (what to build) | Execution (how to build) |
| Data | docs/ideas/*.md, docs/issues/*.md | .beads/issues.jsonl |
| Triage | Idea triage skill | bv --robot-triage |
| Tracking | YAML frontmatter | JSONL records |
| Granularity | Feature-level | Task-level |
| Lifecycle | proposed → specced → promoted | open → in_progress → closed |
**The handoff point is promotion.** An idea becomes one or more beads. After that,
the ideas system only tracks the idea's status (promoted/implemented). Beads owns
execution.
An idea file is NEVER deleted. It's a permanent design record. Even after
implementation, it documents WHY the feature was built and what tradeoffs were made.
---
## Data Model
### Frontmatter Schema
```yaml
---
# ── Identity ──
id: idea-009 # stable unique identifier
title: Similar Issues Finder
type: idea # idea | issue
status: proposed # see lifecycle below
# ── Timestamps ──
created: 2026-02-09
updated: 2026-02-09
eval-hash: null # SHA of scoring fields at last triage run
# ── Scoring Inputs ──
impact: high # high | medium | low
effort: small # small | medium | large | xlarge
severity: null # critical | high | medium | low (issues only)
autonomy: full # full | needs-design | needs-human
# ── Dependency Graph ──
blocked-by: [] # IDs of ideas/issues that must complete first
unlocks: # IDs that become possible/better after this ships
- idea-recurring-patterns
requires: [] # external prerequisites (gate names)
related: # soft links, not blocking
- issue-001
# ── Implementation Context ──
components: # source code paths this will touch
- src/search/
- src/embedding/
command: lore similar # proposed CLI command (null for issues)
has-spec: false # detailed spec has been generated
spec-path: null # path to spec doc if it exists
beads: [] # bead IDs after promotion
# ── Classification ──
tags:
- embeddings
- search
---
```
### Status Lifecycle
```
IDEA lifecycle:
proposed ──→ accepted ──→ specced ──→ promoted ──→ implemented
│ │
└──→ rejected └──→ (scope changed, back to accepted)
ISSUE lifecycle:
open ──→ accepted ──→ specced ──→ promoted ──→ resolved
└──→ wontfix
```
Transitions:
- `proposed → accepted`: Human confirms this is worth building
- `accepted → specced`: Detailed implementation spec has been generated
- `specced → promoted`: Beads created from the spec
- `promoted → implemented`: All beads closed
- Any → `rejected`/`wontfix`: Decided not to build (with reason in body)
- `specced → accepted`: Scope changed during spec, needs re-evaluation
### Effort Calibration (Agent-Executed)
| Level | Wall Clock | Autonomy | Example |
|-------|-----------|----------|---------|
| small | ~30 min | Agent ships end-to-end | stale-discussions, closure-gaps |
| medium | ~1 hour | Agent ships end-to-end | similar-issues, digest |
| large | 1-2 hours | May need one design decision | recurring-patterns, experts |
| xlarge | 2+ hours | Needs human architecture input | project groups |
### Gates Registry (docs/gates.yaml)
```yaml
gates:
gate-1:
title: Resource Events Ingestion
status: complete
completed: 2025-12-15
gate-2:
title: Cross-References & Entity Graph
status: complete
completed: 2026-01-10
gate-3:
title: Timeline Pipeline
status: complete
completed: 2026-01-25
gate-4:
title: MR File Changes Ingestion
status: partial
notes: Schema ready (migration 016), ingestion code exists but untested
tracks: mr_file_changes table population
gate-5:
title: Code Trace (file:line → commit → MR → issue)
status: not-started
blocked-by: gate-4
notes: Requires git log parsing + commit SHA matching
```
The skill reads this file to determine which `requires` entries are satisfied.
---
## Scoring Algorithm
### Priority Score
```
For ideas:
base = impact_weight # high=3, medium=2, low=1
unlock = 1 + (0.5 × count_of_unlocks) # items this directly enables
readiness = 0 if blocked, 1 if ready
priority = base × unlock × readiness
For issues:
base = severity_weight × 1.5 # critical=6, high=4.5, medium=3, low=1.5
unlock = 1 + (0.5 × count_of_unlocks) # (bugs rarely unlock, but can)
readiness = 0 if blocked, 1 if ready
priority = base × unlock × readiness
Tiebreak (among equal priority):
1. Prefer smaller effort (ships faster, starts next cycle sooner)
2. Prefer autonomy:full over needs-design over needs-human
3. Prefer older items (FIFO within same score)
```
### Why This Works
- High-impact items that unlock other items float to the top
- Blocked items score 0 regardless of impact (can't be worked)
- Effort is a tiebreaker, not a primary factor (since execution is fast)
- Issues with severity get a 1.5× multiplier (bugs degrade existing value)
- Unlock multiplier captures the "do Gate 4 first" insight automatically
### Example Rankings
| Item | Impact | Unlocks | Readiness | Score |
|------|--------|---------|-----------|-------|
| project-ergonomics | high(3) | 10 | ready(1) | 3 × 6.0 = 18.0 |
| gate-4-completion | med(2) | 5 | ready(1) | 2 × 3.5 = 7.0 |
| similar-issues | high(3) | 1 | ready(1) | 3 × 1.5 = 4.5 |
| stale-discussions | high(3) | 0 | ready(1) | 3 × 1.0 = 3.0 |
| hotspots | high(3) | 1 | blocked(0) | 0.0 |
Project-ergonomics dominates because it unlocks 10 downstream items. This is the
correct recommendation — it's the highest-leverage work even though "stale-discussions"
is simpler.
---
## Scope Change Detection
This is the hardest problem. An idea's scope can change in three ways:
### 1. During Spec Generation (Primary Detection Point)
When Claude/GPT generates a detailed implementation spec from an idea doc, it
understands the idea more deeply than the original sketch. The spec process should
be instructed to:
- Re-evaluate effort (now that implementation is understood in detail)
- Discover new dependencies (need to change schema first, need a new config option)
- Identify component changes (touches more modules than originally thought)
- Assess impact more accurately (this is actually higher/lower value than estimated)
**Mechanism:** The spec generation prompt includes an explicit "re-evaluate frontmatter"
step. The spec output includes an updated frontmatter block. If scoring-relevant
fields changed, the skill flags it:
```
SCOPE CHANGE DETECTED:
idea-009 (Similar Issues Finder)
- effort: small → medium (needs embedding aggregation strategy)
- blocked-by: [] → [gate-embeddings-populated]
- components: +src/cli/commands/similar.rs (new file)
Previous score: 4.5 → New score: 3.0
Recommendation: Still top-3, but sequencing may change.
```
### 2. During Implementation (Discovered Complexity)
An agent working on beads may discover the spec was wrong:
- "This requires a database migration I didn't anticipate"
- "This module doesn't expose the API I need"
**Mechanism:** When a bead is blocked or takes significantly longer than estimated,
the agent should update the idea's frontmatter. The skill detects the change on
next triage run via eval-hash comparison.
### 3. External Changes (Gate Completion, New Ideas)
When a gate completes or a new idea is added that changes the dependency graph:
- Gate 4 completes → 5 ideas become unblocked
- New idea added that's higher priority than current top-3
- Two ideas discovered to be duplicates
**Mechanism:** The skill detects these automatically by re-computing the full graph
on every run. The eval-hash tracks what the scoring fields looked like last time;
if they haven't changed but the SCORE changed (because a dependency was resolved),
the skill flags it as "newly unblocked."
### The eval-hash Field
```yaml
eval-hash: "a1b2c3d4" # SHA-256 of: impact + effort + blocked-by + unlocks + requires
```
Computed by hashing the concatenation of all scoring-relevant fields. When the skill
runs, it compares:
- If eval-hash matches AND score is same → no change, skip
- If eval-hash matches BUT score changed → external change (dependency resolved)
- If eval-hash differs → item was modified, re-evaluate
This avoids re-announcing unchanged items on every run.
---
## Skill Design
### Location
`.claude/skills/idea-triage/SKILL.md` (project-local)
### Trigger Phrases
- "triage ideas" / "what should I build next?"
- "idea triage" / "prioritize ideas"
- "what's the highest value work?"
- `/idea-triage`
### Workflow Phases
**Phase 1: INGEST**
- Glob docs/ideas/*.md and docs/issues/*.md
- Parse YAML frontmatter from each file
- Read docs/gates.yaml for capability status
- Collect: id, title, type, status, impact, effort, severity, autonomy,
blocked-by, unlocks, requires, has-spec, beads, eval-hash
**Phase 2: VALIDATE**
- Required fields present (id, title, type, status, impact, effort)
- All blocked-by IDs reference existing files
- All unlocks IDs reference existing files
- All requires entries exist in gates.yaml
- No dependency cycles (blocked-by graph is a DAG)
- Status transitions are valid (no "proposed" with beads linked)
- Output: list of validation errors/warnings
**Phase 3: EVALUATE (Scope Change Detection)**
- For each item, compute current eval-hash from scoring fields
- Compare against stored eval-hash in frontmatter
- If different: flag as SCOPE_CHANGED with field-level diff
- If same but score changed (due to external dep resolution): flag as NEWLY_UNBLOCKED
- If status is specced but has-spec is false: flag as INCONSISTENT
**Phase 4: SCORE**
- Resolve requires against gates.yaml (is the gate complete?)
- Resolve blocked-by against other items (is the blocker done?)
- Compute readiness: 0 if any hard blocker is unresolved, 1 otherwise
- Compute unlock count: count items whose blocked-by includes this ID
- Apply scoring formula:
- Ideas: impact_weight × (1 + 0.5 × unlock_count) × readiness
- Issues: severity_weight × 1.5 × (1 + 0.5 × unlock_count) × readiness
- Apply tiebreak: effort_weight, autonomy, created date
**Phase 5: SEQUENCE**
- Separate into: actionable (score > 0) vs blocked (score = 0)
- Among actionable: sort by score descending with tiebreak
- Among blocked: sort by "what-if score" (score if blockers were resolved)
- Compute unlock advisories: "completing X unblocks Y items worth Z total score"
**Phase 6: RECOMMEND**
Output structured report:
```
== IDEA TRIAGE ==
Run: 2026-02-09T14:30:00Z
Items: 22 (18 proposed, 2 accepted, 1 specced, 1 implemented)
RECOMMENDED SEQUENCE:
1. [idea-project-ergonomics] Multi-Project Ergonomics
impact:high effort:medium autonomy:full score:18.0
WHY FIRST: Unlocks 10 downstream ideas. Highest leverage.
COMPONENTS: src/core/config.rs, src/core/project.rs, src/cli/
2. [idea-009] Similar Issues Finder
impact:high effort:small autonomy:full score:4.5
WHY NEXT: Highest standalone impact. Ships in ~30 min.
UNLOCKS: idea-recurring-patterns
3. [idea-004] Stale Discussion Finder
impact:high effort:small autonomy:full score:3.0
WHY NEXT: Quick win, no dependencies, immediate user value.
BLOCKED (would rank high if unblocked):
idea-014 File Hotspots score-if-unblocked:4.5 BLOCKED BY: gate-4
idea-021 Knowledge Silos score-if-unblocked:3.0 BLOCKED BY: gate-4
UNLOCK ADVISORY: Completing gate-4 unblocks 5 items (combined: 15.0)
SCOPE CHANGES DETECTED:
idea-009: effort changed small→medium (eval-hash mismatch)
idea-017: now has spec (has-spec flipped to true)
NEWLY UNBLOCKED:
(none this run)
WARNINGS:
idea-016: status=proposed, unchanged for 30+ days
idea-008: blocked-by references "idea-gate4" which doesn't exist (typo?)
HEALTH:
Proposed: 18 | Accepted: 2 | Specced: 1 | Promoted: 0 | Implemented: 1
Blocked: 6 | Actionable: 16
Backlog runway at ~5/day: ~3 days
```
### What the Skill Does NOT Do
- **Never modifies files.** Read-only triage. The agent or human updates frontmatter.
Exception: the skill CAN update eval-hash after a triage run (opt-in).
- **Never creates beads.** That's plan-to-beads skill territory.
- **Never replaces bv.** Once work is in beads, bv --robot-triage handles execution
prioritization. This skill owns pre-commitment only.
- **Never generates specs.** That's a separate step with Claude/GPT.
---
## Integration Points
### With Spec Generation
The spec generation prompt (separate from this skill) should include:
```
After generating the implementation spec, re-evaluate the idea's frontmatter:
1. Is the effort estimate still accurate? (small/medium/large/xlarge)
2. Did you discover new dependencies? (add to blocked-by)
3. Are there components not listed? (add to components)
4. Has the impact assessment changed?
5. Can an agent ship this autonomously? (autonomy: full/needs-design/needs-human)
Output an UPDATED frontmatter block at the end of the spec.
If any scoring field changed, explain what changed and why.
```
### With plan-to-beads
When promoting an idea to beads:
1. Run plan-to-beads on the spec
2. Capture the created bead IDs
3. Update the idea's frontmatter: status → promoted, beads → [bd-xxx, bd-yyy]
4. Run br sync --flush-only && git add .beads/
### With bv --robot-triage
These systems don't talk to each other directly. The boundary is:
- Idea triage skill → "build idea-009 next"
- Human/agent generates spec → plan-to-beads → beads created
- bv --robot-triage → "work on bd-xxx next"
- Beads close → human/agent updates idea frontmatter → idea triage re-runs
### With New Item Ingestion
When someone adds a new file to docs/ideas/ or docs/issues/:
- If it has valid frontmatter: picked up automatically on next triage run
- If it has no/invalid frontmatter: flagged in WARNINGS section
- Skill can suggest default frontmatter based on content analysis
---
## Failure Modes and Mitigations
### 1. Frontmatter Rot
**Risk:** Fields don't get updated. Status says "proposed" but it's actually shipped.
**Mitigation:** Cross-reference with beads. If an idea has beads and all beads are
closed, flag that the idea should be "implemented" even if frontmatter says otherwise.
The skill detects this inconsistency.
### 2. Score Gaming
**Risk:** Someone inflates impact or unlocks count to make their idea rank higher.
**Mitigation:** Unlocks are verified — the skill checks that the referenced items
actually have this idea in their blocked-by. Impact is subjective but reviewed during
spec generation (second opinion from a different model/session).
### 3. Stale Gates Registry
**Risk:** gate-4 is actually complete but gates.yaml wasn't updated.
**Mitigation:** Skill warns when a gate has been "partial" for a long time. Could
also probe the codebase (check if mr_file_changes ingestion code exists and has tests).
### 4. Circular Dependencies
**Risk:** A blocks B blocks A.
**Mitigation:** Phase 2 validation explicitly checks for cycles in the blocked-by
graph and reports them as errors.
### 5. Unlock Count Inflation
**Risk:** An item claims to unlock 20 things, making it score astronomically.
**Mitigation:** Unlock count is VERIFIED by checking reverse blocked-by references.
If idea-X says it unlocks idea-Y, but idea-Y's blocked-by doesn't include idea-X,
the claim is discounted. Both explicit unlocks and reverse blocked-by contribute to
the count, but unverified claims are flagged.
### 6. Scope Creep During Spec
**Risk:** Spec generation reveals the idea is actually 5× harder than estimated.
The score drops, but the human has already mentally committed.
**Mitigation:** The scope change detection makes this VISIBLE. The triage output
explicitly shows "effort changed small→xlarge, score dropped from 4.5 to 0.75."
Human can then decide: proceed anyway, or switch to a different top-3 pick.
### 7. Orphaned Ideas
**Risk:** Ideas get promoted to beads, beads get implemented, but the idea file
never gets updated. It sits in "promoted" forever.
**Mitigation:** Skill checks: for each idea with status=promoted, look up the
linked beads. If all beads are closed, flag: "idea-009 appears complete, update
status to implemented."
---
## Implementation Plan
### Step 1: Create the Frontmatter Schema (this doc → applied to all files)
- Define the exact YAML schema (above)
- Create docs/gates.yaml
- Apply frontmatter to all 22 existing files in docs/ideas/ and docs/issues/
### Step 2: Build the Skill
- Create .claude/skills/idea-triage/SKILL.md
- Implement all 6 phases in the skill prompt
- The skill uses Glob, Read, and text processing — no external scripts needed
(25 files is small enough for Claude to process directly)
### Step 3: Test the System
- Run the skill against current files
- Verify scoring matches manual expectations
- Check that project-ergonomics ranks #1 (it should, due to unlock count)
- Verify blocked items score 0
- Check validation catches intentional errors
### Step 4: Run One Full Cycle
- Pick the top recommendation
- Generate a spec (separate session)
- Verify scope change detection works (spec should update frontmatter)
- Promote to beads via plan-to-beads
- Implement
- Verify completion detection works
### Step 5: Iterate
- Run triage again after implementation
- Verify newly unblocked items surface
- Adjust scoring weights if rankings feel wrong
- Add new ideas as they emerge