# Project Manager System — Design Proposal ## The Problem We have a growing backlog of ideas and issues in markdown files. Agents can ship features in under an hour. The constraint isn't execution speed — it's knowing WHAT to execute NEXT, in what ORDER, and detecting when the plan needs to change. We need a system that: 1. Automatically scores and sequences work items 2. Detects when scope changes during spec generation 3. Tracks the full lifecycle: idea → spec → beads → shipped 4. Re-triages instantly when the dependency graph changes 5. Runs in seconds, not minutes ## Architecture ``` ┌─────────────────────────────────────────────────────────────┐ │ docs/ideas/*.md │ │ docs/issues/*.md │ │ (YAML frontmatter) │ └──────────────────────────┬──────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ IDEA TRIAGE SKILL │ │ │ │ Phase 1: INGEST — parse all frontmatter │ │ Phase 2: VALIDATE — check refs, detect staleness │ │ Phase 3: EVALUATE — detect scope changes since last run │ │ Phase 4: SCORE — compute priority with unlock graph │ │ Phase 5: SEQUENCE — topological sort by dependency + score │ │ Phase 6: RECOMMEND — top 3 + unlock advisories + warnings │ └──────────────────────────┬──────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ HUMAN DECIDES │ │ (picks from top 3, takes seconds) │ └──────────────────────────┬──────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ SPEC GENERATION (Claude/GPT) │ │ Takes the idea doc, generates detailed implementation spec │ │ ALSO: re-evaluates frontmatter fields based on deeper │ │ understanding. Updates effort, blocked-by, components. │ │ This is the SCOPE CHANGE DETECTION point. │ └──────────────────────────┬──────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ PLAN-TO-BEADS (existing skill) │ │ Spec → granular beads with dependencies via br CLI │ │ Links bead IDs back into the idea frontmatter │ └──────────────────────────┬──────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ AGENT IMPLEMENTATION │ │ Works beads via br/bv workflow │ │ bv --robot-triage handles execution-phase prioritization │ └──────────────────────────┬──────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ COMPLETION & RE-TRIAGE │ │ Beads close → idea status updates to implemented │ │ Skill re-runs → newly unblocked ideas surface │ │ Loop back to top │ └─────────────────────────────────────────────────────────────┘ ``` ## The Two Systems and Their Boundary | Concern | Ideas System (new) | Beads System (existing) | |---------|-------------------|------------------------| | Phase | Pre-commitment (what to build) | Execution (how to build) | | Data | docs/ideas/*.md, docs/issues/*.md | .beads/issues.jsonl | | Triage | Idea triage skill | bv --robot-triage | | Tracking | YAML frontmatter | JSONL records | | Granularity | Feature-level | Task-level | | Lifecycle | proposed → specced → promoted | open → in_progress → closed | **The handoff point is promotion.** An idea becomes one or more beads. After that, the ideas system only tracks the idea's status (promoted/implemented). Beads owns execution. An idea file is NEVER deleted. It's a permanent design record. Even after implementation, it documents WHY the feature was built and what tradeoffs were made. --- ## Data Model ### Frontmatter Schema ```yaml --- # ── Identity ── id: idea-009 # stable unique identifier title: Similar Issues Finder type: idea # idea | issue status: proposed # see lifecycle below # ── Timestamps ── created: 2026-02-09 updated: 2026-02-09 eval-hash: null # SHA of scoring fields at last triage run # ── Scoring Inputs ── impact: high # high | medium | low effort: small # small | medium | large | xlarge severity: null # critical | high | medium | low (issues only) autonomy: full # full | needs-design | needs-human # ── Dependency Graph ── blocked-by: [] # IDs of ideas/issues that must complete first unlocks: # IDs that become possible/better after this ships - idea-recurring-patterns requires: [] # external prerequisites (gate names) related: # soft links, not blocking - issue-001 # ── Implementation Context ── components: # source code paths this will touch - src/search/ - src/embedding/ command: lore similar # proposed CLI command (null for issues) has-spec: false # detailed spec has been generated spec-path: null # path to spec doc if it exists beads: [] # bead IDs after promotion # ── Classification ── tags: - embeddings - search --- ``` ### Status Lifecycle ``` IDEA lifecycle: proposed ──→ accepted ──→ specced ──→ promoted ──→ implemented │ │ └──→ rejected └──→ (scope changed, back to accepted) ISSUE lifecycle: open ──→ accepted ──→ specced ──→ promoted ──→ resolved │ └──→ wontfix ``` Transitions: - `proposed → accepted`: Human confirms this is worth building - `accepted → specced`: Detailed implementation spec has been generated - `specced → promoted`: Beads created from the spec - `promoted → implemented`: All beads closed - Any → `rejected`/`wontfix`: Decided not to build (with reason in body) - `specced → accepted`: Scope changed during spec, needs re-evaluation ### Effort Calibration (Agent-Executed) | Level | Wall Clock | Autonomy | Example | |-------|-----------|----------|---------| | small | ~30 min | Agent ships end-to-end | stale-discussions, closure-gaps | | medium | ~1 hour | Agent ships end-to-end | similar-issues, digest | | large | 1-2 hours | May need one design decision | recurring-patterns, experts | | xlarge | 2+ hours | Needs human architecture input | project groups | ### Gates Registry (docs/gates.yaml) ```yaml gates: gate-1: title: Resource Events Ingestion status: complete completed: 2025-12-15 gate-2: title: Cross-References & Entity Graph status: complete completed: 2026-01-10 gate-3: title: Timeline Pipeline status: complete completed: 2026-01-25 gate-4: title: MR File Changes Ingestion status: partial notes: Schema ready (migration 016), ingestion code exists but untested tracks: mr_file_changes table population gate-5: title: Code Trace (file:line → commit → MR → issue) status: not-started blocked-by: gate-4 notes: Requires git log parsing + commit SHA matching ``` The skill reads this file to determine which `requires` entries are satisfied. --- ## Scoring Algorithm ### Priority Score ``` For ideas: base = impact_weight # high=3, medium=2, low=1 unlock = 1 + (0.5 × count_of_unlocks) # items this directly enables readiness = 0 if blocked, 1 if ready priority = base × unlock × readiness For issues: base = severity_weight × 1.5 # critical=6, high=4.5, medium=3, low=1.5 unlock = 1 + (0.5 × count_of_unlocks) # (bugs rarely unlock, but can) readiness = 0 if blocked, 1 if ready priority = base × unlock × readiness Tiebreak (among equal priority): 1. Prefer smaller effort (ships faster, starts next cycle sooner) 2. Prefer autonomy:full over needs-design over needs-human 3. Prefer older items (FIFO within same score) ``` ### Why This Works - High-impact items that unlock other items float to the top - Blocked items score 0 regardless of impact (can't be worked) - Effort is a tiebreaker, not a primary factor (since execution is fast) - Issues with severity get a 1.5× multiplier (bugs degrade existing value) - Unlock multiplier captures the "do Gate 4 first" insight automatically ### Example Rankings | Item | Impact | Unlocks | Readiness | Score | |------|--------|---------|-----------|-------| | project-ergonomics | high(3) | 10 | ready(1) | 3 × 6.0 = 18.0 | | gate-4-completion | med(2) | 5 | ready(1) | 2 × 3.5 = 7.0 | | similar-issues | high(3) | 1 | ready(1) | 3 × 1.5 = 4.5 | | stale-discussions | high(3) | 0 | ready(1) | 3 × 1.0 = 3.0 | | hotspots | high(3) | 1 | blocked(0) | 0.0 | Project-ergonomics dominates because it unlocks 10 downstream items. This is the correct recommendation — it's the highest-leverage work even though "stale-discussions" is simpler. --- ## Scope Change Detection This is the hardest problem. An idea's scope can change in three ways: ### 1. During Spec Generation (Primary Detection Point) When Claude/GPT generates a detailed implementation spec from an idea doc, it understands the idea more deeply than the original sketch. The spec process should be instructed to: - Re-evaluate effort (now that implementation is understood in detail) - Discover new dependencies (need to change schema first, need a new config option) - Identify component changes (touches more modules than originally thought) - Assess impact more accurately (this is actually higher/lower value than estimated) **Mechanism:** The spec generation prompt includes an explicit "re-evaluate frontmatter" step. The spec output includes an updated frontmatter block. If scoring-relevant fields changed, the skill flags it: ``` SCOPE CHANGE DETECTED: idea-009 (Similar Issues Finder) - effort: small → medium (needs embedding aggregation strategy) - blocked-by: [] → [gate-embeddings-populated] - components: +src/cli/commands/similar.rs (new file) Previous score: 4.5 → New score: 3.0 Recommendation: Still top-3, but sequencing may change. ``` ### 2. During Implementation (Discovered Complexity) An agent working on beads may discover the spec was wrong: - "This requires a database migration I didn't anticipate" - "This module doesn't expose the API I need" **Mechanism:** When a bead is blocked or takes significantly longer than estimated, the agent should update the idea's frontmatter. The skill detects the change on next triage run via eval-hash comparison. ### 3. External Changes (Gate Completion, New Ideas) When a gate completes or a new idea is added that changes the dependency graph: - Gate 4 completes → 5 ideas become unblocked - New idea added that's higher priority than current top-3 - Two ideas discovered to be duplicates **Mechanism:** The skill detects these automatically by re-computing the full graph on every run. The eval-hash tracks what the scoring fields looked like last time; if they haven't changed but the SCORE changed (because a dependency was resolved), the skill flags it as "newly unblocked." ### The eval-hash Field ```yaml eval-hash: "a1b2c3d4" # SHA-256 of: impact + effort + blocked-by + unlocks + requires ``` Computed by hashing the concatenation of all scoring-relevant fields. When the skill runs, it compares: - If eval-hash matches AND score is same → no change, skip - If eval-hash matches BUT score changed → external change (dependency resolved) - If eval-hash differs → item was modified, re-evaluate This avoids re-announcing unchanged items on every run. --- ## Skill Design ### Location `.claude/skills/idea-triage/SKILL.md` (project-local) ### Trigger Phrases - "triage ideas" / "what should I build next?" - "idea triage" / "prioritize ideas" - "what's the highest value work?" - `/idea-triage` ### Workflow Phases **Phase 1: INGEST** - Glob docs/ideas/*.md and docs/issues/*.md - Parse YAML frontmatter from each file - Read docs/gates.yaml for capability status - Collect: id, title, type, status, impact, effort, severity, autonomy, blocked-by, unlocks, requires, has-spec, beads, eval-hash **Phase 2: VALIDATE** - Required fields present (id, title, type, status, impact, effort) - All blocked-by IDs reference existing files - All unlocks IDs reference existing files - All requires entries exist in gates.yaml - No dependency cycles (blocked-by graph is a DAG) - Status transitions are valid (no "proposed" with beads linked) - Output: list of validation errors/warnings **Phase 3: EVALUATE (Scope Change Detection)** - For each item, compute current eval-hash from scoring fields - Compare against stored eval-hash in frontmatter - If different: flag as SCOPE_CHANGED with field-level diff - If same but score changed (due to external dep resolution): flag as NEWLY_UNBLOCKED - If status is specced but has-spec is false: flag as INCONSISTENT **Phase 4: SCORE** - Resolve requires against gates.yaml (is the gate complete?) - Resolve blocked-by against other items (is the blocker done?) - Compute readiness: 0 if any hard blocker is unresolved, 1 otherwise - Compute unlock count: count items whose blocked-by includes this ID - Apply scoring formula: - Ideas: impact_weight × (1 + 0.5 × unlock_count) × readiness - Issues: severity_weight × 1.5 × (1 + 0.5 × unlock_count) × readiness - Apply tiebreak: effort_weight, autonomy, created date **Phase 5: SEQUENCE** - Separate into: actionable (score > 0) vs blocked (score = 0) - Among actionable: sort by score descending with tiebreak - Among blocked: sort by "what-if score" (score if blockers were resolved) - Compute unlock advisories: "completing X unblocks Y items worth Z total score" **Phase 6: RECOMMEND** Output structured report: ``` == IDEA TRIAGE == Run: 2026-02-09T14:30:00Z Items: 22 (18 proposed, 2 accepted, 1 specced, 1 implemented) RECOMMENDED SEQUENCE: 1. [idea-project-ergonomics] Multi-Project Ergonomics impact:high effort:medium autonomy:full score:18.0 WHY FIRST: Unlocks 10 downstream ideas. Highest leverage. COMPONENTS: src/core/config.rs, src/core/project.rs, src/cli/ 2. [idea-009] Similar Issues Finder impact:high effort:small autonomy:full score:4.5 WHY NEXT: Highest standalone impact. Ships in ~30 min. UNLOCKS: idea-recurring-patterns 3. [idea-004] Stale Discussion Finder impact:high effort:small autonomy:full score:3.0 WHY NEXT: Quick win, no dependencies, immediate user value. BLOCKED (would rank high if unblocked): idea-014 File Hotspots score-if-unblocked:4.5 BLOCKED BY: gate-4 idea-021 Knowledge Silos score-if-unblocked:3.0 BLOCKED BY: gate-4 UNLOCK ADVISORY: Completing gate-4 unblocks 5 items (combined: 15.0) SCOPE CHANGES DETECTED: idea-009: effort changed small→medium (eval-hash mismatch) idea-017: now has spec (has-spec flipped to true) NEWLY UNBLOCKED: (none this run) WARNINGS: idea-016: status=proposed, unchanged for 30+ days idea-008: blocked-by references "idea-gate4" which doesn't exist (typo?) HEALTH: Proposed: 18 | Accepted: 2 | Specced: 1 | Promoted: 0 | Implemented: 1 Blocked: 6 | Actionable: 16 Backlog runway at ~5/day: ~3 days ``` ### What the Skill Does NOT Do - **Never modifies files.** Read-only triage. The agent or human updates frontmatter. Exception: the skill CAN update eval-hash after a triage run (opt-in). - **Never creates beads.** That's plan-to-beads skill territory. - **Never replaces bv.** Once work is in beads, bv --robot-triage handles execution prioritization. This skill owns pre-commitment only. - **Never generates specs.** That's a separate step with Claude/GPT. --- ## Integration Points ### With Spec Generation The spec generation prompt (separate from this skill) should include: ``` After generating the implementation spec, re-evaluate the idea's frontmatter: 1. Is the effort estimate still accurate? (small/medium/large/xlarge) 2. Did you discover new dependencies? (add to blocked-by) 3. Are there components not listed? (add to components) 4. Has the impact assessment changed? 5. Can an agent ship this autonomously? (autonomy: full/needs-design/needs-human) Output an UPDATED frontmatter block at the end of the spec. If any scoring field changed, explain what changed and why. ``` ### With plan-to-beads When promoting an idea to beads: 1. Run plan-to-beads on the spec 2. Capture the created bead IDs 3. Update the idea's frontmatter: status → promoted, beads → [bd-xxx, bd-yyy] 4. Run br sync --flush-only && git add .beads/ ### With bv --robot-triage These systems don't talk to each other directly. The boundary is: - Idea triage skill → "build idea-009 next" - Human/agent generates spec → plan-to-beads → beads created - bv --robot-triage → "work on bd-xxx next" - Beads close → human/agent updates idea frontmatter → idea triage re-runs ### With New Item Ingestion When someone adds a new file to docs/ideas/ or docs/issues/: - If it has valid frontmatter: picked up automatically on next triage run - If it has no/invalid frontmatter: flagged in WARNINGS section - Skill can suggest default frontmatter based on content analysis --- ## Failure Modes and Mitigations ### 1. Frontmatter Rot **Risk:** Fields don't get updated. Status says "proposed" but it's actually shipped. **Mitigation:** Cross-reference with beads. If an idea has beads and all beads are closed, flag that the idea should be "implemented" even if frontmatter says otherwise. The skill detects this inconsistency. ### 2. Score Gaming **Risk:** Someone inflates impact or unlocks count to make their idea rank higher. **Mitigation:** Unlocks are verified — the skill checks that the referenced items actually have this idea in their blocked-by. Impact is subjective but reviewed during spec generation (second opinion from a different model/session). ### 3. Stale Gates Registry **Risk:** gate-4 is actually complete but gates.yaml wasn't updated. **Mitigation:** Skill warns when a gate has been "partial" for a long time. Could also probe the codebase (check if mr_file_changes ingestion code exists and has tests). ### 4. Circular Dependencies **Risk:** A blocks B blocks A. **Mitigation:** Phase 2 validation explicitly checks for cycles in the blocked-by graph and reports them as errors. ### 5. Unlock Count Inflation **Risk:** An item claims to unlock 20 things, making it score astronomically. **Mitigation:** Unlock count is VERIFIED by checking reverse blocked-by references. If idea-X says it unlocks idea-Y, but idea-Y's blocked-by doesn't include idea-X, the claim is discounted. Both explicit unlocks and reverse blocked-by contribute to the count, but unverified claims are flagged. ### 6. Scope Creep During Spec **Risk:** Spec generation reveals the idea is actually 5× harder than estimated. The score drops, but the human has already mentally committed. **Mitigation:** The scope change detection makes this VISIBLE. The triage output explicitly shows "effort changed small→xlarge, score dropped from 4.5 to 0.75." Human can then decide: proceed anyway, or switch to a different top-3 pick. ### 7. Orphaned Ideas **Risk:** Ideas get promoted to beads, beads get implemented, but the idea file never gets updated. It sits in "promoted" forever. **Mitigation:** Skill checks: for each idea with status=promoted, look up the linked beads. If all beads are closed, flag: "idea-009 appears complete, update status to implemented." --- ## Implementation Plan ### Step 1: Create the Frontmatter Schema (this doc → applied to all files) - Define the exact YAML schema (above) - Create docs/gates.yaml - Apply frontmatter to all 22 existing files in docs/ideas/ and docs/issues/ ### Step 2: Build the Skill - Create .claude/skills/idea-triage/SKILL.md - Implement all 6 phases in the skill prompt - The skill uses Glob, Read, and text processing — no external scripts needed (25 files is small enough for Claude to process directly) ### Step 3: Test the System - Run the skill against current files - Verify scoring matches manual expectations - Check that project-ergonomics ranks #1 (it should, due to unlock count) - Verify blocked items score 0 - Check validation catches intentional errors ### Step 4: Run One Full Cycle - Pick the top recommendation - Generate a spec (separate session) - Verify scope change detection works (spec should update frontmatter) - Promote to beads via plan-to-beads - Implement - Verify completion detection works ### Step 5: Iterate - Run triage again after implementation - Verify newly unblocked items surface - Adjust scoring weights if rankings feel wrong - Add new ideas as they emerge