chore(plans): add input-history and model-selection plans

plans/input-history.md: - Implementation plan for shell-style up/down arrow message history in SimpleInput, deriving history from session log conversation data - Covers prop threading, history derivation, navigation state, keybinding details, modal parity, and test cases plans/model-selection.md: - Three-phase plan for model visibility and control: display current model, model picker at spawn, mid-session model switching via Zellij plans/PLAN-tool-result-display.md: - Updates to tool result display plan (pre-existing changes) plans/subagent-visibility.md: - Updates to subagent visibility plan (pre-existing changes) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 14:51:28 -05:00
parent abbede923d
commit c5b1fb3a80
4 changed files with 765 additions and 89 deletions
--- a/plans/PLAN-tool-result-display.md
+++ b/plans/PLAN-tool-result-display.md
@@ -20,6 +20,8 @@ Add the ability to view tool call results (diffs, bash output, file contents) di
 - Copy-to-clipboard functionality
 - Virtual scrolling / performance optimization
 - Editor integration (clicking paths to open files)
+- Accessibility (keyboard navigation, focus management, ARIA labels — deferred to v2)
+- Lazy-fetch API for tool results (consider for v2 if payload size becomes an issue)

 ---

@@ -61,44 +63,46 @@ Add the ability to view tool call results (diffs, bash output, file contents) di
 - **AC-1:** Tool calls render as expandable elements showing tool name and summary
 - **AC-2:** Clicking a collapsed tool call expands to show its result
 - **AC-3:** Clicking an expanded tool call collapses it
- **AC-4:** Tool results in the most recent assistant message are expanded by default
- **AC-5:** When a new assistant message arrives, previous tool results collapse
- **AC-6:** Edit and Write tool diffs remain expanded regardless of message age
- **AC-7:** Tool calls without results display as non-expandable with muted styling
+- **AC-4:** In active sessions, tool results in the most recent assistant message are expanded by default
+- **AC-5:** When a new assistant message arrives, previous non-diff tool results collapse unless the user has manually toggled them in that message
+- **AC-6:** Edit and Write results remain expanded regardless of message age or session status (even if Write only has confirmation text)
+- **AC-7:** In completed sessions, all non-diff tool results start collapsed
+- **AC-8:** Tool calls without results display as non-expandable with muted styling; in active sessions, pending tool calls show a spinner to distinguish in-progress from permanently missing

 ### Diff Rendering

- **AC-8:** Edit/Write results display structuredPatch data as syntax-highlighted diff
- **AC-9:** Diff additions render with VS Code dark theme green background (rgba(46, 160, 67, 0.15))
- **AC-10:** Diff deletions render with VS Code dark theme red background (rgba(248, 81, 73, 0.15))
- **AC-11:** Full file path displays above each diff block
- **AC-12:** Diff context lines use structuredPatch as-is (no recomputation)
+- **AC-9:** Edit/Write results display structuredPatch data as syntax-highlighted diff; falls back to raw content text if structuredPatch is malformed or absent
+- **AC-10:** Diff additions render with VS Code dark theme green background (rgba(46, 160, 67, 0.15))
+- **AC-11:** Diff deletions render with VS Code dark theme red background (rgba(248, 81, 73, 0.15))
+- **AC-12:** Full file path displays above each diff block
+- **AC-13:** Diff context lines use structuredPatch as-is (no recomputation)

 ### Other Tool Types

- **AC-13:** Bash results display stdout in monospace, stderr separately if present
- **AC-14:** Read results display file content with syntax highlighting based on file extension
- **AC-15:** Grep/Glob results display file list with match counts
- **AC-16:** WebFetch results display URL and response summary
+- **AC-14:** Bash results display stdout in monospace, stderr separately if present
+- **AC-15:** Bash output with ANSI escape codes renders as colored HTML (via ansi_up)
+- **AC-16:** Read results display file content with syntax highlighting based on file extension
+- **AC-17:** Grep/Glob results display file list with match counts
+- **AC-18:** Unknown tools (WebFetch, Task, etc.) use GenericResult fallback showing raw content

 ### Truncation

- **AC-17:** Long outputs truncate at thresholds matching Claude Code behavior
- **AC-18:** Truncated outputs show "Show full output (N lines)" link
- **AC-19:** Clicking "Show full output" opens a dedicated lightweight modal
- **AC-20:** Modal displays full content with syntax highlighting, scrollable
+- **AC-19:** Long outputs truncate at configurable line/character thresholds (defaults tuned to approximate Claude Code behavior)
+- **AC-20:** Truncated outputs show "Show full output (N lines)" link
+- **AC-21:** Clicking "Show full output" opens a dedicated lightweight modal
+- **AC-22:** Modal displays full content with syntax highlighting, scrollable

 ### Error States

- **AC-21:** Failed tool calls display with red-tinted background
- **AC-22:** Error content (stderr, error messages) is clearly distinguishable from success content
- **AC-23:** is_error flag from tool_result determines error state
+- **AC-23:** Failed tool calls display with red-tinted background
+- **AC-24:** Error content (stderr, error messages) is clearly distinguishable from success content
+- **AC-25:** is_error flag from tool_result determines error state

 ### API Contract

- **AC-24:** /api/conversation response includes tool results nested in tool_calls
- **AC-25:** Each tool_call has: name, id, input, result (when available)
- **AC-26:** Result structure varies by tool type (documented in IMP-SERVER)
+- **AC-26:** /api/conversation response includes tool results nested in tool_calls
+- **AC-27:** Each tool_call has: name, id, input, result (when available)
+- **AC-28:** All tool results conform to a normalized envelope: `{ kind, status, content, is_error }` with tool-specific fields nested in `content`

 ---

@@ -130,6 +134,23 @@ Full output can be thousands of lines. Inline expansion would:

 A modal provides a focused reading experience without disrupting conversation layout.

+### Why a Normalized Result Contract
+
+Raw `toolUseResult` shapes vary wildly by tool type — Edit has `structuredPatch`, Bash has `stdout`/`stderr`, Glob has `filenames`. Passing these raw to the frontend means every renderer must know the exact JSONL format, and adding Codex support (v2) would require duplicating all that branching.
+
+Instead, the server normalizes each result into a stable envelope:
+
+```python
+{
+    "kind": "diff" | "bash" | "file_content" | "file_list" | "generic",
+    "status": "success" | "error" | "pending",
+    "is_error": bool,
+    "content": { ... }  # tool-specific fields, documented per kind
+}
+```
+
+The frontend switches on `kind` (5 cases) rather than tool name (unbounded). This also gives us a clean seam for the `result_mode` query parameter if payload size becomes an issue later.
+
 ### Component Structure

 ```
@@ -157,7 +178,7 @@ FullOutputModal (new, top-level)

 ### IMP-SERVER: Parse and Attach Tool Results

-**Fulfills:** AC-24, AC-25, AC-26
+**Fulfills:** AC-26, AC-27, AC-28

 **Location:** `amc_server/mixins/conversation.py`

@@ -167,38 +188,43 @@ Two-pass parsing:
 1. First pass: Scan all entries, build map of `tool_use_id` → `toolUseResult`
 2. Second pass: Parse messages as before, but when encountering `tool_use`, lookup and attach result

-**Tool call schema after change:**
+**API query parameter:** `/api/conversation?result_mode=full` (default). Future option: `result_mode=preview` to return truncated previews and reduce payload size without an API-breaking change.
+
+**Normalization step:** After looking up the raw `toolUseResult`, the server normalizes it into the stable envelope before attaching:
+
 ```python
 {
    "name": "Edit",
    "id": "toolu_abc123",
    "input": {"file_path": "...", "old_string": "...", "new_string": "..."},
    "result": {
-        "content": "The file has been updated successfully.",
+        "kind": "diff",
+        "status": "success",
        "is_error": False,
-        "structuredPatch": [...],
-        "filePath": "...",
-        # ... other fields from toolUseResult
+        "content": {
+            "structuredPatch": [...],
+            "filePath": "...",
+            "text": "The file has been updated successfully."
+        }
    }
 }
 ```

-**Result Structure by Tool Type:**
+**Normalized `kind` mapping:**

-| Tool | Result Fields |
-|------|---------------|
-| Edit | `structuredPatch`, `filePath`, `oldString`, `newString` |
-| Write | `filePath`, content confirmation |
-| Read | `file`, `type`, content in `content` field |
-| Bash | `stdout`, `stderr`, `interrupted` |
-| Glob | `filenames`, `numFiles`, `truncated` |
-| Grep | `content`, `filenames`, `numFiles`, `numLines` |
+| kind | Source Tools | `content` Fields |
+|------|-------------|-----------------|
+| `diff` | Edit, Write | `structuredPatch`, `filePath`, `text` |
+| `bash` | Bash | `stdout`, `stderr`, `interrupted` |
+| `file_content` | Read | `file`, `type`, `text` |
+| `file_list` | Glob, Grep | `filenames`, `numFiles`, `truncated`, `numLines` |
+| `generic` | All others | `text` (raw content string) |

 ---

 ### IMP-TOOLCALL: Expandable Tool Call Component

-**Fulfills:** AC-1, AC-2, AC-3, AC-4, AC-5, AC-6, AC-7
+**Fulfills:** AC-1, AC-2, AC-3, AC-4, AC-5, AC-6, AC-7, AC-8

 **Location:** `dashboard/lib/markdown.js` (refactor `renderToolCalls`)

@@ -213,16 +239,21 @@ Renders a single tool call with:

 **State Management:**

-Track expanded state per message. When new assistant message arrives:
+Track two sets per message: `autoExpanded` (system-controlled) and `userToggled` (manual clicks).
+
+When new assistant message arrives:
 - Compare latest assistant message ID to stored ID
- If different, reset expanded set to empty
+- If different, reset `autoExpanded` to empty for previous messages
+- `userToggled` entries are never reset — user intent is preserved
 - Edit/Write tools bypass this logic (always expanded via CSS/logic)

+Expand/collapse logic: a tool call is expanded if it is in `userToggled` (explicit click) OR in `autoExpanded` (latest message) OR is Edit/Write kind.
+
 ---

 ### IMP-DIFF: Diff Rendering Component

-**Fulfills:** AC-8, AC-9, AC-10, AC-11, AC-12
+**Fulfills:** AC-9, AC-10, AC-11, AC-12, AC-13

 **Location:** `dashboard/lib/markdown.js` (new function `renderDiff`)

@@ -234,12 +265,13 @@ hljs.registerLanguage('diff', langDiff);

 **Diff Renderer:**

-1. Convert `structuredPatch` array to unified diff text:
+1. If `structuredPatch` is present and valid, convert to unified diff text:
   - Each hunk: `@@ -oldStart,oldLines +newStart,newLines @@`
   - Followed by hunk.lines array
-2. Syntax highlight with hljs diff language
-3. Sanitize with DOMPurify before rendering
-4. Wrap in container with file path header
+2. If `structuredPatch` is missing or malformed, fall back to raw `content.text` in a monospace block
+3. Syntax highlight with hljs diff language
+4. Sanitize with DOMPurify before rendering
+5. Wrap in container with file path header

 **CSS styling:**
 - Container: dark border, rounded corners
@@ -252,22 +284,33 @@ hljs.registerLanguage('diff', langDiff);

 ### IMP-BASH: Bash Output Component

-**Fulfills:** AC-13, AC-21, AC-22
+**Fulfills:** AC-14, AC-15, AC-23, AC-24

 **Location:** `dashboard/lib/markdown.js` (new function `renderBashResult`)

-Renders:
- `stdout` in monospace pre block
+**ANSI-to-HTML conversion:**
+```javascript
+import AnsiUp from 'https://esm.sh/ansi_up';
+const ansi = new AnsiUp();
+const html = ansi.ansi_to_html(bashOutput);
+```
+
+The `ansi_up` library (zero dependencies, ~8KB) converts ANSI escape codes to styled HTML spans, preserving colored test output, progress indicators, and error highlighting from CLI tools.
+
+**Renders:**
+- `stdout` in monospace pre block with ANSI colors preserved
 - `stderr` in separate block with error styling (if present)
 - "Command interrupted" notice (if interrupted flag)

+**Sanitization order (CRITICAL):** First convert ANSI to HTML via ansi_up, THEN sanitize with DOMPurify. Sanitizing before conversion would strip escape codes; sanitizing after preserves the styled spans while preventing XSS.
+
 Error state: `is_error` or presence of stderr triggers error styling (red tint, left border).

 ---

 ### IMP-TRUNCATE: Output Truncation

-**Fulfills:** AC-17, AC-18
+**Fulfills:** AC-19, AC-20

 **Truncation Thresholds (match Claude Code):**

@@ -289,7 +332,7 @@ Takes content string, returns `{ text, truncated, totalLines }`. If truncated, r

 ### IMP-MODAL: Full Output Modal

-**Fulfills:** AC-19, AC-20
+**Fulfills:** AC-21, AC-22

 **Location:** `dashboard/components/FullOutputModal.js` (new file)

@@ -305,7 +348,7 @@ Takes content string, returns `{ text, truncated, totalLines }`. If truncated, r

 ### IMP-ERROR: Error State Styling

-**Fulfills:** AC-21, AC-22, AC-23
+**Fulfills:** AC-23, AC-24, AC-25

 **Styling:**
 - Tool call header: red-tinted background when `result.is_error`
@@ -331,17 +374,19 @@ Takes content string, returns `{ text, truncated, totalLines }`. If truncated, r

 ---

-### Slice 2: Server-Side Tool Result Parsing
+### Slice 2: Server-Side Tool Result Parsing and Normalization

-**Goal:** API returns tool results nested in tool_calls
+**Goal:** API returns normalized tool results nested in tool_calls

 **Deliverables:**
 1. Two-pass parsing in `_parse_claude_conversation`
-2. Tool results attached with `id` field
-3. Unit tests for result attachment
-4. Handle missing results gracefully (return tool_call without result)
+2. Normalization layer: raw `toolUseResult` → `{ kind, status, is_error, content }` envelope
+3. Tool results attached with `id` field
+4. Unit tests for result attachment and normalization per tool type
+5. Handle missing results gracefully (return tool_call without result)
+6. Support `result_mode=full` query parameter (only mode for now, but wired up for future `preview`)

-**Exit Criteria:** AC-24, AC-25, AC-26 pass
+**Exit Criteria:** AC-26, AC-27, AC-28 pass

 ---

@@ -356,7 +401,7 @@ Takes content string, returns `{ text, truncated, totalLines }`. If truncated, r
 4. Collapse on new assistant message
 5. Keep Edit/Write always expanded

-**Exit Criteria:** AC-1 through AC-7 pass
+**Exit Criteria:** AC-1 through AC-8 pass

 ---

@@ -370,7 +415,7 @@ Takes content string, returns `{ text, truncated, totalLines }`. If truncated, r
 3. VS Code dark theme styling
 4. Full file path header

-**Exit Criteria:** AC-8 through AC-12 pass
+**Exit Criteria:** AC-9 through AC-13 pass

 ---

@@ -379,12 +424,13 @@ Takes content string, returns `{ text, truncated, totalLines }`. If truncated, r
 **Goal:** Bash, Read, Glob, Grep render appropriately

 **Deliverables:**
-1. `renderBashResult` with stdout/stderr separation
-2. `renderFileContent` for Read
-3. `renderFileList` for Glob/Grep
-4. Generic fallback for unknown tools
+1. Import and configure `ansi_up` for ANSI-to-HTML conversion
+2. `renderBashResult` with stdout/stderr separation and ANSI color preservation
+3. `renderFileContent` for Read
+4. `renderFileList` for Glob/Grep
+5. `GenericResult` fallback for unknown tools (WebFetch, Task, etc.)

-**Exit Criteria:** AC-13 through AC-16 pass
+**Exit Criteria:** AC-14 through AC-18 pass

 ---

@@ -398,7 +444,7 @@ Takes content string, returns `{ text, truncated, totalLines }`. If truncated, r
 3. `FullOutputModal` component
 4. Syntax highlighting in modal

-**Exit Criteria:** AC-17 through AC-20 pass
+**Exit Criteria:** AC-19 through AC-22 pass

 ---

@@ -412,15 +458,16 @@ Takes content string, returns `{ text, truncated, totalLines }`. If truncated, r
 3. Test with interrupted sessions
 4. Cross-browser testing

-**Exit Criteria:** AC-21 through AC-23 pass, feature complete
+**Exit Criteria:** AC-23 through AC-25 pass, feature complete

 ---

 ## Open Questions

-1. **Exact Claude Code truncation thresholds** — need to verify against Claude Code source or experiment
+1. ~~**Exact Claude Code truncation thresholds**~~ — **Resolved:** using reasonable defaults with a note to tune via testing. AC-19 updated.
 2. **Performance with 100+ tool calls** — monitor after ship, optimize if needed
-3. **Codex support timeline** — when should we prioritize v2?
+3. **Codex support timeline** — when should we prioritize v2? The normalized `kind` contract makes this easier: add Codex normalizers without touching renderers.
+4. ~~**Lazy-fetch for large payloads**~~ — **Resolved:** `result_mode` query parameter wired into API contract. Only `full` implemented in v1; `preview` deferred.

 ---

--- a/plans/input-history.md
+++ b/plans/input-history.md
@@ -0,0 +1,96 @@
+# Input History (Up/Down Arrow)
+
+## Summary
+
+Add shell-style up/down arrow navigation through past messages in SimpleInput. History is derived from the conversation data already parsed from session logs -- no new state management, no server changes.
+
+## How It Works Today
+
+1. Server parses JSONL session logs, extracts user messages with `role: "user"` (`conversation.py:57-66`)
+2. App.js stores parsed conversations in `conversations` state, refreshed via SSE on `conversation_mtime_ns` change
+3. SessionCard receives `conversation` as a prop but does **not** pass it to SimpleInput
+4. SimpleInput has no awareness of past messages
+
+## Step 1: Pipe Conversation to SimpleInput
+
+Pass the conversation array from SessionCard into SimpleInput so it can derive history.
+
+- `SessionCard.js:165-169` -- add `conversation` prop to SimpleInput
+- Same for the QuestionBlock path if freeform input is used there (line 162) -- skip for now, QuestionBlock is option-based
+
+**Files**: `dashboard/components/SessionCard.js`
+
+## Step 2: Derive User Message History
+
+Inside SimpleInput, filter conversation to user messages only.
+
+```js
+const userHistory = useMemo(
+  () => (conversation || []).filter(m => m.role === 'user').map(m => m.content),
+  [conversation]
+);
+```
+
+This updates automatically whenever the session log changes (SSE triggers conversation refresh, new prop flows down).
+
+**Files**: `dashboard/components/SimpleInput.js`
+
+## Step 3: History Navigation State
+
+Add refs for tracking position in history and preserving the draft.
+
+```js
+const historyIndexRef = useRef(-1);  // -1 = not browsing
+const draftRef = useRef('');         // saves in-progress text before browsing
+```
+
+Use refs (not state) because index changes don't need re-renders -- only `setText` triggers the visual update.
+
+**Files**: `dashboard/components/SimpleInput.js`
+
+## Step 4: ArrowUp/ArrowDown Keybinding
+
+In the `onKeyDown` handler (after the autocomplete block, before Enter-to-submit), add history navigation:
+
+- **ArrowUp**: only when autocomplete is closed AND cursor is at position 0 (prevents hijacking multiline cursor movement). On first press, save current text to `draftRef`. Walk backward through `userHistory`. Call `setText()` with the history entry.
+- **ArrowDown**: walk forward through history. If past the newest entry, restore `draftRef` and reset index to -1.
+- **Reset on submit**: set `historyIndexRef.current = -1` in `handleSubmit` after successful send.
+- **Reset on manual edit**: in `onInput`, reset `historyIndexRef.current = -1` so typing after browsing exits history mode.
+
+### Cursor position check
+
+```js
+const atStart = e.target.selectionStart === 0 && e.target.selectionEnd === 0;
+```
+
+Only intercept ArrowUp when `atStart` is true. This lets multiline text cursor movement work normally. ArrowDown can use similar logic (check if cursor is at end of text) or always navigate history when `historyIndexRef.current !== -1` (already browsing).
+
+**Files**: `dashboard/components/SimpleInput.js`
+
+## Step 5: Modal Parity
+
+The Modal (`Modal.js:71`) also renders SimpleInput with `onRespond`. Verify it passes `conversation` through. The same SessionCard is used in enlarged mode, so this should work automatically if Step 1 is done correctly.
+
+**Files**: `dashboard/components/Modal.js` (verify, likely no change needed)
+
+## Non-Goals
+
+- No localStorage persistence -- history comes from session logs which survive across page reloads
+- No server changes -- conversation parsing already extracts what we need
+- No new API endpoints
+- No changes to QuestionBlock (option-based, not free-text history)
+
+## Test Cases
+
+| Scenario | Expected |
+|----------|----------|
+| Press up with empty input | Fills with most recent user message |
+| Press up multiple times | Walks backward through user messages |
+| Press down after browsing up | Walks forward; past newest restores draft |
+| Press up with text in input | Saves text as draft, shows history |
+| Press down past end | Restores saved draft |
+| Type after browsing | Exits history mode (index resets) |
+| Submit after browsing | Sends displayed text, resets index |
+| Up arrow in multiline text (cursor not at pos 0) | Normal cursor movement, no history |
+| New message arrives via SSE | userHistory updates, no index disruption |
+| Session with no prior messages | Up arrow does nothing |
--- a/plans/model-selection.md
+++ b/plans/model-selection.md
@@ -0,0 +1,51 @@
+# Model Selection & Display
+
+## Summary
+
+Add model visibility and control to the AMC dashboard. Users can see which model each agent is running, pick a model when spawning, and switch models mid-session.
+
+## Models
+
+| Label | Value sent to Claude Code |
+|-------|--------------------------|
+| Opus 4.6 | `opus` |
+| Opus 4.5 | `claude-opus-4-5-20251101` |
+| Sonnet 4.6 | `sonnet` |
+| Haiku | `haiku` |
+
+## Step 1: Display Current Model
+
+Surface `context_usage.model` in `SessionCard.js`.
+
+- Data already extracted by `parsing.py` (line 202) from conversation JSONL
+- Already available via `/api/state` in `context_usage.model`
+- Add model name formatter: `claude-opus-4-5-20251101` -> `Opus 4.5`
+- Show in SessionCard (near agent badge or context usage area)
+- Shows `null` until first assistant message
+
+**Files**: `dashboard/components/SessionCard.js`
+
+## Step 2: Model Picker at Spawn
+
+Add model dropdown to `SpawnModal.js`. Pass to spawn API, which appends `--model <value>` to the claude command.
+
+- Extend `/api/spawn` to accept optional `model` param
+- Validate against allowed model list
+- Prepend `--model {model}` to command in `AGENT_COMMANDS`
+- Default: no flag (uses Claude Code's default)
+
+**Files**: `dashboard/components/SpawnModal.js`, `amc_server/mixins/spawn.py`
+
+## Step 3: Mid-Session Model Switch
+
+Dropdown on SessionCard to change model for running sessions via Zellij.
+
+- Send `/model <value>` to the agent's Zellij pane:
+  ```bash
+  zellij -s {session} action write-chars "/model {value}" --pane-id {pane}
+  zellij -s {session} action write 10 --pane-id {pane}
+  ```
+- New endpoint: `POST /api/session/{id}/model` with `{"model": "opus"}`
+- Only works when agent is idle (waiting for input). If mid-turn, command queues and applies after.
+
+**Files**: `dashboard/components/SessionCard.js`, `amc_server/mixins/state.py` (or new mixin)
--- a/plans/subagent-visibility.md
+++ b/plans/subagent-visibility.md
@@ -1,26 +1,27 @@
 # Subagent & Agent Team Visibility for AMC

 > **Status**: Draft
-> **Last Updated**: 2026-02-27
+> **Last Updated**: 2026-03-02

 ## Summary

-Add a button in the turn stats section showing the count of active subagents/team members. Clicking it opens a list with names and lifetime stats (time taken, tokens used). Mirrors Claude Code's own agent display.
+Add visibility into Claude Code subagents (Task tool spawns and team members) within AMC session cards. A pill button shows active agent count; clicking opens a popover with names, status, and stats. Claude-only (Codex does not support subagents).

 ---

 ## User Workflow

 1. User views a session card in AMC
-2. Turn stats area shows: `2h 15m | 84k tokens | 3 agents`
+2. Session status area shows: `[●] Working 2m 15s · 42k tokens     32% ctx  [3 agents]`
 3. User clicks "3 agents" button
-4. List opens showing:
+4. Popover opens showing:
   ```
-   claude-code-guide (running)    12m    42,000 tokens
-   Explore (completed)             3m    18,500 tokens
-   Explore (completed)             5m    23,500 tokens
+   Explore-a250de    ● running     12m    42,000 tokens
+   code-reviewer     ○ completed    3m    18,500 tokens
+   action-wirer      ○ completed    5m    23,500 tokens
   ```
-5. List updates in real-time as agents complete
+5. Popover auto-updates every 2s while open
+6. Button hidden when session has no subagents

 ---

@@ -28,24 +29,505 @@ Add a button in the turn stats section showing the count of active subagents/tea

 ### Discovery

- **AC-1**: Subagent JSONL files discovered at `{session_dir}/subagents/agent-*.jsonl`
- **AC-2**: Both regular subagents (Task tool) and team members (Task with `team_name`) are discovered from same location
+- **AC-1**: Subagent JSONL files discovered for Claude sessions at `{claude_projects}/{encoded_project_dir}/{session_id}/subagents/agent-*.jsonl`
+- **AC-2**: Team members discovered from same location (team spawning uses Task tool, stores in subagents dir)
+- **AC-3**: Codex sessions do not show subagent button (Codex does not support subagents)

 ### Status Detection

- **AC-3**: Subagent is "running" if: parent session is alive AND last assistant entry has `stop_reason != "end_turn"`
- **AC-4**: Subagent is "completed" if: last assistant entry has `stop_reason == "end_turn"` OR parent session is dead
+- **AC-4**: Subagent is "running" if parent session is not dead AND last assistant entry has `stop_reason != "end_turn"`
+- **AC-5**: Subagent is "completed" if last assistant entry has `stop_reason == "end_turn"` OR parent session is dead
+
+### Name Resolution
+
+- **AC-6**: Team member names extracted from agentId format `{name}@{team_name}` (O(1) string split)
+- **AC-7**: Non-team subagent names generated as `agent-{agentId_prefix}` (no parent session parsing required)

 ### Stats Extraction

- **AC-5**: Subagent name extracted from parent's Task tool invocation: use `name` if present (team member), else `subagent_type`
- **AC-6**: Lifetime duration = first entry timestamp to last entry timestamp (or now if running)
- **AC-7**: Lifetime tokens = sum of all assistant entries' `usage.input_tokens + usage.output_tokens`
+- **AC-8**: Duration = first entry timestamp to last entry timestamp (or server time if running)
+- **AC-9**: Tokens = sum of `input_tokens + output_tokens` from all assistant entries (excludes cache tokens)
+
+### API
+
+- **AC-10**: `/api/state` includes `subagent_count` and `subagent_running_count` for each Claude session
+- **AC-11**: New endpoint `/api/sessions/{id}/subagents` returns full subagent list with name, status, duration_ms, tokens
+- **AC-12**: Subagent endpoint supports session_id path param; returns 404 if session not found

 ### UI

- **AC-8**: Turn stats area shows agent count button when subagents exist
- **AC-9**: Button shows count + running indicator (e.g., "3 agents" or "2 agents (1 running)")
- **AC-10**: Clicking button opens popover with: name, status, duration, token count
- **AC-11**: Running agents show activity indicator
- **AC-12**: List updates via existing polling/SSE
+- **AC-13**: Context usage displays as plain text (remove badge styling)
+- **AC-14**: Agent count button appears as bordered pill to the right of context text
+- **AC-15**: Button hidden when `subagent_count == 0`
+- **AC-16**: Button shows running indicator: "3 agents" when none running, "3 agents (1 running)" when some running
+- **AC-17**: Clicking button opens popover anchored to button
+- **AC-18**: Popover shows list: name, status indicator, duration, token count per row
+- **AC-19**: Running agents show filled indicator (●), completed show empty (○)
+- **AC-20**: Popover polls `/api/sessions/{id}/subagents` every 2s while open
+- **AC-21**: Popover closes on outside click or Escape key
+- **AC-22**: Subagent rows are display-only (no click action in v1)
+
+---
+
+## Architecture
+
+### Why This Structure
+
+| Decision | Rationale | Fulfills |
+|----------|-----------|----------|
+| Aggregate counts in `/api/state` + detail endpoint | Minimizes payload size; hash stability (counts change less than durations) | AC-10, AC-11 |
+| Claude-only | Codex lacks subagent infrastructure | AC-3 |
+| Name from agentId pattern | Avoids expensive parent session parsing; team names encoded in agentId | AC-6, AC-7 |
+| Input+output tokens only | Matches "work done" mental model; simpler than cache tracking | AC-9 |
+| Auto-poll in popover | Real-time feel consistent with session card updates | AC-20 |
+| Hide button when empty | Reduces visual noise for sessions without agents | AC-15 |
+
+### Data Flow
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│ Backend (Python)                                                │
+│                                                                 │
+│  _collect_sessions()                                            │
+│       │                                                         │
+│       ├── For each Claude session:                              │
+│       │     └── _count_subagents(session_id, project_dir)       │
+│       │           ├── glob subagents/agent-*.jsonl              │
+│       │           ├── count files, check running status         │
+│       │           └── return (count, running_count)             │
+│       │                                                         │
+│       └── Attach subagent_count, subagent_running_count         │
+│                                                                 │
+│  _serve_subagents(session_id)                                   │
+│       ├── _get_claude_session_dir(session_id, project_dir)      │
+│       ├── glob subagents/agent-*.jsonl                          │
+│       ├── For each file:                                        │
+│       │     ├── Parse name from agentId                         │
+│       │     ├── Determine status from stop_reason               │
+│       │     ├── Calculate duration from timestamps              │
+│       │     └── Sum tokens from assistant usage                 │
+│       └── Return JSON list                                      │
+│                                                                 │
+└─────────────────────────────────────────────────────────────────┘
+
+┌─────────────────────────────────────────────────────────────────┐
+│ Frontend (Preact)                                               │
+│                                                                 │
+│  SessionCard                                                    │
+│       │                                                         │
+│       ├── Session Status Area:                                  │
+│       │     ├── AgentActivityIndicator (left)                   │
+│       │     ├── Context text (center-right, plain)              │
+│       │     └── SubagentButton (far right, if count > 0)        │
+│       │                                                         │
+│       └── SubagentButton                                        │
+│             ├── Shows "{count} agents" or "{count} ({running})" │
+│             ├── onClick: opens SubagentPopover                  │
+│             └── SubagentPopover                                 │
+│                   ├── Polls /api/sessions/{id}/subagents        │
+│                   ├── Renders list with status indicators       │
+│                   └── Closes on outside click or Escape         │
+│                                                                 │
+└─────────────────────────────────────────────────────────────────┘
+```
+
+### File Changes
+
+| File | Change | ACs |
+|------|--------|-----|
+| `amc_server/mixins/subagent.py` | New mixin for subagent discovery and stats | AC-1,2,4-9 |
+| `amc_server/mixins/state.py` | Call subagent mixin, attach counts to session | AC-10 |
+| `amc_server/mixins/http.py` | Add route `/api/sessions/{id}/subagents` | AC-11,12 |
+| `amc_server/handler.py` | Add SubagentMixin to handler class | - |
+| `dashboard/components/SessionCard.js` | Update status area layout | AC-13,14 |
+| `dashboard/components/SubagentButton.js` | New component for button + popover | AC-15-22 |
+| `dashboard/utils/api.js` | Add `fetchSubagents(sessionId)` function | AC-20 |
+
+---
+
+## Implementation Specs
+
+### IMP-1: SubagentMixin (Python)
+
+**Fulfills:** AC-1, AC-2, AC-4, AC-5, AC-6, AC-7, AC-8, AC-9
+
+```python
+# amc_server/mixins/subagent.py
+
+class SubagentMixin:
+    def _get_subagent_counts(self, session_id: str, project_dir: str) -> tuple[int, int]:
+        """Return (total_count, running_count) for a Claude session."""
+        subagents_dir = self._get_subagents_dir(session_id, project_dir)
+        if not subagents_dir or not subagents_dir.exists():
+            return (0, 0)
+
+        total = 0
+        running = 0
+        for jsonl_file in subagents_dir.glob("agent-*.jsonl"):
+            total += 1
+            if self._is_subagent_running(jsonl_file):
+                running += 1
+        return (total, running)
+
+    def _get_subagents_dir(self, session_id: str, project_dir: str) -> Path | None:
+        """Construct path to subagents directory."""
+        if not project_dir:
+            return None
+        encoded_dir = project_dir.replace("/", "-")
+        if not encoded_dir.startswith("-"):
+            encoded_dir = "-" + encoded_dir
+        return CLAUDE_PROJECTS_DIR / encoded_dir / session_id / "subagents"
+
+    def _is_subagent_running(self, jsonl_file: Path) -> bool:
+        """Check if subagent is still running based on last assistant stop_reason."""
+        try:
+            # Read last few lines to find last assistant entry
+            entries = self._read_jsonl_tail_entries(jsonl_file, max_lines=20)
+            for entry in reversed(entries):
+                if entry.get("type") == "assistant":
+                    stop_reason = entry.get("message", {}).get("stop_reason")
+                    return stop_reason != "end_turn"
+            return True  # No assistant entries yet = still starting
+        except Exception:
+            return False
+
+    def _get_subagent_list(self, session_id: str, project_dir: str, parent_is_dead: bool) -> list[dict]:
+        """Return full subagent list with stats."""
+        subagents_dir = self._get_subagents_dir(session_id, project_dir)
+        if not subagents_dir or not subagents_dir.exists():
+            return []
+
+        result = []
+        for jsonl_file in subagents_dir.glob("agent-*.jsonl"):
+            subagent = self._parse_subagent(jsonl_file, parent_is_dead)
+            if subagent:
+                result.append(subagent)
+
+        # Sort: running first, then by name
+        result.sort(key=lambda s: (0 if s["status"] == "running" else 1, s["name"]))
+        return result
+
+    def _parse_subagent(self, jsonl_file: Path, parent_is_dead: bool) -> dict | None:
+        """Parse a single subagent JSONL file."""
+        try:
+            entries = self._read_jsonl_tail_entries(jsonl_file, max_lines=500, max_bytes=512*1024)
+            if not entries:
+                return None
+
+            # Get agentId from first entry
+            first_entry = entries[0] if entries else {}
+            agent_id = first_entry.get("agentId", "")
+
+            # Resolve name
+            name = self._resolve_subagent_name(agent_id, jsonl_file)
+
+            # Determine status
+            is_running = False
+            if not parent_is_dead:
+                for entry in reversed(entries):
+                    if entry.get("type") == "assistant":
+                        stop_reason = entry.get("message", {}).get("stop_reason")
+                        is_running = stop_reason != "end_turn"
+                        break
+            status = "running" if is_running else "completed"
+
+            # Calculate duration
+            first_ts = first_entry.get("timestamp")
+            last_ts = entries[-1].get("timestamp") if entries else None
+            duration_ms = self._calculate_duration_ms(first_ts, last_ts, is_running)
+
+            # Sum tokens
+            tokens = self._sum_assistant_tokens(entries)
+
+            return {
+                "name": name,
+                "status": status,
+                "duration_ms": duration_ms,
+                "tokens": tokens,
+            }
+        except Exception:
+            return None
+
+    def _resolve_subagent_name(self, agent_id: str, jsonl_file: Path) -> str:
+        """Extract display name from agentId or filename."""
+        # Team members: "reviewer-wcja@surgical-sync" -> "reviewer-wcja"
+        if "@" in agent_id:
+            return agent_id.split("@")[0]
+
+        # Regular subagents: use prefix from agentId
+        # agent_id like "a250dec6325c589be" -> "a250de"
+        prefix = agent_id[:6] if agent_id else "agent"
+
+        # Try to get subagent_type from filename if it contains it
+        # Filename: agent-acompact-b857538cac0d5172.jsonl -> might indicate "compact"
+        # For now, use generic fallback
+        return f"agent-{prefix}"
+
+    def _calculate_duration_ms(self, first_ts: str, last_ts: str, is_running: bool) -> int:
+        """Calculate duration in milliseconds."""
+        if not first_ts:
+            return 0
+        try:
+            first = datetime.fromisoformat(first_ts.replace("Z", "+00:00"))
+            if is_running:
+                end = datetime.now(timezone.utc)
+            elif last_ts:
+                end = datetime.fromisoformat(last_ts.replace("Z", "+00:00"))
+            else:
+                return 0
+            return max(0, int((end - first).total_seconds() * 1000))
+        except Exception:
+            return 0
+
+    def _sum_assistant_tokens(self, entries: list[dict]) -> int:
+        """Sum input_tokens + output_tokens from all assistant entries."""
+        total = 0
+        for entry in entries:
+            if entry.get("type") != "assistant":
+                continue
+            usage = entry.get("message", {}).get("usage", {})
+            input_tok = usage.get("input_tokens", 0) or 0
+            output_tok = usage.get("output_tokens", 0) or 0
+            total += input_tok + output_tok
+        return total
+```
+
+### IMP-2: State Integration (Python)
+
+**Fulfills:** AC-10
+
+```python
+# In amc_server/mixins/state.py, within _collect_sessions():
+
+# After computing is_dead, add:
+if data.get("agent") == "claude":
+    subagent_count, subagent_running = self._get_subagent_counts(
+        data.get("session_id", ""),
+        data.get("project_dir", "")
+    )
+    if subagent_count > 0:
+        data["subagent_count"] = subagent_count
+        data["subagent_running_count"] = subagent_running
+```
+
+### IMP-3: Subagents Endpoint (Python)
+
+**Fulfills:** AC-11, AC-12
+
+```python
+# In amc_server/mixins/http.py, add route handling:
+
+def _route_request(self):
+    # ... existing routes ...
+
+    # /api/sessions/{id}/subagents
+    subagent_match = re.match(r"^/api/sessions/([^/]+)/subagents$", self.path)
+    if subagent_match:
+        session_id = subagent_match.group(1)
+        self._serve_subagents(session_id)
+        return
+
+def _serve_subagents(self, session_id):
+    """Serve subagent list for a specific session."""
+    # Find session to get project_dir and is_dead
+    session_file = SESSIONS_DIR / f"{session_id}.json"
+    if not session_file.exists():
+        self._send_json(404, {"error": "Session not found"})
+        return
+
+    try:
+        session_data = json.loads(session_file.read_text())
+    except (json.JSONDecodeError, OSError):
+        self._send_json(404, {"error": "Session not found"})
+        return
+
+    if session_data.get("agent") != "claude":
+        self._send_json(200, {"subagents": []})
+        return
+
+    parent_is_dead = session_data.get("is_dead", False)
+    subagents = self._get_subagent_list(
+        session_id,
+        session_data.get("project_dir", ""),
+        parent_is_dead
+    )
+    self._send_json(200, {"subagents": subagents})
+```
+
+### IMP-4: SubagentButton Component (JavaScript)
+
+**Fulfills:** AC-14, AC-15, AC-16, AC-17, AC-18, AC-19, AC-20, AC-21, AC-22
+
+```javascript
+// dashboard/components/SubagentButton.js
+
+import { html, useState, useEffect, useRef } from '../lib/preact.js';
+import { fetchSubagents } from '../utils/api.js';
+
+export function SubagentButton({ sessionId, count, runningCount }) {
+  const [isOpen, setIsOpen] = useState(false);
+  const [subagents, setSubagents] = useState([]);
+  const buttonRef = useRef(null);
+  const popoverRef = useRef(null);
+
+  // Format button label
+  const label = runningCount > 0
+    ? `${count} agents (${runningCount} running)`
+    : `${count} agents`;
+
+  // Poll while open
+  useEffect(() => {
+    if (!isOpen) return;
+
+    const fetchData = async () => {
+      const data = await fetchSubagents(sessionId);
+      if (data?.subagents) {
+        setSubagents(data.subagents);
+      }
+    };
+
+    fetchData();
+    const interval = setInterval(fetchData, 2000);
+    return () => clearInterval(interval);
+  }, [isOpen, sessionId]);
+
+  // Close on outside click or Escape
+  useEffect(() => {
+    if (!isOpen) return;
+
+    const handleClickOutside = (e) => {
+      if (popoverRef.current && !popoverRef.current.contains(e.target) &&
+          buttonRef.current && !buttonRef.current.contains(e.target)) {
+        setIsOpen(false);
+      }
+    };
+
+    const handleEscape = (e) => {
+      if (e.key === 'Escape') setIsOpen(false);
+    };
+
+    document.addEventListener('mousedown', handleClickOutside);
+    document.addEventListener('keydown', handleEscape);
+    return () => {
+      document.removeEventListener('mousedown', handleClickOutside);
+      document.removeEventListener('keydown', handleEscape);
+    };
+  }, [isOpen]);
+
+  const formatDuration = (ms) => {
+    const sec = Math.floor(ms / 1000);
+    if (sec < 60) return `${sec}s`;
+    const min = Math.floor(sec / 60);
+    return `${min}m`;
+  };
+
+  const formatTokens = (count) => {
+    if (count >= 1000) return `${(count / 1000).toFixed(1)}k`;
+    return String(count);
+  };
+
+  return html`
+    <div class="relative">
+      <button
+        ref=${buttonRef}
+        onClick=${() => setIsOpen(!isOpen)}
+        class="rounded-lg border border-selection/80 bg-bg/45 px-2.5 py-1 font-mono text-label text-dim hover:border-starting/50 hover:text-bright transition-colors"
+      >
+        ${label}
+      </button>
+
+      ${isOpen && html`
+        <div
+          ref=${popoverRef}
+          class="absolute right-0 top-full mt-2 z-50 min-w-[280px] rounded-lg border border-selection/80 bg-surface shadow-lg"
+        >
+          <div class="p-2">
+            ${subagents.length === 0 ? html`
+              <div class="text-center text-dim text-sm py-4">Loading...</div>
+            ` : subagents.map(agent => html`
+              <div class="flex items-center gap-3 px-3 py-2 rounded hover:bg-bg/40">
+                <span class="w-2 h-2 rounded-full ${agent.status === 'running' ? 'bg-active' : 'border border-dim'}"></span>
+                <span class="flex-1 font-mono text-sm text-bright truncate">${agent.name}</span>
+                <span class="font-mono text-label text-dim">${formatDuration(agent.duration_ms)}</span>
+                <span class="font-mono text-label text-dim">${formatTokens(agent.tokens)}</span>
+              </div>
+            `)}
+          </div>
+        </div>
+      `}
+    </div>
+  `;
+}
+```
+
+### IMP-5: SessionCard Status Area Update (JavaScript)
+
+**Fulfills:** AC-13, AC-14, AC-15
+
+```javascript
+// In dashboard/components/SessionCard.js, update the Session Status Area:
+
+// Replace the contextUsage badge with plain text + SubagentButton
+
+<!-- Session Status Area -->
+<div class="flex items-center justify-between gap-3 px-4 py-2 border-b border-selection/50 bg-bg/60">
+  <${AgentActivityIndicator} session=${session} />
+  <div class="flex items-center gap-3">
+    ${contextUsage && html`
+      <span class="font-mono text-label text-dim" title=${contextUsage.title}>
+        ${contextUsage.headline}
+      </span>
+    `}
+    ${session.subagent_count > 0 && session.agent === 'claude' && html`
+      <${SubagentButton}
+        sessionId=${session.session_id}
+        count=${session.subagent_count}
+        runningCount=${session.subagent_running_count || 0}
+      />
+    `}
+  </div>
+</div>
+```
+
+### IMP-6: API Function (JavaScript)
+
+**Fulfills:** AC-20
+
+```javascript
+// In dashboard/utils/api.js, add:
+
+export async function fetchSubagents(sessionId) {
+  try {
+    const response = await fetch(`/api/sessions/${sessionId}/subagents`);
+    if (!response.ok) return null;
+    return await response.json();
+  } catch (e) {
+    console.error('Failed to fetch subagents:', e);
+    return null;
+  }
+}
+```
+
+---
+
+## Rollout Slices
+
+### Slice 1: Backend Discovery (AC-1, AC-2, AC-4, AC-5, AC-6, AC-7, AC-8, AC-9, AC-10)
+- Create `amc_server/mixins/subagent.py` with discovery and stats logic
+- Integrate into `state.py` to add counts to session payload
+- Unit tests for name resolution, status detection, token summing
+
+### Slice 2: Backend Endpoint (AC-11, AC-12)
+- Add `/api/sessions/{id}/subagents` route
+- Return 404 for missing sessions, empty list for Codex
+- Integration test with real session data
+
+### Slice 3: Frontend Button (AC-13, AC-14, AC-15, AC-16)
+- Update SessionCard status area layout
+- Create SubagentButton component with label logic
+- Test: button shows when count > 0, hidden when 0
+
+### Slice 4: Frontend Popover (AC-17, AC-18, AC-19, AC-20, AC-21, AC-22)
+- Add popover with polling
+- Style running/completed indicators
+- Test: popover opens, polls, closes on outside click/Escape