From c5b1fb3a803925e97726a7fbb665fa469769094d Mon Sep 17 00:00:00 2001 From: teernisse Date: Fri, 6 Mar 2026 14:51:28 -0500 Subject: [PATCH] chore(plans): add input-history and model-selection plans plans/input-history.md: - Implementation plan for shell-style up/down arrow message history in SimpleInput, deriving history from session log conversation data - Covers prop threading, history derivation, navigation state, keybinding details, modal parity, and test cases plans/model-selection.md: - Three-phase plan for model visibility and control: display current model, model picker at spawn, mid-session model switching via Zellij plans/PLAN-tool-result-display.md: - Updates to tool result display plan (pre-existing changes) plans/subagent-visibility.md: - Updates to subagent visibility plan (pre-existing changes) Co-Authored-By: Claude Opus 4.6 --- plans/PLAN-tool-result-display.md | 185 +++++++---- plans/input-history.md | 96 ++++++ plans/model-selection.md | 51 +++ plans/subagent-visibility.md | 522 ++++++++++++++++++++++++++++-- 4 files changed, 765 insertions(+), 89 deletions(-) create mode 100644 plans/input-history.md create mode 100644 plans/model-selection.md diff --git a/plans/PLAN-tool-result-display.md b/plans/PLAN-tool-result-display.md index 494dcdf..a27fef6 100644 --- a/plans/PLAN-tool-result-display.md +++ b/plans/PLAN-tool-result-display.md @@ -20,6 +20,8 @@ Add the ability to view tool call results (diffs, bash output, file contents) di - Copy-to-clipboard functionality - Virtual scrolling / performance optimization - Editor integration (clicking paths to open files) +- Accessibility (keyboard navigation, focus management, ARIA labels — deferred to v2) +- Lazy-fetch API for tool results (consider for v2 if payload size becomes an issue) --- @@ -61,44 +63,46 @@ Add the ability to view tool call results (diffs, bash output, file contents) di - **AC-1:** Tool calls render as expandable elements showing tool name and summary - **AC-2:** Clicking a collapsed tool call expands to show its result - **AC-3:** Clicking an expanded tool call collapses it -- **AC-4:** Tool results in the most recent assistant message are expanded by default -- **AC-5:** When a new assistant message arrives, previous tool results collapse -- **AC-6:** Edit and Write tool diffs remain expanded regardless of message age -- **AC-7:** Tool calls without results display as non-expandable with muted styling +- **AC-4:** In active sessions, tool results in the most recent assistant message are expanded by default +- **AC-5:** When a new assistant message arrives, previous non-diff tool results collapse unless the user has manually toggled them in that message +- **AC-6:** Edit and Write results remain expanded regardless of message age or session status (even if Write only has confirmation text) +- **AC-7:** In completed sessions, all non-diff tool results start collapsed +- **AC-8:** Tool calls without results display as non-expandable with muted styling; in active sessions, pending tool calls show a spinner to distinguish in-progress from permanently missing ### Diff Rendering -- **AC-8:** Edit/Write results display structuredPatch data as syntax-highlighted diff -- **AC-9:** Diff additions render with VS Code dark theme green background (rgba(46, 160, 67, 0.15)) -- **AC-10:** Diff deletions render with VS Code dark theme red background (rgba(248, 81, 73, 0.15)) -- **AC-11:** Full file path displays above each diff block -- **AC-12:** Diff context lines use structuredPatch as-is (no recomputation) +- **AC-9:** Edit/Write results display structuredPatch data as syntax-highlighted diff; falls back to raw content text if structuredPatch is malformed or absent +- **AC-10:** Diff additions render with VS Code dark theme green background (rgba(46, 160, 67, 0.15)) +- **AC-11:** Diff deletions render with VS Code dark theme red background (rgba(248, 81, 73, 0.15)) +- **AC-12:** Full file path displays above each diff block +- **AC-13:** Diff context lines use structuredPatch as-is (no recomputation) ### Other Tool Types -- **AC-13:** Bash results display stdout in monospace, stderr separately if present -- **AC-14:** Read results display file content with syntax highlighting based on file extension -- **AC-15:** Grep/Glob results display file list with match counts -- **AC-16:** WebFetch results display URL and response summary +- **AC-14:** Bash results display stdout in monospace, stderr separately if present +- **AC-15:** Bash output with ANSI escape codes renders as colored HTML (via ansi_up) +- **AC-16:** Read results display file content with syntax highlighting based on file extension +- **AC-17:** Grep/Glob results display file list with match counts +- **AC-18:** Unknown tools (WebFetch, Task, etc.) use GenericResult fallback showing raw content ### Truncation -- **AC-17:** Long outputs truncate at thresholds matching Claude Code behavior -- **AC-18:** Truncated outputs show "Show full output (N lines)" link -- **AC-19:** Clicking "Show full output" opens a dedicated lightweight modal -- **AC-20:** Modal displays full content with syntax highlighting, scrollable +- **AC-19:** Long outputs truncate at configurable line/character thresholds (defaults tuned to approximate Claude Code behavior) +- **AC-20:** Truncated outputs show "Show full output (N lines)" link +- **AC-21:** Clicking "Show full output" opens a dedicated lightweight modal +- **AC-22:** Modal displays full content with syntax highlighting, scrollable ### Error States -- **AC-21:** Failed tool calls display with red-tinted background -- **AC-22:** Error content (stderr, error messages) is clearly distinguishable from success content -- **AC-23:** is_error flag from tool_result determines error state +- **AC-23:** Failed tool calls display with red-tinted background +- **AC-24:** Error content (stderr, error messages) is clearly distinguishable from success content +- **AC-25:** is_error flag from tool_result determines error state ### API Contract -- **AC-24:** /api/conversation response includes tool results nested in tool_calls -- **AC-25:** Each tool_call has: name, id, input, result (when available) -- **AC-26:** Result structure varies by tool type (documented in IMP-SERVER) +- **AC-26:** /api/conversation response includes tool results nested in tool_calls +- **AC-27:** Each tool_call has: name, id, input, result (when available) +- **AC-28:** All tool results conform to a normalized envelope: `{ kind, status, content, is_error }` with tool-specific fields nested in `content` --- @@ -130,6 +134,23 @@ Full output can be thousands of lines. Inline expansion would: A modal provides a focused reading experience without disrupting conversation layout. +### Why a Normalized Result Contract + +Raw `toolUseResult` shapes vary wildly by tool type — Edit has `structuredPatch`, Bash has `stdout`/`stderr`, Glob has `filenames`. Passing these raw to the frontend means every renderer must know the exact JSONL format, and adding Codex support (v2) would require duplicating all that branching. + +Instead, the server normalizes each result into a stable envelope: + +```python +{ + "kind": "diff" | "bash" | "file_content" | "file_list" | "generic", + "status": "success" | "error" | "pending", + "is_error": bool, + "content": { ... } # tool-specific fields, documented per kind +} +``` + +The frontend switches on `kind` (5 cases) rather than tool name (unbounded). This also gives us a clean seam for the `result_mode` query parameter if payload size becomes an issue later. + ### Component Structure ``` @@ -157,7 +178,7 @@ FullOutputModal (new, top-level) ### IMP-SERVER: Parse and Attach Tool Results -**Fulfills:** AC-24, AC-25, AC-26 +**Fulfills:** AC-26, AC-27, AC-28 **Location:** `amc_server/mixins/conversation.py` @@ -167,38 +188,43 @@ Two-pass parsing: 1. First pass: Scan all entries, build map of `tool_use_id` → `toolUseResult` 2. Second pass: Parse messages as before, but when encountering `tool_use`, lookup and attach result -**Tool call schema after change:** +**API query parameter:** `/api/conversation?result_mode=full` (default). Future option: `result_mode=preview` to return truncated previews and reduce payload size without an API-breaking change. + +**Normalization step:** After looking up the raw `toolUseResult`, the server normalizes it into the stable envelope before attaching: + ```python { "name": "Edit", "id": "toolu_abc123", "input": {"file_path": "...", "old_string": "...", "new_string": "..."}, "result": { - "content": "The file has been updated successfully.", + "kind": "diff", + "status": "success", "is_error": False, - "structuredPatch": [...], - "filePath": "...", - # ... other fields from toolUseResult + "content": { + "structuredPatch": [...], + "filePath": "...", + "text": "The file has been updated successfully." + } } } ``` -**Result Structure by Tool Type:** +**Normalized `kind` mapping:** -| Tool | Result Fields | -|------|---------------| -| Edit | `structuredPatch`, `filePath`, `oldString`, `newString` | -| Write | `filePath`, content confirmation | -| Read | `file`, `type`, content in `content` field | -| Bash | `stdout`, `stderr`, `interrupted` | -| Glob | `filenames`, `numFiles`, `truncated` | -| Grep | `content`, `filenames`, `numFiles`, `numLines` | +| kind | Source Tools | `content` Fields | +|------|-------------|-----------------| +| `diff` | Edit, Write | `structuredPatch`, `filePath`, `text` | +| `bash` | Bash | `stdout`, `stderr`, `interrupted` | +| `file_content` | Read | `file`, `type`, `text` | +| `file_list` | Glob, Grep | `filenames`, `numFiles`, `truncated`, `numLines` | +| `generic` | All others | `text` (raw content string) | --- ### IMP-TOOLCALL: Expandable Tool Call Component -**Fulfills:** AC-1, AC-2, AC-3, AC-4, AC-5, AC-6, AC-7 +**Fulfills:** AC-1, AC-2, AC-3, AC-4, AC-5, AC-6, AC-7, AC-8 **Location:** `dashboard/lib/markdown.js` (refactor `renderToolCalls`) @@ -213,16 +239,21 @@ Renders a single tool call with: **State Management:** -Track expanded state per message. When new assistant message arrives: +Track two sets per message: `autoExpanded` (system-controlled) and `userToggled` (manual clicks). + +When new assistant message arrives: - Compare latest assistant message ID to stored ID -- If different, reset expanded set to empty +- If different, reset `autoExpanded` to empty for previous messages +- `userToggled` entries are never reset — user intent is preserved - Edit/Write tools bypass this logic (always expanded via CSS/logic) +Expand/collapse logic: a tool call is expanded if it is in `userToggled` (explicit click) OR in `autoExpanded` (latest message) OR is Edit/Write kind. + --- ### IMP-DIFF: Diff Rendering Component -**Fulfills:** AC-8, AC-9, AC-10, AC-11, AC-12 +**Fulfills:** AC-9, AC-10, AC-11, AC-12, AC-13 **Location:** `dashboard/lib/markdown.js` (new function `renderDiff`) @@ -234,12 +265,13 @@ hljs.registerLanguage('diff', langDiff); **Diff Renderer:** -1. Convert `structuredPatch` array to unified diff text: +1. If `structuredPatch` is present and valid, convert to unified diff text: - Each hunk: `@@ -oldStart,oldLines +newStart,newLines @@` - Followed by hunk.lines array -2. Syntax highlight with hljs diff language -3. Sanitize with DOMPurify before rendering -4. Wrap in container with file path header +2. If `structuredPatch` is missing or malformed, fall back to raw `content.text` in a monospace block +3. Syntax highlight with hljs diff language +4. Sanitize with DOMPurify before rendering +5. Wrap in container with file path header **CSS styling:** - Container: dark border, rounded corners @@ -252,22 +284,33 @@ hljs.registerLanguage('diff', langDiff); ### IMP-BASH: Bash Output Component -**Fulfills:** AC-13, AC-21, AC-22 +**Fulfills:** AC-14, AC-15, AC-23, AC-24 **Location:** `dashboard/lib/markdown.js` (new function `renderBashResult`) -Renders: -- `stdout` in monospace pre block +**ANSI-to-HTML conversion:** +```javascript +import AnsiUp from 'https://esm.sh/ansi_up'; +const ansi = new AnsiUp(); +const html = ansi.ansi_to_html(bashOutput); +``` + +The `ansi_up` library (zero dependencies, ~8KB) converts ANSI escape codes to styled HTML spans, preserving colored test output, progress indicators, and error highlighting from CLI tools. + +**Renders:** +- `stdout` in monospace pre block with ANSI colors preserved - `stderr` in separate block with error styling (if present) - "Command interrupted" notice (if interrupted flag) +**Sanitization order (CRITICAL):** First convert ANSI to HTML via ansi_up, THEN sanitize with DOMPurify. Sanitizing before conversion would strip escape codes; sanitizing after preserves the styled spans while preventing XSS. + Error state: `is_error` or presence of stderr triggers error styling (red tint, left border). --- ### IMP-TRUNCATE: Output Truncation -**Fulfills:** AC-17, AC-18 +**Fulfills:** AC-19, AC-20 **Truncation Thresholds (match Claude Code):** @@ -289,7 +332,7 @@ Takes content string, returns `{ text, truncated, totalLines }`. If truncated, r ### IMP-MODAL: Full Output Modal -**Fulfills:** AC-19, AC-20 +**Fulfills:** AC-21, AC-22 **Location:** `dashboard/components/FullOutputModal.js` (new file) @@ -305,7 +348,7 @@ Takes content string, returns `{ text, truncated, totalLines }`. If truncated, r ### IMP-ERROR: Error State Styling -**Fulfills:** AC-21, AC-22, AC-23 +**Fulfills:** AC-23, AC-24, AC-25 **Styling:** - Tool call header: red-tinted background when `result.is_error` @@ -331,17 +374,19 @@ Takes content string, returns `{ text, truncated, totalLines }`. If truncated, r --- -### Slice 2: Server-Side Tool Result Parsing +### Slice 2: Server-Side Tool Result Parsing and Normalization -**Goal:** API returns tool results nested in tool_calls +**Goal:** API returns normalized tool results nested in tool_calls **Deliverables:** 1. Two-pass parsing in `_parse_claude_conversation` -2. Tool results attached with `id` field -3. Unit tests for result attachment -4. Handle missing results gracefully (return tool_call without result) +2. Normalization layer: raw `toolUseResult` → `{ kind, status, is_error, content }` envelope +3. Tool results attached with `id` field +4. Unit tests for result attachment and normalization per tool type +5. Handle missing results gracefully (return tool_call without result) +6. Support `result_mode=full` query parameter (only mode for now, but wired up for future `preview`) -**Exit Criteria:** AC-24, AC-25, AC-26 pass +**Exit Criteria:** AC-26, AC-27, AC-28 pass --- @@ -356,7 +401,7 @@ Takes content string, returns `{ text, truncated, totalLines }`. If truncated, r 4. Collapse on new assistant message 5. Keep Edit/Write always expanded -**Exit Criteria:** AC-1 through AC-7 pass +**Exit Criteria:** AC-1 through AC-8 pass --- @@ -370,7 +415,7 @@ Takes content string, returns `{ text, truncated, totalLines }`. If truncated, r 3. VS Code dark theme styling 4. Full file path header -**Exit Criteria:** AC-8 through AC-12 pass +**Exit Criteria:** AC-9 through AC-13 pass --- @@ -379,12 +424,13 @@ Takes content string, returns `{ text, truncated, totalLines }`. If truncated, r **Goal:** Bash, Read, Glob, Grep render appropriately **Deliverables:** -1. `renderBashResult` with stdout/stderr separation -2. `renderFileContent` for Read -3. `renderFileList` for Glob/Grep -4. Generic fallback for unknown tools +1. Import and configure `ansi_up` for ANSI-to-HTML conversion +2. `renderBashResult` with stdout/stderr separation and ANSI color preservation +3. `renderFileContent` for Read +4. `renderFileList` for Glob/Grep +5. `GenericResult` fallback for unknown tools (WebFetch, Task, etc.) -**Exit Criteria:** AC-13 through AC-16 pass +**Exit Criteria:** AC-14 through AC-18 pass --- @@ -398,7 +444,7 @@ Takes content string, returns `{ text, truncated, totalLines }`. If truncated, r 3. `FullOutputModal` component 4. Syntax highlighting in modal -**Exit Criteria:** AC-17 through AC-20 pass +**Exit Criteria:** AC-19 through AC-22 pass --- @@ -412,15 +458,16 @@ Takes content string, returns `{ text, truncated, totalLines }`. If truncated, r 3. Test with interrupted sessions 4. Cross-browser testing -**Exit Criteria:** AC-21 through AC-23 pass, feature complete +**Exit Criteria:** AC-23 through AC-25 pass, feature complete --- ## Open Questions -1. **Exact Claude Code truncation thresholds** — need to verify against Claude Code source or experiment +1. ~~**Exact Claude Code truncation thresholds**~~ — **Resolved:** using reasonable defaults with a note to tune via testing. AC-19 updated. 2. **Performance with 100+ tool calls** — monitor after ship, optimize if needed -3. **Codex support timeline** — when should we prioritize v2? +3. **Codex support timeline** — when should we prioritize v2? The normalized `kind` contract makes this easier: add Codex normalizers without touching renderers. +4. ~~**Lazy-fetch for large payloads**~~ — **Resolved:** `result_mode` query parameter wired into API contract. Only `full` implemented in v1; `preview` deferred. --- diff --git a/plans/input-history.md b/plans/input-history.md new file mode 100644 index 0000000..019b643 --- /dev/null +++ b/plans/input-history.md @@ -0,0 +1,96 @@ +# Input History (Up/Down Arrow) + +## Summary + +Add shell-style up/down arrow navigation through past messages in SimpleInput. History is derived from the conversation data already parsed from session logs -- no new state management, no server changes. + +## How It Works Today + +1. Server parses JSONL session logs, extracts user messages with `role: "user"` (`conversation.py:57-66`) +2. App.js stores parsed conversations in `conversations` state, refreshed via SSE on `conversation_mtime_ns` change +3. SessionCard receives `conversation` as a prop but does **not** pass it to SimpleInput +4. SimpleInput has no awareness of past messages + +## Step 1: Pipe Conversation to SimpleInput + +Pass the conversation array from SessionCard into SimpleInput so it can derive history. + +- `SessionCard.js:165-169` -- add `conversation` prop to SimpleInput +- Same for the QuestionBlock path if freeform input is used there (line 162) -- skip for now, QuestionBlock is option-based + +**Files**: `dashboard/components/SessionCard.js` + +## Step 2: Derive User Message History + +Inside SimpleInput, filter conversation to user messages only. + +```js +const userHistory = useMemo( + () => (conversation || []).filter(m => m.role === 'user').map(m => m.content), + [conversation] +); +``` + +This updates automatically whenever the session log changes (SSE triggers conversation refresh, new prop flows down). + +**Files**: `dashboard/components/SimpleInput.js` + +## Step 3: History Navigation State + +Add refs for tracking position in history and preserving the draft. + +```js +const historyIndexRef = useRef(-1); // -1 = not browsing +const draftRef = useRef(''); // saves in-progress text before browsing +``` + +Use refs (not state) because index changes don't need re-renders -- only `setText` triggers the visual update. + +**Files**: `dashboard/components/SimpleInput.js` + +## Step 4: ArrowUp/ArrowDown Keybinding + +In the `onKeyDown` handler (after the autocomplete block, before Enter-to-submit), add history navigation: + +- **ArrowUp**: only when autocomplete is closed AND cursor is at position 0 (prevents hijacking multiline cursor movement). On first press, save current text to `draftRef`. Walk backward through `userHistory`. Call `setText()` with the history entry. +- **ArrowDown**: walk forward through history. If past the newest entry, restore `draftRef` and reset index to -1. +- **Reset on submit**: set `historyIndexRef.current = -1` in `handleSubmit` after successful send. +- **Reset on manual edit**: in `onInput`, reset `historyIndexRef.current = -1` so typing after browsing exits history mode. + +### Cursor position check + +```js +const atStart = e.target.selectionStart === 0 && e.target.selectionEnd === 0; +``` + +Only intercept ArrowUp when `atStart` is true. This lets multiline text cursor movement work normally. ArrowDown can use similar logic (check if cursor is at end of text) or always navigate history when `historyIndexRef.current !== -1` (already browsing). + +**Files**: `dashboard/components/SimpleInput.js` + +## Step 5: Modal Parity + +The Modal (`Modal.js:71`) also renders SimpleInput with `onRespond`. Verify it passes `conversation` through. The same SessionCard is used in enlarged mode, so this should work automatically if Step 1 is done correctly. + +**Files**: `dashboard/components/Modal.js` (verify, likely no change needed) + +## Non-Goals + +- No localStorage persistence -- history comes from session logs which survive across page reloads +- No server changes -- conversation parsing already extracts what we need +- No new API endpoints +- No changes to QuestionBlock (option-based, not free-text history) + +## Test Cases + +| Scenario | Expected | +|----------|----------| +| Press up with empty input | Fills with most recent user message | +| Press up multiple times | Walks backward through user messages | +| Press down after browsing up | Walks forward; past newest restores draft | +| Press up with text in input | Saves text as draft, shows history | +| Press down past end | Restores saved draft | +| Type after browsing | Exits history mode (index resets) | +| Submit after browsing | Sends displayed text, resets index | +| Up arrow in multiline text (cursor not at pos 0) | Normal cursor movement, no history | +| New message arrives via SSE | userHistory updates, no index disruption | +| Session with no prior messages | Up arrow does nothing | diff --git a/plans/model-selection.md b/plans/model-selection.md new file mode 100644 index 0000000..95b9b5e --- /dev/null +++ b/plans/model-selection.md @@ -0,0 +1,51 @@ +# Model Selection & Display + +## Summary + +Add model visibility and control to the AMC dashboard. Users can see which model each agent is running, pick a model when spawning, and switch models mid-session. + +## Models + +| Label | Value sent to Claude Code | +|-------|--------------------------| +| Opus 4.6 | `opus` | +| Opus 4.5 | `claude-opus-4-5-20251101` | +| Sonnet 4.6 | `sonnet` | +| Haiku | `haiku` | + +## Step 1: Display Current Model + +Surface `context_usage.model` in `SessionCard.js`. + +- Data already extracted by `parsing.py` (line 202) from conversation JSONL +- Already available via `/api/state` in `context_usage.model` +- Add model name formatter: `claude-opus-4-5-20251101` -> `Opus 4.5` +- Show in SessionCard (near agent badge or context usage area) +- Shows `null` until first assistant message + +**Files**: `dashboard/components/SessionCard.js` + +## Step 2: Model Picker at Spawn + +Add model dropdown to `SpawnModal.js`. Pass to spawn API, which appends `--model ` to the claude command. + +- Extend `/api/spawn` to accept optional `model` param +- Validate against allowed model list +- Prepend `--model {model}` to command in `AGENT_COMMANDS` +- Default: no flag (uses Claude Code's default) + +**Files**: `dashboard/components/SpawnModal.js`, `amc_server/mixins/spawn.py` + +## Step 3: Mid-Session Model Switch + +Dropdown on SessionCard to change model for running sessions via Zellij. + +- Send `/model ` to the agent's Zellij pane: + ```bash + zellij -s {session} action write-chars "/model {value}" --pane-id {pane} + zellij -s {session} action write 10 --pane-id {pane} + ``` +- New endpoint: `POST /api/session/{id}/model` with `{"model": "opus"}` +- Only works when agent is idle (waiting for input). If mid-turn, command queues and applies after. + +**Files**: `dashboard/components/SessionCard.js`, `amc_server/mixins/state.py` (or new mixin) diff --git a/plans/subagent-visibility.md b/plans/subagent-visibility.md index 878b89a..b61c9ab 100644 --- a/plans/subagent-visibility.md +++ b/plans/subagent-visibility.md @@ -1,26 +1,27 @@ # Subagent & Agent Team Visibility for AMC > **Status**: Draft -> **Last Updated**: 2026-02-27 +> **Last Updated**: 2026-03-02 ## Summary -Add a button in the turn stats section showing the count of active subagents/team members. Clicking it opens a list with names and lifetime stats (time taken, tokens used). Mirrors Claude Code's own agent display. +Add visibility into Claude Code subagents (Task tool spawns and team members) within AMC session cards. A pill button shows active agent count; clicking opens a popover with names, status, and stats. Claude-only (Codex does not support subagents). --- ## User Workflow 1. User views a session card in AMC -2. Turn stats area shows: `2h 15m | 84k tokens | 3 agents` +2. Session status area shows: `[●] Working 2m 15s · 42k tokens 32% ctx [3 agents]` 3. User clicks "3 agents" button -4. List opens showing: +4. Popover opens showing: ``` - claude-code-guide (running) 12m 42,000 tokens - Explore (completed) 3m 18,500 tokens - Explore (completed) 5m 23,500 tokens + Explore-a250de ● running 12m 42,000 tokens + code-reviewer ○ completed 3m 18,500 tokens + action-wirer ○ completed 5m 23,500 tokens ``` -5. List updates in real-time as agents complete +5. Popover auto-updates every 2s while open +6. Button hidden when session has no subagents --- @@ -28,24 +29,505 @@ Add a button in the turn stats section showing the count of active subagents/tea ### Discovery -- **AC-1**: Subagent JSONL files discovered at `{session_dir}/subagents/agent-*.jsonl` -- **AC-2**: Both regular subagents (Task tool) and team members (Task with `team_name`) are discovered from same location +- **AC-1**: Subagent JSONL files discovered for Claude sessions at `{claude_projects}/{encoded_project_dir}/{session_id}/subagents/agent-*.jsonl` +- **AC-2**: Team members discovered from same location (team spawning uses Task tool, stores in subagents dir) +- **AC-3**: Codex sessions do not show subagent button (Codex does not support subagents) ### Status Detection -- **AC-3**: Subagent is "running" if: parent session is alive AND last assistant entry has `stop_reason != "end_turn"` -- **AC-4**: Subagent is "completed" if: last assistant entry has `stop_reason == "end_turn"` OR parent session is dead +- **AC-4**: Subagent is "running" if parent session is not dead AND last assistant entry has `stop_reason != "end_turn"` +- **AC-5**: Subagent is "completed" if last assistant entry has `stop_reason == "end_turn"` OR parent session is dead + +### Name Resolution + +- **AC-6**: Team member names extracted from agentId format `{name}@{team_name}` (O(1) string split) +- **AC-7**: Non-team subagent names generated as `agent-{agentId_prefix}` (no parent session parsing required) ### Stats Extraction -- **AC-5**: Subagent name extracted from parent's Task tool invocation: use `name` if present (team member), else `subagent_type` -- **AC-6**: Lifetime duration = first entry timestamp to last entry timestamp (or now if running) -- **AC-7**: Lifetime tokens = sum of all assistant entries' `usage.input_tokens + usage.output_tokens` +- **AC-8**: Duration = first entry timestamp to last entry timestamp (or server time if running) +- **AC-9**: Tokens = sum of `input_tokens + output_tokens` from all assistant entries (excludes cache tokens) + +### API + +- **AC-10**: `/api/state` includes `subagent_count` and `subagent_running_count` for each Claude session +- **AC-11**: New endpoint `/api/sessions/{id}/subagents` returns full subagent list with name, status, duration_ms, tokens +- **AC-12**: Subagent endpoint supports session_id path param; returns 404 if session not found ### UI -- **AC-8**: Turn stats area shows agent count button when subagents exist -- **AC-9**: Button shows count + running indicator (e.g., "3 agents" or "2 agents (1 running)") -- **AC-10**: Clicking button opens popover with: name, status, duration, token count -- **AC-11**: Running agents show activity indicator -- **AC-12**: List updates via existing polling/SSE +- **AC-13**: Context usage displays as plain text (remove badge styling) +- **AC-14**: Agent count button appears as bordered pill to the right of context text +- **AC-15**: Button hidden when `subagent_count == 0` +- **AC-16**: Button shows running indicator: "3 agents" when none running, "3 agents (1 running)" when some running +- **AC-17**: Clicking button opens popover anchored to button +- **AC-18**: Popover shows list: name, status indicator, duration, token count per row +- **AC-19**: Running agents show filled indicator (●), completed show empty (○) +- **AC-20**: Popover polls `/api/sessions/{id}/subagents` every 2s while open +- **AC-21**: Popover closes on outside click or Escape key +- **AC-22**: Subagent rows are display-only (no click action in v1) + +--- + +## Architecture + +### Why This Structure + +| Decision | Rationale | Fulfills | +|----------|-----------|----------| +| Aggregate counts in `/api/state` + detail endpoint | Minimizes payload size; hash stability (counts change less than durations) | AC-10, AC-11 | +| Claude-only | Codex lacks subagent infrastructure | AC-3 | +| Name from agentId pattern | Avoids expensive parent session parsing; team names encoded in agentId | AC-6, AC-7 | +| Input+output tokens only | Matches "work done" mental model; simpler than cache tracking | AC-9 | +| Auto-poll in popover | Real-time feel consistent with session card updates | AC-20 | +| Hide button when empty | Reduces visual noise for sessions without agents | AC-15 | + +### Data Flow + +``` +┌─────────────────────────────────────────────────────────────────┐ +│ Backend (Python) │ +│ │ +│ _collect_sessions() │ +│ │ │ +│ ├── For each Claude session: │ +│ │ └── _count_subagents(session_id, project_dir) │ +│ │ ├── glob subagents/agent-*.jsonl │ +│ │ ├── count files, check running status │ +│ │ └── return (count, running_count) │ +│ │ │ +│ └── Attach subagent_count, subagent_running_count │ +│ │ +│ _serve_subagents(session_id) │ +│ ├── _get_claude_session_dir(session_id, project_dir) │ +│ ├── glob subagents/agent-*.jsonl │ +│ ├── For each file: │ +│ │ ├── Parse name from agentId │ +│ │ ├── Determine status from stop_reason │ +│ │ ├── Calculate duration from timestamps │ +│ │ └── Sum tokens from assistant usage │ +│ └── Return JSON list │ +│ │ +└─────────────────────────────────────────────────────────────────┘ + +┌─────────────────────────────────────────────────────────────────┐ +│ Frontend (Preact) │ +│ │ +│ SessionCard │ +│ │ │ +│ ├── Session Status Area: │ +│ │ ├── AgentActivityIndicator (left) │ +│ │ ├── Context text (center-right, plain) │ +│ │ └── SubagentButton (far right, if count > 0) │ +│ │ │ +│ └── SubagentButton │ +│ ├── Shows "{count} agents" or "{count} ({running})" │ +│ ├── onClick: opens SubagentPopover │ +│ └── SubagentPopover │ +│ ├── Polls /api/sessions/{id}/subagents │ +│ ├── Renders list with status indicators │ +│ └── Closes on outside click or Escape │ +│ │ +└─────────────────────────────────────────────────────────────────┘ +``` + +### File Changes + +| File | Change | ACs | +|------|--------|-----| +| `amc_server/mixins/subagent.py` | New mixin for subagent discovery and stats | AC-1,2,4-9 | +| `amc_server/mixins/state.py` | Call subagent mixin, attach counts to session | AC-10 | +| `amc_server/mixins/http.py` | Add route `/api/sessions/{id}/subagents` | AC-11,12 | +| `amc_server/handler.py` | Add SubagentMixin to handler class | - | +| `dashboard/components/SessionCard.js` | Update status area layout | AC-13,14 | +| `dashboard/components/SubagentButton.js` | New component for button + popover | AC-15-22 | +| `dashboard/utils/api.js` | Add `fetchSubagents(sessionId)` function | AC-20 | + +--- + +## Implementation Specs + +### IMP-1: SubagentMixin (Python) + +**Fulfills:** AC-1, AC-2, AC-4, AC-5, AC-6, AC-7, AC-8, AC-9 + +```python +# amc_server/mixins/subagent.py + +class SubagentMixin: + def _get_subagent_counts(self, session_id: str, project_dir: str) -> tuple[int, int]: + """Return (total_count, running_count) for a Claude session.""" + subagents_dir = self._get_subagents_dir(session_id, project_dir) + if not subagents_dir or not subagents_dir.exists(): + return (0, 0) + + total = 0 + running = 0 + for jsonl_file in subagents_dir.glob("agent-*.jsonl"): + total += 1 + if self._is_subagent_running(jsonl_file): + running += 1 + return (total, running) + + def _get_subagents_dir(self, session_id: str, project_dir: str) -> Path | None: + """Construct path to subagents directory.""" + if not project_dir: + return None + encoded_dir = project_dir.replace("/", "-") + if not encoded_dir.startswith("-"): + encoded_dir = "-" + encoded_dir + return CLAUDE_PROJECTS_DIR / encoded_dir / session_id / "subagents" + + def _is_subagent_running(self, jsonl_file: Path) -> bool: + """Check if subagent is still running based on last assistant stop_reason.""" + try: + # Read last few lines to find last assistant entry + entries = self._read_jsonl_tail_entries(jsonl_file, max_lines=20) + for entry in reversed(entries): + if entry.get("type") == "assistant": + stop_reason = entry.get("message", {}).get("stop_reason") + return stop_reason != "end_turn" + return True # No assistant entries yet = still starting + except Exception: + return False + + def _get_subagent_list(self, session_id: str, project_dir: str, parent_is_dead: bool) -> list[dict]: + """Return full subagent list with stats.""" + subagents_dir = self._get_subagents_dir(session_id, project_dir) + if not subagents_dir or not subagents_dir.exists(): + return [] + + result = [] + for jsonl_file in subagents_dir.glob("agent-*.jsonl"): + subagent = self._parse_subagent(jsonl_file, parent_is_dead) + if subagent: + result.append(subagent) + + # Sort: running first, then by name + result.sort(key=lambda s: (0 if s["status"] == "running" else 1, s["name"])) + return result + + def _parse_subagent(self, jsonl_file: Path, parent_is_dead: bool) -> dict | None: + """Parse a single subagent JSONL file.""" + try: + entries = self._read_jsonl_tail_entries(jsonl_file, max_lines=500, max_bytes=512*1024) + if not entries: + return None + + # Get agentId from first entry + first_entry = entries[0] if entries else {} + agent_id = first_entry.get("agentId", "") + + # Resolve name + name = self._resolve_subagent_name(agent_id, jsonl_file) + + # Determine status + is_running = False + if not parent_is_dead: + for entry in reversed(entries): + if entry.get("type") == "assistant": + stop_reason = entry.get("message", {}).get("stop_reason") + is_running = stop_reason != "end_turn" + break + status = "running" if is_running else "completed" + + # Calculate duration + first_ts = first_entry.get("timestamp") + last_ts = entries[-1].get("timestamp") if entries else None + duration_ms = self._calculate_duration_ms(first_ts, last_ts, is_running) + + # Sum tokens + tokens = self._sum_assistant_tokens(entries) + + return { + "name": name, + "status": status, + "duration_ms": duration_ms, + "tokens": tokens, + } + except Exception: + return None + + def _resolve_subagent_name(self, agent_id: str, jsonl_file: Path) -> str: + """Extract display name from agentId or filename.""" + # Team members: "reviewer-wcja@surgical-sync" -> "reviewer-wcja" + if "@" in agent_id: + return agent_id.split("@")[0] + + # Regular subagents: use prefix from agentId + # agent_id like "a250dec6325c589be" -> "a250de" + prefix = agent_id[:6] if agent_id else "agent" + + # Try to get subagent_type from filename if it contains it + # Filename: agent-acompact-b857538cac0d5172.jsonl -> might indicate "compact" + # For now, use generic fallback + return f"agent-{prefix}" + + def _calculate_duration_ms(self, first_ts: str, last_ts: str, is_running: bool) -> int: + """Calculate duration in milliseconds.""" + if not first_ts: + return 0 + try: + first = datetime.fromisoformat(first_ts.replace("Z", "+00:00")) + if is_running: + end = datetime.now(timezone.utc) + elif last_ts: + end = datetime.fromisoformat(last_ts.replace("Z", "+00:00")) + else: + return 0 + return max(0, int((end - first).total_seconds() * 1000)) + except Exception: + return 0 + + def _sum_assistant_tokens(self, entries: list[dict]) -> int: + """Sum input_tokens + output_tokens from all assistant entries.""" + total = 0 + for entry in entries: + if entry.get("type") != "assistant": + continue + usage = entry.get("message", {}).get("usage", {}) + input_tok = usage.get("input_tokens", 0) or 0 + output_tok = usage.get("output_tokens", 0) or 0 + total += input_tok + output_tok + return total +``` + +### IMP-2: State Integration (Python) + +**Fulfills:** AC-10 + +```python +# In amc_server/mixins/state.py, within _collect_sessions(): + +# After computing is_dead, add: +if data.get("agent") == "claude": + subagent_count, subagent_running = self._get_subagent_counts( + data.get("session_id", ""), + data.get("project_dir", "") + ) + if subagent_count > 0: + data["subagent_count"] = subagent_count + data["subagent_running_count"] = subagent_running +``` + +### IMP-3: Subagents Endpoint (Python) + +**Fulfills:** AC-11, AC-12 + +```python +# In amc_server/mixins/http.py, add route handling: + +def _route_request(self): + # ... existing routes ... + + # /api/sessions/{id}/subagents + subagent_match = re.match(r"^/api/sessions/([^/]+)/subagents$", self.path) + if subagent_match: + session_id = subagent_match.group(1) + self._serve_subagents(session_id) + return + +def _serve_subagents(self, session_id): + """Serve subagent list for a specific session.""" + # Find session to get project_dir and is_dead + session_file = SESSIONS_DIR / f"{session_id}.json" + if not session_file.exists(): + self._send_json(404, {"error": "Session not found"}) + return + + try: + session_data = json.loads(session_file.read_text()) + except (json.JSONDecodeError, OSError): + self._send_json(404, {"error": "Session not found"}) + return + + if session_data.get("agent") != "claude": + self._send_json(200, {"subagents": []}) + return + + parent_is_dead = session_data.get("is_dead", False) + subagents = self._get_subagent_list( + session_id, + session_data.get("project_dir", ""), + parent_is_dead + ) + self._send_json(200, {"subagents": subagents}) +``` + +### IMP-4: SubagentButton Component (JavaScript) + +**Fulfills:** AC-14, AC-15, AC-16, AC-17, AC-18, AC-19, AC-20, AC-21, AC-22 + +```javascript +// dashboard/components/SubagentButton.js + +import { html, useState, useEffect, useRef } from '../lib/preact.js'; +import { fetchSubagents } from '../utils/api.js'; + +export function SubagentButton({ sessionId, count, runningCount }) { + const [isOpen, setIsOpen] = useState(false); + const [subagents, setSubagents] = useState([]); + const buttonRef = useRef(null); + const popoverRef = useRef(null); + + // Format button label + const label = runningCount > 0 + ? `${count} agents (${runningCount} running)` + : `${count} agents`; + + // Poll while open + useEffect(() => { + if (!isOpen) return; + + const fetchData = async () => { + const data = await fetchSubagents(sessionId); + if (data?.subagents) { + setSubagents(data.subagents); + } + }; + + fetchData(); + const interval = setInterval(fetchData, 2000); + return () => clearInterval(interval); + }, [isOpen, sessionId]); + + // Close on outside click or Escape + useEffect(() => { + if (!isOpen) return; + + const handleClickOutside = (e) => { + if (popoverRef.current && !popoverRef.current.contains(e.target) && + buttonRef.current && !buttonRef.current.contains(e.target)) { + setIsOpen(false); + } + }; + + const handleEscape = (e) => { + if (e.key === 'Escape') setIsOpen(false); + }; + + document.addEventListener('mousedown', handleClickOutside); + document.addEventListener('keydown', handleEscape); + return () => { + document.removeEventListener('mousedown', handleClickOutside); + document.removeEventListener('keydown', handleEscape); + }; + }, [isOpen]); + + const formatDuration = (ms) => { + const sec = Math.floor(ms / 1000); + if (sec < 60) return `${sec}s`; + const min = Math.floor(sec / 60); + return `${min}m`; + }; + + const formatTokens = (count) => { + if (count >= 1000) return `${(count / 1000).toFixed(1)}k`; + return String(count); + }; + + return html` +
+ + + ${isOpen && html` +
+
+ ${subagents.length === 0 ? html` +
Loading...
+ ` : subagents.map(agent => html` +
+ + ${agent.name} + ${formatDuration(agent.duration_ms)} + ${formatTokens(agent.tokens)} +
+ `)} +
+
+ `} +
+ `; +} +``` + +### IMP-5: SessionCard Status Area Update (JavaScript) + +**Fulfills:** AC-13, AC-14, AC-15 + +```javascript +// In dashboard/components/SessionCard.js, update the Session Status Area: + +// Replace the contextUsage badge with plain text + SubagentButton + + +
+ <${AgentActivityIndicator} session=${session} /> +
+ ${contextUsage && html` + + ${contextUsage.headline} + + `} + ${session.subagent_count > 0 && session.agent === 'claude' && html` + <${SubagentButton} + sessionId=${session.session_id} + count=${session.subagent_count} + runningCount=${session.subagent_running_count || 0} + /> + `} +
+
+``` + +### IMP-6: API Function (JavaScript) + +**Fulfills:** AC-20 + +```javascript +// In dashboard/utils/api.js, add: + +export async function fetchSubagents(sessionId) { + try { + const response = await fetch(`/api/sessions/${sessionId}/subagents`); + if (!response.ok) return null; + return await response.json(); + } catch (e) { + console.error('Failed to fetch subagents:', e); + return null; + } +} +``` + +--- + +## Rollout Slices + +### Slice 1: Backend Discovery (AC-1, AC-2, AC-4, AC-5, AC-6, AC-7, AC-8, AC-9, AC-10) +- Create `amc_server/mixins/subagent.py` with discovery and stats logic +- Integrate into `state.py` to add counts to session payload +- Unit tests for name resolution, status detection, token summing + +### Slice 2: Backend Endpoint (AC-11, AC-12) +- Add `/api/sessions/{id}/subagents` route +- Return 404 for missing sessions, empty list for Codex +- Integration test with real session data + +### Slice 3: Frontend Button (AC-13, AC-14, AC-15, AC-16) +- Update SessionCard status area layout +- Create SubagentButton component with label logic +- Test: button shows when count > 0, hidden when 0 + +### Slice 4: Frontend Popover (AC-17, AC-18, AC-19, AC-20, AC-21, AC-22) +- Add popover with polling +- Style running/completed indicators +- Test: popover opens, polls, closes on outside click/Escape