plans/PLAN-tool-result-display.md: - Add comprehensive plan for displaying tool results inline in conversation view, including truncation strategies and expand/collapse UI patterns plans/subagent-visibility.md: - Mark completed phases and update remaining work items - Reflects current state of subagent tracking implementation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
14 KiB
Plan: Tool Result Display in AMC Dashboard
Status: Draft — awaiting review and mockup phase Author: Claude + Taylor Created: 2026-02-27
Summary
Add the ability to view tool call results (diffs, bash output, file contents) directly in the AMC dashboard conversation view. Currently, users see that a tool was called but cannot see what it did. This feature brings Claude Code's result visibility to the multi-agent dashboard.
Goals
- See code changes as they happen — diffs from Edit/Write tools always visible
- Debug agent behavior — inspect Bash output, Read content, search results
- Match Claude Code UX — familiar expand/collapse behavior with latest results expanded
Non-Goals (v1)
- Codex agent support (different JSONL format — deferred to v2)
- Copy-to-clipboard functionality
- Virtual scrolling / performance optimization
- Editor integration (clicking paths to open files)
User Workflows
Workflow 1: Watching an Active Session
- User opens a session card showing an active Claude agent
- Agent calls Edit tool to modify a file
- User immediately sees the diff expanded below the tool call pill
- Agent calls Bash to run tests
- User sees bash output expanded, previous Edit diff stays expanded (it's a diff)
- Agent sends a text message explaining results
- Bash output collapses (new assistant message arrived), Edit diff stays expanded
Workflow 2: Reviewing a Completed Session
- User opens a completed session to review what the agent did
- All tool calls are collapsed by default (no "latest" assistant message)
- Exception: Edit/Write diffs are still expanded
- User clicks a Bash tool call to see what command ran and its output
- User clicks "Show full output" when output is truncated
- Lightweight modal opens with full scrollable content
- User closes modal and continues reviewing
Workflow 3: Debugging a Failed Tool Call
- Agent runs a Bash command that fails
- Tool result block shows with red-tinted background
- stderr content is visible, clearly marked as error
- User can see what went wrong without leaving the dashboard
Acceptance Criteria
Display Behavior
- AC-1: Tool calls render as expandable elements showing tool name and summary
- AC-2: Clicking a collapsed tool call expands to show its result
- AC-3: Clicking an expanded tool call collapses it
- AC-4: Tool results in the most recent assistant message are expanded by default
- AC-5: When a new assistant message arrives, previous tool results collapse
- AC-6: Edit and Write tool diffs remain expanded regardless of message age
- AC-7: Tool calls without results display as non-expandable with muted styling
Diff Rendering
- AC-8: Edit/Write results display structuredPatch data as syntax-highlighted diff
- AC-9: Diff additions render with VS Code dark theme green background (rgba(46, 160, 67, 0.15))
- AC-10: Diff deletions render with VS Code dark theme red background (rgba(248, 81, 73, 0.15))
- AC-11: Full file path displays above each diff block
- AC-12: Diff context lines use structuredPatch as-is (no recomputation)
Other Tool Types
- AC-13: Bash results display stdout in monospace, stderr separately if present
- AC-14: Read results display file content with syntax highlighting based on file extension
- AC-15: Grep/Glob results display file list with match counts
- AC-16: WebFetch results display URL and response summary
Truncation
- AC-17: Long outputs truncate at thresholds matching Claude Code behavior
- AC-18: Truncated outputs show "Show full output (N lines)" link
- AC-19: Clicking "Show full output" opens a dedicated lightweight modal
- AC-20: Modal displays full content with syntax highlighting, scrollable
Error States
- AC-21: Failed tool calls display with red-tinted background
- AC-22: Error content (stderr, error messages) is clearly distinguishable from success content
- AC-23: is_error flag from tool_result determines error state
API Contract
- AC-24: /api/conversation response includes tool results nested in tool_calls
- AC-25: Each tool_call has: name, id, input, result (when available)
- AC-26: Result structure varies by tool type (documented in IMP-SERVER)
Architecture
Why Two-Pass JSONL Parsing
The Claude Code JSONL stores tool_use and tool_result as separate entries linked by tool_use_id. To nest results inside tool_calls for the API response, the server must:
- First pass: Build a map of tool_use_id → toolUseResult
- Second pass: Parse messages, attaching results to matching tool_calls
This adds parsing overhead but keeps the API contract simple. Alternatives considered:
- Streaming/incremental: More complex, doesn't help since we need full conversation anyway
- Client-side joining: Shifts complexity to frontend, increases payload size
Why Render Everything, Not Virtual Scroll
Sessions typically have 20-80 tool calls. Modern browsers handle hundreds of DOM elements efficiently. Virtual scrolling adds significant complexity (measuring, windowing, scroll position management) for marginal benefit.
Decision: Ship simple, measure real-world performance, optimize if >100ms render times observed.
Why Dedicated Modal Over Inline Expansion
Full output can be thousands of lines. Inline expansion would:
- Push other content out of view
- Make scrolling confusing
- Lose context of surrounding conversation
A modal provides a focused reading experience without disrupting conversation layout.
Component Structure
MessageBubble
├── Content (text)
├── Thinking (existing)
└── ToolCallList (new)
└── ToolCallItem (repeated)
├── Header (pill: chevron, name, summary, status)
└── ResultContent (conditional)
├── DiffResult (for Edit/Write)
├── BashResult (for Bash)
├── FileListResult (for Glob/Grep)
└── GenericResult (fallback)
FullOutputModal (new, top-level)
├── Header (tool name, file path)
├── Content (full output, scrollable)
└── CloseButton
Implementation Specifications
IMP-SERVER: Parse and Attach Tool Results
Fulfills: AC-24, AC-25, AC-26
Location: amc_server/mixins/conversation.py
Changes to _parse_claude_conversation:
Two-pass parsing:
- First pass: Scan all entries, build map of
tool_use_id→toolUseResult - Second pass: Parse messages as before, but when encountering
tool_use, lookup and attach result
Tool call schema after change:
{
"name": "Edit",
"id": "toolu_abc123",
"input": {"file_path": "...", "old_string": "...", "new_string": "..."},
"result": {
"content": "The file has been updated successfully.",
"is_error": False,
"structuredPatch": [...],
"filePath": "...",
# ... other fields from toolUseResult
}
}
Result Structure by Tool Type:
| Tool | Result Fields |
|---|---|
| Edit | structuredPatch, filePath, oldString, newString |
| Write | filePath, content confirmation |
| Read | file, type, content in content field |
| Bash | stdout, stderr, interrupted |
| Glob | filenames, numFiles, truncated |
| Grep | content, filenames, numFiles, numLines |
IMP-TOOLCALL: Expandable Tool Call Component
Fulfills: AC-1, AC-2, AC-3, AC-4, AC-5, AC-6, AC-7
Location: dashboard/lib/markdown.js (refactor renderToolCalls)
New function: ToolCallItem
Renders a single tool call with:
- Chevron for expand/collapse (when result exists and not Edit/Write)
- Tool name (bold, colored)
- Summary (from existing
getToolSummary) - Status icon (checkmark or X)
- Result content (when expanded)
State Management:
Track expanded state per message. When new assistant message arrives:
- Compare latest assistant message ID to stored ID
- If different, reset expanded set to empty
- Edit/Write tools bypass this logic (always expanded via CSS/logic)
IMP-DIFF: Diff Rendering Component
Fulfills: AC-8, AC-9, AC-10, AC-11, AC-12
Location: dashboard/lib/markdown.js (new function renderDiff)
Add diff language to highlight.js:
import langDiff from 'https://esm.sh/highlight.js@11.11.1/lib/languages/diff';
hljs.registerLanguage('diff', langDiff);
Diff Renderer:
- Convert
structuredPatcharray to unified diff text:- Each hunk:
@@ -oldStart,oldLines +newStart,newLines @@ - Followed by hunk.lines array
- Each hunk:
- Syntax highlight with hljs diff language
- Sanitize with DOMPurify before rendering
- Wrap in container with file path header
CSS styling:
- Container: dark border, rounded corners
- Header: muted background, monospace font, full file path
- Content: monospace, horizontal scroll
- Additions:
background: rgba(46, 160, 67, 0.15) - Deletions:
background: rgba(248, 81, 73, 0.15)
IMP-BASH: Bash Output Component
Fulfills: AC-13, AC-21, AC-22
Location: dashboard/lib/markdown.js (new function renderBashResult)
Renders:
stdoutin monospace pre blockstderrin separate block with error styling (if present)- "Command interrupted" notice (if interrupted flag)
Error state: is_error or presence of stderr triggers error styling (red tint, left border).
IMP-TRUNCATE: Output Truncation
Fulfills: AC-17, AC-18
Truncation Thresholds (match Claude Code):
| Tool Type | Max Lines | Max Chars |
|---|---|---|
| Bash stdout | 100 | 10000 |
| Bash stderr | 50 | 5000 |
| Read content | 500 | 50000 |
| Grep matches | 100 | 10000 |
| Glob files | 100 | 5000 |
Note: These thresholds need verification against Claude Code behavior. May require adjustment based on testing.
Truncation Helper:
Takes content string, returns { text, truncated, totalLines }. If truncated, result renderers show "Show full output (N lines)" link.
IMP-MODAL: Full Output Modal
Fulfills: AC-19, AC-20
Location: dashboard/components/FullOutputModal.js (new file)
Structure:
- Overlay (click to close)
- Modal container (click does NOT close)
- Header: title (tool name + file path), close button
- Content: scrollable pre/code block with syntax highlighting
Integration: Modal state managed at App level or ChatMessages level. "Show full output" link sets state with content + metadata.
IMP-ERROR: Error State Styling
Fulfills: AC-21, AC-22, AC-23
Styling:
- Tool call header: red-tinted background when
result.is_error - Status icon: red X instead of green checkmark
- Bash stderr: red text, italic, distinct from stdout
- Overall: left border accent in error color
Rollout Slices
Slice 1: Design Mockups (Pre-Implementation)
Goal: Validate visual design before building
Deliverables:
- Create
/mockupstest route with static data - Implement 3-4 design variants (card-based, minimal, etc.)
- Use real tool result data from session JSONL
- User reviews and selects preferred design
Exit Criteria: Design direction locked
Slice 2: Server-Side Tool Result Parsing
Goal: API returns tool results nested in tool_calls
Deliverables:
- Two-pass parsing in
_parse_claude_conversation - Tool results attached with
idfield - Unit tests for result attachment
- Handle missing results gracefully (return tool_call without result)
Exit Criteria: AC-24, AC-25, AC-26 pass
Slice 3: Basic Expand/Collapse UI
Goal: Tool calls are expandable, show raw result content
Deliverables:
- Refactor
renderToolCallstoToolCallListcomponent - Implement expand/collapse with chevron
- Track expanded state per message
- Collapse on new assistant message
- Keep Edit/Write always expanded
Exit Criteria: AC-1 through AC-7 pass
Slice 4: Diff Rendering
Goal: Edit/Write show beautiful diffs
Deliverables:
- Add diff language to highlight.js
- Implement
renderDifffunction - VS Code dark theme styling
- Full file path header
Exit Criteria: AC-8 through AC-12 pass
Slice 5: Other Tool Types
Goal: Bash, Read, Glob, Grep render appropriately
Deliverables:
renderBashResultwith stdout/stderr separationrenderFileContentfor ReadrenderFileListfor Glob/Grep- Generic fallback for unknown tools
Exit Criteria: AC-13 through AC-16 pass
Slice 6: Truncation and Modal
Goal: Long outputs truncate with modal expansion
Deliverables:
- Truncation helper with Claude Code thresholds
- "Show full output" link
FullOutputModalcomponent- Syntax highlighting in modal
Exit Criteria: AC-17 through AC-20 pass
Slice 7: Error States and Polish
Goal: Failed tools visually distinct, edge cases handled
Deliverables:
- Error state styling (red tint)
- Muted styling for missing results
- Test with interrupted sessions
- Cross-browser testing
Exit Criteria: AC-21 through AC-23 pass, feature complete
Open Questions
- Exact Claude Code truncation thresholds — need to verify against Claude Code source or experiment
- Performance with 100+ tool calls — monitor after ship, optimize if needed
- Codex support timeline — when should we prioritize v2?
Appendix: Research Findings
Claude Code JSONL Format
Tool calls and results are stored as separate entries:
// Assistant sends tool_use
{"type": "assistant", "message": {"content": [{"type": "tool_use", "id": "toolu_abc", "name": "Edit", "input": {...}}]}}
// Result in separate user entry
{"type": "user", "message": {"content": [{"type": "tool_result", "tool_use_id": "toolu_abc", "content": "Success"}]}, "toolUseResult": {...}}
The toolUseResult object contains rich structured data varying by tool type.
Missing Results Statistics
Across 55 sessions with 2,063 tool calls:
- 11 missing results (0.5%)
- Affected tools: Edit (4), Read (2), Bash (1), others
Interrupt Handling
User interrupts create a separate user message:
{"type": "user", "message": {"content": [{"type": "text", "text": "[Request interrupted by user for tool use]"}]}}
Tool results for completed tools are still present; the interrupt message indicates the turn ended early.