docs(jsonl): add comprehensive Claude JSONL session log reference

Create authoritative documentation suite for Claude Code JSONL session log processing, synthesized from codebase analysis, official Anthropic documentation, and community tooling research. Documentation structure (docs/claude-jsonl-reference/): 01-format-specification.md (214 lines): - Complete message envelope structure with all fields - Content block types (text, thinking, tool_use, tool_result) - Usage object for token reporting - Model identifiers and version history - Conversation DAG structure via parentUuid 02-message-types.md (346 lines): - Every message type with concrete JSON examples - User messages (string content vs array for tool results) - Assistant messages with all content block variants - Progress events (hooks, bash, MCP) - System, summary, and file-history-snapshot types - Codex format differences (response_item, function_call) 03-tool-lifecycle.md (341 lines): - Complete tool invocation to result flow - Hook input/output formats (PreToolUse, PostToolUse) - Parallel tool call handling - Tool-to-result pairing algorithm - Missing result edge cases - Codex tool format differences 04-subagent-teams.md (363 lines): - Task tool invocation and input fields - Subagent transcript locations and format - Team coordination (TeamCreate, SendMessage) - Hook events (SubagentStart, SubagentStop) - AMC spawn tracking with pending spawn registry - Worktree isolation for subagents 05-edge-cases.md (475 lines): - Parsing edge cases (invalid JSON, type ambiguity) - Type coercion gotchas (bool vs int in Python) - Session state edge cases (orphans, dead detection) - Tool call edge cases (missing results, parallel ordering) - Codex-specific quirks (content injection, buffering) - File system safety (path traversal, permissions) - Cache invalidation strategies 06-quick-reference.md (238 lines): - File locations cheat sheet - jq recipes for common queries - Python parsing snippets - Common gotchas table - Useful constants - Debugging commands Also adds CLAUDE.md at project root linking to documentation and providing project overview for agents working on AMC. Sources include Claude Code hooks.md, headless.md, Anthropic Messages API reference, and community tools (claude-code-log, claude-JSONL-browser). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-28 00:48:34 -05:00
parent ac629bd149
commit 781e74cda2
8 changed files with 2102 additions and 0 deletions
--- a/docs/claude-jsonl-reference/06-quick-reference.md
+++ b/docs/claude-jsonl-reference/06-quick-reference.md
@@ -0,0 +1,238 @@
+# Quick Reference
+
+Cheat sheet for common Claude JSONL operations.
+
+## File Locations
+
+```bash
+# Claude sessions
+~/.claude/projects/-Users-user-projects-myapp/*.jsonl
+
+# Codex sessions
+~/.codex/sessions/**/*.jsonl
+
+# Subagent transcripts
+~/.claude/projects/.../session-id/subagents/agent-*.jsonl
+
+# AMC session state
+~/.local/share/amc/sessions/*.json
+```
+
+## Path Encoding
+
+```python
+# Encode: /Users/dev/myproject -> -Users-dev-myproject
+encoded = '-' + project_path.replace('/', '-')
+
+# Decode: -Users-dev-myproject -> /Users/dev/myproject
+decoded = encoded[1:].replace('-', '/')
+```
+
+## Message Type Quick ID
+
+| If you see... | It's a... |
+|---------------|-----------|
+| `"type": "user"` + string content | User input |
+| `"type": "user"` + array content | Tool results |
+| `"type": "assistant"` | Claude response |
+| `"type": "progress"` | Hook/tool execution |
+| `"type": "summary"` | Session summary |
+| `"type": "system"` | Metadata/commands |
+
+## Content Block Quick ID
+
+| Block Type | Key Fields |
+|------------|------------|
+| `text` | `text` |
+| `thinking` | `thinking`, `signature` |
+| `tool_use` | `id`, `name`, `input` |
+| `tool_result` | `tool_use_id`, `content`, `is_error` |
+
+## jq Recipes
+
+```bash
+# Count messages by type
+jq -s 'group_by(.type) | map({type: .[0].type, count: length})' session.jsonl
+
+# Extract all tool calls
+jq -c 'select(.type=="assistant") | .message.content[]? | select(.type=="tool_use")' session.jsonl
+
+# Get user messages only
+jq -c 'select(.type=="user" and (.message.content | type)=="string")' session.jsonl
+
+# Sum tokens
+jq -s '[.[].message.usage? | select(.) | .input_tokens + .output_tokens] | add' session.jsonl
+
+# List tools used
+jq -c 'select(.type=="assistant") | .message.content[]? | select(.type=="tool_use") | .name' session.jsonl | sort | uniq -c
+
+# Find errors
+jq -c 'select(.type=="user") | .message.content[]? | select(.type=="tool_result" and .is_error==true)' session.jsonl
+```
+
+## Python Snippets
+
+### Read JSONL
+```python
+import json
+
+def read_jsonl(path):
+    with open(path) as f:
+        for line in f:
+            if line.strip():
+                try:
+                    yield json.loads(line)
+                except json.JSONDecodeError:
+                    continue
+```
+
+### Extract Conversation
+```python
+def extract_conversation(path):
+    messages = []
+    for event in read_jsonl(path):
+        if event['type'] == 'user':
+            content = event['message']['content']
+            if isinstance(content, str):
+                messages.append({'role': 'user', 'content': content})
+        elif event['type'] == 'assistant':
+            for block in event['message'].get('content', []):
+                if block.get('type') == 'text':
+                    messages.append({'role': 'assistant', 'content': block['text']})
+    return messages
+```
+
+### Get Token Usage
+```python
+def get_token_usage(path):
+    total_input = 0
+    total_output = 0
+
+    for event in read_jsonl(path):
+        if event['type'] == 'assistant':
+            usage = event.get('message', {}).get('usage', {})
+            total_input += usage.get('input_tokens', 0)
+            total_output += usage.get('output_tokens', 0)
+
+    return {'input': total_input, 'output': total_output}
+```
+
+### Find Tool Calls
+```python
+def find_tool_calls(path):
+    tools = []
+    for event in read_jsonl(path):
+        if event['type'] == 'assistant':
+            for block in event['message'].get('content', []):
+                if block.get('type') == 'tool_use':
+                    tools.append({
+                        'name': block['name'],
+                        'id': block['id'],
+                        'input': block['input']
+                    })
+    return tools
+```
+
+### Pair Tools with Results
+```python
+def pair_tools_results(path):
+    pending = {}
+
+    for event in read_jsonl(path):
+        if event['type'] == 'assistant':
+            for block in event['message'].get('content', []):
+                if block.get('type') == 'tool_use':
+                    pending[block['id']] = {'use': block, 'result': None}
+
+        elif event['type'] == 'user':
+            content = event['message'].get('content', [])
+            if isinstance(content, list):
+                for block in content:
+                    if block.get('type') == 'tool_result':
+                        tool_id = block['tool_use_id']
+                        if tool_id in pending:
+                            pending[tool_id]['result'] = block
+
+    return pending
+```
+
+## Common Gotchas
+
+| Gotcha | Solution |
+|--------|----------|
+| `content` can be string or array | Check `isinstance(content, str)` first |
+| `usage` may be missing | Use `.get('usage', {})` |
+| Booleans are ints in Python | Check `isinstance(v, bool)` before `isinstance(v, int)` |
+| First line may be partial after seek | Call `readline()` to discard |
+| Tool results in user messages | Check for `tool_result` type in array |
+| Codex `arguments` is JSON string | Parse with `json.loads()` |
+| Agent ID vs session ID | Agent ID survives rewrites, session ID is per-run |
+
+## Status Values
+
+| Field | Values |
+|-------|--------|
+| `status` | `starting`, `active`, `done` |
+| `stop_reason` | `end_turn`, `max_tokens`, `tool_use`, null |
+| `is_error` | `true`, `false` (tool results) |
+
+## Token Fields
+
+```python
+# All possible token fields to sum
+token_fields = [
+    'input_tokens',
+    'output_tokens',
+    'cache_creation_input_tokens',
+    'cache_read_input_tokens'
+]
+
+# Context window by model
+context_windows = {
+    'claude-opus': 200_000,
+    'claude-sonnet': 200_000,
+    'claude-haiku': 200_000,
+    'claude-2': 100_000
+}
+```
+
+## Useful Constants
+
+```python
+# File locations
+CLAUDE_BASE = os.path.expanduser('~/.claude/projects')
+CODEX_BASE = os.path.expanduser('~/.codex/sessions')
+AMC_BASE = os.path.expanduser('~/.local/share/amc')
+
+# Read limits
+MAX_TAIL_BYTES = 1_000_000  # 1MB
+MAX_LINES = 400  # For context extraction
+
+# Timeouts
+SUBPROCESS_TIMEOUT = 5  # seconds
+SPAWN_COOLDOWN = 30  # seconds
+
+# Session ages
+ACTIVE_THRESHOLD_MINUTES = 2
+ORPHAN_CLEANUP_HOURS = 24
+STARTING_CLEANUP_HOURS = 1
+```
+
+## Debugging Commands
+
+```bash
+# Watch session file changes
+tail -f ~/.claude/projects/-path-to-project/*.jsonl | jq -c
+
+# Find latest session
+ls -t ~/.claude/projects/-path-to-project/*.jsonl | head -1
+
+# Count lines in session
+wc -l session.jsonl
+
+# Validate JSON
+cat session.jsonl | while read line; do echo "$line" | jq . > /dev/null || echo "Invalid: $line"; done
+
+# Pretty print last message
+tail -1 session.jsonl | jq .
+```