Create authoritative documentation suite for Claude Code JSONL session log processing, synthesized from codebase analysis, official Anthropic documentation, and community tooling research. Documentation structure (docs/claude-jsonl-reference/): 01-format-specification.md (214 lines): - Complete message envelope structure with all fields - Content block types (text, thinking, tool_use, tool_result) - Usage object for token reporting - Model identifiers and version history - Conversation DAG structure via parentUuid 02-message-types.md (346 lines): - Every message type with concrete JSON examples - User messages (string content vs array for tool results) - Assistant messages with all content block variants - Progress events (hooks, bash, MCP) - System, summary, and file-history-snapshot types - Codex format differences (response_item, function_call) 03-tool-lifecycle.md (341 lines): - Complete tool invocation to result flow - Hook input/output formats (PreToolUse, PostToolUse) - Parallel tool call handling - Tool-to-result pairing algorithm - Missing result edge cases - Codex tool format differences 04-subagent-teams.md (363 lines): - Task tool invocation and input fields - Subagent transcript locations and format - Team coordination (TeamCreate, SendMessage) - Hook events (SubagentStart, SubagentStop) - AMC spawn tracking with pending spawn registry - Worktree isolation for subagents 05-edge-cases.md (475 lines): - Parsing edge cases (invalid JSON, type ambiguity) - Type coercion gotchas (bool vs int in Python) - Session state edge cases (orphans, dead detection) - Tool call edge cases (missing results, parallel ordering) - Codex-specific quirks (content injection, buffering) - File system safety (path traversal, permissions) - Cache invalidation strategies 06-quick-reference.md (238 lines): - File locations cheat sheet - jq recipes for common queries - Python parsing snippets - Common gotchas table - Useful constants - Debugging commands Also adds CLAUDE.md at project root linking to documentation and providing project overview for agents working on AMC. Sources include Claude Code hooks.md, headless.md, Anthropic Messages API reference, and community tools (claude-code-log, claude-JSONL-browser). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
6.4 KiB
6.4 KiB
Quick Reference
Cheat sheet for common Claude JSONL operations.
File Locations
# Claude sessions
~/.claude/projects/-Users-user-projects-myapp/*.jsonl
# Codex sessions
~/.codex/sessions/**/*.jsonl
# Subagent transcripts
~/.claude/projects/.../session-id/subagents/agent-*.jsonl
# AMC session state
~/.local/share/amc/sessions/*.json
Path Encoding
# Encode: /Users/dev/myproject -> -Users-dev-myproject
encoded = '-' + project_path.replace('/', '-')
# Decode: -Users-dev-myproject -> /Users/dev/myproject
decoded = encoded[1:].replace('-', '/')
Message Type Quick ID
| If you see... | It's a... |
|---|---|
"type": "user" + string content |
User input |
"type": "user" + array content |
Tool results |
"type": "assistant" |
Claude response |
"type": "progress" |
Hook/tool execution |
"type": "summary" |
Session summary |
"type": "system" |
Metadata/commands |
Content Block Quick ID
| Block Type | Key Fields |
|---|---|
text |
text |
thinking |
thinking, signature |
tool_use |
id, name, input |
tool_result |
tool_use_id, content, is_error |
jq Recipes
# Count messages by type
jq -s 'group_by(.type) | map({type: .[0].type, count: length})' session.jsonl
# Extract all tool calls
jq -c 'select(.type=="assistant") | .message.content[]? | select(.type=="tool_use")' session.jsonl
# Get user messages only
jq -c 'select(.type=="user" and (.message.content | type)=="string")' session.jsonl
# Sum tokens
jq -s '[.[].message.usage? | select(.) | .input_tokens + .output_tokens] | add' session.jsonl
# List tools used
jq -c 'select(.type=="assistant") | .message.content[]? | select(.type=="tool_use") | .name' session.jsonl | sort | uniq -c
# Find errors
jq -c 'select(.type=="user") | .message.content[]? | select(.type=="tool_result" and .is_error==true)' session.jsonl
Python Snippets
Read JSONL
import json
def read_jsonl(path):
with open(path) as f:
for line in f:
if line.strip():
try:
yield json.loads(line)
except json.JSONDecodeError:
continue
Extract Conversation
def extract_conversation(path):
messages = []
for event in read_jsonl(path):
if event['type'] == 'user':
content = event['message']['content']
if isinstance(content, str):
messages.append({'role': 'user', 'content': content})
elif event['type'] == 'assistant':
for block in event['message'].get('content', []):
if block.get('type') == 'text':
messages.append({'role': 'assistant', 'content': block['text']})
return messages
Get Token Usage
def get_token_usage(path):
total_input = 0
total_output = 0
for event in read_jsonl(path):
if event['type'] == 'assistant':
usage = event.get('message', {}).get('usage', {})
total_input += usage.get('input_tokens', 0)
total_output += usage.get('output_tokens', 0)
return {'input': total_input, 'output': total_output}
Find Tool Calls
def find_tool_calls(path):
tools = []
for event in read_jsonl(path):
if event['type'] == 'assistant':
for block in event['message'].get('content', []):
if block.get('type') == 'tool_use':
tools.append({
'name': block['name'],
'id': block['id'],
'input': block['input']
})
return tools
Pair Tools with Results
def pair_tools_results(path):
pending = {}
for event in read_jsonl(path):
if event['type'] == 'assistant':
for block in event['message'].get('content', []):
if block.get('type') == 'tool_use':
pending[block['id']] = {'use': block, 'result': None}
elif event['type'] == 'user':
content = event['message'].get('content', [])
if isinstance(content, list):
for block in content:
if block.get('type') == 'tool_result':
tool_id = block['tool_use_id']
if tool_id in pending:
pending[tool_id]['result'] = block
return pending
Common Gotchas
| Gotcha | Solution |
|---|---|
content can be string or array |
Check isinstance(content, str) first |
usage may be missing |
Use .get('usage', {}) |
| Booleans are ints in Python | Check isinstance(v, bool) before isinstance(v, int) |
| First line may be partial after seek | Call readline() to discard |
| Tool results in user messages | Check for tool_result type in array |
Codex arguments is JSON string |
Parse with json.loads() |
| Agent ID vs session ID | Agent ID survives rewrites, session ID is per-run |
Status Values
| Field | Values |
|---|---|
status |
starting, active, done |
stop_reason |
end_turn, max_tokens, tool_use, null |
is_error |
true, false (tool results) |
Token Fields
# All possible token fields to sum
token_fields = [
'input_tokens',
'output_tokens',
'cache_creation_input_tokens',
'cache_read_input_tokens'
]
# Context window by model
context_windows = {
'claude-opus': 200_000,
'claude-sonnet': 200_000,
'claude-haiku': 200_000,
'claude-2': 100_000
}
Useful Constants
# File locations
CLAUDE_BASE = os.path.expanduser('~/.claude/projects')
CODEX_BASE = os.path.expanduser('~/.codex/sessions')
AMC_BASE = os.path.expanduser('~/.local/share/amc')
# Read limits
MAX_TAIL_BYTES = 1_000_000 # 1MB
MAX_LINES = 400 # For context extraction
# Timeouts
SUBPROCESS_TIMEOUT = 5 # seconds
SPAWN_COOLDOWN = 30 # seconds
# Session ages
ACTIVE_THRESHOLD_MINUTES = 2
ORPHAN_CLEANUP_HOURS = 24
STARTING_CLEANUP_HOURS = 1
Debugging Commands
# Watch session file changes
tail -f ~/.claude/projects/-path-to-project/*.jsonl | jq -c
# Find latest session
ls -t ~/.claude/projects/-path-to-project/*.jsonl | head -1
# Count lines in session
wc -l session.jsonl
# Validate JSON
cat session.jsonl | while read line; do echo "$line" | jq . > /dev/null || echo "Invalid: $line"; done
# Pretty print last message
tail -1 session.jsonl | jq .