Files

teernisse 781e74cda2 docs(jsonl): add comprehensive Claude JSONL session log reference

Create authoritative documentation suite for Claude Code JSONL session
log processing, synthesized from codebase analysis, official Anthropic
documentation, and community tooling research.

Documentation structure (docs/claude-jsonl-reference/):

01-format-specification.md (214 lines):
- Complete message envelope structure with all fields
- Content block types (text, thinking, tool_use, tool_result)
- Usage object for token reporting
- Model identifiers and version history
- Conversation DAG structure via parentUuid

02-message-types.md (346 lines):
- Every message type with concrete JSON examples
- User messages (string content vs array for tool results)
- Assistant messages with all content block variants
- Progress events (hooks, bash, MCP)
- System, summary, and file-history-snapshot types
- Codex format differences (response_item, function_call)

03-tool-lifecycle.md (341 lines):
- Complete tool invocation to result flow
- Hook input/output formats (PreToolUse, PostToolUse)
- Parallel tool call handling
- Tool-to-result pairing algorithm
- Missing result edge cases
- Codex tool format differences

04-subagent-teams.md (363 lines):
- Task tool invocation and input fields
- Subagent transcript locations and format
- Team coordination (TeamCreate, SendMessage)
- Hook events (SubagentStart, SubagentStop)
- AMC spawn tracking with pending spawn registry
- Worktree isolation for subagents

05-edge-cases.md (475 lines):
- Parsing edge cases (invalid JSON, type ambiguity)
- Type coercion gotchas (bool vs int in Python)
- Session state edge cases (orphans, dead detection)
- Tool call edge cases (missing results, parallel ordering)
- Codex-specific quirks (content injection, buffering)
- File system safety (path traversal, permissions)
- Cache invalidation strategies

06-quick-reference.md (238 lines):
- File locations cheat sheet
- jq recipes for common queries
- Python parsing snippets
- Common gotchas table
- Useful constants
- Debugging commands

Also adds CLAUDE.md at project root linking to documentation and
providing project overview for agents working on AMC.

Sources include Claude Code hooks.md, headless.md, Anthropic Messages
API reference, and community tools (claude-code-log, claude-JSONL-browser).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-02-28 00:48:55 -05:00

6.4 KiB

Raw Blame History

Quick Reference

Cheat sheet for common Claude JSONL operations.

File Locations

# Claude sessions
~/.claude/projects/-Users-user-projects-myapp/*.jsonl

# Codex sessions
~/.codex/sessions/**/*.jsonl

# Subagent transcripts
~/.claude/projects/.../session-id/subagents/agent-*.jsonl

# AMC session state
~/.local/share/amc/sessions/*.json

Path Encoding

# Encode: /Users/dev/myproject -> -Users-dev-myproject
encoded = '-' + project_path.replace('/', '-')

# Decode: -Users-dev-myproject -> /Users/dev/myproject
decoded = encoded[1:].replace('-', '/')

Message Type Quick ID

If you see...	It's a...
`"type": "user"` + string content	User input
`"type": "user"` + array content	Tool results
`"type": "assistant"`	Claude response
`"type": "progress"`	Hook/tool execution
`"type": "summary"`	Session summary
`"type": "system"`	Metadata/commands

Content Block Quick ID

Block Type	Key Fields
`text`	`text`
`thinking`	`thinking`, `signature`
`tool_use`	`id`, `name`, `input`
`tool_result`	`tool_use_id`, `content`, `is_error`

jq Recipes

# Count messages by type
jq -s 'group_by(.type) | map({type: .[0].type, count: length})' session.jsonl

# Extract all tool calls
jq -c 'select(.type=="assistant") | .message.content[]? | select(.type=="tool_use")' session.jsonl

# Get user messages only
jq -c 'select(.type=="user" and (.message.content | type)=="string")' session.jsonl

# Sum tokens
jq -s '[.[].message.usage? | select(.) | .input_tokens + .output_tokens] | add' session.jsonl

# List tools used
jq -c 'select(.type=="assistant") | .message.content[]? | select(.type=="tool_use") | .name' session.jsonl | sort | uniq -c

# Find errors
jq -c 'select(.type=="user") | .message.content[]? | select(.type=="tool_result" and .is_error==true)' session.jsonl

Python Snippets

Read JSONL

import json

def read_jsonl(path):
    with open(path) as f:
        for line in f:
            if line.strip():
                try:
                    yield json.loads(line)
                except json.JSONDecodeError:
                    continue

Extract Conversation

def extract_conversation(path):
    messages = []
    for event in read_jsonl(path):
        if event['type'] == 'user':
            content = event['message']['content']
            if isinstance(content, str):
                messages.append({'role': 'user', 'content': content})
        elif event['type'] == 'assistant':
            for block in event['message'].get('content', []):
                if block.get('type') == 'text':
                    messages.append({'role': 'assistant', 'content': block['text']})
    return messages

Get Token Usage

def get_token_usage(path):
    total_input = 0
    total_output = 0

    for event in read_jsonl(path):
        if event['type'] == 'assistant':
            usage = event.get('message', {}).get('usage', {})
            total_input += usage.get('input_tokens', 0)
            total_output += usage.get('output_tokens', 0)

    return {'input': total_input, 'output': total_output}

Find Tool Calls

def find_tool_calls(path):
    tools = []
    for event in read_jsonl(path):
        if event['type'] == 'assistant':
            for block in event['message'].get('content', []):
                if block.get('type') == 'tool_use':
                    tools.append({
                        'name': block['name'],
                        'id': block['id'],
                        'input': block['input']
                    })
    return tools

Pair Tools with Results

def pair_tools_results(path):
    pending = {}

    for event in read_jsonl(path):
        if event['type'] == 'assistant':
            for block in event['message'].get('content', []):
                if block.get('type') == 'tool_use':
                    pending[block['id']] = {'use': block, 'result': None}

        elif event['type'] == 'user':
            content = event['message'].get('content', [])
            if isinstance(content, list):
                for block in content:
                    if block.get('type') == 'tool_result':
                        tool_id = block['tool_use_id']
                        if tool_id in pending:
                            pending[tool_id]['result'] = block

    return pending

Common Gotchas

Gotcha	Solution
`content` can be string or array	Check `isinstance(content, str)` first
`usage` may be missing	Use `.get('usage', {})`
Booleans are ints in Python	Check `isinstance(v, bool)` before `isinstance(v, int)`
First line may be partial after seek	Call `readline()` to discard
Tool results in user messages	Check for `tool_result` type in array
Codex `arguments` is JSON string	Parse with `json.loads()`
Agent ID vs session ID	Agent ID survives rewrites, session ID is per-run

Status Values

Field	Values
`status`	`starting`, `active`, `done`
`stop_reason`	`end_turn`, `max_tokens`, `tool_use`, null
`is_error`	`true`, `false` (tool results)

Token Fields

# All possible token fields to sum
token_fields = [
    'input_tokens',
    'output_tokens',
    'cache_creation_input_tokens',
    'cache_read_input_tokens'
]

# Context window by model
context_windows = {
    'claude-opus': 200_000,
    'claude-sonnet': 200_000,
    'claude-haiku': 200_000,
    'claude-2': 100_000
}

Useful Constants

# File locations
CLAUDE_BASE = os.path.expanduser('~/.claude/projects')
CODEX_BASE = os.path.expanduser('~/.codex/sessions')
AMC_BASE = os.path.expanduser('~/.local/share/amc')

# Read limits
MAX_TAIL_BYTES = 1_000_000  # 1MB
MAX_LINES = 400  # For context extraction

# Timeouts
SUBPROCESS_TIMEOUT = 5  # seconds
SPAWN_COOLDOWN = 30  # seconds

# Session ages
ACTIVE_THRESHOLD_MINUTES = 2
ORPHAN_CLEANUP_HOURS = 24
STARTING_CLEANUP_HOURS = 1

Debugging Commands

# Watch session file changes
tail -f ~/.claude/projects/-path-to-project/*.jsonl | jq -c

# Find latest session
ls -t ~/.claude/projects/-path-to-project/*.jsonl | head -1

# Count lines in session
wc -l session.jsonl

# Validate JSON
cat session.jsonl | while read line; do echo "$line" | jq . > /dev/null || echo "Invalid: $line"; done

# Pretty print last message
tail -1 session.jsonl | jq .

6.4 KiB Raw Blame History

Quick Reference

File Locations

Path Encoding

Message Type Quick ID

Content Block Quick ID

jq Recipes

Python Snippets

Read JSONL

Extract Conversation

Get Token Usage

Find Tool Calls

Pair Tools with Results

Common Gotchas

Status Values

Token Fields

Useful Constants

Debugging Commands

6.4 KiB

Raw Blame History