Create authoritative documentation suite for Claude Code JSONL session log processing, synthesized from codebase analysis, official Anthropic documentation, and community tooling research. Documentation structure (docs/claude-jsonl-reference/): 01-format-specification.md (214 lines): - Complete message envelope structure with all fields - Content block types (text, thinking, tool_use, tool_result) - Usage object for token reporting - Model identifiers and version history - Conversation DAG structure via parentUuid 02-message-types.md (346 lines): - Every message type with concrete JSON examples - User messages (string content vs array for tool results) - Assistant messages with all content block variants - Progress events (hooks, bash, MCP) - System, summary, and file-history-snapshot types - Codex format differences (response_item, function_call) 03-tool-lifecycle.md (341 lines): - Complete tool invocation to result flow - Hook input/output formats (PreToolUse, PostToolUse) - Parallel tool call handling - Tool-to-result pairing algorithm - Missing result edge cases - Codex tool format differences 04-subagent-teams.md (363 lines): - Task tool invocation and input fields - Subagent transcript locations and format - Team coordination (TeamCreate, SendMessage) - Hook events (SubagentStart, SubagentStop) - AMC spawn tracking with pending spawn registry - Worktree isolation for subagents 05-edge-cases.md (475 lines): - Parsing edge cases (invalid JSON, type ambiguity) - Type coercion gotchas (bool vs int in Python) - Session state edge cases (orphans, dead detection) - Tool call edge cases (missing results, parallel ordering) - Codex-specific quirks (content injection, buffering) - File system safety (path traversal, permissions) - Cache invalidation strategies 06-quick-reference.md (238 lines): - File locations cheat sheet - jq recipes for common queries - Python parsing snippets - Common gotchas table - Useful constants - Debugging commands Also adds CLAUDE.md at project root linking to documentation and providing project overview for agents working on AMC. Sources include Claude Code hooks.md, headless.md, Anthropic Messages API reference, and community tools (claude-code-log, claude-JSONL-browser). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
5.3 KiB
5.3 KiB
Claude JSONL Format Specification
File Format
- Format: Newline-delimited JSON (NDJSON/JSONL)
- Encoding: UTF-8
- Line terminator:
\n(LF) - One JSON object per line — no array wrapper
Message Envelope (Common Fields)
Every line in a Claude JSONL file contains these fields:
{
"parentUuid": "uuid-string or null",
"isSidechain": false,
"userType": "external",
"cwd": "/full/path/to/working/directory",
"sessionId": "session-uuid-v4",
"version": "2.1.20",
"gitBranch": "branch-name or empty string",
"type": "user|assistant|progress|system|summary|file-history-snapshot",
"message": { ... },
"uuid": "unique-message-uuid-v4",
"timestamp": "ISO-8601 timestamp"
}
Field Reference
| Field | Type | Required | Description |
|---|---|---|---|
type |
string | Yes | Message type identifier |
uuid |
string (uuid) | Yes* | Unique identifier for this event |
parentUuid |
string (uuid) or null | Yes | Links to parent message (null for root) |
timestamp |
string (ISO-8601) | Yes* | When event occurred (UTC) |
sessionId |
string (uuid) | Yes | Session identifier |
version |
string (semver) | Yes | Claude Code version (e.g., "2.1.20") |
cwd |
string (path) | Yes | Working directory at event time |
gitBranch |
string | No | Git branch name (empty if not in repo) |
isSidechain |
boolean | Yes | true for subagent sessions |
userType |
string | Yes | Always "external" for user sessions |
message |
object | Conditional | Message content (user/assistant types) |
agentId |
string | Conditional | Agent identifier (subagent sessions only) |
*May be null in metadata-only entries like file-history-snapshot
Content Structure
User Message Content
User messages have message.content as either:
String (direct input):
{
"message": {
"role": "user",
"content": "Your question or instruction"
}
}
Array (tool results):
{
"message": {
"role": "user",
"content": [
{
"type": "tool_result",
"tool_use_id": "toolu_01XYZ",
"content": "Tool output text"
}
]
}
}
Assistant Message Content
Assistant messages always have message.content as an array:
{
"message": {
"role": "assistant",
"type": "message",
"model": "claude-opus-4-5-20251101",
"id": "msg_bdrk_01Abc123",
"content": [
{"type": "thinking", "thinking": "..."},
{"type": "text", "text": "..."},
{"type": "tool_use", "id": "toolu_01XYZ", "name": "Read", "input": {...}}
],
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {...}
}
}
Content Block Types
Text Block
{
"type": "text",
"text": "Response text content"
}
Thinking Block
{
"type": "thinking",
"thinking": "Internal reasoning (extended thinking mode)",
"signature": "base64-signature (optional)"
}
Tool Use Block
{
"type": "tool_use",
"id": "toolu_01Abc123XYZ",
"name": "ToolName",
"input": {
"param1": "value1",
"param2": 123
}
}
Tool Result Block
{
"type": "tool_result",
"tool_use_id": "toolu_01Abc123XYZ",
"content": "Result text or structured output",
"is_error": false
}
Usage Object
Token consumption reported in assistant messages:
{
"usage": {
"input_tokens": 1000,
"output_tokens": 500,
"cache_creation_input_tokens": 200,
"cache_read_input_tokens": 400,
"cache_creation": {
"ephemeral_5m_input_tokens": 200,
"ephemeral_1h_input_tokens": 0
},
"service_tier": "standard"
}
}
| Field | Type | Description |
|---|---|---|
input_tokens |
int | Input tokens consumed |
output_tokens |
int | Output tokens generated |
cache_creation_input_tokens |
int | Tokens used to create cache |
cache_read_input_tokens |
int | Tokens read from cache |
service_tier |
string | API tier ("standard", etc.) |
Model Identifiers
Common model names in message.model:
| Model | Identifier |
|---|---|
| Claude Opus 4.5 | claude-opus-4-5-20251101 |
| Claude Sonnet 4.5 | claude-sonnet-4-5-20241022 |
| Claude Haiku 4.5 | claude-haiku-4-5-20251001 |
Version History
| Version | Changes |
|---|---|
| 2.1.20 | Extended thinking, permission modes, todos |
| 2.1.17 | Subagent support with agentId |
| 2.1.x | Progress events, hook metadata |
| 2.0.x | Basic message/tool_use/tool_result |
Conversation Graph
Messages form a DAG (directed acyclic graph) via parent-child relationships:
Root (parentUuid: null)
├── User message (uuid: A)
│ └── Assistant (uuid: B, parentUuid: A)
│ ├── Progress: Tool (uuid: C, parentUuid: A)
│ └── Progress: Hook (uuid: D, parentUuid: A)
└── User message (uuid: E, parentUuid: B)
└── Assistant (uuid: F, parentUuid: E)
Parsing Recommendations
- Line-by-line — Don't load entire file into memory
- Skip invalid lines — Wrap JSON.parse in try/catch
- Handle missing fields — Check existence before access
- Ignore unknown types — Format evolves with new event types
- Check content type — User content can be string OR array
- Sum token variants — Cache tokens may be in different fields