Files

teernisse 781e74cda2 docs(jsonl): add comprehensive Claude JSONL session log reference

Create authoritative documentation suite for Claude Code JSONL session
log processing, synthesized from codebase analysis, official Anthropic
documentation, and community tooling research.

Documentation structure (docs/claude-jsonl-reference/):

01-format-specification.md (214 lines):
- Complete message envelope structure with all fields
- Content block types (text, thinking, tool_use, tool_result)
- Usage object for token reporting
- Model identifiers and version history
- Conversation DAG structure via parentUuid

02-message-types.md (346 lines):
- Every message type with concrete JSON examples
- User messages (string content vs array for tool results)
- Assistant messages with all content block variants
- Progress events (hooks, bash, MCP)
- System, summary, and file-history-snapshot types
- Codex format differences (response_item, function_call)

03-tool-lifecycle.md (341 lines):
- Complete tool invocation to result flow
- Hook input/output formats (PreToolUse, PostToolUse)
- Parallel tool call handling
- Tool-to-result pairing algorithm
- Missing result edge cases
- Codex tool format differences

04-subagent-teams.md (363 lines):
- Task tool invocation and input fields
- Subagent transcript locations and format
- Team coordination (TeamCreate, SendMessage)
- Hook events (SubagentStart, SubagentStop)
- AMC spawn tracking with pending spawn registry
- Worktree isolation for subagents

05-edge-cases.md (475 lines):
- Parsing edge cases (invalid JSON, type ambiguity)
- Type coercion gotchas (bool vs int in Python)
- Session state edge cases (orphans, dead detection)
- Tool call edge cases (missing results, parallel ordering)
- Codex-specific quirks (content injection, buffering)
- File system safety (path traversal, permissions)
- Cache invalidation strategies

06-quick-reference.md (238 lines):
- File locations cheat sheet
- jq recipes for common queries
- Python parsing snippets
- Common gotchas table
- Useful constants
- Debugging commands

Also adds CLAUDE.md at project root linking to documentation and
providing project overview for agents working on AMC.

Sources include Claude Code hooks.md, headless.md, Anthropic Messages
API reference, and community tools (claude-code-log, claude-JSONL-browser).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-02-28 00:48:55 -05:00

5.3 KiB

Raw Blame History

Claude JSONL Format Specification

File Format

Format: Newline-delimited JSON (NDJSON/JSONL)
Encoding: UTF-8
Line terminator: \n (LF)
One JSON object per line — no array wrapper

Message Envelope (Common Fields)

Every line in a Claude JSONL file contains these fields:

{
  "parentUuid": "uuid-string or null",
  "isSidechain": false,
  "userType": "external",
  "cwd": "/full/path/to/working/directory",
  "sessionId": "session-uuid-v4",
  "version": "2.1.20",
  "gitBranch": "branch-name or empty string",
  "type": "user|assistant|progress|system|summary|file-history-snapshot",
  "message": { ... },
  "uuid": "unique-message-uuid-v4",
  "timestamp": "ISO-8601 timestamp"
}

Field Reference

Field	Type	Required	Description
`type`	string	Yes	Message type identifier
`uuid`	string (uuid)	Yes*	Unique identifier for this event
`parentUuid`	string (uuid) or null	Yes	Links to parent message (null for root)
`timestamp`	string (ISO-8601)	Yes*	When event occurred (UTC)
`sessionId`	string (uuid)	Yes	Session identifier
`version`	string (semver)	Yes	Claude Code version (e.g., "2.1.20")
`cwd`	string (path)	Yes	Working directory at event time
`gitBranch`	string	No	Git branch name (empty if not in repo)
`isSidechain`	boolean	Yes	`true` for subagent sessions
`userType`	string	Yes	Always "external" for user sessions
`message`	object	Conditional	Message content (user/assistant types)
`agentId`	string	Conditional	Agent identifier (subagent sessions only)

*May be null in metadata-only entries like file-history-snapshot

Content Structure

User Message Content

User messages have message.content as either:

String (direct input):

{
  "message": {
    "role": "user",
    "content": "Your question or instruction"
  }
}

Array (tool results):

{
  "message": {
    "role": "user",
    "content": [
      {
        "type": "tool_result",
        "tool_use_id": "toolu_01XYZ",
        "content": "Tool output text"
      }
    ]
  }
}

Assistant Message Content

Assistant messages always have message.content as an array:

{
  "message": {
    "role": "assistant",
    "type": "message",
    "model": "claude-opus-4-5-20251101",
    "id": "msg_bdrk_01Abc123",
    "content": [
      {"type": "thinking", "thinking": "..."},
      {"type": "text", "text": "..."},
      {"type": "tool_use", "id": "toolu_01XYZ", "name": "Read", "input": {...}}
    ],
    "stop_reason": "end_turn",
    "stop_sequence": null,
    "usage": {...}
  }
}

Content Block Types

Text Block

{
  "type": "text",
  "text": "Response text content"
}

Thinking Block

{
  "type": "thinking",
  "thinking": "Internal reasoning (extended thinking mode)",
  "signature": "base64-signature (optional)"
}

Tool Use Block

{
  "type": "tool_use",
  "id": "toolu_01Abc123XYZ",
  "name": "ToolName",
  "input": {
    "param1": "value1",
    "param2": 123
  }
}

Tool Result Block

{
  "type": "tool_result",
  "tool_use_id": "toolu_01Abc123XYZ",
  "content": "Result text or structured output",
  "is_error": false
}

Usage Object

Token consumption reported in assistant messages:

{
  "usage": {
    "input_tokens": 1000,
    "output_tokens": 500,
    "cache_creation_input_tokens": 200,
    "cache_read_input_tokens": 400,
    "cache_creation": {
      "ephemeral_5m_input_tokens": 200,
      "ephemeral_1h_input_tokens": 0
    },
    "service_tier": "standard"
  }
}

Field	Type	Description
`input_tokens`	int	Input tokens consumed
`output_tokens`	int	Output tokens generated
`cache_creation_input_tokens`	int	Tokens used to create cache
`cache_read_input_tokens`	int	Tokens read from cache
`service_tier`	string	API tier ("standard", etc.)

Model Identifiers

Common model names in message.model:

Model	Identifier
Claude Opus 4.5	`claude-opus-4-5-20251101`
Claude Sonnet 4.5	`claude-sonnet-4-5-20241022`
Claude Haiku 4.5	`claude-haiku-4-5-20251001`

Version History

Version	Changes
2.1.20	Extended thinking, permission modes, todos
2.1.17	Subagent support with agentId
2.1.x	Progress events, hook metadata
2.0.x	Basic message/tool_use/tool_result

Conversation Graph

Messages form a DAG (directed acyclic graph) via parent-child relationships:

Root (parentUuid: null)
├── User message (uuid: A)
│   └── Assistant (uuid: B, parentUuid: A)
│       ├── Progress: Tool (uuid: C, parentUuid: A)
│       └── Progress: Hook (uuid: D, parentUuid: A)
└── User message (uuid: E, parentUuid: B)
    └── Assistant (uuid: F, parentUuid: E)

Parsing Recommendations

Line-by-line — Don't load entire file into memory
Skip invalid lines — Wrap JSON.parse in try/catch
Handle missing fields — Check existence before access
Ignore unknown types — Format evolves with new event types
Check content type — User content can be string OR array
Sum token variants — Cache tokens may be in different fields

5.3 KiB Raw Blame History