Files
amc/docs/claude-jsonl-reference/01-format-specification.md
teernisse 781e74cda2 docs(jsonl): add comprehensive Claude JSONL session log reference
Create authoritative documentation suite for Claude Code JSONL session
log processing, synthesized from codebase analysis, official Anthropic
documentation, and community tooling research.

Documentation structure (docs/claude-jsonl-reference/):

01-format-specification.md (214 lines):
- Complete message envelope structure with all fields
- Content block types (text, thinking, tool_use, tool_result)
- Usage object for token reporting
- Model identifiers and version history
- Conversation DAG structure via parentUuid

02-message-types.md (346 lines):
- Every message type with concrete JSON examples
- User messages (string content vs array for tool results)
- Assistant messages with all content block variants
- Progress events (hooks, bash, MCP)
- System, summary, and file-history-snapshot types
- Codex format differences (response_item, function_call)

03-tool-lifecycle.md (341 lines):
- Complete tool invocation to result flow
- Hook input/output formats (PreToolUse, PostToolUse)
- Parallel tool call handling
- Tool-to-result pairing algorithm
- Missing result edge cases
- Codex tool format differences

04-subagent-teams.md (363 lines):
- Task tool invocation and input fields
- Subagent transcript locations and format
- Team coordination (TeamCreate, SendMessage)
- Hook events (SubagentStart, SubagentStop)
- AMC spawn tracking with pending spawn registry
- Worktree isolation for subagents

05-edge-cases.md (475 lines):
- Parsing edge cases (invalid JSON, type ambiguity)
- Type coercion gotchas (bool vs int in Python)
- Session state edge cases (orphans, dead detection)
- Tool call edge cases (missing results, parallel ordering)
- Codex-specific quirks (content injection, buffering)
- File system safety (path traversal, permissions)
- Cache invalidation strategies

06-quick-reference.md (238 lines):
- File locations cheat sheet
- jq recipes for common queries
- Python parsing snippets
- Common gotchas table
- Useful constants
- Debugging commands

Also adds CLAUDE.md at project root linking to documentation and
providing project overview for agents working on AMC.

Sources include Claude Code hooks.md, headless.md, Anthropic Messages
API reference, and community tools (claude-code-log, claude-JSONL-browser).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-28 00:48:55 -05:00

5.3 KiB

Claude JSONL Format Specification

File Format

  • Format: Newline-delimited JSON (NDJSON/JSONL)
  • Encoding: UTF-8
  • Line terminator: \n (LF)
  • One JSON object per line — no array wrapper

Message Envelope (Common Fields)

Every line in a Claude JSONL file contains these fields:

{
  "parentUuid": "uuid-string or null",
  "isSidechain": false,
  "userType": "external",
  "cwd": "/full/path/to/working/directory",
  "sessionId": "session-uuid-v4",
  "version": "2.1.20",
  "gitBranch": "branch-name or empty string",
  "type": "user|assistant|progress|system|summary|file-history-snapshot",
  "message": { ... },
  "uuid": "unique-message-uuid-v4",
  "timestamp": "ISO-8601 timestamp"
}

Field Reference

Field Type Required Description
type string Yes Message type identifier
uuid string (uuid) Yes* Unique identifier for this event
parentUuid string (uuid) or null Yes Links to parent message (null for root)
timestamp string (ISO-8601) Yes* When event occurred (UTC)
sessionId string (uuid) Yes Session identifier
version string (semver) Yes Claude Code version (e.g., "2.1.20")
cwd string (path) Yes Working directory at event time
gitBranch string No Git branch name (empty if not in repo)
isSidechain boolean Yes true for subagent sessions
userType string Yes Always "external" for user sessions
message object Conditional Message content (user/assistant types)
agentId string Conditional Agent identifier (subagent sessions only)

*May be null in metadata-only entries like file-history-snapshot

Content Structure

User Message Content

User messages have message.content as either:

String (direct input):

{
  "message": {
    "role": "user",
    "content": "Your question or instruction"
  }
}

Array (tool results):

{
  "message": {
    "role": "user",
    "content": [
      {
        "type": "tool_result",
        "tool_use_id": "toolu_01XYZ",
        "content": "Tool output text"
      }
    ]
  }
}

Assistant Message Content

Assistant messages always have message.content as an array:

{
  "message": {
    "role": "assistant",
    "type": "message",
    "model": "claude-opus-4-5-20251101",
    "id": "msg_bdrk_01Abc123",
    "content": [
      {"type": "thinking", "thinking": "..."},
      {"type": "text", "text": "..."},
      {"type": "tool_use", "id": "toolu_01XYZ", "name": "Read", "input": {...}}
    ],
    "stop_reason": "end_turn",
    "stop_sequence": null,
    "usage": {...}
  }
}

Content Block Types

Text Block

{
  "type": "text",
  "text": "Response text content"
}

Thinking Block

{
  "type": "thinking",
  "thinking": "Internal reasoning (extended thinking mode)",
  "signature": "base64-signature (optional)"
}

Tool Use Block

{
  "type": "tool_use",
  "id": "toolu_01Abc123XYZ",
  "name": "ToolName",
  "input": {
    "param1": "value1",
    "param2": 123
  }
}

Tool Result Block

{
  "type": "tool_result",
  "tool_use_id": "toolu_01Abc123XYZ",
  "content": "Result text or structured output",
  "is_error": false
}

Usage Object

Token consumption reported in assistant messages:

{
  "usage": {
    "input_tokens": 1000,
    "output_tokens": 500,
    "cache_creation_input_tokens": 200,
    "cache_read_input_tokens": 400,
    "cache_creation": {
      "ephemeral_5m_input_tokens": 200,
      "ephemeral_1h_input_tokens": 0
    },
    "service_tier": "standard"
  }
}
Field Type Description
input_tokens int Input tokens consumed
output_tokens int Output tokens generated
cache_creation_input_tokens int Tokens used to create cache
cache_read_input_tokens int Tokens read from cache
service_tier string API tier ("standard", etc.)

Model Identifiers

Common model names in message.model:

Model Identifier
Claude Opus 4.5 claude-opus-4-5-20251101
Claude Sonnet 4.5 claude-sonnet-4-5-20241022
Claude Haiku 4.5 claude-haiku-4-5-20251001

Version History

Version Changes
2.1.20 Extended thinking, permission modes, todos
2.1.17 Subagent support with agentId
2.1.x Progress events, hook metadata
2.0.x Basic message/tool_use/tool_result

Conversation Graph

Messages form a DAG (directed acyclic graph) via parent-child relationships:

Root (parentUuid: null)
├── User message (uuid: A)
│   └── Assistant (uuid: B, parentUuid: A)
│       ├── Progress: Tool (uuid: C, parentUuid: A)
│       └── Progress: Hook (uuid: D, parentUuid: A)
└── User message (uuid: E, parentUuid: B)
    └── Assistant (uuid: F, parentUuid: E)

Parsing Recommendations

  1. Line-by-line — Don't load entire file into memory
  2. Skip invalid lines — Wrap JSON.parse in try/catch
  3. Handle missing fields — Check existence before access
  4. Ignore unknown types — Format evolves with new event types
  5. Check content type — User content can be string OR array
  6. Sum token variants — Cache tokens may be in different fields