Implement the pipeline layer that orchestrates discovery, parsing,
caching, and aggregation:
- pipeline/loader.go: Load() discovers session files via ScanDir,
optionally filters out subagent files, then parses all files in
parallel using a bounded worker pool sized to GOMAXPROCS. Workers
read from a pre-filled channel (no contention on dispatch) and
report progress via an atomic counter and callback. LoadResult
tracks total files, parsed files, parse errors, and file errors.
- pipeline/aggregator.go: Five aggregation functions, all operating
on time-filtered session slices:
* Aggregate: computes SummaryStats across all sessions — total
tokens (5 types), estimated cost, cache savings (summed per-model
via config.CalculateCacheSavings), cache hit rate, and per-active-
day rates (cost, tokens, sessions, prompts, minutes).
* AggregateDays: groups sessions by local calendar date, sorted
most-recent-first.
* AggregateModels: groups by normalized model name with share
percentages, sorted by cost descending.
* AggregateProjects: groups by project name, sorted by cost.
* AggregateHourly: distributes prompt/session/token counts across
24 hour buckets (attributed to session start hour).
Also provides FilterByTime, FilterByProject, FilterByModel with
case-insensitive substring matching.
- pipeline/incremental.go: LoadWithCache() implements the incremental
loading strategy — compares discovered files against the cache's
file_tracker (mtime_ns + size), loads unchanged sessions from
SQLite, and only reparses files that changed. Reparsed results
are immediately saved back to cache. CacheDir/CachePath follow
XDG_CACHE_HOME convention (~/.cache/cburn/metrics.db).
- pipeline/bench_test.go: Benchmarks for ScanDir, ParseFile (worst-
case largest file), full Load, and LoadWithCache to measure the
incremental cache speedup.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement the caching layer that enables fast subsequent runs by
persisting parsed session data in SQLite:
- store/schema.go: DDL for three tables — sessions (primary metrics
with file_mtime_ns/file_size for change detection), session_models
(per-model breakdown, FK cascade on delete), and file_tracker
(path -> mtime+size mapping for cache invalidation). Indexes on
start_time and project for efficient time-range and filter queries.
- store/cache.go: Cache struct wrapping database/sql with WAL mode
and synchronous=normal for concurrent read safety and write
performance. Key operations:
* Open: creates the cache directory, opens/creates the database,
and ensures the schema is applied (idempotent via IF NOT EXISTS).
* GetTrackedFiles: returns the mtime/size map used by the pipeline
to determine which files need reparsing.
* SaveSession: transactional upsert of session stats + model
breakdown + file tracker entry. Uses INSERT OR REPLACE to handle
both new files and files that changed since last parse.
* LoadAllSessions: batch-loads all cached sessions with a two-pass
strategy — first loads session rows, then batch-loads model data
with an index map for O(1) join, avoiding N+1 queries.
Uses modernc.org/sqlite (pure-Go, no CGO) for zero-dependency
cross-platform builds.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement the bottom of the data pipeline — discovery and parsing of
Claude Code session files:
- source/types.go: Raw JSON deserialization types (RawEntry,
RawMessage, RawUsage, CacheCreation) matching the Claude Code
JSONL schema. DiscoveredFile carries file metadata including
decoded project name, session ID, and subagent relationship info.
- source/scanner.go: ScanDir walks ~/.claude/projects/ to discover
all .jsonl session files. Detects subagent files by the
<project>/<session>/subagents/agent-<id>.jsonl path pattern and
links them to parent sessions. decodeProjectName reverses Claude
Code's path-encoding convention (/-delimited path segments joined
with hyphens) by scanning for known parent markers (projects,
repos, src, code, workspace, dev) and extracting the project name
after the last marker.
- source/parser.go: ParseFile processes a single JSONL session file.
Uses a hybrid parsing strategy for performance:
* "user" and "system" entries: byte-level field extraction for
timestamps, cwd, and turn_duration (avoids JSON allocation).
extractTopLevelType tracks brace depth and string boundaries to
find only the top-level "type" field, early-exiting ~400 bytes
in for O(1) per line cost regardless of line length.
* "assistant" entries: full JSON unmarshal to extract token usage,
model name, and cost data.
Deduplicates API calls by message.id (keeping the last entry per
ID, which holds the final billed usage). Computes per-model cost
breakdown using config.CalculateCost and aggregates cache hit rate.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement the configuration layer that supports the entire cost
estimation pipeline:
- config/config.go: TOML-based config at ~/.config/cburn/config.toml
(XDG-compliant) with sections for general preferences, Admin API
credentials, budget tracking, appearance, and per-model pricing
overrides. Supports Load/Save with sensible defaults (30-day
window, subagents included, Flexoki Dark theme). Admin API key
resolution checks ANTHROPIC_ADMIN_KEY env var first, falling back
to the config file.
- config/pricing.go: Hardcoded pricing table for 8 Claude model
variants (Opus 4/4.1/4.5/4.6, Sonnet 4/4.5/4.6, Haiku 3.5/4.5)
with per-million-token rates across 5 billing dimensions: input,
output, cache_write_5m, cache_write_1h, cache_read, plus long-
context overrides (>200K tokens). NormalizeModelName strips date
suffixes (e.g., "claude-opus-4-5-20251101" -> "claude-opus-4-5").
CalculateCost and CalculateCacheSavings compute per-call USD costs
by multiplying token counts against the pricing table.
- config/plan.go: DetectPlan reads ~/.claude/.claude.json to
determine the billing plan type. Maps "stripe_subscription" to
the Max plan ($200/mo ceiling), everything else to Pro ($100/mo).
Used by the budget tab for plan-relative spend visualization.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Define the core data structures that flow through the entire pipeline:
- model/session.go: SessionStats (per-session aggregates including
token counts across 5 categories — input, output, cache_write_5m,
cache_write_1h, cache_read), APICall (deduplicated by message.id,
keyed to the final billed usage), and ModelUsage (per-model
breakdown within a session). Tracks subagent relationships via
IsSubagent/ParentSession fields.
- model/metrics.go: Higher-order aggregate types — SummaryStats
(top-level totals with per-active-day rates for cost, tokens,
sessions, and minutes), DailyStats/HourlyStats/WeeklyStats
(time-bucketed views), ModelStats (cross-session model comparison
with share percentages), ProjectStats (per-project ranking), and
PeriodComparison (current vs previous period for delta display).
- model/budget.go: BudgetStats with plan ceiling, custom budget,
burn rate, and projected monthly spend for the budget tab.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>