Commit Graph

10 Commits

Author SHA1 Message Date
teernisse
c20652924d Extract shared JSONL parsing helpers for parser parity
Introduce three shared helpers in session-parser.ts that both the full
parser and the lightweight metadata extractor can use:

- forEachJsonlLine(content, onLine): Iterates JSONL lines with consistent
  malformed-line handling. Skips invalid JSON lines identically to how
  parseSessionContent handles them. Returns parse error count for diagnostics.

- countMessagesForLine(parsed): Returns the number of messages a single
  JSONL line expands into, using the same classification rules as the
  full parser. User arrays expand tool_result and text blocks; assistant
  arrays expand thinking, text, and tool_use.

- classifyLine(parsed): Classifies a parsed line into one of 8 types
  (user, assistant, system, progress, summary, file_snapshot, queue, other).

The internal extractMessages() function now uses these shared helpers,
ensuring no behavior change while enabling the upcoming metadata extraction
service to reuse the same logic. This guarantees list counts can never drift
from detail-view counts, regardless of future parser changes.

Test coverage includes:
- Malformed line handling parity with full parser
- Parse error counting for truncated/corrupted files
- countMessagesForLine output matches extractMessages().length
- Edge cases: empty files, progress events, array content expansion

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-28 00:52:41 -05:00
3fe8d7d3b5 Add comprehensive test suite for progress tracking system
Test fixture updates:
- Add toolUseId fields (toolu_read1, toolu_edit1) to tool_use blocks
- Add parentToolUseID-linked progress events for read and edit tools
- Add orphaned SessionStart progress event (no parent)
- Update tool_result references to match new toolUseId values
- Add bash_progress and mcp_progress subtypes for subtype derivation

session-parser tests (7 new):
- toolUseId extraction from tool_use blocks with and without id field
- parentToolUseId and progressSubtype extraction from hook_progress
- Subtype derivation for bash_progress, mcp_progress, agent_progress
- Fallback to "hook" for unknown data types
- Undefined parentToolUseId when field is absent

progress-grouper tests (7 new):
- Partition parented progress into toolProgress map
- Remove parented progress from filtered messages array
- Keep orphaned progress (no parentToolUseId) in main stream
- Keep progress with invalid parentToolUseId (no matching tool_call)
- Empty input handling
- Sort each group by rawIndex
- Multiple tool_call parents tracked independently

agent-progress-parser tests (full suite):
- Parse user text events with prompt/agentId metadata extraction
- Parse tool_use blocks into AgentToolCall events
- Parse tool_result blocks with content extraction
- Parse text content as text_response with line counting
- Handle multiple content blocks in single turn
- Post-pass tool_result→tool_call linking (sourceTool, language)
- Empty input and malformed JSON → raw_content fallback
- stripLineNumbers for cat-n prefixed output
- summarizeToolCall for Read, Grep, Glob, Bash, Task, WarpGrep, etc.

ProgressBadge component tests:
- Collapsed state shows pill counts, hides content
- Expanded state shows all event content via markdown
- Subtype counting accuracy
- Agent-only events route to AgentProgressView

AgentProgressView component tests:
- Prompt banner rendering with truncation
- Agent ID and turn count display
- Summary rows with timestamps and tool names
- Click-to-expand drill-down content

html-exporter tests (8 new):
- Collapsible rendering for thinking, tool_call, tool_result
- Toggle button and JavaScript inclusion
- Non-collapsible messages lack collapse attributes
- Diff content detection and highlighting
- Progress badge rendering with toolProgress data

filters tests (2 new):
- hook_progress included/excluded by category toggle

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 23:05:01 -05:00
6681f07fc0 Add countSensitiveMessages for pre-scan sensitive content detection
Export a new countSensitiveMessages() function that returns how many
messages in an array contain at least one sensitive pattern match.
Checks both content and toolInput fields, counting each message at
most once regardless of how many matches it contains.

Tests verify zero counts for clean messages, correct counting with
mixed sensitive/clean messages, and the single-count-per-message
invariant when multiple secrets appear in one message.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 13:35:15 -05:00
4027dd65be Persist filter and auto-redact preferences to localStorage
Category toggles and the auto-redact checkbox now survive page
reloads. On mount, useFilters reads from localStorage keys
session-viewer:enabledCategories and session-viewer:autoRedact,
falling back to defaults when storage is empty, corrupted, or
contains invalid category names. Each state change writes back
to localStorage in a useEffect.

Tests cover round-trip persistence, invalid data recovery, corrupted
JSON fallback, and the boolean coercion for auto-redact.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 13:35:08 -05:00
54f909c80c Revise default hidden categories to reduce noise in session view
Change the default-hidden message categories from [thinking,
hook_progress] to [tool_result, system_message, hook_progress,
file_snapshot]. This hides the verbose machine-oriented categories
by default while keeping thinking blocks visible — they contain
useful reasoning context that users typically want to see.

Also rename the "summary" category label from "Summaries" to
"Compactions" to better reflect what Claude's summary messages
actually represent (context-window compaction artifacts).

Tests updated to match the new defaults: the filter test now
asserts that tool_result, system_message, hook_progress, and
file_snapshot are all excluded, producing 5 visible messages
instead of the previous 7.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 10:42:21 -05:00
69857fa825 Fix session discovery tests to use dynamic paths and add containment test
Replace hardcoded absolute paths in test assertions with dynamically
constructed paths matching the temp directory. This makes tests portable
across environments where path.resolve() produces different results.

Add test verifying that absolute paths pointing outside the projects
directory (e.g. /etc/shadow.jsonl) are rejected by the discovery filter.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 01:10:55 -05:00
0e89924685 Redesign HTML exporter with dark theme, timestamps, and performance fixes
Visual overhaul of exported HTML to match the new client dark design:
- Replace category-specific CSS classes with inline border/dot/text styles
  from a CATEGORY_STYLES map matching client-side colors
- Add message header layout with category dot, label, and timestamp
- Add Inter font family, refined prose typography, and proper code styling
- Add print-friendly media query
- Redesign redacted divider with SVG eye-slash icon and red accent
- Add SVG icons to session header metadata (project, date, message count)
- Fix singular/plural for '1 message' vs 'N messages'

Performance: Skip markdown parsing for hook_progress, tool_result, and
file_snapshot categories (structured data). Render as preformatted text
instead, avoiding expensive marked.parse() on large JSON blobs (~300ms each).

Replace local escapeHtml with shared/escape-html module. Add formatTimestamp
helper. Add cast safety comment for marked.parse() sync usage.

Update test to verify singular message count ('1 message' not '1 messages').

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 01:10:35 -05:00
0e5a36f0d1 Fix sensitive redactor keyword matching for case-insensitive patterns
The keyword pre-filter used case-sensitive string matching for all patterns,
but several regex patterns use the /i flag (e.g. generic_api_key). This meant
inputs like 'ApiKey = "secret"' would skip the keyword check for 'api_key'
and miss the redaction entirely.

Changes:
- Add caseInsensitive parameter to hasKeyword() that lowercases both content
  and keywords before comparison
- Detect /i flag on pattern regex and pass it through automatically
- Narrow IP address keywords from ["."] to ["0.", "1.", ..., "9."] to reduce
  false-positive regex invocations on content containing periods
- Fix email regex character class [A-Z|a-z] → [A-Za-z] (the pipe was literal)
- Add clarifying comment on url_with_creds pattern
- Add test cases for mixed-case and UPPER_CASE key assignments
- Relax SECRET_KEY test assertion to accept either redaction label

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 01:09:11 -05:00
eb8001dbf1 Harden session discovery with path validation and parallel I/O
Security: Reject session paths containing '..' traversal segments or
non-.jsonl extensions before resolving them. This prevents a malicious
sessions-index.json from tricking the viewer into reading arbitrary files.

Performance: Process all project directories concurrently with Promise.all
instead of sequentially awaiting each one. Each directory's stat + readFile
is independent I/O that benefits from parallelism.

Add test case verifying that traversal paths and non-JSONL paths are rejected
while valid paths pass through.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 01:08:57 -05:00
a1d54e84c7 Add test suite: unit tests for parser, discovery, redactor, exporter, and filters
Comprehensive test coverage for all server services and shared modules:

tests/unit/session-parser.test.ts (16 tests):
- Parses every message type: user (string + array content), assistant
  (text, thinking, tool_use blocks), progress/hook events (from data
  field), file-history-snapshot, summary (from summary field)
- Verifies system metadata (turn_duration) and queue-operation lines
  are silently skipped
- Detects <system-reminder> tags and reclassifies as system_message
- Resilience: skips malformed JSONL lines, returns empty for empty
  files, preserves UUIDs from source
- Integration: parses full sample-session.jsonl fixture verifying all
  9 categories are represented, handles edge-cases.jsonl with corrupt
  lines

tests/unit/session-discovery.test.ts (6 tests):
- Discovers sessions from {version, entries} index format
- Handles legacy raw array format
- Gracefully returns empty for missing directories and corrupt JSON
- Aggregates across multiple project directories
- Uses fullPath from index entries when available

tests/unit/sensitive-redactor.test.ts (40+ tests):
- Tier 1 secrets: AWS access keys (AKIA/ASIA), Bedrock keys, GitHub
  PATs (ghp_, github_pat_, ghu_, ghs_), GitLab (glpat-, glrt-),
  OpenAI (sk-proj-, legacy sk-), Anthropic (api03, admin01),
  HuggingFace, Perplexity, Stripe (sk_live/test/prod, rk_*), Slack
  (bot token, webhook URL), SendGrid, GCP, Heroku, npm, PyPI, Sentry,
  JWT, PEM private keys (RSA + generic), generic API key assignments
- Tier 2 PII: home directory paths (Linux/macOS/Windows), connection
  strings (PostgreSQL/MongoDB/Redis), URLs with embedded credentials,
  email addresses, IPv4 addresses, Bearer tokens, env var secrets
- False positive resistance: normal code, markdown, short strings,
  non-home file paths, version numbers
- Allowlists: example.com/test.com emails, noreply@anthropic.com,
  127.0.0.1, 0.0.0.0, RFC 5737 documentation IPs
- Edge cases: empty strings, multiple secrets, category tracking
- redactMessage: preserves uuid/category/timestamp, redacts content
  and toolInput, leaves toolName unchanged, doesn't mutate original

tests/unit/html-exporter.test.ts (8 tests):
- Valid DOCTYPE HTML output with no external URL references
- Includes visible messages, excludes filtered and redacted ones
- Inserts redacted dividers at correct positions
- Renders markdown to HTML with syntax highlighting CSS
- Includes session metadata header

tests/unit/filters.test.ts (12 tests):
- Category filtering: include/exclude by enabled set
- Default state: thinking and hook_progress hidden
- All-on/all-off edge cases
- Redacted UUID exclusion
- Search match counting with empty query edge case
- Preset validation (conversation, debug)
- Category count computation with and without redactions

src/client/components/SessionList.test.tsx (7 tests):
- Two-phase navigation: project list → session list → back
- Auto-select project when selectedId matches
- Loading and empty states
- onSelect callback on session click

tests/fixtures/:
- sample-session.jsonl: representative session with all message types
- edge-cases.jsonl: corrupt lines interspersed with valid messages
- sessions-index.json: sample index file for discovery tests

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 22:57:02 -05:00