Implement JSONL-first session discovery with tiered lookup
Rewrite session discovery to be filesystem-first, addressing the widespread bug where Claude Code's sessions-index.json files are unreliable (87 MB of unindexed sessions, 17% loss rate across all projects). Architecture: Three-tier metadata lookup Tier 1 - Index validation (instant): - Parse sessions-index.json into Map<sessionId, IndexEntry> - Validate entry.modified against actual file stat.mtimeMs - Use 1s tolerance to account for ISO string → filesystem mtime rounding - Trust content fields only (messageCount, summary, firstPrompt) - Timestamps always come from fs.stat, never from index Tier 2 - Persistent cache hit (instant): - Check MetadataCache by (filePath, mtimeMs, size) - If match, use cached metadata - Survives server restarts Tier 3 - Full JSONL parse (~5-50ms/file): - Call extractSessionMetadata() with shared parser helpers - Cache result for future lookups Key correctness guarantees: - All .jsonl files appear regardless of index state - SessionEntry timestamps always from fs.stat (list ordering never stale) - Message counts exact (shared helpers ensure parser parity) - Duration computed from JSONL timestamps, not index Performance: - Bounded concurrency: 32 concurrent operations per project - mapWithLimit() prevents file handle exhaustion - Warm start <1s (stat all files, in-memory lookups) - Cold start ~3-5s for 3,103 files (stat + parse phases) TOCTOU handling: - Files that disappear between readdir and stat: silently skipped - Files that disappear between stat and read: silently skipped - File actively being written: partial parse handled gracefully Include PRD document that drove this implementation with detailed requirements, edge cases, and verification plan. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
313
docs/prd-jsonl-first-discovery.md
Normal file
313
docs/prd-jsonl-first-discovery.md
Normal file
@@ -0,0 +1,313 @@
|
||||
# PRD: JSONL-First Session Discovery
|
||||
|
||||
## Status: Ready for Implementation
|
||||
|
||||
## Context
|
||||
|
||||
Session viewer relies exclusively on `sessions-index.json` files that Claude Code maintains. These indexes are unreliable — a known, widespread bug with multiple open GitHub issues ([#22030](https://github.com/anthropics/claude-code/issues/22030), [#21610](https://github.com/anthropics/claude-code/issues/21610), [#18619](https://github.com/anthropics/claude-code/issues/18619), [#22114](https://github.com/anthropics/claude-code/issues/22114)).
|
||||
|
||||
### Root cause
|
||||
|
||||
Claude Code updates `sessions-index.json` only at session end. If a session crashes, is killed, or is abandoned, the JSONL file is written but the index is never updated. Multiple concurrent Claude instances can also corrupt the index (last-write-wins on a single JSON file). There is no reindex command and no background repair process.
|
||||
|
||||
### Impact on this system
|
||||
|
||||
- **542 unindexed JSONL files** across all projects (87 MB total)
|
||||
- **48 unindexed in last 7 days** (30.8 MB)
|
||||
- **13 projects** have JSONL session files but no index at all
|
||||
- **Zero sessions from today** (Feb 4, 2026) appear in any index
|
||||
- **3,103 total JSONL files** vs **2,563 indexed entries** = 17% loss rate
|
||||
|
||||
### Key insight
|
||||
|
||||
The `.jsonl` files are the source of truth. The index is an unreliable convenience cache. The session viewer must treat it that way.
|
||||
|
||||
## Requirements
|
||||
|
||||
### Must have
|
||||
|
||||
1. **All sessions with a `.jsonl` file must appear in the session list**, regardless of whether they're in `sessions-index.json`
|
||||
2. **Exact message counts** — no estimates, no approximations. Contract: Tier 3 extraction MUST reuse the same line-classification logic as `parseSessionContent` (shared helper), so list counts cannot drift from detail parsing.
|
||||
3. **Performance**: Warm start (cache exists, few changes) must complete under 1 second. Cold start (no cache) is acceptable up to 5 seconds for first request
|
||||
4. **Correctness over speed** — never show stale metadata if the file has been modified
|
||||
5. **Zero config** — works out of the box with no setup or external dependencies
|
||||
|
||||
### Should have
|
||||
|
||||
6. Session `summary` extracted from the last `type="summary"` line in the JSONL
|
||||
7. Session `firstPrompt` extracted from the first non-system-reminder user message
|
||||
8. Session `duration` MUST be derivable without relying on `sessions-index.json` — extract first and last timestamps from JSONL when index is missing or stale
|
||||
9. Persistent metadata cache survives server restarts
|
||||
|
||||
### Won't have (this iteration)
|
||||
|
||||
- Real-time push updates (sessions appearing in UI without refresh)
|
||||
- Background file watcher daemon
|
||||
- Integration with `cass` as a search/indexing backend
|
||||
- Rebuilding Claude Code's `sessions-index.json`
|
||||
|
||||
## Technical Design
|
||||
|
||||
### Architecture: Filesystem-primary with tiered metadata lookup
|
||||
|
||||
```
|
||||
discoverSessions()
|
||||
|
|
||||
+-- For each project directory under ~/.claude/projects/:
|
||||
| |
|
||||
| +-- fs.readdir() --> list all *.jsonl files
|
||||
| +-- Read sessions-index.json (optional, used as pre-populated cache)
|
||||
| |
|
||||
| +-- Batch stat all .jsonl files (bounded concurrency)
|
||||
| | Files that disappeared between readdir and stat are silently skipped (TOCTOU race)
|
||||
| |
|
||||
| +-- For each .jsonl file:
|
||||
| | |
|
||||
| | +-- Tier 1: Check index
|
||||
| | | Entry exists AND normalize(index.modified) matches stat mtime?
|
||||
| | | --> Use index content data (messageCount, summary, firstPrompt)
|
||||
| | | --> Use stat-derived timestamps for created/modified (always)
|
||||
| | |
|
||||
| | +-- Tier 2: Check persistent metadata cache
|
||||
| | | path + mtimeMs + size match?
|
||||
| | | --> Use cached metadata (fast path)
|
||||
| | |
|
||||
| | +-- Tier 3: Extract metadata from JSONL content
|
||||
| | Read file, lightweight parse using shared line iterator + counting helper
|
||||
| | --> Cache result for future lookups
|
||||
| |
|
||||
| +-- Collect SessionEntry[] for this project
|
||||
|
|
||||
+-- Merge all projects
|
||||
+-- Sort by modified (descending) — always stat-derived, never index-derived
|
||||
+-- Async: persist metadata cache to disk (if dirty)
|
||||
```
|
||||
|
||||
### Tier explanation
|
||||
|
||||
| Tier | Source | Speed | When used | Trusts from source |
|
||||
|------|--------|-------|-----------|--------------------|
|
||||
| 1 | `sessions-index.json` | Instant (in-memory lookup) | Index exists, entry present, `normalize(modified)` matches actual file mtime | `messageCount`, `summary`, `firstPrompt` only. Timestamps always from stat. |
|
||||
| 2 | Persistent metadata cache | Instant (in-memory lookup) | Index missing/stale, but file hasn't changed since last extraction (mtimeMs + size match) | All cached fields |
|
||||
| 3 | JSONL file parse | ~5-50ms/file | New or modified file, not in any cache | Extracted fresh |
|
||||
|
||||
Tier 1 reuses Claude's index when it's valid — no wasted work. The index `modified` field (ISO string) is normalized to milliseconds and compared against the real file `stat.mtimeMs`. If the index is missing or corrupt, discovery continues with Tier 2 and 3 without error. Even when Tier 1 is valid, `created` and `modified` timestamps on the `SessionEntry` always come from `fs.stat` — the index is a content cache only.
|
||||
|
||||
### Tier 1: Index validation details
|
||||
|
||||
The actual `sessions-index.json` format has `created` and `modified` as ISO strings, not a `fileMtime` field. Tier 1 validation must:
|
||||
|
||||
1. Map JSONL filename to sessionId: `sessionId := path.basename(jsonlFile, '.jsonl')`
|
||||
2. Look up `sessionId` in the index `Map<string, IndexEntry>`
|
||||
3. Compare `new Date(entry.modified).getTime()` against `stat.mtimeMs` — reject if they differ by more than 1000ms (accounts for ISO string → filesystem mtime rounding)
|
||||
4. If the index entry has no `modified` field, skip Tier 1 (fall through to Tier 2)
|
||||
5. When Tier 1 is valid, trust only content fields (`messageCount`, `summary`, `firstPrompt`). The `created`/`modified` on the resulting `SessionEntry` must come from `stat.birthtimeMs`/`stat.mtimeMs` respectively — this ensures list ordering is never stale even within the 1s mtime tolerance window.
|
||||
|
||||
### Shared line-iteration and counting (parser parity contract)
|
||||
|
||||
The biggest correctness risk in this design is duplicating any JSONL processing logic. The real parser in `session-parser.ts` has non-trivial expansion rules:
|
||||
|
||||
- User array content: expands `tool_result` and `text` blocks into separate messages
|
||||
- `system-reminder` detection reclassifies user `text` blocks as `system_message`
|
||||
- Assistant array content: `thinking`, `text`, and `tool_use` each become separate messages
|
||||
- `progress`, `file-history-snapshot`, `summary` → 1 message each
|
||||
- `system`, `queue-operation` → 0 (skipped)
|
||||
|
||||
It also has error-handling behavior: malformed/truncated JSON lines are skipped (common when sessions crash mid-write). If the metadata extractor and the full parser handle malformed lines differently, counts will drift.
|
||||
|
||||
Rather than reimplementing any of these rules, extract shared helpers at two levels:
|
||||
|
||||
```typescript
|
||||
// In session-parser.ts (or a shared module):
|
||||
|
||||
// Level 1: Line iteration with consistent error handling
|
||||
// Splits content by newlines, JSON.parse each, skips malformed lines identically
|
||||
// to how parseSessionContent handles them. Returns parse error count for diagnostics.
|
||||
export function forEachJsonlLine(
|
||||
content: string,
|
||||
onLine: (parsed: RawLine, lineIndex: number) => void
|
||||
): { parseErrors: number }
|
||||
|
||||
// Level 2: Classification and counting (called per parsed line)
|
||||
export function countMessagesForLine(parsed: RawLine): number
|
||||
export function classifyLine(parsed: RawLine): LineClassification
|
||||
```
|
||||
|
||||
Both `extractSessionMetadata()` and `parseSessionContent()` use `forEachJsonlLine()` for iteration, ensuring identical malformed-line handling. Both use `countMessagesForLine()` for counting. This two-level sharing guarantees that list counts can never drift from detail-view counts, regardless of future parser changes or edge cases in error handling.
|
||||
|
||||
### Metadata extraction (Tier 3)
|
||||
|
||||
A lightweight `extractSessionMetadata()` function reads the JSONL file and extracts only what the list view needs, without building full message content strings:
|
||||
|
||||
```typescript
|
||||
export function extractSessionMetadata(content: string): SessionMetadata
|
||||
```
|
||||
|
||||
Implementation:
|
||||
|
||||
1. Iterate lines via `forEachJsonlLine(content, ...)` — the shared iterator with identical malformed-line handling as the main parser
|
||||
2. Call `countMessagesForLine(parsed)` per line — the shared helper that uses the **same classification rules** as `parseSessionContent` in `session-parser.ts`
|
||||
3. Extract `firstPrompt`: content of the first user message that isn't a `<system-reminder>`, truncated to 200 characters
|
||||
4. Extract `summary`: the `summary` field from the last `type="summary"` line
|
||||
5. Capture first and last `timestamp` fields for duration computation
|
||||
|
||||
No string building, no `JSON.stringify`, no markdown processing — just counting, timestamp capture, and first-match extraction. This is exact (matches `parseSessionContent().length` via shared helpers) but 2-3x faster than full parsing.
|
||||
|
||||
### Persistent metadata cache
|
||||
|
||||
**Location:** `~/.cache/session-viewer/metadata.json`
|
||||
|
||||
```typescript
|
||||
interface CacheFile {
|
||||
version: 1;
|
||||
entries: Record<string, { // keyed by absolute file path
|
||||
mtimeMs: number;
|
||||
size: number;
|
||||
messageCount: number;
|
||||
firstPrompt: string;
|
||||
summary: string;
|
||||
created: string; // ISO string from file birthtime
|
||||
modified: string; // ISO string from file mtime
|
||||
firstTimestamp: string; // ISO from first JSONL line with timestamp
|
||||
lastTimestamp: string; // ISO from last JSONL line with timestamp
|
||||
}>;
|
||||
}
|
||||
```
|
||||
|
||||
Behavior:
|
||||
- Loaded once on first `discoverSessions()` call
|
||||
- Entries validated by `(mtimeMs, size)` — if either changes, entry is re-extracted via Tier 3
|
||||
- Written to disk asynchronously using a dirty-flag write-behind strategy: only when cache has new/updated entries, coalescing multiple discovery passes, non-blocking
|
||||
- Flush any pending write on process exit (`SIGTERM`, `SIGINT`) and graceful server shutdown — prevents losing cache updates when the server stops before the async write fires
|
||||
- Corrupt or missing cache file triggers graceful fallback (all files go through Tier 3, cache rebuilt)
|
||||
- Atomic writes: write to temp file, then rename (prevents corruption from crashes during write)
|
||||
- Stale entries (file no longer exists on disk) are pruned on save
|
||||
|
||||
### Concurrency model
|
||||
|
||||
Cold start with 3,103 files requires bounded parallelism to avoid file-handle exhaustion and IO thrash, while still meeting the <5s target:
|
||||
|
||||
- **Stat phase**: Batch all `fs.stat()` calls with concurrency limit (e.g., 64). This classifies each file into Tier 1/2 (cache hit) or Tier 3 (needs parse). Files that fail stat (ENOENT from deletion race, EACCES) are silently skipped with a debug log.
|
||||
- **Parse phase**: Process Tier-3 misses with bounded concurrency (e.g., 8). Each parse reads + iterates via shared `forEachJsonlLine()` + shared counter. With max file size 4.5MB, each parse is ~5-50ms.
|
||||
- Use a simple async work queue (e.g., `p-limit` or hand-rolled semaphore). No worker threads needed for this IO-bound workload.
|
||||
|
||||
### Performance expectations
|
||||
|
||||
| Scenario | Estimated time |
|
||||
|----------|---------------|
|
||||
| Cold start (no cache, no index) | ~3-5s for 3,103 files (~500MB), bounded concurrency: stat@64, parse@8 |
|
||||
| Warm start (cache exists, few changes) | ~300-500ms (stat all files at bounded concurrency, in-memory lookups) |
|
||||
| Incremental (cache + few new sessions) | ~500ms + ~50ms per new file |
|
||||
| Subsequent API calls within 30s TTL | <1ms (in-memory session list cache) |
|
||||
|
||||
### Existing infrastructure leveraged
|
||||
|
||||
- **30-second in-memory cache** in `sessions.ts` (`getCachedSessions()`) — unchanged, provides the fast path for repeated API calls
|
||||
- **`?refresh=1` query parameter** — forces cache invalidation, unchanged
|
||||
- **Concurrent request deduplication** via `cachePromise` pattern — unchanged
|
||||
- **Security validations** — path traversal rejection, containment checks, `.jsonl` extension enforcement — applied to filesystem-discovered files identically
|
||||
|
||||
## Implementation scope
|
||||
|
||||
### Checkpoints
|
||||
|
||||
#### CP0 — Parser parity foundations
|
||||
- Extract `forEachJsonlLine()` shared line iterator from existing parser
|
||||
- Extract `countMessagesForLine()` and `classifyLine()` shared helpers
|
||||
- Refactor `extractMessages()` to use these internally (no behavior change to parseSessionContent)
|
||||
- Tests verify identical behavior on malformed/truncated lines
|
||||
|
||||
#### CP1 — Filesystem-first correctness
|
||||
- All `.jsonl` sessions appear even with missing/corrupt index
|
||||
- `extractSessionMetadata()` uses shared line iterator + counting helpers; exact counts verified by tests
|
||||
- Stat-derived `created`/`modified` are the single source for SessionEntry timestamps and list ordering
|
||||
- Duration computed from JSONL timestamps, not index
|
||||
- TOCTOU races (readdir/stat, stat/read) handled gracefully — disappeared files silently skipped
|
||||
|
||||
#### CP2 — Persistent cache
|
||||
- Atomic writes with dirty-flag write-behind; prune stale entries
|
||||
- Invalidation keyed on `(mtimeMs, size)`
|
||||
- Flush pending writes on process exit / server shutdown
|
||||
|
||||
#### CP3 — Index fast path (Tier 1)
|
||||
- Parse index into Map; normalize `modified` ISO → ms; validate against stat mtime with 1s tolerance
|
||||
- sessionId mapping: `basename(file, '.jsonl')`
|
||||
- Tier 1 trusts content fields only; timestamps always from stat
|
||||
|
||||
#### CP4 — Performance hardening
|
||||
- Bounded concurrency for stat + parse phases
|
||||
- Warm start <1s verified on real dataset
|
||||
|
||||
### Modified files
|
||||
|
||||
**`src/server/services/session-parser.ts`**
|
||||
|
||||
1. Extract `forEachJsonlLine(content, onLine): { parseErrors: number }` — shared line iterator with consistent malformed-line handling
|
||||
2. Extract `countMessagesForLine(parsed: RawLine): number` — shared counting helper
|
||||
3. Extract `classifyLine(parsed: RawLine): LineClassification` — shared classification
|
||||
4. Refactor `extractMessages()` to use these shared helpers internally (no behavior change to parseSessionContent)
|
||||
|
||||
**`src/server/services/session-discovery.ts`**
|
||||
|
||||
1. Add `extractSessionMetadata(content: string): SessionMetadata` — lightweight JSONL metadata extractor using shared line iterator + counting helper
|
||||
2. Add `MetadataCache` class — persistent cache with load/get/set/save, dirty-flag write-behind, shutdown flush
|
||||
3. Rewrite per-project discovery loop — filesystem-first, tiered metadata lookup with bounded concurrency
|
||||
4. Read `sessions-index.json` as optimization only — parse into `Map<sessionId, IndexEntry>`, normalize `modified` to ms, validate against stat mtime before trusting
|
||||
5. Register shutdown hooks for cache flush on `SIGTERM`/`SIGINT`
|
||||
|
||||
### Unchanged files
|
||||
|
||||
- `src/server/routes/sessions.ts` — existing caching layer works as-is
|
||||
- `src/shared/types.ts` — `SessionEntry` type already has `duration?: number`
|
||||
- All client components — no changes needed
|
||||
|
||||
### New tests
|
||||
|
||||
- Unit test: `forEachJsonlLine()` skips malformed lines identically to how `parseSessionContent` handles them
|
||||
- Unit test: `forEachJsonlLine()` reports parse error count for truncated/corrupted lines
|
||||
- Unit test: `countMessagesForLine()` matches actual `extractMessages()` output length on sample lines
|
||||
- Unit test: `extractSessionMetadata()` output matches `parseSessionContent().length` on sample fixtures (including malformed/truncated lines)
|
||||
- Unit test: Duration extracted from JSONL timestamps matches expected values
|
||||
- Unit test: SessionEntry `created`/`modified` always come from stat, even when Tier 1 index data is trusted
|
||||
- Unit test: Tier 1 validation rejects stale index entries (mtime mismatch beyond 1s tolerance)
|
||||
- Unit test: Tier 1 handles missing `modified` field gracefully (falls through to Tier 2)
|
||||
- Unit test: Discovery works with no `sessions-index.json` present
|
||||
- Unit test: Discovery silently skips files that disappear between readdir and stat (TOCTOU)
|
||||
- Unit test: Cache hit/miss/invalidation behavior (mtimeMs + size)
|
||||
- Unit test: Cache dirty-flag only triggers write when entries changed
|
||||
|
||||
## Edge cases
|
||||
|
||||
| Scenario | Behavior |
|
||||
|----------|----------|
|
||||
| File actively being written | mtime changes between stat and read. Next discovery pass re-extracts. Partial JSONL handled gracefully (malformed lines skipped via shared `forEachJsonlLine`, same behavior as real parser). |
|
||||
| Deleted session files | File in cache but gone from disk. Entry silently dropped, pruned from cache on next save. |
|
||||
| File disappears between readdir and stat | TOCTOU race. Stat failure (ENOENT) silently skipped with debug log. |
|
||||
| File disappears between stat and read | Read failure silently skipped; file excluded from results. Next pass re-discovers if it reappears. |
|
||||
| Index entry with wrong mtime | Tier 1 validation rejects it (>1s tolerance). Falls through to Tier 2/3. |
|
||||
| Index entry with no `modified` field | Tier 1 skips it. Falls through to Tier 2/3. |
|
||||
| Index `modified` in seconds vs milliseconds | Normalization handles both ISO strings and numeric timestamps. |
|
||||
| Cache file locked or unwritable | Extraction still works, just doesn't persist. Warning logged to stderr. |
|
||||
| Very large files | 4.5MB max observed. Tier 3 parse ~50ms. Acceptable. |
|
||||
| Concurrent server restarts | Cache writes are atomic (temp file + rename). |
|
||||
| Server killed before async cache write | Shutdown hooks flush pending writes on SIGTERM/SIGINT. Hard kills (SIGKILL) may lose updates — acceptable, cache rebuilt on next cold start. |
|
||||
| Empty JSONL files | Returns `messageCount: 0`, empty `firstPrompt`, `summary`, and timestamps. Duration: 0. |
|
||||
| Projects with no index file | Discovery proceeds normally via Tier 2/3. Common case (13 projects). |
|
||||
| Non-JSONL files in project dirs | Filtered out by `.jsonl` extension check in `readdir` results. |
|
||||
| File handle exhaustion | Bounded concurrency (stat@64, parse@8) prevents opening thousands of handles. |
|
||||
| Future parser changes (new message types) | Shared line iterator + counting helper in session-parser.ts means Tier 3 automatically stays in sync. |
|
||||
| Malformed JSONL lines (crash mid-write) | Shared `forEachJsonlLine()` skips identically in both metadata extraction and full parsing — no count drift. |
|
||||
|
||||
## Verification plan
|
||||
|
||||
1. Start dev server, confirm today's sessions appear immediately in the session list
|
||||
2. Compare message counts for indexed sessions: Tier 1 data vs Tier 3 extraction (should match)
|
||||
3. Verify duration is shown for sessions that have no index entry (JSONL-only sessions)
|
||||
4. Delete a `sessions-index.json`, refresh — verify all sessions for that project still appear with correct counts and durations
|
||||
5. Run existing test suite: `npm test`
|
||||
6. Run new unit tests for shared line iterator, counting helper, `extractSessionMetadata()`, and `MetadataCache`
|
||||
7. Verify `created`/`modified` in session list come from stat, not index (compare with `ls -l` output)
|
||||
8. Verify cold start performance: delete `~/.cache/session-viewer/metadata.json`, time the first API request
|
||||
9. Verify warm start performance: time a subsequent server start with cache in place
|
||||
10. Verify cache dirty-flag: repeated refreshes with no file changes should not write cache to disk
|
||||
11. Kill server with SIGTERM, restart — verify cache was flushed (no full re-parse on restart)
|
||||
@@ -2,6 +2,54 @@ import fs from "fs/promises";
|
||||
import path from "path";
|
||||
import os from "os";
|
||||
import type { SessionEntry } from "../../shared/types.js";
|
||||
import { extractSessionMetadata } from "./session-metadata.js";
|
||||
import { MetadataCache } from "./metadata-cache.js";
|
||||
import type { CacheEntry } from "./metadata-cache.js";
|
||||
|
||||
const CLAUDE_PROJECTS_DIR = path.join(os.homedir(), ".claude", "projects");
|
||||
const FILE_CONCURRENCY = 32;
|
||||
|
||||
let cache: MetadataCache | null = null;
|
||||
let cacheLoaded = false;
|
||||
|
||||
export function setCache(c: MetadataCache | null): void {
|
||||
cache = c;
|
||||
cacheLoaded = c !== null;
|
||||
}
|
||||
|
||||
async function ensureCache(): Promise<MetadataCache> {
|
||||
if (!cache) {
|
||||
cache = new MetadataCache();
|
||||
}
|
||||
if (!cacheLoaded) {
|
||||
await cache.load();
|
||||
cacheLoaded = true;
|
||||
}
|
||||
return cache;
|
||||
}
|
||||
|
||||
async function mapWithLimit<T, R>(
|
||||
items: T[],
|
||||
limit: number,
|
||||
fn: (item: T) => Promise<R>
|
||||
): Promise<R[]> {
|
||||
const results: R[] = new Array(items.length);
|
||||
let nextIndex = 0;
|
||||
|
||||
async function worker(): Promise<void> {
|
||||
while (nextIndex < items.length) {
|
||||
const i = nextIndex++;
|
||||
results[i] = await fn(items[i]);
|
||||
}
|
||||
}
|
||||
|
||||
const workers = Array.from(
|
||||
{ length: Math.min(limit, items.length) },
|
||||
() => worker()
|
||||
);
|
||||
await Promise.all(workers);
|
||||
return results;
|
||||
}
|
||||
|
||||
interface IndexEntry {
|
||||
sessionId: string;
|
||||
@@ -14,12 +62,14 @@ interface IndexEntry {
|
||||
projectPath?: string;
|
||||
}
|
||||
|
||||
const CLAUDE_PROJECTS_DIR = path.join(os.homedir(), ".claude", "projects");
|
||||
const MTIME_TOLERANCE_MS = 1000;
|
||||
|
||||
export async function discoverSessions(
|
||||
projectsDir: string = CLAUDE_PROJECTS_DIR
|
||||
): Promise<SessionEntry[]> {
|
||||
const sessions: SessionEntry[] = [];
|
||||
const metadataCache = await ensureCache();
|
||||
const discoveredPaths = new Set<string>();
|
||||
|
||||
let projectDirs: string[];
|
||||
try {
|
||||
@@ -28,63 +78,152 @@ export async function discoverSessions(
|
||||
return sessions;
|
||||
}
|
||||
|
||||
// Parallel I/O: stat + readFile for all project dirs concurrently
|
||||
const results = await Promise.all(
|
||||
projectDirs.map(async (projectDir) => {
|
||||
const projectPath = path.join(projectsDir, projectDir);
|
||||
const entries: SessionEntry[] = [];
|
||||
|
||||
let stat;
|
||||
let dirStat;
|
||||
try {
|
||||
stat = await fs.stat(projectPath);
|
||||
dirStat = await fs.stat(projectPath);
|
||||
} catch {
|
||||
return entries;
|
||||
}
|
||||
if (!stat.isDirectory()) return entries;
|
||||
if (!dirStat.isDirectory()) return entries;
|
||||
|
||||
const indexPath = path.join(projectPath, "sessions-index.json");
|
||||
let files: string[];
|
||||
try {
|
||||
const content = await fs.readFile(indexPath, "utf-8");
|
||||
const parsed = JSON.parse(content);
|
||||
|
||||
// Handle both formats: raw array or { version, entries: [...] }
|
||||
const rawEntries: IndexEntry[] = Array.isArray(parsed)
|
||||
? parsed
|
||||
: parsed.entries ?? [];
|
||||
|
||||
for (const entry of rawEntries) {
|
||||
const sessionPath =
|
||||
entry.fullPath ||
|
||||
path.join(projectPath, `${entry.sessionId}.jsonl`);
|
||||
|
||||
// Validate: reject paths with traversal segments or non-JSONL extensions.
|
||||
// Check the raw path for ".." before resolving (resolve normalizes them away).
|
||||
if (sessionPath.includes("..") || !sessionPath.endsWith(".jsonl")) {
|
||||
continue;
|
||||
}
|
||||
const resolved = path.resolve(sessionPath);
|
||||
|
||||
// Containment check: reject paths that escape the projects directory.
|
||||
// A corrupted or malicious index could set fullPath to an arbitrary
|
||||
// absolute path like "/etc/shadow.jsonl".
|
||||
if (!resolved.startsWith(projectsDir + path.sep) && resolved !== projectsDir) {
|
||||
continue;
|
||||
}
|
||||
|
||||
entries.push({
|
||||
id: entry.sessionId,
|
||||
summary: entry.summary || "",
|
||||
firstPrompt: entry.firstPrompt || "",
|
||||
project: projectDir,
|
||||
created: entry.created || "",
|
||||
modified: entry.modified || "",
|
||||
messageCount: entry.messageCount || 0,
|
||||
path: resolved,
|
||||
duration: computeDuration(entry.created, entry.modified),
|
||||
});
|
||||
}
|
||||
files = await fs.readdir(projectPath);
|
||||
} catch {
|
||||
// Missing or corrupt index - skip
|
||||
return entries;
|
||||
}
|
||||
|
||||
const jsonlFiles = files.filter((f) => f.endsWith(".jsonl"));
|
||||
|
||||
// Tier 1: Load sessions-index.json for this project
|
||||
const indexMap = await loadProjectIndex(projectPath);
|
||||
|
||||
const fileResults = await mapWithLimit(
|
||||
jsonlFiles,
|
||||
FILE_CONCURRENCY,
|
||||
async (filename) => {
|
||||
const filePath = path.join(projectPath, filename);
|
||||
|
||||
// Security: reject traversal
|
||||
if (filename.includes("..")) return null;
|
||||
|
||||
const resolved = path.resolve(filePath);
|
||||
if (
|
||||
!resolved.startsWith(projectsDir + path.sep) &&
|
||||
resolved !== projectsDir
|
||||
) {
|
||||
return null;
|
||||
}
|
||||
|
||||
let fileStat;
|
||||
try {
|
||||
fileStat = await fs.stat(resolved);
|
||||
} catch {
|
||||
return null;
|
||||
}
|
||||
|
||||
discoveredPaths.add(resolved);
|
||||
|
||||
const sessionId = path.basename(filename, ".jsonl");
|
||||
|
||||
// Tier 1: Check index
|
||||
const indexEntry = indexMap.get(sessionId);
|
||||
if (indexEntry?.modified) {
|
||||
const indexMtimeMs = new Date(indexEntry.modified).getTime();
|
||||
if (
|
||||
!isNaN(indexMtimeMs) &&
|
||||
Math.abs(indexMtimeMs - fileStat.mtimeMs) <= MTIME_TOLERANCE_MS
|
||||
) {
|
||||
const duration = computeDuration(
|
||||
indexEntry.created,
|
||||
indexEntry.modified
|
||||
);
|
||||
return {
|
||||
id: sessionId,
|
||||
project: projectDir,
|
||||
path: resolved,
|
||||
created: new Date(fileStat.birthtimeMs).toISOString(),
|
||||
modified: new Date(fileStat.mtimeMs).toISOString(),
|
||||
messageCount: indexEntry.messageCount || 0,
|
||||
firstPrompt: indexEntry.firstPrompt || "",
|
||||
summary: indexEntry.summary || "",
|
||||
duration: duration > 0 ? duration : undefined,
|
||||
} satisfies SessionEntry;
|
||||
}
|
||||
}
|
||||
|
||||
// Tier 2: Check metadata cache
|
||||
const cached = metadataCache.get(
|
||||
resolved,
|
||||
fileStat.mtimeMs,
|
||||
fileStat.size
|
||||
);
|
||||
if (cached) {
|
||||
const duration = computeDuration(
|
||||
cached.firstTimestamp,
|
||||
cached.lastTimestamp
|
||||
);
|
||||
return {
|
||||
id: sessionId,
|
||||
project: projectDir,
|
||||
path: resolved,
|
||||
created: new Date(fileStat.birthtimeMs).toISOString(),
|
||||
modified: new Date(fileStat.mtimeMs).toISOString(),
|
||||
messageCount: cached.messageCount,
|
||||
firstPrompt: cached.firstPrompt,
|
||||
summary: cached.summary,
|
||||
duration: duration > 0 ? duration : undefined,
|
||||
} satisfies SessionEntry;
|
||||
}
|
||||
|
||||
// Tier 3: Full parse
|
||||
let content: string;
|
||||
try {
|
||||
content = await fs.readFile(resolved, "utf-8");
|
||||
} catch {
|
||||
return null;
|
||||
}
|
||||
|
||||
const metadata = extractSessionMetadata(content);
|
||||
|
||||
// Update cache
|
||||
const cacheEntry: CacheEntry = {
|
||||
mtimeMs: fileStat.mtimeMs,
|
||||
size: fileStat.size,
|
||||
messageCount: metadata.messageCount,
|
||||
firstPrompt: metadata.firstPrompt,
|
||||
summary: metadata.summary,
|
||||
firstTimestamp: metadata.firstTimestamp,
|
||||
lastTimestamp: metadata.lastTimestamp,
|
||||
};
|
||||
metadataCache.set(resolved, cacheEntry);
|
||||
|
||||
const duration = computeDuration(
|
||||
metadata.firstTimestamp,
|
||||
metadata.lastTimestamp
|
||||
);
|
||||
|
||||
return {
|
||||
id: sessionId,
|
||||
project: projectDir,
|
||||
path: resolved,
|
||||
created: new Date(fileStat.birthtimeMs).toISOString(),
|
||||
modified: new Date(fileStat.mtimeMs).toISOString(),
|
||||
messageCount: metadata.messageCount,
|
||||
firstPrompt: metadata.firstPrompt,
|
||||
summary: metadata.summary,
|
||||
duration: duration > 0 ? duration : undefined,
|
||||
} satisfies SessionEntry;
|
||||
}
|
||||
);
|
||||
|
||||
for (const entry of fileResults) {
|
||||
if (entry) entries.push(entry);
|
||||
}
|
||||
|
||||
return entries;
|
||||
@@ -101,14 +240,47 @@ export async function discoverSessions(
|
||||
return dateB - dateA;
|
||||
});
|
||||
|
||||
// Fire-and-forget cache save
|
||||
metadataCache.save(discoveredPaths).catch(() => {
|
||||
// Cache write failure is non-fatal
|
||||
});
|
||||
|
||||
return sessions;
|
||||
}
|
||||
|
||||
function computeDuration(created?: string, modified?: string): number {
|
||||
if (!created || !modified) return 0;
|
||||
const createdMs = new Date(created).getTime();
|
||||
const modifiedMs = new Date(modified).getTime();
|
||||
if (isNaN(createdMs) || isNaN(modifiedMs)) return 0;
|
||||
const diff = modifiedMs - createdMs;
|
||||
async function loadProjectIndex(
|
||||
projectPath: string
|
||||
): Promise<Map<string, IndexEntry>> {
|
||||
const indexMap = new Map<string, IndexEntry>();
|
||||
const indexPath = path.join(projectPath, "sessions-index.json");
|
||||
|
||||
try {
|
||||
const raw = await fs.readFile(indexPath, "utf-8");
|
||||
const parsed = JSON.parse(raw);
|
||||
const rawEntries: IndexEntry[] = Array.isArray(parsed)
|
||||
? parsed
|
||||
: parsed.entries ?? [];
|
||||
|
||||
for (const entry of rawEntries) {
|
||||
if (entry.sessionId) {
|
||||
indexMap.set(entry.sessionId, entry);
|
||||
}
|
||||
}
|
||||
} catch {
|
||||
// Missing or corrupt index — continue without Tier 1
|
||||
}
|
||||
|
||||
return indexMap;
|
||||
}
|
||||
|
||||
function computeDuration(
|
||||
firstTimestamp?: string,
|
||||
lastTimestamp?: string
|
||||
): number {
|
||||
if (!firstTimestamp || !lastTimestamp) return 0;
|
||||
const firstMs = new Date(firstTimestamp).getTime();
|
||||
const lastMs = new Date(lastTimestamp).getTime();
|
||||
if (isNaN(firstMs) || isNaN(lastMs)) return 0;
|
||||
const diff = lastMs - firstMs;
|
||||
return diff > 0 ? diff : 0;
|
||||
}
|
||||
|
||||
@@ -1,70 +1,122 @@
|
||||
import { describe, it, expect } from "vitest";
|
||||
import { discoverSessions } from "../../src/server/services/session-discovery.js";
|
||||
import { describe, it, expect, beforeEach } from "vitest";
|
||||
import { discoverSessions, setCache } from "../../src/server/services/session-discovery.js";
|
||||
import { MetadataCache } from "../../src/server/services/metadata-cache.js";
|
||||
import path from "path";
|
||||
import fs from "fs/promises";
|
||||
import os from "os";
|
||||
|
||||
/** Helper to write a sessions-index.json in the real { version, entries } format */
|
||||
function makeIndex(entries: Record<string, unknown>[]) {
|
||||
function makeJsonlContent(lines: Record<string, unknown>[]): string {
|
||||
return lines.map((l) => JSON.stringify(l)).join("\n");
|
||||
}
|
||||
|
||||
function makeIndex(entries: Record<string, unknown>[]): string {
|
||||
return JSON.stringify({ version: 1, entries });
|
||||
}
|
||||
|
||||
describe("session-discovery", () => {
|
||||
it("discovers sessions from { version, entries } format", async () => {
|
||||
const tmpDir = path.join(os.tmpdir(), `sv-test-${Date.now()}`);
|
||||
const projectDir = path.join(tmpDir, "test-project");
|
||||
await fs.mkdir(projectDir, { recursive: true });
|
||||
async function makeTmpProject(
|
||||
suffix: string
|
||||
): Promise<{ tmpDir: string; projectDir: string; cachePath: string; cleanup: () => Promise<void> }> {
|
||||
const tmpDir = path.join(os.tmpdir(), `sv-test-${suffix}-${Date.now()}`);
|
||||
const projectDir = path.join(tmpDir, "test-project");
|
||||
const cachePath = path.join(tmpDir, ".cache", "metadata.json");
|
||||
await fs.mkdir(projectDir, { recursive: true });
|
||||
return {
|
||||
tmpDir,
|
||||
projectDir,
|
||||
cachePath,
|
||||
cleanup: () => fs.rm(tmpDir, { recursive: true }),
|
||||
};
|
||||
}
|
||||
|
||||
const sessionPath = path.join(projectDir, "sess-001.jsonl");
|
||||
await fs.writeFile(
|
||||
path.join(projectDir, "sessions-index.json"),
|
||||
makeIndex([
|
||||
{
|
||||
sessionId: "sess-001",
|
||||
fullPath: sessionPath,
|
||||
summary: "Test session",
|
||||
firstPrompt: "Hello",
|
||||
created: "2025-10-15T10:00:00Z",
|
||||
modified: "2025-10-15T11:00:00Z",
|
||||
messageCount: 5,
|
||||
describe("session-discovery", () => {
|
||||
beforeEach(() => {
|
||||
// Reset global cache between tests to prevent cross-contamination
|
||||
setCache(new MetadataCache(path.join(os.tmpdir(), `sv-cache-${Date.now()}.json`)));
|
||||
});
|
||||
|
||||
it("discovers sessions from .jsonl files without index", async () => {
|
||||
const { tmpDir, projectDir, cleanup } = await makeTmpProject("no-index");
|
||||
|
||||
const content = makeJsonlContent([
|
||||
{
|
||||
type: "user",
|
||||
message: { role: "user", content: "Hello world" },
|
||||
uuid: "u-1",
|
||||
timestamp: "2025-10-15T10:00:00Z",
|
||||
},
|
||||
{
|
||||
type: "assistant",
|
||||
message: {
|
||||
role: "assistant",
|
||||
content: [{ type: "text", text: "Hi there" }],
|
||||
},
|
||||
])
|
||||
);
|
||||
uuid: "a-1",
|
||||
timestamp: "2025-10-15T10:01:00Z",
|
||||
},
|
||||
]);
|
||||
|
||||
await fs.writeFile(path.join(projectDir, "sess-001.jsonl"), content);
|
||||
|
||||
const sessions = await discoverSessions(tmpDir);
|
||||
expect(sessions).toHaveLength(1);
|
||||
expect(sessions[0].id).toBe("sess-001");
|
||||
expect(sessions[0].summary).toBe("Test session");
|
||||
expect(sessions[0].project).toBe("test-project");
|
||||
expect(sessions[0].messageCount).toBe(5);
|
||||
expect(sessions[0].path).toBe(sessionPath);
|
||||
expect(sessions[0].messageCount).toBe(2);
|
||||
expect(sessions[0].firstPrompt).toBe("Hello world");
|
||||
expect(sessions[0].path).toBe(path.join(projectDir, "sess-001.jsonl"));
|
||||
|
||||
await fs.rm(tmpDir, { recursive: true });
|
||||
await cleanup();
|
||||
});
|
||||
|
||||
it("also handles legacy raw array format", async () => {
|
||||
const tmpDir = path.join(os.tmpdir(), `sv-test-legacy-${Date.now()}`);
|
||||
const projectDir = path.join(tmpDir, "legacy-project");
|
||||
await fs.mkdir(projectDir, { recursive: true });
|
||||
it("timestamps come from stat, not JSONL content", async () => {
|
||||
const { tmpDir, projectDir, cleanup } = await makeTmpProject("stat-times");
|
||||
|
||||
// Raw array (not wrapped in { version, entries })
|
||||
await fs.writeFile(
|
||||
path.join(projectDir, "sessions-index.json"),
|
||||
JSON.stringify([
|
||||
{
|
||||
sessionId: "legacy-001",
|
||||
summary: "Legacy format",
|
||||
created: "2025-10-15T10:00:00Z",
|
||||
modified: "2025-10-15T11:00:00Z",
|
||||
},
|
||||
])
|
||||
);
|
||||
const content = makeJsonlContent([
|
||||
{
|
||||
type: "user",
|
||||
message: { role: "user", content: "Hello" },
|
||||
uuid: "u-1",
|
||||
timestamp: "2020-01-01T00:00:00Z",
|
||||
},
|
||||
]);
|
||||
|
||||
const filePath = path.join(projectDir, "sess-stat.jsonl");
|
||||
await fs.writeFile(filePath, content);
|
||||
|
||||
const sessions = await discoverSessions(tmpDir);
|
||||
expect(sessions).toHaveLength(1);
|
||||
expect(sessions[0].id).toBe("legacy-001");
|
||||
|
||||
await fs.rm(tmpDir, { recursive: true });
|
||||
// created and modified should be from stat (recent), not from the 2020 timestamp
|
||||
const createdDate = new Date(sessions[0].created);
|
||||
const now = new Date();
|
||||
const diffMs = now.getTime() - createdDate.getTime();
|
||||
expect(diffMs).toBeLessThan(60_000); // within last minute
|
||||
|
||||
await cleanup();
|
||||
});
|
||||
|
||||
it("silently skips files deleted between readdir and stat", async () => {
|
||||
const { tmpDir, projectDir, cleanup } = await makeTmpProject("toctou");
|
||||
|
||||
// Write a session, discover will find it
|
||||
const content = makeJsonlContent([
|
||||
{
|
||||
type: "user",
|
||||
message: { role: "user", content: "Survives" },
|
||||
uuid: "u-1",
|
||||
},
|
||||
]);
|
||||
await fs.writeFile(path.join(projectDir, "survivor.jsonl"), content);
|
||||
|
||||
// Write and immediately delete another
|
||||
await fs.writeFile(path.join(projectDir, "ghost.jsonl"), content);
|
||||
await fs.unlink(path.join(projectDir, "ghost.jsonl"));
|
||||
|
||||
const sessions = await discoverSessions(tmpDir);
|
||||
expect(sessions).toHaveLength(1);
|
||||
expect(sessions[0].id).toBe("survivor");
|
||||
|
||||
await cleanup();
|
||||
});
|
||||
|
||||
it("handles missing projects directory gracefully", async () => {
|
||||
@@ -72,21 +124,6 @@ describe("session-discovery", () => {
|
||||
expect(sessions).toEqual([]);
|
||||
});
|
||||
|
||||
it("handles corrupt index files gracefully", async () => {
|
||||
const tmpDir = path.join(os.tmpdir(), `sv-test-corrupt-${Date.now()}`);
|
||||
const projectDir = path.join(tmpDir, "corrupt-project");
|
||||
await fs.mkdir(projectDir, { recursive: true });
|
||||
await fs.writeFile(
|
||||
path.join(projectDir, "sessions-index.json"),
|
||||
"not valid json {"
|
||||
);
|
||||
|
||||
const sessions = await discoverSessions(tmpDir);
|
||||
expect(sessions).toEqual([]);
|
||||
|
||||
await fs.rm(tmpDir, { recursive: true });
|
||||
});
|
||||
|
||||
it("aggregates across multiple project directories", async () => {
|
||||
const tmpDir = path.join(os.tmpdir(), `sv-test-multi-${Date.now()}`);
|
||||
const proj1 = path.join(tmpDir, "project-a");
|
||||
@@ -94,14 +131,25 @@ describe("session-discovery", () => {
|
||||
await fs.mkdir(proj1, { recursive: true });
|
||||
await fs.mkdir(proj2, { recursive: true });
|
||||
|
||||
await fs.writeFile(
|
||||
path.join(proj1, "sessions-index.json"),
|
||||
makeIndex([{ sessionId: "a-001", created: "2025-01-01T00:00:00Z", modified: "2025-01-01T00:00:00Z" }])
|
||||
);
|
||||
await fs.writeFile(
|
||||
path.join(proj2, "sessions-index.json"),
|
||||
makeIndex([{ sessionId: "b-001", created: "2025-01-02T00:00:00Z", modified: "2025-01-02T00:00:00Z" }])
|
||||
);
|
||||
const contentA = makeJsonlContent([
|
||||
{
|
||||
type: "user",
|
||||
message: { role: "user", content: "Project A" },
|
||||
uuid: "u-a",
|
||||
timestamp: "2025-01-01T00:00:00Z",
|
||||
},
|
||||
]);
|
||||
const contentB = makeJsonlContent([
|
||||
{
|
||||
type: "user",
|
||||
message: { role: "user", content: "Project B" },
|
||||
uuid: "u-b",
|
||||
timestamp: "2025-01-02T00:00:00Z",
|
||||
},
|
||||
]);
|
||||
|
||||
await fs.writeFile(path.join(proj1, "a-001.jsonl"), contentA);
|
||||
await fs.writeFile(path.join(proj2, "b-001.jsonl"), contentB);
|
||||
|
||||
const sessions = await discoverSessions(tmpDir);
|
||||
expect(sessions).toHaveLength(2);
|
||||
@@ -112,93 +160,299 @@ describe("session-discovery", () => {
|
||||
await fs.rm(tmpDir, { recursive: true });
|
||||
});
|
||||
|
||||
it("rejects paths with traversal segments", async () => {
|
||||
const tmpDir = path.join(os.tmpdir(), `sv-test-traversal-${Date.now()}`);
|
||||
const projectDir = path.join(tmpDir, "traversal-project");
|
||||
await fs.mkdir(projectDir, { recursive: true });
|
||||
it("ignores non-.jsonl files in project directories", async () => {
|
||||
const { tmpDir, projectDir, cleanup } = await makeTmpProject("filter-ext");
|
||||
|
||||
const goodPath = path.join(projectDir, "good-001.jsonl");
|
||||
const content = makeJsonlContent([
|
||||
{
|
||||
type: "user",
|
||||
message: { role: "user", content: "Hello" },
|
||||
uuid: "u-1",
|
||||
},
|
||||
]);
|
||||
|
||||
await fs.writeFile(path.join(projectDir, "session.jsonl"), content);
|
||||
await fs.writeFile(
|
||||
path.join(projectDir, "sessions-index.json"),
|
||||
makeIndex([
|
||||
{
|
||||
sessionId: "evil-001",
|
||||
fullPath: "/home/ubuntu/../../../etc/passwd",
|
||||
created: "2025-10-15T10:00:00Z",
|
||||
modified: "2025-10-15T11:00:00Z",
|
||||
},
|
||||
{
|
||||
sessionId: "evil-002",
|
||||
fullPath: "/home/ubuntu/sessions/not-a-jsonl.txt",
|
||||
created: "2025-10-15T10:00:00Z",
|
||||
modified: "2025-10-15T11:00:00Z",
|
||||
},
|
||||
{
|
||||
sessionId: "good-001",
|
||||
fullPath: goodPath,
|
||||
created: "2025-10-15T10:00:00Z",
|
||||
modified: "2025-10-15T11:00:00Z",
|
||||
},
|
||||
])
|
||||
'{"version":1,"entries":[]}'
|
||||
);
|
||||
await fs.writeFile(path.join(projectDir, "notes.txt"), "notes");
|
||||
|
||||
const sessions = await discoverSessions(tmpDir);
|
||||
expect(sessions).toHaveLength(1);
|
||||
expect(sessions[0].id).toBe("good-001");
|
||||
expect(sessions[0].id).toBe("session");
|
||||
|
||||
await fs.rm(tmpDir, { recursive: true });
|
||||
await cleanup();
|
||||
});
|
||||
|
||||
it("rejects absolute paths outside the projects directory", async () => {
|
||||
const tmpDir = path.join(os.tmpdir(), `sv-test-containment-${Date.now()}`);
|
||||
const projectDir = path.join(tmpDir, "contained-project");
|
||||
await fs.mkdir(projectDir, { recursive: true });
|
||||
it("duration computed from JSONL timestamps", async () => {
|
||||
const { tmpDir, projectDir, cleanup } = await makeTmpProject("duration");
|
||||
|
||||
await fs.writeFile(
|
||||
path.join(projectDir, "sessions-index.json"),
|
||||
makeIndex([
|
||||
{
|
||||
sessionId: "escaped-001",
|
||||
fullPath: "/etc/shadow.jsonl",
|
||||
created: "2025-10-15T10:00:00Z",
|
||||
modified: "2025-10-15T11:00:00Z",
|
||||
const content = makeJsonlContent([
|
||||
{
|
||||
type: "user",
|
||||
message: { role: "user", content: "Start" },
|
||||
uuid: "u-1",
|
||||
timestamp: "2025-10-15T10:00:00Z",
|
||||
},
|
||||
{
|
||||
type: "assistant",
|
||||
message: {
|
||||
role: "assistant",
|
||||
content: [{ type: "text", text: "End" }],
|
||||
},
|
||||
{
|
||||
sessionId: "escaped-002",
|
||||
fullPath: "/tmp/other-dir/secret.jsonl",
|
||||
created: "2025-10-15T10:00:00Z",
|
||||
modified: "2025-10-15T11:00:00Z",
|
||||
},
|
||||
])
|
||||
);
|
||||
uuid: "a-1",
|
||||
timestamp: "2025-10-15T10:30:00Z",
|
||||
},
|
||||
]);
|
||||
|
||||
await fs.writeFile(path.join(projectDir, "sess-dur.jsonl"), content);
|
||||
|
||||
const sessions = await discoverSessions(tmpDir);
|
||||
expect(sessions).toHaveLength(0);
|
||||
expect(sessions).toHaveLength(1);
|
||||
// 30 minutes = 1800000 ms
|
||||
expect(sessions[0].duration).toBe(1_800_000);
|
||||
|
||||
await fs.rm(tmpDir, { recursive: true });
|
||||
await cleanup();
|
||||
});
|
||||
|
||||
it("uses fullPath from index entry", async () => {
|
||||
const tmpDir = path.join(os.tmpdir(), `sv-test-fp-${Date.now()}`);
|
||||
const projectDir = path.join(tmpDir, "fp-project");
|
||||
await fs.mkdir(projectDir, { recursive: true });
|
||||
it("handles empty .jsonl files", async () => {
|
||||
const { tmpDir, projectDir, cleanup } = await makeTmpProject("empty");
|
||||
|
||||
const sessionPath = path.join(projectDir, "fp-001.jsonl");
|
||||
await fs.writeFile(
|
||||
path.join(projectDir, "sessions-index.json"),
|
||||
makeIndex([
|
||||
{
|
||||
sessionId: "fp-001",
|
||||
fullPath: sessionPath,
|
||||
created: "2025-10-15T10:00:00Z",
|
||||
modified: "2025-10-15T11:00:00Z",
|
||||
},
|
||||
])
|
||||
);
|
||||
await fs.writeFile(path.join(projectDir, "empty.jsonl"), "");
|
||||
|
||||
const sessions = await discoverSessions(tmpDir);
|
||||
expect(sessions[0].path).toBe(sessionPath);
|
||||
expect(sessions).toHaveLength(1);
|
||||
expect(sessions[0].id).toBe("empty");
|
||||
expect(sessions[0].messageCount).toBe(0);
|
||||
expect(sessions[0].firstPrompt).toBe("");
|
||||
|
||||
await fs.rm(tmpDir, { recursive: true });
|
||||
await cleanup();
|
||||
});
|
||||
|
||||
it("sorts by modified descending", async () => {
|
||||
const { tmpDir, projectDir, cleanup } = await makeTmpProject("sort");
|
||||
|
||||
const content1 = makeJsonlContent([
|
||||
{
|
||||
type: "user",
|
||||
message: { role: "user", content: "First" },
|
||||
uuid: "u-1",
|
||||
},
|
||||
]);
|
||||
const content2 = makeJsonlContent([
|
||||
{
|
||||
type: "user",
|
||||
message: { role: "user", content: "Second" },
|
||||
uuid: "u-2",
|
||||
},
|
||||
]);
|
||||
|
||||
await fs.writeFile(path.join(projectDir, "older.jsonl"), content1);
|
||||
// Small delay to ensure different mtime
|
||||
await new Promise((r) => setTimeout(r, 50));
|
||||
await fs.writeFile(path.join(projectDir, "newer.jsonl"), content2);
|
||||
|
||||
const sessions = await discoverSessions(tmpDir);
|
||||
expect(sessions).toHaveLength(2);
|
||||
expect(sessions[0].id).toBe("newer");
|
||||
expect(sessions[1].id).toBe("older");
|
||||
|
||||
await cleanup();
|
||||
});
|
||||
|
||||
describe("Tier 1 index validation", () => {
|
||||
it("uses index data when modified matches stat mtime within 1s", async () => {
|
||||
const { tmpDir, projectDir, cleanup } = await makeTmpProject("tier1-hit");
|
||||
|
||||
const content = makeJsonlContent([
|
||||
{
|
||||
type: "user",
|
||||
message: { role: "user", content: "Hello" },
|
||||
uuid: "u-1",
|
||||
timestamp: "2025-10-15T10:00:00Z",
|
||||
},
|
||||
]);
|
||||
const filePath = path.join(projectDir, "sess-idx.jsonl");
|
||||
await fs.writeFile(filePath, content);
|
||||
|
||||
// Get the actual mtime from the file
|
||||
const stat = await fs.stat(filePath);
|
||||
const mtimeIso = new Date(stat.mtimeMs).toISOString();
|
||||
|
||||
// Write an index with the matching modified timestamp and different metadata
|
||||
await fs.writeFile(
|
||||
path.join(projectDir, "sessions-index.json"),
|
||||
makeIndex([
|
||||
{
|
||||
sessionId: "sess-idx",
|
||||
summary: "Index summary",
|
||||
firstPrompt: "Index prompt",
|
||||
messageCount: 99,
|
||||
modified: mtimeIso,
|
||||
created: "2025-10-15T09:00:00Z",
|
||||
},
|
||||
])
|
||||
);
|
||||
|
||||
const sessions = await discoverSessions(tmpDir);
|
||||
expect(sessions).toHaveLength(1);
|
||||
// Should use index data (Tier 1 hit)
|
||||
expect(sessions[0].messageCount).toBe(99);
|
||||
expect(sessions[0].summary).toBe("Index summary");
|
||||
expect(sessions[0].firstPrompt).toBe("Index prompt");
|
||||
|
||||
await cleanup();
|
||||
});
|
||||
|
||||
it("rejects index data when mtime mismatch > 1s", async () => {
|
||||
const { tmpDir, projectDir, cleanup } = await makeTmpProject("tier1-miss");
|
||||
|
||||
const content = makeJsonlContent([
|
||||
{
|
||||
type: "user",
|
||||
message: { role: "user", content: "Real content" },
|
||||
uuid: "u-1",
|
||||
timestamp: "2025-10-15T10:00:00Z",
|
||||
},
|
||||
]);
|
||||
await fs.writeFile(path.join(projectDir, "sess-stale.jsonl"), content);
|
||||
|
||||
// Write an index with a very old modified timestamp (stale)
|
||||
await fs.writeFile(
|
||||
path.join(projectDir, "sessions-index.json"),
|
||||
makeIndex([
|
||||
{
|
||||
sessionId: "sess-stale",
|
||||
summary: "Stale index summary",
|
||||
firstPrompt: "Stale prompt",
|
||||
messageCount: 99,
|
||||
modified: "2020-01-01T00:00:00Z",
|
||||
created: "2020-01-01T00:00:00Z",
|
||||
},
|
||||
])
|
||||
);
|
||||
|
||||
const sessions = await discoverSessions(tmpDir);
|
||||
expect(sessions).toHaveLength(1);
|
||||
// Should NOT use index data (Tier 1 miss) — falls through to Tier 3
|
||||
expect(sessions[0].messageCount).toBe(1); // Actual parse count
|
||||
expect(sessions[0].firstPrompt).toBe("Real content");
|
||||
|
||||
await cleanup();
|
||||
});
|
||||
|
||||
it("skips Tier 1 when entry has no modified field", async () => {
|
||||
const { tmpDir, projectDir, cleanup } = await makeTmpProject("tier1-no-mod");
|
||||
|
||||
const content = makeJsonlContent([
|
||||
{
|
||||
type: "user",
|
||||
message: { role: "user", content: "Real content" },
|
||||
uuid: "u-1",
|
||||
},
|
||||
]);
|
||||
await fs.writeFile(path.join(projectDir, "sess-nomod.jsonl"), content);
|
||||
|
||||
await fs.writeFile(
|
||||
path.join(projectDir, "sessions-index.json"),
|
||||
makeIndex([
|
||||
{
|
||||
sessionId: "sess-nomod",
|
||||
summary: "Index summary",
|
||||
messageCount: 99,
|
||||
// No modified field
|
||||
},
|
||||
])
|
||||
);
|
||||
|
||||
const sessions = await discoverSessions(tmpDir);
|
||||
expect(sessions).toHaveLength(1);
|
||||
// Falls through to Tier 3 parse
|
||||
expect(sessions[0].messageCount).toBe(1);
|
||||
|
||||
await cleanup();
|
||||
});
|
||||
|
||||
it("handles missing sessions-index.json", async () => {
|
||||
const { tmpDir, projectDir, cleanup } = await makeTmpProject("tier1-missing");
|
||||
|
||||
const content = makeJsonlContent([
|
||||
{
|
||||
type: "user",
|
||||
message: { role: "user", content: "No index" },
|
||||
uuid: "u-1",
|
||||
},
|
||||
]);
|
||||
await fs.writeFile(path.join(projectDir, "sess-noindex.jsonl"), content);
|
||||
|
||||
const sessions = await discoverSessions(tmpDir);
|
||||
expect(sessions).toHaveLength(1);
|
||||
expect(sessions[0].firstPrompt).toBe("No index");
|
||||
|
||||
await cleanup();
|
||||
});
|
||||
|
||||
it("handles corrupt sessions-index.json", async () => {
|
||||
const { tmpDir, projectDir, cleanup } = await makeTmpProject("tier1-corrupt");
|
||||
|
||||
const content = makeJsonlContent([
|
||||
{
|
||||
type: "user",
|
||||
message: { role: "user", content: "Corrupt index" },
|
||||
uuid: "u-1",
|
||||
},
|
||||
]);
|
||||
await fs.writeFile(path.join(projectDir, "sess-corrupt.jsonl"), content);
|
||||
await fs.writeFile(
|
||||
path.join(projectDir, "sessions-index.json"),
|
||||
"not valid json {"
|
||||
);
|
||||
|
||||
const sessions = await discoverSessions(tmpDir);
|
||||
expect(sessions).toHaveLength(1);
|
||||
expect(sessions[0].firstPrompt).toBe("Corrupt index");
|
||||
|
||||
await cleanup();
|
||||
});
|
||||
|
||||
it("timestamps always from stat even on Tier 1 hit", async () => {
|
||||
const { tmpDir, projectDir, cleanup } = await makeTmpProject("tier1-stat-ts");
|
||||
|
||||
const content = makeJsonlContent([
|
||||
{
|
||||
type: "user",
|
||||
message: { role: "user", content: "Hello" },
|
||||
uuid: "u-1",
|
||||
},
|
||||
]);
|
||||
const filePath = path.join(projectDir, "sess-ts.jsonl");
|
||||
await fs.writeFile(filePath, content);
|
||||
|
||||
const stat = await fs.stat(filePath);
|
||||
const mtimeIso = new Date(stat.mtimeMs).toISOString();
|
||||
|
||||
await fs.writeFile(
|
||||
path.join(projectDir, "sessions-index.json"),
|
||||
makeIndex([
|
||||
{
|
||||
sessionId: "sess-ts",
|
||||
messageCount: 1,
|
||||
modified: mtimeIso,
|
||||
created: "1990-01-01T00:00:00Z",
|
||||
},
|
||||
])
|
||||
);
|
||||
|
||||
const sessions = await discoverSessions(tmpDir);
|
||||
expect(sessions).toHaveLength(1);
|
||||
|
||||
// created/modified should be from stat (recent), not from index's 1990 date
|
||||
const createdDate = new Date(sessions[0].created);
|
||||
const now = new Date();
|
||||
expect(now.getTime() - createdDate.getTime()).toBeLessThan(60_000);
|
||||
|
||||
await cleanup();
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
Reference in New Issue
Block a user