Commit Graph

3 Commits

Author SHA1 Message Date
teernisse
cc04772792 Core: eliminate double-parse in normalize_to_json, harden SSRF, optimize search
normalize_to_json now returns (Vec<u8>, serde_json::Value) — callers get
the parsed Value for free instead of re-parsing the bytes they just
produced. Eliminates a redundant serde_json::from_slice on every fetch,
sync, and external-ref resolution path.

Format detection switches from trial JSON parse to first-byte inspection
({/[ = JSON, else YAML) — roughly 300x faster for the common case.

SSRF protection expanded: block CGNAT range 100.64.0.0/10 (RFC 6598,
common cloud-internal SSRF target) and IPv6 unique-local fc00::/7.

Alias validation simplified: the regex ^[A-Za-z0-9][A-Za-z0-9._-]{0,63}$
already rejects path separators, traversal, and leading dots — remove
redundant explicit checks.

Search performance: pre-lowercase query terms once and pre-lowercase each
field once per endpoint (not once per term x field). Removes the
contains_term helper entirely. safe_snippet rewritten with char-based
search to avoid byte-position mismatches on multi-byte Unicode characters
(e.g. U+0130 which expands during lowercasing).
2026-02-12 16:56:12 -05:00
teernisse
4ac8659ebd Wave 7: Phase 2 features - sync --all, external refs, cross-alias discovery, CI/CD, reliability tests (bd-1ky, bd-1bp, bd-1rk, bd-1lj, bd-gvr, bd-1x5)
- Sync --all with async concurrency, per-host throttling, failure budgets, resumable execution
- External ref bundling at fetch time with origin tracking
- Cross-alias discovery (--all-aliases) for list and search commands
- CI/CD pipeline (.gitlab-ci.yml), cargo-deny config, Dockerfile, install script
- Reliability test suite: crash consistency (8 tests), lock contention (3 tests), property-based (4 tests)
- Criterion performance benchmarks (5 benchmarks)
- Bug fix: doctor --fix now repairs missing index.json when raw.json exists
- Bug fix: shared $ref references no longer incorrectly flagged as circular (refs.rs)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-12 15:56:42 -05:00
teernisse
faa6281790 Wave 4: Full CLI command implementations - fetch, list, show, search, tags, aliases, doctor, cache lifecycle (bd-16o, bd-3km, bd-1dj, bd-acf, bd-3bl, bd-30a, bd-2s6, bd-1d4) 2026-02-12 14:25:30 -05:00