No `## Rejected Recommendations` section exists in `prd-swagger-cli.md`, so all suggestions below are net-new. **1. Add a canonical ingest pipeline (JSON + YAML + gzip) with streaming limits** Current plan is effectively JSON-first and buffer-oriented. In practice, many OpenAPI specs are YAML and/or compressed; forcing JSON-only ingestion creates adoption friction and unnecessary memory pressure on large specs. A canonical ingest stage (detect format, decode, parse, normalize) gives one robust path for URL/file/stdin and makes future behaviors (ref resolution, provenance) cleaner. ```diff diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md @@ FR-1: Spec Fetching and Caching -- ✓ Validate JSON is parseable before caching (no full OpenAPI structural validation) +- ✓ Validate spec is parseable JSON or YAML before caching (no full OpenAPI structural validation) +- ✓ Support compressed inputs (`.json.gz`, `.yaml.gz`, `Content-Encoding: gzip`) +- ✓ Enforce `--max-bytes` during streaming download (fail before full buffering) @@ OPTIONS: + --format Input format hint: auto (default), json, yaml @@ Cache directory layout - ├── raw.json # Exact upstream bytes (lossless) + ├── raw.source # Exact upstream bytes (json|yaml|gz as fetched) + ├── raw.json # Canonical normalized JSON for pointers/show @@ Core dependencies +serde_yaml = "0.9" +flate2 = "1.0" ``` **2. Add explicit fetch-time external-ref bundling (opt-in)** You already correctly avoid external network fetches during `show --expand-refs`. The missing piece is specs that rely on external refs for core operations. Add an explicit fetch-time bundling mode with strict allowlists/limits. Default remains offline-safe and unchanged. ```diff diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md @@ FR-1 OPTIONS: + --resolve-external-refs Resolve and bundle external $ref targets at fetch-time (opt-in) + --ref-allow-host Allowlist host for external ref resolution (repeatable) + --ref-max-depth Max external ref chain depth (default: 3) + --ref-max-bytes Total bytes cap for all external ref fetches @@ FR-3 Decision rationale -- External refs are NOT fetched (no network). +- Query-time external refs are NOT fetched (no network). +- Optional fetch-time bundling can resolve external refs under explicit policy flags. +- Bundled snapshots preserve offline guarantees for all later commands. ``` **3. Add a global network policy switch for deterministic agent runs** Manual `sync` is good, but a global network policy is better for CI/agent reproducibility. This prevents accidental network behavior in restricted environments and makes failure mode explicit. ```diff diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md @@ src/cli/mod.rs + /// Network policy: auto (default), offline, online-only + #[arg(long, global = true, default_value = "auto", value_parser = ["auto","offline","online-only"])] + pub network: String, @@ Goals +6. **Determinism:** Global network policy control for reproducible offline/CI execution @@ Appendix B: Exit Code Reference +| 15 | Offline mode blocked network operation | `OFFLINE_MODE` | No | Retry without `--network offline` | ``` **4. Strengthen integrity checks with raw-hash verification + pointer validation + safe auto-rebuild** Current generation/index-hash integrity is good but incomplete for raw corruption and stale pointers. Add raw hash verification when raw is used, and validate every pointer at fetch/sync/doctor. Also allow safe index auto-rebuild from valid raw under lock to reduce operational toil. ```diff diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md @@ Read protocol: - - Read index.json. Validate ALL THREE match meta.json: + - Read index.json. Validate ALL FOUR match meta.json: 1. meta.index_version == index.index_version 2. meta.generation == index.generation 3. meta.index_hash == sha256(index.json bytes) + 4. meta.content_hash == sha256(raw.json bytes) (commands that require raw) + - Validate every `operation_ptr` / `schema_ptr` resolves during fetch/sync; doctor re-checks all pointers. + - If index integrity fails but raw is valid: auto-rebuild index under alias lock (unless `--strict-integrity`). @@ FR-9 doctor +- ✓ Verify index pointers resolve to existing JSON nodes +- ✓ Repair path prefers deterministic index rebuild before surfacing CACHE_INTEGRITY ``` **5. Make `sync --all` scalable and polite (bounded concurrency + host throttling + Retry-After)** As alias count grows, sequential sync is slow; unconstrained parallel sync is abusive and unreliable. Add bounded concurrency plus per-host caps and Retry-After handling for robust large-team usage. ```diff diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md @@ FR-7 OPTIONS: + --jobs Parallel aliases to sync (default: 4, bounded) + --per-host Max concurrent requests per host (default: 2) @@ Decision rationale: +- `sync --all` uses bounded concurrency with per-host throttling. +- Retries honor `Retry-After` when present; otherwise exponential backoff + jitter. +- Robot output reports partial failures per alias without aborting the entire run. ``` **6. Upgrade search to a precomputed token index + deterministic fuzzy fallback** Current contains-based scoring will degrade on larger specs and misses common misspellings. A small postings index in `index.json` keeps search fast and makes ranking better without adding a heavy FTS dependency. ```diff diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md @@ FR-5 Acceptance Criteria +- ✓ Use precomputed token postings for O(query_terms + matches) lookup +- ✓ Optional typo-tolerant matching (`--fuzzy`) with bounded edit distance +- ✓ Deterministic fixed-point scoring (integer), stable tie-breaking retained @@ Command: swagger-cli search + --fuzzy Enable bounded typo-tolerant token matching + --min-score Filter low-relevance matches @@ Data Models: SpecIndex + pub search_lexicon_version: u32, + pub search_postings: Vec, ``` **7. Add cross-alias discovery mode for `list` and `search`** Single-alias operation is clean, but discovery across many APIs is a common real-world workflow. `--all-aliases` gives immediate utility to both humans and agents while preserving existing default behavior. ```diff diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md @@ FR-2 Command: swagger-cli list + --all-aliases Run query across all aliases; include `alias` per result @@ FR-5 Command: swagger-cli search + --all-aliases Search across all aliases; include `alias` per result @@ Robot output (list/search) + "alias": "petstore", @@ Open Questions Q3 -Decision: Single alias per query in MVP; revisit if requested +Decision: Default remains single alias; `--all-aliases` added for explicit federated discovery. ``` **8. Add cache lifecycle management (`cache` command)** You report disk usage but don’t provide lifecycle controls. Add prune/compact/stats to avoid long-term cache bloat and improve operational hygiene. ```diff diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md @@ Functional Requirements +### FR-10: Cache Lifecycle Management +**Description:** Manage cache growth and retention. +**Command:** swagger-cli cache [OPTIONS] +OPTIONS: + --stats Show per-alias and total cache usage + --prune-stale Delete aliases older than stale threshold + --max-total-mb Enforce global cache cap via LRU eviction + --robot Machine-readable output ``` **9. Harden release/install supply chain (checksums + signatures)** Current installer downloads and executes binaries without verification. Add checksum/signature artifacts and enforce verification in installer by default. ```diff diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md @@ Release stage + - sha256sum swagger-cli-* > SHA256SUMS + - minisign -Sm SHA256SUMS + - upload SHA256SUMS and SHA256SUMS.minisig with binaries @@ install.sh -# Download binary +# Download binary + checksum manifest + signature +# Verify signature + checksum before chmod +x +VERIFY="${VERIFY:-true}" +if [ "$VERIFY" = "true" ]; then + # fail closed on verification mismatch +fi ``` **10. Add adversarial reliability tests (fault injection + concurrency stress + property tests)** The plan has good tests, but not enough proof for crash consistency and lock behavior under contention. Add targeted destructive tests to validate the core reliability claims. ```diff diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md @@ Testing Strategy +### Reliability Stress Tests +- Fault-injection tests at each write step (before/after fsync, before/after rename) to prove recoverability. +- Multi-process lock contention tests (N>=32) validating bounded lock timeout and no deadlocks. +- Property-based tests for deterministic ordering, stable tie-breaking, and pointer validity. @@ Success Metrics Phase 1 +- Crash-consistency claim is validated by automated fault-injection test suite (not only unit tests). ``` If you want, I can now consolidate these into a single full unified patch for `prd-swagger-cli.md` with section-by-section wording ready to paste.