171 lines
9.1 KiB
Markdown
171 lines
9.1 KiB
Markdown
No `## Rejected Recommendations` section exists in `prd-swagger-cli.md`, so all suggestions below are net-new.
|
||
|
||
**1. Add a canonical ingest pipeline (JSON + YAML + gzip) with streaming limits**
|
||
Current plan is effectively JSON-first and buffer-oriented. In practice, many OpenAPI specs are YAML and/or compressed; forcing JSON-only ingestion creates adoption friction and unnecessary memory pressure on large specs. A canonical ingest stage (detect format, decode, parse, normalize) gives one robust path for URL/file/stdin and makes future behaviors (ref resolution, provenance) cleaner.
|
||
|
||
```diff
|
||
diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md
|
||
@@ FR-1: Spec Fetching and Caching
|
||
-- ✓ Validate JSON is parseable before caching (no full OpenAPI structural validation)
|
||
+- ✓ Validate spec is parseable JSON or YAML before caching (no full OpenAPI structural validation)
|
||
+- ✓ Support compressed inputs (`.json.gz`, `.yaml.gz`, `Content-Encoding: gzip`)
|
||
+- ✓ Enforce `--max-bytes` during streaming download (fail before full buffering)
|
||
@@ OPTIONS:
|
||
+ --format <FORMAT> Input format hint: auto (default), json, yaml
|
||
@@ Cache directory layout
|
||
- ├── raw.json # Exact upstream bytes (lossless)
|
||
+ ├── raw.source # Exact upstream bytes (json|yaml|gz as fetched)
|
||
+ ├── raw.json # Canonical normalized JSON for pointers/show
|
||
@@ Core dependencies
|
||
+serde_yaml = "0.9"
|
||
+flate2 = "1.0"
|
||
```
|
||
|
||
**2. Add explicit fetch-time external-ref bundling (opt-in)**
|
||
You already correctly avoid external network fetches during `show --expand-refs`. The missing piece is specs that rely on external refs for core operations. Add an explicit fetch-time bundling mode with strict allowlists/limits. Default remains offline-safe and unchanged.
|
||
|
||
```diff
|
||
diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md
|
||
@@ FR-1 OPTIONS:
|
||
+ --resolve-external-refs Resolve and bundle external $ref targets at fetch-time (opt-in)
|
||
+ --ref-allow-host <HOST> Allowlist host for external ref resolution (repeatable)
|
||
+ --ref-max-depth <N> Max external ref chain depth (default: 3)
|
||
+ --ref-max-bytes <N> Total bytes cap for all external ref fetches
|
||
@@ FR-3 Decision rationale
|
||
-- External refs are NOT fetched (no network).
|
||
+- Query-time external refs are NOT fetched (no network).
|
||
+- Optional fetch-time bundling can resolve external refs under explicit policy flags.
|
||
+- Bundled snapshots preserve offline guarantees for all later commands.
|
||
```
|
||
|
||
**3. Add a global network policy switch for deterministic agent runs**
|
||
Manual `sync` is good, but a global network policy is better for CI/agent reproducibility. This prevents accidental network behavior in restricted environments and makes failure mode explicit.
|
||
|
||
```diff
|
||
diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md
|
||
@@ src/cli/mod.rs
|
||
+ /// Network policy: auto (default), offline, online-only
|
||
+ #[arg(long, global = true, default_value = "auto", value_parser = ["auto","offline","online-only"])]
|
||
+ pub network: String,
|
||
@@ Goals
|
||
+6. **Determinism:** Global network policy control for reproducible offline/CI execution
|
||
@@ Appendix B: Exit Code Reference
|
||
+| 15 | Offline mode blocked network operation | `OFFLINE_MODE` | No | Retry without `--network offline` |
|
||
```
|
||
|
||
**4. Strengthen integrity checks with raw-hash verification + pointer validation + safe auto-rebuild**
|
||
Current generation/index-hash integrity is good but incomplete for raw corruption and stale pointers. Add raw hash verification when raw is used, and validate every pointer at fetch/sync/doctor. Also allow safe index auto-rebuild from valid raw under lock to reduce operational toil.
|
||
|
||
```diff
|
||
diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md
|
||
@@ Read protocol:
|
||
- - Read index.json. Validate ALL THREE match meta.json:
|
||
+ - Read index.json. Validate ALL FOUR match meta.json:
|
||
1. meta.index_version == index.index_version
|
||
2. meta.generation == index.generation
|
||
3. meta.index_hash == sha256(index.json bytes)
|
||
+ 4. meta.content_hash == sha256(raw.json bytes) (commands that require raw)
|
||
+ - Validate every `operation_ptr` / `schema_ptr` resolves during fetch/sync; doctor re-checks all pointers.
|
||
+ - If index integrity fails but raw is valid: auto-rebuild index under alias lock (unless `--strict-integrity`).
|
||
@@ FR-9 doctor
|
||
+- ✓ Verify index pointers resolve to existing JSON nodes
|
||
+- ✓ Repair path prefers deterministic index rebuild before surfacing CACHE_INTEGRITY
|
||
```
|
||
|
||
**5. Make `sync --all` scalable and polite (bounded concurrency + host throttling + Retry-After)**
|
||
As alias count grows, sequential sync is slow; unconstrained parallel sync is abusive and unreliable. Add bounded concurrency plus per-host caps and Retry-After handling for robust large-team usage.
|
||
|
||
```diff
|
||
diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md
|
||
@@ FR-7 OPTIONS:
|
||
+ --jobs <N> Parallel aliases to sync (default: 4, bounded)
|
||
+ --per-host <N> Max concurrent requests per host (default: 2)
|
||
@@ Decision rationale:
|
||
+- `sync --all` uses bounded concurrency with per-host throttling.
|
||
+- Retries honor `Retry-After` when present; otherwise exponential backoff + jitter.
|
||
+- Robot output reports partial failures per alias without aborting the entire run.
|
||
```
|
||
|
||
**6. Upgrade search to a precomputed token index + deterministic fuzzy fallback**
|
||
Current contains-based scoring will degrade on larger specs and misses common misspellings. A small postings index in `index.json` keeps search fast and makes ranking better without adding a heavy FTS dependency.
|
||
|
||
```diff
|
||
diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md
|
||
@@ FR-5 Acceptance Criteria
|
||
+- ✓ Use precomputed token postings for O(query_terms + matches) lookup
|
||
+- ✓ Optional typo-tolerant matching (`--fuzzy`) with bounded edit distance
|
||
+- ✓ Deterministic fixed-point scoring (integer), stable tie-breaking retained
|
||
@@ Command: swagger-cli search
|
||
+ --fuzzy Enable bounded typo-tolerant token matching
|
||
+ --min-score <N> Filter low-relevance matches
|
||
@@ Data Models: SpecIndex
|
||
+ pub search_lexicon_version: u32,
|
||
+ pub search_postings: Vec<SearchPosting>,
|
||
```
|
||
|
||
**7. Add cross-alias discovery mode for `list` and `search`**
|
||
Single-alias operation is clean, but discovery across many APIs is a common real-world workflow. `--all-aliases` gives immediate utility to both humans and agents while preserving existing default behavior.
|
||
|
||
```diff
|
||
diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md
|
||
@@ FR-2 Command: swagger-cli list
|
||
+ --all-aliases Run query across all aliases; include `alias` per result
|
||
@@ FR-5 Command: swagger-cli search
|
||
+ --all-aliases Search across all aliases; include `alias` per result
|
||
@@ Robot output (list/search)
|
||
+ "alias": "petstore",
|
||
@@ Open Questions Q3
|
||
-Decision: Single alias per query in MVP; revisit if requested
|
||
+Decision: Default remains single alias; `--all-aliases` added for explicit federated discovery.
|
||
```
|
||
|
||
**8. Add cache lifecycle management (`cache` command)**
|
||
You report disk usage but don’t provide lifecycle controls. Add prune/compact/stats to avoid long-term cache bloat and improve operational hygiene.
|
||
|
||
```diff
|
||
diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md
|
||
@@ Functional Requirements
|
||
+### FR-10: Cache Lifecycle Management
|
||
+**Description:** Manage cache growth and retention.
|
||
+**Command:** swagger-cli cache [OPTIONS]
|
||
+OPTIONS:
|
||
+ --stats Show per-alias and total cache usage
|
||
+ --prune-stale Delete aliases older than stale threshold
|
||
+ --max-total-mb <N> Enforce global cache cap via LRU eviction
|
||
+ --robot Machine-readable output
|
||
```
|
||
|
||
**9. Harden release/install supply chain (checksums + signatures)**
|
||
Current installer downloads and executes binaries without verification. Add checksum/signature artifacts and enforce verification in installer by default.
|
||
|
||
```diff
|
||
diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md
|
||
@@ Release stage
|
||
+ - sha256sum swagger-cli-* > SHA256SUMS
|
||
+ - minisign -Sm SHA256SUMS
|
||
+ - upload SHA256SUMS and SHA256SUMS.minisig with binaries
|
||
@@ install.sh
|
||
-# Download binary
|
||
+# Download binary + checksum manifest + signature
|
||
+# Verify signature + checksum before chmod +x
|
||
+VERIFY="${VERIFY:-true}"
|
||
+if [ "$VERIFY" = "true" ]; then
|
||
+ # fail closed on verification mismatch
|
||
+fi
|
||
```
|
||
|
||
**10. Add adversarial reliability tests (fault injection + concurrency stress + property tests)**
|
||
The plan has good tests, but not enough proof for crash consistency and lock behavior under contention. Add targeted destructive tests to validate the core reliability claims.
|
||
|
||
```diff
|
||
diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md
|
||
@@ Testing Strategy
|
||
+### Reliability Stress Tests
|
||
+- Fault-injection tests at each write step (before/after fsync, before/after rename) to prove recoverability.
|
||
+- Multi-process lock contention tests (N>=32) validating bounded lock timeout and no deadlocks.
|
||
+- Property-based tests for deterministic ordering, stable tie-breaking, and pointer validity.
|
||
@@ Success Metrics Phase 1
|
||
+- Crash-consistency claim is validated by automated fault-injection test suite (not only unit tests).
|
||
```
|
||
|
||
If you want, I can now consolidate these into a single full unified patch for `prd-swagger-cli.md` with section-by-section wording ready to paste. |