bd-a7e: Bootstrap Rust project and directory structure

This commit is contained in:
teernisse
2026-02-12 12:33:05 -05:00
commit 24739cb270
18 changed files with 8024 additions and 0 deletions

View File

@@ -0,0 +1,171 @@
No `## Rejected Recommendations` section exists in `prd-swagger-cli.md`, so all suggestions below are net-new.
**1. Add a canonical ingest pipeline (JSON + YAML + gzip) with streaming limits**
Current plan is effectively JSON-first and buffer-oriented. In practice, many OpenAPI specs are YAML and/or compressed; forcing JSON-only ingestion creates adoption friction and unnecessary memory pressure on large specs. A canonical ingest stage (detect format, decode, parse, normalize) gives one robust path for URL/file/stdin and makes future behaviors (ref resolution, provenance) cleaner.
```diff
diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md
@@ FR-1: Spec Fetching and Caching
-- ✓ Validate JSON is parseable before caching (no full OpenAPI structural validation)
+- ✓ Validate spec is parseable JSON or YAML before caching (no full OpenAPI structural validation)
+- ✓ Support compressed inputs (`.json.gz`, `.yaml.gz`, `Content-Encoding: gzip`)
+- ✓ Enforce `--max-bytes` during streaming download (fail before full buffering)
@@ OPTIONS:
+ --format <FORMAT> Input format hint: auto (default), json, yaml
@@ Cache directory layout
- ├── raw.json # Exact upstream bytes (lossless)
+ ├── raw.source # Exact upstream bytes (json|yaml|gz as fetched)
+ ├── raw.json # Canonical normalized JSON for pointers/show
@@ Core dependencies
+serde_yaml = "0.9"
+flate2 = "1.0"
```
**2. Add explicit fetch-time external-ref bundling (opt-in)**
You already correctly avoid external network fetches during `show --expand-refs`. The missing piece is specs that rely on external refs for core operations. Add an explicit fetch-time bundling mode with strict allowlists/limits. Default remains offline-safe and unchanged.
```diff
diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md
@@ FR-1 OPTIONS:
+ --resolve-external-refs Resolve and bundle external $ref targets at fetch-time (opt-in)
+ --ref-allow-host <HOST> Allowlist host for external ref resolution (repeatable)
+ --ref-max-depth <N> Max external ref chain depth (default: 3)
+ --ref-max-bytes <N> Total bytes cap for all external ref fetches
@@ FR-3 Decision rationale
-- External refs are NOT fetched (no network).
+- Query-time external refs are NOT fetched (no network).
+- Optional fetch-time bundling can resolve external refs under explicit policy flags.
+- Bundled snapshots preserve offline guarantees for all later commands.
```
**3. Add a global network policy switch for deterministic agent runs**
Manual `sync` is good, but a global network policy is better for CI/agent reproducibility. This prevents accidental network behavior in restricted environments and makes failure mode explicit.
```diff
diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md
@@ src/cli/mod.rs
+ /// Network policy: auto (default), offline, online-only
+ #[arg(long, global = true, default_value = "auto", value_parser = ["auto","offline","online-only"])]
+ pub network: String,
@@ Goals
+6. **Determinism:** Global network policy control for reproducible offline/CI execution
@@ Appendix B: Exit Code Reference
+| 15 | Offline mode blocked network operation | `OFFLINE_MODE` | No | Retry without `--network offline` |
```
**4. Strengthen integrity checks with raw-hash verification + pointer validation + safe auto-rebuild**
Current generation/index-hash integrity is good but incomplete for raw corruption and stale pointers. Add raw hash verification when raw is used, and validate every pointer at fetch/sync/doctor. Also allow safe index auto-rebuild from valid raw under lock to reduce operational toil.
```diff
diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md
@@ Read protocol:
- - Read index.json. Validate ALL THREE match meta.json:
+ - Read index.json. Validate ALL FOUR match meta.json:
1. meta.index_version == index.index_version
2. meta.generation == index.generation
3. meta.index_hash == sha256(index.json bytes)
+ 4. meta.content_hash == sha256(raw.json bytes) (commands that require raw)
+ - Validate every `operation_ptr` / `schema_ptr` resolves during fetch/sync; doctor re-checks all pointers.
+ - If index integrity fails but raw is valid: auto-rebuild index under alias lock (unless `--strict-integrity`).
@@ FR-9 doctor
+- ✓ Verify index pointers resolve to existing JSON nodes
+- ✓ Repair path prefers deterministic index rebuild before surfacing CACHE_INTEGRITY
```
**5. Make `sync --all` scalable and polite (bounded concurrency + host throttling + Retry-After)**
As alias count grows, sequential sync is slow; unconstrained parallel sync is abusive and unreliable. Add bounded concurrency plus per-host caps and Retry-After handling for robust large-team usage.
```diff
diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md
@@ FR-7 OPTIONS:
+ --jobs <N> Parallel aliases to sync (default: 4, bounded)
+ --per-host <N> Max concurrent requests per host (default: 2)
@@ Decision rationale:
+- `sync --all` uses bounded concurrency with per-host throttling.
+- Retries honor `Retry-After` when present; otherwise exponential backoff + jitter.
+- Robot output reports partial failures per alias without aborting the entire run.
```
**6. Upgrade search to a precomputed token index + deterministic fuzzy fallback**
Current contains-based scoring will degrade on larger specs and misses common misspellings. A small postings index in `index.json` keeps search fast and makes ranking better without adding a heavy FTS dependency.
```diff
diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md
@@ FR-5 Acceptance Criteria
+- ✓ Use precomputed token postings for O(query_terms + matches) lookup
+- ✓ Optional typo-tolerant matching (`--fuzzy`) with bounded edit distance
+- ✓ Deterministic fixed-point scoring (integer), stable tie-breaking retained
@@ Command: swagger-cli search
+ --fuzzy Enable bounded typo-tolerant token matching
+ --min-score <N> Filter low-relevance matches
@@ Data Models: SpecIndex
+ pub search_lexicon_version: u32,
+ pub search_postings: Vec<SearchPosting>,
```
**7. Add cross-alias discovery mode for `list` and `search`**
Single-alias operation is clean, but discovery across many APIs is a common real-world workflow. `--all-aliases` gives immediate utility to both humans and agents while preserving existing default behavior.
```diff
diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md
@@ FR-2 Command: swagger-cli list
+ --all-aliases Run query across all aliases; include `alias` per result
@@ FR-5 Command: swagger-cli search
+ --all-aliases Search across all aliases; include `alias` per result
@@ Robot output (list/search)
+ "alias": "petstore",
@@ Open Questions Q3
-Decision: Single alias per query in MVP; revisit if requested
+Decision: Default remains single alias; `--all-aliases` added for explicit federated discovery.
```
**8. Add cache lifecycle management (`cache` command)**
You report disk usage but dont provide lifecycle controls. Add prune/compact/stats to avoid long-term cache bloat and improve operational hygiene.
```diff
diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md
@@ Functional Requirements
+### FR-10: Cache Lifecycle Management
+**Description:** Manage cache growth and retention.
+**Command:** swagger-cli cache [OPTIONS]
+OPTIONS:
+ --stats Show per-alias and total cache usage
+ --prune-stale Delete aliases older than stale threshold
+ --max-total-mb <N> Enforce global cache cap via LRU eviction
+ --robot Machine-readable output
```
**9. Harden release/install supply chain (checksums + signatures)**
Current installer downloads and executes binaries without verification. Add checksum/signature artifacts and enforce verification in installer by default.
```diff
diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md
@@ Release stage
+ - sha256sum swagger-cli-* > SHA256SUMS
+ - minisign -Sm SHA256SUMS
+ - upload SHA256SUMS and SHA256SUMS.minisig with binaries
@@ install.sh
-# Download binary
+# Download binary + checksum manifest + signature
+# Verify signature + checksum before chmod +x
+VERIFY="${VERIFY:-true}"
+if [ "$VERIFY" = "true" ]; then
+ # fail closed on verification mismatch
+fi
```
**10. Add adversarial reliability tests (fault injection + concurrency stress + property tests)**
The plan has good tests, but not enough proof for crash consistency and lock behavior under contention. Add targeted destructive tests to validate the core reliability claims.
```diff
diff --git a/prd-swagger-cli.md b/prd-swagger-cli.md
@@ Testing Strategy
+### Reliability Stress Tests
+- Fault-injection tests at each write step (before/after fsync, before/after rename) to prove recoverability.
+- Multi-process lock contention tests (N>=32) validating bounded lock timeout and no deadlocks.
+- Property-based tests for deterministic ordering, stable tie-breaking, and pointer validity.
@@ Success Metrics Phase 1
+- Crash-consistency claim is validated by automated fault-injection test suite (not only unit tests).
```
If you want, I can now consolidate these into a single full unified patch for `prd-swagger-cli.md` with section-by-section wording ready to paste.