theirongolddev/swagger-cli

Fork 0

Files

teernisse dfcb4c93eb misc

2026-02-28 00:41:31 -05:00

18 KiB

Raw Permalink Blame History

swagger-cli Refactoring Plan: 5 System Weakness Remediations

Epic 1: Output Sink Abstraction

Problem

Every command handler takes robot: bool, manually branches on it 29+ times, and contains its own formatting logic. The output/ module is a hollow shell (human.rs is 8 lines). Adding new output formats (YAML is on the roadmap) requires touching every command. Output concerns are tangled into business logic.

Solution

Introduce an OutputMode enum and an emit() function that each command calls once with its typed data. Commands produce data; the output layer decides presentation.

Design

New type in src/output/mod.rs:

pub enum OutputMode {
    Robot,
    Human,
    // Future: Yaml, Table, etc.
}

New emit function pattern: Each command defines a CommandOutput trait impl or uses a generic emit<T: Serialize + HumanDisplay>() function. The HumanDisplay trait provides the human-readable rendering.

pub trait HumanDisplay {
    fn display_human(&self, w: &mut dyn Write) -> std::io::Result<()>;
}

pub fn emit<T: Serialize + HumanDisplay>(
    data: &T,
    mode: OutputMode,
    command: &str,
    duration: Duration,
) {
    match mode {
        OutputMode::Robot => robot::robot_success(data, command, duration),
        OutputMode::Human => {
            let mut stdout = std::io::stdout().lock();
            data.display_human(&mut stdout).expect("stdout write");
        }
    }
}

Implementation Steps

Create OutputMode enum and HumanDisplay trait in src/output/mod.rs
Create emit() function in src/output/mod.rs
Convert tags command first (simplest command, 168 lines) as proof-of-concept:
- Impl HumanDisplay for TagsOutput
- Replace if robot_mode { ... } else { ... } with emit(&output, mode, "tags", duration)
- Remove robot: bool parameter from execute(), replace with OutputMode
Propagate OutputMode through main.rs -- replace let robot = resolve_robot_mode(&cli) with let mode = resolve_output_mode(&cli) returning OutputMode
Convert remaining simple commands one at a time (show, schemas, diff, search, cache_cmd, aliases, doctor, list)
Convert fetch and sync (most complex, last)
Delete dead code -- robot: bool parameters, inline formatting blocks

TDD Plan

RED tests to write first (in src/output/mod.rs tests):

test_emit_robot_mode_produces_valid_json -- call emit() with Robot mode, capture stdout, parse as RobotEnvelope
test_emit_human_mode_produces_readable_text -- call emit() with Human mode, verify no JSON in output
test_human_display_tags_output -- verify TagsOutput::display_human() produces table with correct headers

GREEN: Implement OutputMode, HumanDisplay, emit() to pass tests.

REFACTOR: Convert each command one at a time, running full test suite between each.

Acceptance Criteria

Zero if robot / if !robot branches in command handlers
output/human.rs contains all human formatting logic (not 8 lines)
Adding a new output format requires zero changes to command handlers
All existing integration tests and golden tests pass unchanged

Files Changed

src/output/mod.rs -- add OutputMode, HumanDisplay, emit()
src/output/human.rs -- move all human formatting here
src/main.rs -- resolve_output_mode() replaces resolve_robot_mode()
src/cli/tags.rs -- proof of concept conversion
src/cli/show.rs, src/cli/schemas.rs, src/cli/diff.rs, src/cli/search.rs, src/cli/cache_cmd.rs, src/cli/aliases.rs, src/cli/doctor.rs, src/cli/list.rs -- convert each
src/cli/fetch.rs, src/cli/sync_cmd.rs -- convert last (most complex)
src/cli/mod.rs -- no changes needed (Cli struct keeps robot flag, OutputMode resolves from it)

Dependency Note

This epic should be completed before Epic 2 (sync split) since sync_cmd conversion is part of this work.

Epic 2: Split sync_cmd.rs

Problem

sync_cmd.rs is 1,843 lines containing 5+ distinct responsibilities: CLI args, diff computation, checkpoint/resume, per-host rate limiting, concurrent job pool, single-alias sync, batch sync, and output formatting. It's the most complex module and the hardest to reason about.

Solution

Split into a sync/ directory with focused modules while preserving the existing public API (execute() function).

Design

New directory: src/cli/sync/

File	Responsibility	Lines (est.)
`mod.rs`	Re-exports, `Args` struct, `execute()` entry point	~100
`types.rs`	All Serialize structs: `SyncOutput`, `AliasSyncResult`, `SyncAllOutput`, `ChangeSummary`, `ChangeDetails`, `EndpointKey`, `SchemaDiff`, `EndpointDiff`	~100
`diff.rs`	`compute_diff()`, `endpoint_key()`, `endpoint_fingerprint()`	~120
`checkpoint.rs`	`SyncCheckpoint`, `load_checkpoint()`, `save_checkpoint()`, `remove_checkpoint()`	~60
`throttle.rs`	`PerHostThrottle`, `extract_host()`	~50
`single.rs`	`sync_inner()`, `sync_one_alias()`, `sync_one_alias_inner()`, `output_no_changes()`, `output_changes()`	~250
`batch.rs`	`sync_all_inner()` with concurrent stream logic	~300

Implementation Steps

Create src/cli/sync/ directory
Move types first -- extract all #[derive(Serialize)] structs into types.rs
Extract diff.rs -- compute_diff(), endpoint_key(), endpoint_fingerprint(), MAX_DETAIL_ITEMS
Extract checkpoint.rs -- checkpoint load/save/remove + CHECKPOINT_FILE const
Extract throttle.rs -- PerHostThrottle and extract_host()
Extract single.rs -- single-alias sync logic and output helpers
Extract batch.rs -- sync_all_inner() concurrent execution
Create mod.rs -- Args struct, execute(), re-exports
Update src/cli/mod.rs -- replace pub mod sync_cmd with pub mod sync
Update src/main.rs -- Commands::Sync references

TDD Plan

This is a pure refactor -- no new behavior. The TDD approach is:

Run full test suite before starting: cargo test
After each extraction step, verify: cargo test && cargo clippy --all-targets -- -D warnings
No new tests needed (existing tests in sync_cmd::tests move to sync/diff.rs tests)

Specific test verification after each step:

test_diff_no_changes, test_diff_added_endpoint, test_diff_removed_endpoint, test_diff_modified_endpoint, test_diff_added_schema, test_diff_removed_schema, test_diff_endpoint_modified_by_params -- move to sync/diff.rs
Integration tests (tests/integration_test.rs) that test sync command -- must pass unchanged

Acceptance Criteria

sync_cmd.rs is deleted, replaced by sync/ directory
No file in sync/ exceeds 350 lines
All existing tests pass with zero modifications
pub async fn execute() signature unchanged
cargo clippy --all-targets -- -D warnings passes

Files Changed

Delete: src/cli/sync_cmd.rs
Create: src/cli/sync/mod.rs, types.rs, diff.rs, checkpoint.rs, throttle.rs, single.rs, batch.rs
Modify: src/cli/mod.rs (module declaration), src/main.rs (import path if needed)

Dependency Note

Should be done after Epic 1 (output sink) since that will have already simplified the output formatting within sync. If done before Epic 1, the output formatting moves to single.rs and batch.rs and then needs to be re-extracted during Epic 1.

Epic 3: spawn_blocking for Cache I/O

Problem

CacheManager uses std::fs operations and fs2::FileExt (flock) -- all blocking I/O. But command handlers are async fn running on tokio. Under sync --all --jobs=4, multiple concurrent tasks hit blocking cache I/O, potentially starving the tokio runtime.

Solution

Wrap cache operations in tokio::task::spawn_blocking(). Create async wrapper methods on a new AsyncCacheManager or add async variants directly to CacheManager.

Design

Option A (recommended): Async wrapper struct

pub struct AsyncCache {
    inner: CacheManager,
}

impl AsyncCache {
    pub async fn load_index(&self, alias: &str) -> Result<(SpecIndex, CacheMetadata), SwaggerCliError> {
        let inner = self.inner.clone(); // CacheManager is just a PathBuf, cheap
        let alias = alias.to_string();
        tokio::task::spawn_blocking(move || inner.load_index(&alias))
            .await
            .map_err(|e| SwaggerCliError::Cache(format!("task join error: {e}")))?
    }
    // ... same pattern for write_cache, load_raw, list_aliases, etc.
}

This preserves the existing sync CacheManager for tests and simple cases while providing an async-safe interface for the runtime.

Implementation Steps

Derive Clone for CacheManager (it's just a PathBuf wrapper -- trivial)
Create AsyncCache in src/core/cache.rs with async wrappers for each public method:
- load_index(), load_raw(), write_cache(), list_aliases(), ensure_dirs(), alias_dir(), delete_alias_dir(), update_last_accessed()
Write tests for AsyncCache verifying it produces identical results to sync CacheManager
Convert command handlers to use AsyncCache instead of CacheManager:
- Start with tags.rs (simplest)
- Then list.rs, show.rs, search.rs, schemas.rs
- Then sync_cmd.rs / sync/ (most critical -- this is where contention happens)
- Then fetch.rs, doctor.rs, cache_cmd.rs, aliases.rs

TDD Plan

RED tests first:

#[tokio::test]
async fn test_async_cache_load_index_matches_sync() {
    // Setup: write a spec to cache using sync CacheManager
    // Act: load via AsyncCache
    // Assert: result matches sync CacheManager::load_index()
}

#[tokio::test]
async fn test_async_cache_write_then_load_roundtrip() {
    // Write via AsyncCache, load via AsyncCache, verify integrity
}

#[tokio::test]
async fn test_async_cache_concurrent_reads_no_panic() {
    // Spawn 10 concurrent load_index tasks on same alias
    // All should succeed (shared read lock)
}

GREEN: Implement AsyncCache wrapper.

REFACTOR: Convert command handlers one at a time.

Acceptance Criteria

All cache I/O from async command handlers goes through spawn_blocking
Existing sync CacheManager preserved for non-async tests
cargo test passes (including lock_contention_test.rs)
No direct std::fs calls from async contexts in command handlers

Files Changed

src/core/cache.rs -- add Clone derive, add AsyncCache struct
All src/cli/*.rs files -- replace CacheManager::new() with AsyncCache::new()

Dependency Note

Independent of Epics 1 and 2. Can be done in parallel.

Epic 4: Property Tests for Search Engine and Parser

Problem

Property tests currently cover only 4 trivial properties (hash determinism, JSON roundtrip, index ordering, hash format). The search engine (722 LOC with scoring, tokenization, Unicode snippet handling) and the OpenAPI parser/indexer (659 LOC) have zero property test coverage.

Solution

Add targeted property tests using proptest for the search engine and format detection/parsing.

Design

New test file: tests/property_search_test.rs

Property tests for search engine:

Score monotonicity: adding a matching field to an endpoint should never decrease its score
Deterministic ordering: same query + same index = same result order, always
Limit respected: result count <= opts.limit for any query
Coverage boost property: matching N/N terms scores >= matching 1/N terms
Case insensitivity: lowercased query on lowercased data = same results as mixed case
Empty query safety: any whitespace-only string returns empty
Unicode safety: search never panics on arbitrary Unicode input (including emoji, RTL, zero-width chars)
Snippet bounds: snippet length never exceeds 50 chars + ellipsis markers

New test file: tests/property_indexer_test.rs

Property tests for indexer:

Format detection idempotent: detect_format(bytes, hint, ct) called twice = same result
JSON sniffing correct: bytes starting with { or [ always detected as JSON
Normalize roundtrip: normalize_to_json(bytes, Json) preserves semantic content
build_index never panics on valid OpenAPI: generate random valid-ish OpenAPI structures, verify build_index produces a valid SpecIndex
Index endpoint count matches paths: number of IndexedEndpoints = sum of methods across all paths
Content hash deterministic: same bytes -> same hash, every time

Implementation Steps

Create tests/property_search_test.rs with proptest strategies for generating SpecIndex and SearchOptions
Implement search property tests (8 tests above)
Create tests/property_indexer_test.rs with proptest strategies for generating OpenAPI-like JSON
Implement indexer property tests (6 tests above)
Run with extended case count to verify: cargo test -- --ignored or PROPTEST_CASES=1000 cargo test

TDD Plan

These ARE the tests. Write them, then verify they pass against the existing implementation. If any fail, that's a real bug to fix.

Proptest strategy for SpecIndex generation:

fn arb_indexed_endpoint() -> impl Strategy<Value = IndexedEndpoint> {
    (
        "/[a-z]{1,5}(/[a-z]{1,5}){0,3}",  // path
        prop_oneof!["GET", "POST", "PUT", "DELETE", "PATCH"],  // method
        proptest::option::of("[A-Za-z ]{1,50}"),  // summary
        proptest::option::of("[A-Za-z ]{1,100}"),  // description
    ).prop_map(|(path, method, summary, description)| {
        IndexedEndpoint { path, method, summary, description, /* ... defaults ... */ }
    })
}

fn arb_spec_index() -> impl Strategy<Value = SpecIndex> {
    proptest::collection::vec(arb_indexed_endpoint(), 0..50)
        .prop_map(|endpoints| SpecIndex { endpoints, /* ... */ })
}

Acceptance Criteria

14 new property tests (8 search + 6 indexer)
All pass with default proptest case count (256)
No new dependencies (proptest already in dev-deps)
Tests catch at least one real edge case (likely Unicode snippet handling)

Files Changed

Create: tests/property_search_test.rs, tests/property_indexer_test.rs
Possibly modify: src/core/search.rs if property tests reveal bugs

Dependency Note

Fully independent. Can be done in parallel with all other epics.

Epic 5: Shared Command Pipeline

Problem

Every command handler independently: captures Instant::now(), creates CacheManager::new(cache_dir()), loads index with cm.load_index(alias), does work, formats output, passes duration. This is ~15-20 lines of identical scaffolding per command. Adding cross-cutting concerns (telemetry, logging, caching headers) requires touching every command.

Solution

Create a CommandContext struct that encapsulates the common setup, and a run_command() helper that handles timing and output.

Design

/// Shared context created once per command invocation.
pub struct CommandContext {
    pub cache: AsyncCache,  // or CacheManager
    pub mode: OutputMode,
    pub start: Instant,
    pub network_policy: NetworkPolicy,
    pub config_override: Option<PathBuf>,
}

impl CommandContext {
    pub fn new(mode: OutputMode, network_flag: &str, config_override: Option<&Path>) -> Result<Self, SwaggerCliError> {
        Ok(Self {
            cache: AsyncCache::new(cache_dir()),
            mode,
            start: Instant::now(),
            network_policy: resolve_policy(network_flag)?,
            config_override: config_override.map(PathBuf::from),
        })
    }

    pub fn load_index(&self, alias: &str) -> Result<(SpecIndex, CacheMetadata), SwaggerCliError> {
        self.cache.load_index(alias)
    }

    pub fn elapsed(&self) -> Duration {
        self.start.elapsed()
    }
}

Usage in commands:

pub async fn execute(args: &Args, ctx: &CommandContext) -> Result<(), SwaggerCliError> {
    let (index, meta) = ctx.load_index(&args.alias)?;
    let output = build_output(&index);
    emit(&output, ctx.mode, "tags", ctx.elapsed());
    Ok(())
}

Implementation Steps

Define CommandContext struct in src/cli/context.rs
Write tests for CommandContext::new() and helper methods
Update main.rs to create CommandContext and pass to commands
Convert tags first (simplest) as proof of concept
Convert all other commands to accept &CommandContext
Remove duplicated CacheManager::new(cache_dir()) + Instant::now() from each handler

TDD Plan

RED tests first:

#[test]
fn test_command_context_creates_valid_cache() {
    let ctx = CommandContext::new(OutputMode::Robot, "auto", None).unwrap();
    // cache dir should be the platform default
    assert!(ctx.cache.cache_dir().exists() || true); // dir may not exist in test
}

#[test]
fn test_command_context_elapsed_increases() {
    let ctx = CommandContext::new(OutputMode::Human, "auto", None).unwrap();
    std::thread::sleep(Duration::from_millis(10));
    assert!(ctx.elapsed().as_millis() >= 10);
}

#[test]
fn test_command_context_offline_policy() {
    let ctx = CommandContext::new(OutputMode::Robot, "offline", None).unwrap();
    assert_eq!(ctx.network_policy, NetworkPolicy::Offline);
}

Acceptance Criteria

CommandContext created once in main.rs, passed to all commands
No command handler creates its own CacheManager or Instant
Each command's execute() signature includes &CommandContext
All tests pass unchanged

Files Changed

Create: src/cli/context.rs
Modify: src/cli/mod.rs (add pub mod context)
Modify: src/main.rs (create CommandContext, pass to dispatch)
Modify: All src/cli/*.rs execute functions (accept &CommandContext)

Dependency Note

Depends on Epic 1 (OutputMode) and Epic 3 (AsyncCache). Should be done last.

Execution Order

Epic 4 (property tests) ─────────────────────────────────────┐
Epic 3 (spawn_blocking) ─────────────────────────────────────┤
Epic 1 (output sink) ──> Epic 2 (split sync) ──> Epic 5 (pipeline)

Epics 3 and 4 are fully independent and can run in parallel with everything. Epic 1 should precede Epic 2 (simplifies sync output formatting before split). Epic 5 depends on Epics 1 and 3 (needs OutputMode and AsyncCache).

18 KiB Raw Permalink Blame History

swagger-cli Refactoring Plan: 5 System Weakness Remediations

Epic 1: Output Sink Abstraction

Problem

Solution

Design

Implementation Steps

TDD Plan

Acceptance Criteria

Files Changed

Dependency Note

Epic 2: Split sync_cmd.rs

Problem

Solution

Design

Implementation Steps

TDD Plan

Acceptance Criteria

Files Changed

Dependency Note

Epic 3: spawn_blocking for Cache I/O

Problem

Solution

Design

Implementation Steps

TDD Plan

Acceptance Criteria

Files Changed

Dependency Note

Epic 4: Property Tests for Search Engine and Parser

Problem

Solution

Design

Implementation Steps

TDD Plan

Acceptance Criteria

Files Changed

Dependency Note

Epic 5: Shared Command Pipeline

Problem

Solution

Design

Implementation Steps

TDD Plan

Acceptance Criteria

Files Changed

Dependency Note

Execution Order

18 KiB

Raw Permalink Blame History