No `## Rejected Recommendations` section was present in the plan you shared, so the proposals below are all net-new. 1. **Make scheduled runs explicitly target a single service instance** Analysis: right now `service run` has no selector, but the plan supports multiple installed services. That creates ambiguity and incorrect manifest/status selection. This is the most important architectural fix. ```diff @@ `lore service install` What it does - runs `lore --robot service run` at the specified interval + runs `lore --robot service run --service-id ` at the specified interval @@ Robot output (`install`) - "sync_command": "/usr/local/bin/lore --robot service run", + "sync_command": "/usr/local/bin/lore --robot service run --service-id a1b2c3d4", @@ `ServiceCommand` enum - #[command(hide = true)] - Run, + #[command(hide = true)] + Run { + /// Internal selector injected by scheduler backend + #[arg(long, hide = true)] + service_id: String, + }, @@ `handle_service_run` signature -pub fn handle_service_run(start: std::time::Instant) -> Result<(), Box> +pub fn handle_service_run(service_id: &str, start: std::time::Instant) -> Result<(), Box> @@ run flow step 1 - Read install manifest + Read install manifest for `service_id` ``` 2. **Strengthen `service_id` derivation to avoid cross-workspace collisions** Analysis: hashing config path alone can collide when many workspaces share one global config. Identity should represent what is being synced, not only where config lives. ```diff @@ Key Design Principles / Project-Scoped Service Identity - derive from a stable hash of the config file path + derive from a stable fingerprint of: + - canonical workspace root + - normalized configured GitLab project URLs + - canonical config path + then take first 12 hex chars of SHA-256 @@ `compute_service_id` - Returns first 8 hex chars of SHA-256 of the canonical config path. + Returns first 12 hex chars of SHA-256 of a canonical identity tuple + (workspace_root + sorted project URLs + config_path). ``` 3. **Introduce a service-state machine with a dedicated admin lock** Analysis: install/uninstall/pause/resume/repair/status can race each other. A lock and explicit transition table prevents invalid states and file races. ```diff @@ New section: Service State Model + All state mutations are serialized by `AppLock("service-admin-{service_id}")`. + Legal transitions: + - idle -> running -> success|degraded|backoff|paused + - backoff -> running|paused + - paused -> half_open|running (resume) + - half_open -> running|paused + Any invalid transition is rejected with `ServiceCorruptState`. @@ `handle_install`, `handle_uninstall`, `handle_pause`, `handle_resume`, `handle_repair` + Acquire `service-admin-{service_id}` before mutating manifest/status/service files. ``` 4. **Unify manual and scheduled sync execution behind one orchestrator** Analysis: the plan currently duplicates stage logic and error classification in `service run`, increasing drift risk. A shared orchestrator gives one authoritative pipeline behavior. ```diff @@ Key Design Principles + #### 6. Single Sync Orchestrator + Both `lore sync` and `lore service run` call `SyncOrchestrator`. + Service mode adds policy (backoff/circuit-breaker); manual mode bypasses policy. @@ Service Run Implementation - execute_sync_stages(&sync_args) + SyncOrchestrator::run(SyncMode::Service { profile, policy }) @@ manual sync - separate pipeline path + SyncOrchestrator::run(SyncMode::Manual { flags }) ``` 5. **Add bounded in-run retries for transient core-stage failures** Analysis: single-shot failure handling will over-trigger backoff on temporary network blips. One short retry per core stage significantly improves freshness without much extra runtime. ```diff @@ Stage-aware execution + Core stages (`issues`, `mrs`) get up to 1 immediate retry on transient errors + (jittered 1-5s). Permanent errors are never retried. + Optional stages keep best-effort semantics. @@ Acceptance criteria (`service run`) + Retries transient core stage failures once before counting run as failed. ``` 6. **Harden persistence with full crash-safety semantics** Analysis: current atomic write description is good but incomplete for power-loss durability. You should fsync parent directory after rename and include lightweight integrity metadata. ```diff @@ `write_atomic` - tmp file + fsync + rename + tmp file + fsync(file) + rename + fsync(parent_dir) @@ `ServiceManifest` and `SyncStatusFile` + pub write_seq: u64, + pub content_sha256: String, // optional integrity guard for repair/doctor ``` 7. **Fix token handling to avoid shell/env injection and add secure-store mode** Analysis: sourcing env files in shell is brittle if token contains special chars/newlines. Also, secure OS credential stores should be first-class for production reliability/security. ```diff @@ Token storage strategies -| `env-file` (default) ... +| `auto` (default) | use secure-store when available, else env-file | +| `secure-store` | macOS Keychain / libsecret / Windows Credential Manager | +| `env-file` | explicit fallback | @@ macOS wrapper script -. "{data_dir}/service-env-{service_id}" -export {token_env_var} +TOKEN_VALUE="$(cat "{data_dir}/service-token-{service_id}" )" +export {token_env_var}="$TOKEN_VALUE" @@ Acceptance criteria + Reject token values containing `\0` or newline for env-file mode. + Never eval/source untrusted token content. ``` 8. **Correct platform/runtime implementation hazards** Analysis: there are a few correctness risks that should be fixed in-plan now. ```diff @@ macOS install steps - Get UID via `unsafe { libc::getuid() }` + Get UID via safe API (`nix::unistd::Uid::current()` or equivalent safe helper) @@ Command Runner Helper - poll try_wait and read stdout/stderr after exit + avoid potential pipe backpressure deadlock: + use wait-with-timeout + concurrent stdout/stderr draining @@ Linux timer - OnUnitActiveSec={interval_seconds}s + OnUnitInactiveSec={interval_seconds}s + AccuracySec=1min ``` 9. **Make logs fully service-scoped** Analysis: you already scoped manifest/status by `service_id`; logs are still global in several places. Multi-service installs will overwrite each other’s logs. ```diff @@ Paths Module Additions -pub fn get_service_log_path() -> PathBuf +pub fn get_service_log_path(service_id: &str, stream: LogStream) -> PathBuf @@ log filenames - logs/service-stderr.log - logs/service-stdout.log + logs/service-{service_id}-stderr.log + logs/service-{service_id}-stdout.log @@ `service logs` - default path: `{data_dir}/logs/service-stderr.log` + default path: `{data_dir}/logs/service-{service_id}-stderr.log` ``` 10. **Resolve internal spec contradictions and rollback gaps** Analysis: there are a few contradictory statements and incomplete rollback behavior that will cause implementation churn. ```diff @@ `service logs` behavior - default (no flags): open in editor (human) + default (no flags): print last 100 lines (human and robot metadata mode) + `--open` is explicit opt-in @@ install rollback - On failure: removes generated service files + On failure: removes generated service files, env file, wrapper script, and temp manifest @@ `handle_service_run` sample code - let manifest_path = get_service_manifest_path(); + let manifest_path = get_service_manifest_path(service_id); ``` If you want, I can take these revisions and produce a single consolidated “Iteration 4” replacement plan block with all sections rewritten coherently so it’s ready to hand to an implementer.