gitlore/plans/lore-service.feedback-4.md

No `## Rejected Recommendations` section was present in the plan you shared, so the proposals below are all net-new.

1. **Make scheduled runs explicitly target a single service instance**
Analysis: right now `service run` has no selector, but the plan supports multiple installed services. That creates ambiguity and incorrect manifest/status selection. This is the most important architectural fix.

```diff
@@ `lore service install` What it does
- runs `lore --robot service run` at the specified interval
+ runs `lore --robot service run --service-id <service_id>` at the specified interval

@@ Robot output (`install`)
- "sync_command": "/usr/local/bin/lore --robot service run",
+ "sync_command": "/usr/local/bin/lore --robot service run --service-id a1b2c3d4",

@@ `ServiceCommand` enum
- #[command(hide = true)]
- Run,
+ #[command(hide = true)]
+ Run {
+     /// Internal selector injected by scheduler backend
+     #[arg(long, hide = true)]
+     service_id: String,
+ },

@@ `handle_service_run` signature
-pub fn handle_service_run(start: std::time::Instant) -> Result<(), Box<dyn std::error::Error>>
+pub fn handle_service_run(service_id: &str, start: std::time::Instant) -> Result<(), Box<dyn std::error::Error>>

@@ run flow step 1
- Read install manifest
+ Read install manifest for `service_id`
```

2. **Strengthen `service_id` derivation to avoid cross-workspace collisions**
Analysis: hashing config path alone can collide when many workspaces share one global config. Identity should represent what is being synced, not only where config lives.

```diff
@@ Key Design Principles / Project-Scoped Service Identity
- derive from a stable hash of the config file path
+ derive from a stable fingerprint of:
+ - canonical workspace root
+ - normalized configured GitLab project URLs
+ - canonical config path
+ then take first 12 hex chars of SHA-256

@@ `compute_service_id`
- Returns first 8 hex chars of SHA-256 of the canonical config path.
+ Returns first 12 hex chars of SHA-256 of a canonical identity tuple
+ (workspace_root + sorted project URLs + config_path).
```

3. **Introduce a service-state machine with a dedicated admin lock**
Analysis: install/uninstall/pause/resume/repair/status can race each other. A lock and explicit transition table prevents invalid states and file races.

```diff
@@ New section: Service State Model
+ All state mutations are serialized by `AppLock("service-admin-{service_id}")`.
+ Legal transitions:
+ - idle -> running -> success|degraded|backoff|paused
+ - backoff -> running|paused
+ - paused -> half_open|running (resume)
+ - half_open -> running|paused
+ Any invalid transition is rejected with `ServiceCorruptState`.

@@ `handle_install`, `handle_uninstall`, `handle_pause`, `handle_resume`, `handle_repair`
+ Acquire `service-admin-{service_id}` before mutating manifest/status/service files.
```

4. **Unify manual and scheduled sync execution behind one orchestrator**
Analysis: the plan currently duplicates stage logic and error classification in `service run`, increasing drift risk. A shared orchestrator gives one authoritative pipeline behavior.

```diff
@@ Key Design Principles
+ #### 6. Single Sync Orchestrator
+ Both `lore sync` and `lore service run` call `SyncOrchestrator`.
+ Service mode adds policy (backoff/circuit-breaker); manual mode bypasses policy.

@@ Service Run Implementation
- execute_sync_stages(&sync_args)
+ SyncOrchestrator::run(SyncMode::Service { profile, policy })

@@ manual sync
- separate pipeline path
+ SyncOrchestrator::run(SyncMode::Manual { flags })
```

5. **Add bounded in-run retries for transient core-stage failures**
Analysis: single-shot failure handling will over-trigger backoff on temporary network blips. One short retry per core stage significantly improves freshness without much extra runtime.

```diff
@@ Stage-aware execution
+ Core stages (`issues`, `mrs`) get up to 1 immediate retry on transient errors
+ (jittered 1-5s). Permanent errors are never retried.
+ Optional stages keep best-effort semantics.

@@ Acceptance criteria (`service run`)
+ Retries transient core stage failures once before counting run as failed.
```

6. **Harden persistence with full crash-safety semantics**
Analysis: current atomic write description is good but incomplete for power-loss durability. You should fsync parent directory after rename and include lightweight integrity metadata.

```diff
@@ `write_atomic`
- tmp file + fsync + rename
+ tmp file + fsync(file) + rename + fsync(parent_dir)

@@ `ServiceManifest` and `SyncStatusFile`
+ pub write_seq: u64,
+ pub content_sha256: String, // optional integrity guard for repair/doctor
```

7. **Fix token handling to avoid shell/env injection and add secure-store mode**
Analysis: sourcing env files in shell is brittle if token contains special chars/newlines. Also, secure OS credential stores should be first-class for production reliability/security.

```diff
@@ Token storage strategies
-| `env-file` (default) ...
+| `auto` (default) | use secure-store when available, else env-file |
+| `secure-store` | macOS Keychain / libsecret / Windows Credential Manager |
+| `env-file` | explicit fallback |

@@ macOS wrapper script
-. "{data_dir}/service-env-{service_id}"
-export {token_env_var}
+TOKEN_VALUE="$(cat "{data_dir}/service-token-{service_id}" )"
+export {token_env_var}="$TOKEN_VALUE"

@@ Acceptance criteria
+ Reject token values containing `\0` or newline for env-file mode.
+ Never eval/source untrusted token content.
```

8. **Correct platform/runtime implementation hazards**
Analysis: there are a few correctness risks that should be fixed in-plan now.

```diff
@@ macOS install steps
- Get UID via `unsafe { libc::getuid() }`
+ Get UID via safe API (`nix::unistd::Uid::current()` or equivalent safe helper)

@@ Command Runner Helper
- poll try_wait and read stdout/stderr after exit
+ avoid potential pipe backpressure deadlock:
+ use wait-with-timeout + concurrent stdout/stderr draining

@@ Linux timer
- OnUnitActiveSec={interval_seconds}s
+ OnUnitInactiveSec={interval_seconds}s
+ AccuracySec=1min
```

9. **Make logs fully service-scoped**
Analysis: you already scoped manifest/status by `service_id`; logs are still global in several places. Multi-service installs will overwrite each other’s logs.

```diff
@@ Paths Module Additions
-pub fn get_service_log_path() -> PathBuf
+pub fn get_service_log_path(service_id: &str, stream: LogStream) -> PathBuf

@@ log filenames
- logs/service-stderr.log
- logs/service-stdout.log
+ logs/service-{service_id}-stderr.log
+ logs/service-{service_id}-stdout.log

@@ `service logs`
- default path: `{data_dir}/logs/service-stderr.log`
+ default path: `{data_dir}/logs/service-{service_id}-stderr.log`
```

10. **Resolve internal spec contradictions and rollback gaps**
Analysis: there are a few contradictory statements and incomplete rollback behavior that will cause implementation churn.

```diff
@@ `service logs` behavior
- default (no flags): open in editor (human)
+ default (no flags): print last 100 lines (human and robot metadata mode)
+ `--open` is explicit opt-in

@@ install rollback
- On failure: removes generated service files
+ On failure: removes generated service files, env file, wrapper script, and temp manifest

@@ `handle_service_run` sample code
- let manifest_path = get_service_manifest_path();
+ let manifest_path = get_service_manifest_path(service_id);
```

If you want, I can take these revisions and produce a single consolidated “Iteration 4” replacement plan block with all sections rewritten coherently so it’s ready to hand to an implementer.