gitlore/plans/time-decay-expert-scoring.feedback-1.md

Your plan is strong directionally, but I’d revise it in 8 key places to avoid regressions and make it significantly more useful in production.

1. **Split reviewer signals into “participated” vs “assigned-only”**
Reason: today’s inflation problem is often assignment noise. Treating `mr_reviewers` equal to real review activity still over-ranks passive reviewers.

```diff
@@ Per-signal contributions
-| Reviewer (reviewed MR touching path) | 10 | 90 days |
+| ReviewerParticipated (left DiffNote on MR/path) | 10 | 90 days |
+| ReviewerAssignedOnly (in mr_reviewers, no DiffNote by that user on MR/path) | 3 | 45 days |
```

```diff
@@ Scoring Formula
-score = reviewer_mr * reviewer_weight + ...
+score = reviewer_participated * reviewer_weight
+      + reviewer_assigned_only * reviewer_assignment_weight
+      + ...
```

2. **Cap/saturate note intensity per MR**
Reason: raw per-note addition can still reward “comment storms.” Use diminishing returns.

```diff
@@ Rust-Side Aggregation
-- Notes: Vec<i64> (timestamps) from diffnote_reviewer
+-- Notes grouped per (username, mr_id): note_count + max_ts
+-- Note contribution per MR uses diminishing returns:
+-- note_score_mr = note_bonus * ln(1 + note_count) * decay(now - ts, note_hl)
```

3. **Use better event timestamps than `m.updated_at` for file-change signals**
Reason: `updated_at` is noisy (title edits, metadata touches) and creates false recency.

```diff
@@ SQL Restructure
- signal 3/4 seen_at = m.updated_at
+ signal 3/4 activity_ts = COALESCE(m.merged_at, m.closed_at, m.created_at, m.updated_at)
```

4. **Don’t stream raw note rows to Rust; pre-aggregate in SQL**
Reason: current plan removes SQL grouping and can blow up memory/latency on large repos.

```diff
@@ SQL Restructure
-SELECT username, signal, mr_id, note_id, ts FROM signals
+WITH raw_signals AS (...),
+aggregated AS (
+  -- 1 row per (username, signal_class, mr_id) for MR-level signals
+  -- 1 row per (username, mr_id) for note_count + max_ts
+)
+SELECT username, signal_class, mr_id, qty, ts FROM aggregated
```

5. **Replace fixed `"24m"` with model-driven cutoff**
Reason: hardcoded 24m is arbitrary and tied to current weights/half-lives only.

```diff
@@ Default --since Change
-Expert mode: "6m" -> "24m"
+Expert mode default window derived from scoring.max_age_days (default 1095 days / 36m).
+Formula guidance: choose max_age where max possible single-event contribution < epsilon (e.g. 0.25 points).
+Add `--all-history` to disable cutoff for diagnostics.
```

6. **Validate scoring config explicitly**
Reason: silent bad configs (`half_life_days = 0`, negative weights) create undefined behavior.

```diff
@@ ScoringConfig (config.rs)
 pub struct ScoringConfig {
   pub author_weight: i64,
   pub reviewer_weight: i64,
   pub note_bonus: i64,
+  pub reviewer_assignment_weight: i64,   // default: 3
   pub author_half_life_days: u32,
   pub reviewer_half_life_days: u32,
   pub note_half_life_days: u32,
+  pub reviewer_assignment_half_life_days: u32, // default: 45
+  pub max_age_days: u32, // default: 1095
 }
@@ Config::load_from_path
+validate_scoring(&config.scoring)?; // weights >= 0, half_life_days > 0, max_age_days >= 30
```

7. **Keep raw float score internally; round only for display**
Reason: rounding before sort causes avoidable ties/rank instability.

```diff
@@ Rust-Side Aggregation
-Round to i64 for Expert.score field
+Compute `raw_score: f64`, sort by raw_score DESC.
+Expose integer `score` for existing UX.
+Optionally expose `score_raw` and `score_components` in robot JSON when `--explain-score`.
```

8. **Add confidence + data-completeness metadata**
Reason: rankings are misleading if `mr_file_changes` coverage is poor.

```diff
@@ ExpertResult / Output
+confidence: "high" | "medium" | "low"
+coverage: { mrs_with_file_changes, total_mrs_in_window, percent }
+warning when coverage < threshold (e.g. 70%)
```

```diff
@@ Verification
 4. cargo test
+5. ubs src/cli/commands/who.rs src/core/config.rs
+6. Benchmark query_expert on representative DB (latency + rows scanned before/after)
```

If you want, I can rewrite your full plan document into a clean “v2” version that already incorporates these diffs end-to-end.