gitlore/plans/time-decay-expert-scoring.feedback-4.md at trace

Files

Taylor Eernisse 4185abe05d docs: add feature ideas catalog, time-decay scoring plan, and timeline issue doc

Ideas catalog (docs/ideas/): 25 feature concept documents covering future
lore capabilities including bottleneck detection, churn analysis, expert
scoring, collaboration patterns, milestone risk, knowledge silos, and more.
Each doc includes motivation, implementation sketch, data requirements, and
dependencies on existing infrastructure. README.md provides an overview and
SYSTEM-PROPOSAL.md presents the unified analytics vision.

Plans (plans/): Time-decay expert scoring design with four rounds of review
feedback exploring decay functions, scoring algebra, and integration points
with the existing who-expert pipeline.

Issue doc (docs/issues/001): Documents the timeline pipeline bug where
EntityRef was missing project context, causing ambiguous cross-project
references during the EXPAND stage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-09 10:16:48 -05:00

5.7 KiB

Raw Permalink Blame History

Your plan is already strong. The biggest remaining gaps are temporal correctness, indexability at scale, and ranking reliability under sparse/noisy evidence. These are the revisions I’d make.

Fix temporal correctness for --as-of (critical) Analysis: Right now the plan describes --as-of, but the SQL only enforces lower bounds (>= since). If as_of is in the past, “future” events can still enter and get full weight (because elapsed is clamped). This breaks reproducibility.

@@ 3. SQL Restructure
- AND n.created_at >= ?2
+ AND n.created_at BETWEEN ?2 AND ?4
@@ Signal 3/4a/4b
- AND {state_aware_ts} >= ?2
+ AND {state_aware_ts} BETWEEN ?2 AND ?4

@@ 5a. Reproducible Scoring via --as-of
- All decay computations use as_of_ms instead of SystemTime::now()
+ All event selection and decay computations are bounded by as_of_ms.
+ Query window is [since_ms, as_of_ms], never [since_ms, now_ms].
+ Add test: test_as_of_excludes_future_events.

Resolve closed-state inconsistency Analysis: The CASE handles closed, but all signal queries filter to ('opened','merged'), making the closed_at branch dead code. Either include closed MRs intentionally or remove that logic. I’d include closed with a reduced multiplier.

@@ ScoringConfig (config.rs)
+ pub closed_mr_multiplier: f64, // default: 0.5

@@ 3. SQL Restructure
- AND m.state IN ('opened','merged')
+ AND m.state IN ('opened','merged','closed')

@@ 4. Rust-Side Aggregation
+ if state == "closed" { contribution *= closed_mr_multiplier; }

Replace OR path predicates with index-friendly UNION ALL branches Analysis: (new_path ... OR old_path ...) often degrades index usage in SQLite. Split into two indexed branches and dedupe once. This improves planner stability and latency on large datasets.

@@ 3. SQL Restructure
-WITH matched_notes AS (
-  ... AND (n.position_new_path {path_op} OR n.position_old_path {path_op})
-),
+WITH matched_notes AS (
+  SELECT ... FROM notes n WHERE ... AND n.position_new_path {path_op}
+  UNION ALL
+  SELECT ... FROM notes n WHERE ... AND n.position_old_path {path_op}
+),
+matched_notes_dedup AS (
+  SELECT DISTINCT id, discussion_id, author_username, created_at, project_id
+  FROM matched_notes
+),
@@
- JOIN matched_notes mn ...
+ JOIN matched_notes_dedup mn ...

Add canonical path identity (rename-chain support) Analysis: Direct old_path/new_path matching only handles one-hop rename scenarios. A small alias graph/table built at ingest time gives robust expertise continuity across A→B→C chains and avoids repeated SQL complexity.

@@ Files to Modify
- 3. src/core/db.rs — Add migration for indexes...
+ 3. src/core/db.rs — Add migration for indexes + path_identity table
+ 4. src/core/ingest/*.rs — populate path_identity on rename events
+ 5. src/cli/commands/who.rs — resolve query path to canonical path_id first

@@ Context
- The fix has three parts:
+ The fix has four parts:
+ - Introduce canonical path identity so multi-hop renames preserve expertise

Split scoring engine into a versioned core module Analysis: who.rs is becoming a mixed CLI/query/math/output surface. Move scoring math and event normalization into src/core/scoring/ with explicit model versions. This reduces regression risk and enables future model experiments.

@@ Files to Modify
+ 4. src/core/scoring/mod.rs — model interface + shared types
+ 5. src/core/scoring/model_v2_decay.rs — current implementation
+ 6. src/cli/commands/who.rs — orchestration only

@@ 5b. Score Explainability
+ resolved_input includes scoring_model_version and scoring_model_name

Add evidence confidence to reduce sparse-data rank spikes Analysis: One recent MR can outrank broader, steadier expertise. Add a confidence factor derived from number of distinct evidence MRs and expose both score_raw and score_adjusted.

@@ Scoring Formula
+ confidence(user) = 1 - exp(-evidence_mr_count / 6.0)
+ score_adjusted = score_raw * confidence

@@ 4. Rust-Side Aggregation
+ compute evidence_mr_count from unique MR ids across all signals
+ sort by score_adjusted DESC, then score_raw DESC, then last_seen DESC

@@ 5b. --explain-score
+ include confidence and evidence_mr_count

Add first-class bot/service-account filtering Analysis: Reviewer inflation is not just assignment; bots and automation users can still pollute rankings. Make exclusion explicit and configurable.

@@ ScoringConfig (config.rs)
+ pub excluded_username_patterns: Vec<String>, // defaults include "*bot*", "renovate", "dependabot"

@@ 3. SQL Restructure
+ AND username NOT MATCHES excluded patterns (applied in Rust post-query or SQL where feasible)

@@ CLI
+ --include-bots (override exclusions)

Tighten reviewer “participated” with substantive-note threshold Analysis: A single “LGTM” note shouldn’t classify someone as engaged reviewer equivalent to real inline review. Use a minimum substantive threshold.

@@ ScoringConfig (config.rs)
+ pub reviewer_min_note_chars: u32, // default: 20

@@ reviewer_participation CTE
- SELECT DISTINCT ... FROM matched_notes
+ SELECT DISTINCT ... FROM matched_notes
+ WHERE LENGTH(TRIM(body)) >= ?reviewer_min_note_chars

Add rollout safety: model compare mode + rank-delta diagnostics Analysis: This is a scoring-model migration. You need safe rollout mechanics, not just tests. Add a compare mode so you can inspect rank deltas before forcing v2.

@@ CLI (who)
+ --scoring-model v1|v2|compare   (default: v2)
+ --max-rank-delta-report N       (compare mode diagnostics)

@@ Robot output
+ include v1_score, v2_score, rank_delta when --scoring-model compare

If you want, I can produce a single consolidated “plan v4” document that applies all nine diffs cleanly into your original markdown.

5.7 KiB Raw Permalink Blame History Unescape Escape

5.7 KiB

Raw Permalink Blame History