docs: add feature ideas catalog, time-decay scoring plan, and timeline issue doc

Ideas catalog (docs/ideas/): 25 feature concept documents covering future
lore capabilities including bottleneck detection, churn analysis, expert
scoring, collaboration patterns, milestone risk, knowledge silos, and more.
Each doc includes motivation, implementation sketch, data requirements, and
dependencies on existing infrastructure. README.md provides an overview and
SYSTEM-PROPOSAL.md presents the unified analytics vision.

Plans (plans/): Time-decay expert scoring design with four rounds of review
feedback exploring decay functions, scoring algebra, and integration points
with the existing who-expert pipeline.

Issue doc (docs/issues/001): Documents the timeline pipeline bug where
EntityRef was missing project context, causing ambiguous cross-project
references during the EXPAND stage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Taylor Eernisse
2026-02-09 10:16:48 -05:00
parent d54f669c5e
commit 4185abe05d
32 changed files with 4170 additions and 0 deletions

90
docs/ideas/silos.md Normal file
View File

@@ -0,0 +1,90 @@
# Knowledge Silo Detection
- **Command:** `lore silos [--min-changes <N>]`
- **Confidence:** 87%
- **Tier:** 2
- **Status:** proposed
- **Effort:** medium — requires mr_file_changes population (Gate 4)
## What
For each file path (or directory), count unique MR authors. Flag paths where only
1 person has ever authored changes (bus factor = 1). Aggregate by directory to show
silo areas.
## Why
Bus factor analysis is critical for team resilience. If only one person has ever
touched the auth module, that's a risk. This uses data already ingested to surface
knowledge concentration that's otherwise invisible.
## Data Required
- `mr_file_changes` (new_path, merge_request_id) — needs Gate 4 ingestion
- `merge_requests` (author_username, state='merged')
- `projects` (path_with_namespace)
## Implementation Sketch
```sql
-- Find directories with bus factor = 1
WITH file_authors AS (
SELECT
mfc.new_path,
mr.author_username,
p.path_with_namespace,
mfc.project_id
FROM mr_file_changes mfc
JOIN merge_requests mr ON mfc.merge_request_id = mr.id
JOIN projects p ON mfc.project_id = p.id
WHERE mr.state = 'merged'
),
directory_authors AS (
SELECT
project_id,
path_with_namespace,
-- Extract directory: everything before last '/'
CASE
WHEN INSTR(new_path, '/') > 0
THEN SUBSTR(new_path, 1, LENGTH(new_path) - LENGTH(REPLACE(RTRIM(new_path, REPLACE(new_path, '/', '')), '', '')))
ELSE '.'
END as directory,
COUNT(DISTINCT author_username) as unique_authors,
COUNT(*) as total_changes,
GROUP_CONCAT(DISTINCT author_username) as authors
FROM file_authors
GROUP BY project_id, directory
)
SELECT * FROM directory_authors
WHERE unique_authors = 1
AND total_changes >= ?1 -- min-changes threshold
ORDER BY total_changes DESC;
```
## Human Output
```
Knowledge Silos (bus factor = 1, min 3 changes)
group/backend
src/auth/ alice (8 changes) HIGH RISK
src/billing/ bob (5 changes) HIGH RISK
src/utils/cache/ charlie (3 changes) MODERATE RISK
group/frontend
src/admin/ dave (12 changes) HIGH RISK
```
## Downsides
- Historical authors may have left the team; needs recency weighting
- Requires `mr_file_changes` to be populated (Gate 4)
- Single-author directories may be intentional (ownership model)
- Directory aggregation heuristic is imperfect for deep nesting
## Extensions
- `lore silos --since 180d` — only count recent activity
- `lore silos --depth 2` — aggregate at directory depth N
- Combine with `lore experts` to show both silos and experts in one view
- Risk scoring: weight by directory size, change frequency, recency