Ideas catalog (docs/ideas/): 25 feature concept documents covering future lore capabilities including bottleneck detection, churn analysis, expert scoring, collaboration patterns, milestone risk, knowledge silos, and more. Each doc includes motivation, implementation sketch, data requirements, and dependencies on existing infrastructure. README.md provides an overview and SYSTEM-PROPOSAL.md presents the unified analytics vision. Plans (plans/): Time-decay expert scoring design with four rounds of review feedback exploring decay functions, scoring algebra, and integration points with the existing who-expert pipeline. Issue doc (docs/issues/001): Documents the timeline pipeline bug where EntityRef was missing project context, causing ambiguous cross-project references during the EXPAND stage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
98 lines
2.8 KiB
Markdown
98 lines
2.8 KiB
Markdown
# Label Hygiene Audit
|
|
|
|
- **Command:** `lore label-audit`
|
|
- **Confidence:** 82%
|
|
- **Tier:** 2
|
|
- **Status:** proposed
|
|
- **Effort:** low — straightforward aggregation queries
|
|
|
|
## What
|
|
|
|
Report on label health:
|
|
- Labels used only once (may be typos or abandoned experiments)
|
|
- Labels applied and removed within 1 hour (likely mistakes)
|
|
- Labels with no active issues/MRs (orphaned)
|
|
- Label name collisions across projects (same name, different meaning)
|
|
- Labels never used at all (defined but not applied)
|
|
|
|
## Why
|
|
|
|
Label sprawl is real and makes filtering useless over time. Teams create labels
|
|
ad-hoc and never clean them up. This simple audit surfaces maintenance tasks.
|
|
|
|
## Data Required
|
|
|
|
All exists today:
|
|
- `labels` (name, project_id)
|
|
- `issue_labels` / `mr_labels` (usage counts)
|
|
- `resource_label_events` (add/remove pairs for mistake detection)
|
|
- `issues` / `merge_requests` (state for "active" filtering)
|
|
|
|
## Implementation Sketch
|
|
|
|
```sql
|
|
-- Labels used only once
|
|
SELECT l.name, p.path_with_namespace, COUNT(*) as usage
|
|
FROM labels l
|
|
JOIN projects p ON l.project_id = p.id
|
|
LEFT JOIN issue_labels il ON il.label_id = l.id
|
|
LEFT JOIN mr_labels ml ON ml.label_id = l.id
|
|
GROUP BY l.id
|
|
HAVING COUNT(il.issue_id) + COUNT(ml.merge_request_id) = 1;
|
|
|
|
-- Flash labels (applied and removed within 1 hour)
|
|
SELECT
|
|
rle1.label_name,
|
|
rle1.created_at as added_at,
|
|
rle2.created_at as removed_at,
|
|
(rle2.created_at - rle1.created_at) / 60000 as minutes_active
|
|
FROM resource_label_events rle1
|
|
JOIN resource_label_events rle2
|
|
ON rle1.issue_id = rle2.issue_id
|
|
AND rle1.label_name = rle2.label_name
|
|
AND rle1.action = 'add'
|
|
AND rle2.action = 'remove'
|
|
AND rle2.created_at > rle1.created_at
|
|
AND (rle2.created_at - rle1.created_at) < 3600000;
|
|
|
|
-- Unused labels (defined but never applied)
|
|
SELECT l.name, p.path_with_namespace
|
|
FROM labels l
|
|
JOIN projects p ON l.project_id = p.id
|
|
LEFT JOIN issue_labels il ON il.label_id = l.id
|
|
LEFT JOIN mr_labels ml ON ml.label_id = l.id
|
|
WHERE il.issue_id IS NULL AND ml.merge_request_id IS NULL;
|
|
```
|
|
|
|
## Human Output
|
|
|
|
```
|
|
Label Audit
|
|
|
|
Unused Labels (4):
|
|
group/backend: deprecated-v1, needs-triage, wontfix-maybe
|
|
group/frontend: old-design
|
|
|
|
Single-Use Labels (3):
|
|
group/backend: perf-regression (1 issue)
|
|
group/frontend: ux-debt (1 MR), mobile-only (1 issue)
|
|
|
|
Flash Labels (applied < 1hr, 2):
|
|
group/backend #90: +priority::critical then -priority::critical (12 min)
|
|
group/backend #85: +blocked then -blocked (5 min)
|
|
|
|
Cross-Project Collisions (1):
|
|
"needs-review" used in group/backend (32 uses) AND group/frontend (8 uses)
|
|
```
|
|
|
|
## Downsides
|
|
|
|
- Low glamour; this is janitorial work
|
|
- Single-use labels may be legitimate (one-off categorization)
|
|
- Cross-project collisions may be intentional (shared vocabulary)
|
|
|
|
## Extensions
|
|
|
|
- `lore label-audit --fix` — suggest deletions for unused labels
|
|
- Trend: label count over time (is sprawl increasing?)
|