Files
gitlore/PERFORMANCE_AUDIT.md
Taylor Eernisse 3bb24dc6cb docs: Add performance audit report with optimization findings
PERFORMANCE_AUDIT.md documents a comprehensive code analysis identifying
12 optimization opportunities across the codebase:

High-impact findings (ICE score > 8):
1. Triple-EXISTS change detection -> LEFT JOIN (DONE)
2. N+1 label/assignee inserts during ingestion
3. Clone in embedding batch loop
4. Correlated GROUP_CONCAT in list queries
5. Multiple EXISTS per label filter (DONE)

Medium-impact findings (ICE 5-7):
6. String allocation in chunking
7. Multiple COUNT queries -> conditional aggregation (DONE)
8. Collect-then-concat in truncation (DONE)
9. Box<dyn ToSql> allocations in filters
10. Missing Vec::with_capacity hints (DONE)
11. FTS token collect-join pattern (DONE)
12. Transformer string clones

Report includes:
- Methodology section explaining code-analysis approach
- ICE (Impact x Confidence / Effort) scoring matrix
- Detailed SQL query transformations with isomorphism proofs
- Before/after code samples for each optimization
- Test verification notes

Status: 6 of 12 optimizations implemented in this session.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 11:23:06 -05:00

468 lines
14 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Gitlore Performance Audit Report
**Date**: 2026-02-05
**Auditor**: Claude Code (Opus 4.5)
**Scope**: Core system performance - ingestion, embedding, search, and document regeneration
## Executive Summary
This audit identifies 12 high-impact optimization opportunities across the Gitlore codebase. The most significant findings center on:
1. **SQL query patterns** with N+1 issues and inefficient correlated subqueries
2. **Memory allocation patterns** in hot paths (embedding, chunking, ingestion)
3. **Change detection queries** using triple-EXISTS patterns instead of JOINs
**Estimated overall improvement potential**: 30-50% reduction in latency for filtered searches, 2-5x improvement in ingestion throughput for issues/MRs with many labels.
---
## Methodology
- **Codebase analysis**: Full read of all modules in `src/`
- **SQL pattern analysis**: All queries checked for N+1, missing indexes, unbounded results
- **Memory allocation analysis**: Clone patterns, unnecessary collections, missing capacity hints
- **Test baseline**: All tests pass (`cargo test --release`)
Note: Without access to a live GitLab instance or populated database, profiling is code-analysis based rather than runtime measured.
---
## Opportunity Matrix
| ID | Issue | Location | Impact | Confidence | Effort | ICE Score | Status |
|----|-------|----------|--------|------------|--------|-----------|--------|
| 1 | Triple-EXISTS change detection | `change_detector.rs:19-46` | HIGH | 95% | LOW | **9.5** | **DONE** |
| 2 | N+1 label/assignee inserts | `issues.rs:270-285`, `merge_requests.rs:242-272` | HIGH | 95% | MEDIUM | **9.0** | Pending |
| 3 | Clone in embedding batch loop | `pipeline.rs:165` | HIGH | 90% | LOW | **9.0** | Pending |
| 4 | Correlated GROUP_CONCAT in list | `list.rs:341-348` | HIGH | 90% | MEDIUM | **8.5** | Pending |
| 5 | Multiple EXISTS per label filter | `filters.rs:100-107` | HIGH | 85% | MEDIUM | **8.0** | **DONE** |
| 6 | String allocation in chunking | `chunking.rs:7-49` | MEDIUM | 95% | MEDIUM | **7.5** | Pending |
| 7 | Multiple COUNT queries | `count.rs:44-56` | MEDIUM | 95% | LOW | **7.0** | **DONE** |
| 8 | Collect-then-concat pattern | `truncation.rs:60-61` | MEDIUM | 90% | LOW | **7.0** | **DONE** |
| 9 | Box<dyn ToSql> allocations | `filters.rs:67-135` | MEDIUM | 80% | HIGH | **6.0** | Pending |
| 10 | Missing Vec::with_capacity | `pipeline.rs:106`, multiple | LOW | 95% | LOW | **5.5** | **DONE** |
| 11 | FTS token collect-join | `fts.rs:26-41` | LOW | 90% | LOW | **5.0** | **DONE** |
| 12 | Transformer string clones | `merge_request.rs:51-77` | MEDIUM | 85% | HIGH | **5.0** | Pending |
ICE Score = (Impact x Confidence) / Effort, scaled 1-10
---
## Detailed Findings
### 1. Triple-EXISTS Change Detection Query (ICE: 9.5)
**Location**: `src/embedding/change_detector.rs:19-46`
**Current Code**:
```sql
SELECT d.id, d.content_text, d.content_hash
FROM documents d
WHERE d.id > ?1
AND (
NOT EXISTS (SELECT 1 FROM embedding_metadata em WHERE em.document_id = d.id AND em.chunk_index = 0)
OR EXISTS (SELECT 1 FROM embedding_metadata em WHERE em.document_id = d.id AND em.chunk_index = 0 AND em.document_hash != d.content_hash)
OR EXISTS (SELECT 1 FROM embedding_metadata em WHERE em.document_id = d.id AND em.chunk_index = 0 AND (...))
)
ORDER BY d.id
LIMIT ?2
```
**Problem**: Three separate EXISTS subqueries, each scanning `embedding_metadata`. SQLite cannot short-circuit across OR'd EXISTS efficiently.
**Proposed Fix**:
```sql
SELECT d.id, d.content_text, d.content_hash
FROM documents d
LEFT JOIN embedding_metadata em
ON em.document_id = d.id AND em.chunk_index = 0
WHERE d.id > ?1
AND (
em.document_id IS NULL -- no embedding
OR em.document_hash != d.content_hash -- hash mismatch
OR em.chunk_max_bytes IS NULL
OR em.chunk_max_bytes != ?3
OR em.model != ?4
OR em.dims != ?5
)
ORDER BY d.id
LIMIT ?2
```
**Isomorphism Proof**: Both queries return documents needing embedding when:
- No embedding exists for chunk_index=0 (NULL check)
- Hash changed (direct comparison)
- Config mismatch (model/dims/chunk_max_bytes)
The LEFT JOIN + NULL check is semantically identical to NOT EXISTS. The OR conditions inside WHERE match the EXISTS predicates exactly.
**Expected Impact**: 2-3x faster for large document sets. Single scan of embedding_metadata instead of three.
---
### 2. N+1 Label/Assignee Inserts (ICE: 9.0)
**Location**:
- `src/ingestion/issues.rs:270-285`
- `src/ingestion/merge_requests.rs:242-272`
**Current Code**:
```rust
for label_name in label_names {
let label_id = upsert_label_tx(tx, project_id, label_name, &mut labels_created)?;
link_issue_label_tx(tx, local_issue_id, label_id)?;
}
```
**Problem**: Each label triggers 2+ SQL statements. With 20 labels × 100 issues = 4000+ queries per batch.
**Proposed Fix**: Batch insert using prepared statements with multi-row VALUES:
```rust
// Build batch: INSERT INTO issue_labels VALUES (?, ?), (?, ?), ...
let mut values = String::new();
let mut params: Vec<Box<dyn ToSql>> = Vec::with_capacity(label_ids.len() * 2);
for (i, label_id) in label_ids.iter().enumerate() {
if i > 0 { values.push_str(","); }
values.push_str("(?,?)");
params.push(Box::new(local_issue_id));
params.push(Box::new(*label_id));
}
let sql = format!("INSERT OR IGNORE INTO issue_labels (issue_id, label_id) VALUES {}", values);
```
Or use `prepare_cached()` pattern from `events_db.rs`.
**Isomorphism Proof**: Both approaches insert identical rows. OR IGNORE handles duplicates identically.
**Expected Impact**: 5-10x faster ingestion for issues/MRs with many labels.
---
### 3. Clone in Embedding Batch Loop (ICE: 9.0)
**Location**: `src/embedding/pipeline.rs:165`
**Current Code**:
```rust
let texts: Vec<String> = batch.iter().map(|c| c.text.clone()).collect();
```
**Problem**: Every batch iteration clones all chunk texts. With BATCH_SIZE=32 and thousands of chunks, this doubles memory allocation in the hot path.
**Proposed Fix**: Transfer ownership instead of cloning:
```rust
// Option A: Drain chunks from all_chunks instead of iterating
let texts: Vec<String> = batch.into_iter().map(|c| c.text).collect();
// Option B: Store references in ChunkWork, clone only at API boundary
struct ChunkWork<'a> {
text: &'a str,
// ...
}
```
**Isomorphism Proof**: Same texts sent to Ollama, same embeddings returned. Order and content identical.
**Expected Impact**: 30-50% reduction in embedding pipeline memory allocation.
---
### 4. Correlated GROUP_CONCAT in List Queries (ICE: 8.5)
**Location**: `src/cli/commands/list.rs:341-348`
**Current Code**:
```sql
SELECT i.*,
(SELECT GROUP_CONCAT(l.name, X'1F') FROM issue_labels il JOIN labels l ... WHERE il.issue_id = i.id) AS labels_csv,
(SELECT COUNT(*) FROM discussions WHERE issue_id = i.id) as discussion_count
FROM issues i
```
**Problem**: Each correlated subquery executes per row. With LIMIT 50, that's 100+ subquery executions.
**Proposed Fix**: Use window functions or pre-aggregated CTEs:
```sql
WITH label_agg AS (
SELECT il.issue_id, GROUP_CONCAT(l.name, X'1F') AS labels_csv
FROM issue_labels il JOIN labels l ON il.label_id = l.id
GROUP BY il.issue_id
),
discussion_agg AS (
SELECT issue_id, COUNT(*) AS cnt
FROM discussions WHERE issue_id IS NOT NULL
GROUP BY issue_id
)
SELECT i.*, la.labels_csv, da.cnt
FROM issues i
LEFT JOIN label_agg la ON la.issue_id = i.id
LEFT JOIN discussion_agg da ON da.issue_id = i.id
WHERE ...
LIMIT 50
```
**Isomorphism Proof**: Same data returned - labels concatenated, discussion counts accurate. JOIN preserves NULL when no labels/discussions exist.
**Expected Impact**: 3-5x faster list queries with discussion/label data.
---
### 5. Multiple EXISTS Per Label Filter (ICE: 8.0)
**Location**: `src/search/filters.rs:100-107`
**Current Code**:
```sql
WHERE EXISTS (SELECT 1 ... AND label_name = ?)
AND EXISTS (SELECT 1 ... AND label_name = ?)
AND EXISTS (SELECT 1 ... AND label_name = ?)
```
**Problem**: Filtering by 3 labels generates 3 EXISTS subqueries. Each scans document_labels.
**Proposed Fix**: Single EXISTS with GROUP BY/HAVING:
```sql
WHERE EXISTS (
SELECT 1 FROM document_labels dl
WHERE dl.document_id = d.id
AND dl.label_name IN (?, ?, ?)
GROUP BY dl.document_id
HAVING COUNT(DISTINCT dl.label_name) = 3
)
```
**Isomorphism Proof**: Both return documents with ALL specified labels. AND of EXISTS = document has label1 AND label2 AND label3. GROUP BY + HAVING COUNT(DISTINCT) = 3 is mathematically equivalent.
**Expected Impact**: 2-4x faster filtered search with multiple labels.
---
### 6. String Allocation in Chunking (ICE: 7.5)
**Location**: `src/embedding/chunking.rs:7-49`
**Current Code**:
```rust
chunks.push((chunk_index, remaining.to_string()));
```
**Problem**: Converts `&str` slices to owned `String` for every chunk. The input is already a `&str`.
**Proposed Fix**: Return borrowed slices or use `Cow`:
```rust
pub fn split_into_chunks(content: &str) -> Vec<(usize, &str)> {
// Return slices into original content
}
```
Or if ownership is needed later:
```rust
pub fn split_into_chunks(content: &str) -> Vec<(usize, Cow<'_, str>)>
```
**Isomorphism Proof**: Same chunk boundaries, same text content. Only allocation behavior changes.
**Expected Impact**: Reduces allocations by ~50% in chunking hot path.
---
### 7. Multiple COUNT Queries (ICE: 7.0)
**Location**: `src/cli/commands/count.rs:44-56`
**Current Code**:
```rust
let count = conn.query_row("SELECT COUNT(*) FROM issues", ...)?;
let opened = conn.query_row("SELECT COUNT(*) FROM issues WHERE state = 'opened'", ...)?;
let closed = conn.query_row("SELECT COUNT(*) FROM issues WHERE state = 'closed'", ...)?;
```
**Problem**: 5 separate queries for MR state breakdown, 3 for issues.
**Proposed Fix**: Single query with CASE aggregation:
```sql
SELECT
COUNT(*) AS total,
SUM(CASE WHEN state = 'opened' THEN 1 ELSE 0 END) AS opened,
SUM(CASE WHEN state = 'closed' THEN 1 ELSE 0 END) AS closed
FROM issues
```
**Isomorphism Proof**: Identical counts returned. CASE WHEN with SUM is standard SQL for conditional counting.
**Expected Impact**: 3-5x fewer round trips for count command.
---
### 8. Collect-then-Concat Pattern (ICE: 7.0)
**Location**: `src/documents/truncation.rs:60-61`
**Current Code**:
```rust
let formatted: Vec<String> = notes.iter().map(format_note).collect();
let total: String = formatted.concat();
```
**Problem**: Allocates intermediate Vec<String>, then allocates again for concat.
**Proposed Fix**: Use fold or format directly:
```rust
let total = notes.iter().fold(String::new(), |mut acc, note| {
acc.push_str(&format_note(note));
acc
});
```
Or with capacity hint:
```rust
let total_len: usize = notes.iter().map(|n| estimate_note_len(n)).sum();
let mut total = String::with_capacity(total_len);
for note in notes {
total.push_str(&format_note(note));
}
```
**Isomorphism Proof**: Same concatenated string output. Order preserved.
**Expected Impact**: 50% reduction in allocations for document regeneration.
---
### 9. Box<dyn ToSql> Allocations (ICE: 6.0)
**Location**: `src/search/filters.rs:67-135`
**Current Code**:
```rust
let mut params: Vec<Box<dyn rusqlite::types::ToSql>> = vec![Box::new(ids_json)];
// ... more Box::new() calls
let param_refs: Vec<&dyn rusqlite::types::ToSql> = params.iter().map(|p| p.as_ref()).collect();
```
**Problem**: Boxing each parameter, then collecting references. Two allocations per parameter.
**Proposed Fix**: Use rusqlite's params! macro or typed parameter arrays:
```rust
// For known parameter counts, use arrays
let params: [&dyn ToSql; 4] = [&ids_json, &author, &state, &limit];
// Or build SQL with named parameters and use params! directly
```
**Expected Impact**: Eliminates ~15 allocations per filtered search.
---
### 10. Missing Vec::with_capacity (ICE: 5.5)
**Locations**:
- `src/embedding/pipeline.rs:106`
- `src/embedding/pipeline.rs:162`
- Multiple other locations
**Current Code**:
```rust
let mut all_chunks: Vec<ChunkWork> = Vec::new();
```
**Proposed Fix**:
```rust
// Estimate: average 3 chunks per document
let mut all_chunks = Vec::with_capacity(pending.len() * 3);
```
**Expected Impact**: Eliminates reallocation overhead during vector growth.
---
### 11. FTS Token Collect-Join (ICE: 5.0)
**Location**: `src/search/fts.rs:26-41`
**Current Code**:
```rust
let tokens: Vec<String> = trimmed.split_whitespace().map(...).collect();
tokens.join(" ")
```
**Proposed Fix**: Use itertools or avoid intermediate vec:
```rust
use itertools::Itertools;
trimmed.split_whitespace().map(...).join(" ")
```
**Expected Impact**: Minor - search queries are typically short.
---
### 12. Transformer String Clones (ICE: 5.0)
**Location**: `src/gitlab/transformers/merge_request.rs:51-77`
**Problem**: Multiple `.clone()` calls on String fields during transformation.
**Proposed Fix**: Use `std::mem::take()` where possible, or restructure to avoid cloning.
**Expected Impact**: Moderate - depends on MR volume.
---
## Regression Guardrails
For any optimization implemented:
1. **Test Coverage**: All existing tests must pass
2. **Output Equivalence**: For SQL changes, verify identical result sets with test data
3. **Benchmark Suite**: Add benchmarks for affected paths before/after
Suggested benchmark targets:
```rust
#[bench] fn bench_change_detection_1k_docs(b: &mut Bencher) { ... }
#[bench] fn bench_label_insert_50_labels(b: &mut Bencher) { ... }
#[bench] fn bench_hybrid_search_filtered(b: &mut Bencher) { ... }
```
---
## Implementation Priority
**Phase 1 (Quick Wins)** - COMPLETE:
1. ~~Change detection query rewrite (#1)~~ **DONE**
2. ~~Multiple COUNT consolidation (#7)~~ **DONE**
3. ~~Collect-concat pattern (#8)~~ **DONE**
4. ~~Vec::with_capacity hints (#10)~~ **DONE**
5. ~~FTS token collect-join (#11)~~ **DONE**
6. ~~Multiple EXISTS per label (#5)~~ **DONE**
**Phase 2 (Medium Effort)**:
5. Embedding batch clone removal (#3)
6. Label filter EXISTS consolidation (#5)
7. Chunking string allocation (#6)
**Phase 3 (Higher Effort)**:
8. N+1 batch inserts (#2)
9. List query CTEs (#4)
10. Parameter boxing (#9)
---
## Appendix: Test Baseline
```
cargo test --release
running 127 tests
test result: ok. 127 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
```
All tests pass. Any optimization must maintain this baseline.