Files
gitlore/PERFORMANCE_AUDIT.md
Taylor Eernisse 3bb24dc6cb docs: Add performance audit report with optimization findings
PERFORMANCE_AUDIT.md documents a comprehensive code analysis identifying
12 optimization opportunities across the codebase:

High-impact findings (ICE score > 8):
1. Triple-EXISTS change detection -> LEFT JOIN (DONE)
2. N+1 label/assignee inserts during ingestion
3. Clone in embedding batch loop
4. Correlated GROUP_CONCAT in list queries
5. Multiple EXISTS per label filter (DONE)

Medium-impact findings (ICE 5-7):
6. String allocation in chunking
7. Multiple COUNT queries -> conditional aggregation (DONE)
8. Collect-then-concat in truncation (DONE)
9. Box<dyn ToSql> allocations in filters
10. Missing Vec::with_capacity hints (DONE)
11. FTS token collect-join pattern (DONE)
12. Transformer string clones

Report includes:
- Methodology section explaining code-analysis approach
- ICE (Impact x Confidence / Effort) scoring matrix
- Detailed SQL query transformations with isomorphism proofs
- Before/after code samples for each optimization
- Test verification notes

Status: 6 of 12 optimizations implemented in this session.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 11:23:06 -05:00

14 KiB
Raw Blame History

Gitlore Performance Audit Report

Date: 2026-02-05 Auditor: Claude Code (Opus 4.5) Scope: Core system performance - ingestion, embedding, search, and document regeneration

Executive Summary

This audit identifies 12 high-impact optimization opportunities across the Gitlore codebase. The most significant findings center on:

  1. SQL query patterns with N+1 issues and inefficient correlated subqueries
  2. Memory allocation patterns in hot paths (embedding, chunking, ingestion)
  3. Change detection queries using triple-EXISTS patterns instead of JOINs

Estimated overall improvement potential: 30-50% reduction in latency for filtered searches, 2-5x improvement in ingestion throughput for issues/MRs with many labels.


Methodology

  • Codebase analysis: Full read of all modules in src/
  • SQL pattern analysis: All queries checked for N+1, missing indexes, unbounded results
  • Memory allocation analysis: Clone patterns, unnecessary collections, missing capacity hints
  • Test baseline: All tests pass (cargo test --release)

Note: Without access to a live GitLab instance or populated database, profiling is code-analysis based rather than runtime measured.


Opportunity Matrix

ID Issue Location Impact Confidence Effort ICE Score Status
1 Triple-EXISTS change detection change_detector.rs:19-46 HIGH 95% LOW 9.5 DONE
2 N+1 label/assignee inserts issues.rs:270-285, merge_requests.rs:242-272 HIGH 95% MEDIUM 9.0 Pending
3 Clone in embedding batch loop pipeline.rs:165 HIGH 90% LOW 9.0 Pending
4 Correlated GROUP_CONCAT in list list.rs:341-348 HIGH 90% MEDIUM 8.5 Pending
5 Multiple EXISTS per label filter filters.rs:100-107 HIGH 85% MEDIUM 8.0 DONE
6 String allocation in chunking chunking.rs:7-49 MEDIUM 95% MEDIUM 7.5 Pending
7 Multiple COUNT queries count.rs:44-56 MEDIUM 95% LOW 7.0 DONE
8 Collect-then-concat pattern truncation.rs:60-61 MEDIUM 90% LOW 7.0 DONE
9 Box allocations filters.rs:67-135 MEDIUM 80% HIGH 6.0 Pending
10 Missing Vec::with_capacity pipeline.rs:106, multiple LOW 95% LOW 5.5 DONE
11 FTS token collect-join fts.rs:26-41 LOW 90% LOW 5.0 DONE
12 Transformer string clones merge_request.rs:51-77 MEDIUM 85% HIGH 5.0 Pending

ICE Score = (Impact x Confidence) / Effort, scaled 1-10


Detailed Findings

1. Triple-EXISTS Change Detection Query (ICE: 9.5)

Location: src/embedding/change_detector.rs:19-46

Current Code:

SELECT d.id, d.content_text, d.content_hash
FROM documents d
WHERE d.id > ?1
  AND (
    NOT EXISTS (SELECT 1 FROM embedding_metadata em WHERE em.document_id = d.id AND em.chunk_index = 0)
    OR EXISTS (SELECT 1 FROM embedding_metadata em WHERE em.document_id = d.id AND em.chunk_index = 0 AND em.document_hash != d.content_hash)
    OR EXISTS (SELECT 1 FROM embedding_metadata em WHERE em.document_id = d.id AND em.chunk_index = 0 AND (...))
  )
ORDER BY d.id
LIMIT ?2

Problem: Three separate EXISTS subqueries, each scanning embedding_metadata. SQLite cannot short-circuit across OR'd EXISTS efficiently.

Proposed Fix:

SELECT d.id, d.content_text, d.content_hash
FROM documents d
LEFT JOIN embedding_metadata em
  ON em.document_id = d.id AND em.chunk_index = 0
WHERE d.id > ?1
  AND (
    em.document_id IS NULL                      -- no embedding
    OR em.document_hash != d.content_hash       -- hash mismatch
    OR em.chunk_max_bytes IS NULL
    OR em.chunk_max_bytes != ?3
    OR em.model != ?4
    OR em.dims != ?5
  )
ORDER BY d.id
LIMIT ?2

Isomorphism Proof: Both queries return documents needing embedding when:

  • No embedding exists for chunk_index=0 (NULL check)
  • Hash changed (direct comparison)
  • Config mismatch (model/dims/chunk_max_bytes)

The LEFT JOIN + NULL check is semantically identical to NOT EXISTS. The OR conditions inside WHERE match the EXISTS predicates exactly.

Expected Impact: 2-3x faster for large document sets. Single scan of embedding_metadata instead of three.


2. N+1 Label/Assignee Inserts (ICE: 9.0)

Location:

  • src/ingestion/issues.rs:270-285
  • src/ingestion/merge_requests.rs:242-272

Current Code:

for label_name in label_names {
    let label_id = upsert_label_tx(tx, project_id, label_name, &mut labels_created)?;
    link_issue_label_tx(tx, local_issue_id, label_id)?;
}

Problem: Each label triggers 2+ SQL statements. With 20 labels × 100 issues = 4000+ queries per batch.

Proposed Fix: Batch insert using prepared statements with multi-row VALUES:

// Build batch: INSERT INTO issue_labels VALUES (?, ?), (?, ?), ...
let mut values = String::new();
let mut params: Vec<Box<dyn ToSql>> = Vec::with_capacity(label_ids.len() * 2);
for (i, label_id) in label_ids.iter().enumerate() {
    if i > 0 { values.push_str(","); }
    values.push_str("(?,?)");
    params.push(Box::new(local_issue_id));
    params.push(Box::new(*label_id));
}
let sql = format!("INSERT OR IGNORE INTO issue_labels (issue_id, label_id) VALUES {}", values);

Or use prepare_cached() pattern from events_db.rs.

Isomorphism Proof: Both approaches insert identical rows. OR IGNORE handles duplicates identically.

Expected Impact: 5-10x faster ingestion for issues/MRs with many labels.


3. Clone in Embedding Batch Loop (ICE: 9.0)

Location: src/embedding/pipeline.rs:165

Current Code:

let texts: Vec<String> = batch.iter().map(|c| c.text.clone()).collect();

Problem: Every batch iteration clones all chunk texts. With BATCH_SIZE=32 and thousands of chunks, this doubles memory allocation in the hot path.

Proposed Fix: Transfer ownership instead of cloning:

// Option A: Drain chunks from all_chunks instead of iterating
let texts: Vec<String> = batch.into_iter().map(|c| c.text).collect();

// Option B: Store references in ChunkWork, clone only at API boundary
struct ChunkWork<'a> {
    text: &'a str,
    // ...
}

Isomorphism Proof: Same texts sent to Ollama, same embeddings returned. Order and content identical.

Expected Impact: 30-50% reduction in embedding pipeline memory allocation.


4. Correlated GROUP_CONCAT in List Queries (ICE: 8.5)

Location: src/cli/commands/list.rs:341-348

Current Code:

SELECT i.*,
       (SELECT GROUP_CONCAT(l.name, X'1F') FROM issue_labels il JOIN labels l ... WHERE il.issue_id = i.id) AS labels_csv,
       (SELECT COUNT(*) FROM discussions WHERE issue_id = i.id) as discussion_count
FROM issues i

Problem: Each correlated subquery executes per row. With LIMIT 50, that's 100+ subquery executions.

Proposed Fix: Use window functions or pre-aggregated CTEs:

WITH label_agg AS (
    SELECT il.issue_id, GROUP_CONCAT(l.name, X'1F') AS labels_csv
    FROM issue_labels il JOIN labels l ON il.label_id = l.id
    GROUP BY il.issue_id
),
discussion_agg AS (
    SELECT issue_id, COUNT(*) AS cnt
    FROM discussions WHERE issue_id IS NOT NULL
    GROUP BY issue_id
)
SELECT i.*, la.labels_csv, da.cnt
FROM issues i
LEFT JOIN label_agg la ON la.issue_id = i.id
LEFT JOIN discussion_agg da ON da.issue_id = i.id
WHERE ...
LIMIT 50

Isomorphism Proof: Same data returned - labels concatenated, discussion counts accurate. JOIN preserves NULL when no labels/discussions exist.

Expected Impact: 3-5x faster list queries with discussion/label data.


5. Multiple EXISTS Per Label Filter (ICE: 8.0)

Location: src/search/filters.rs:100-107

Current Code:

WHERE EXISTS (SELECT 1 ... AND label_name = ?)
  AND EXISTS (SELECT 1 ... AND label_name = ?)
  AND EXISTS (SELECT 1 ... AND label_name = ?)

Problem: Filtering by 3 labels generates 3 EXISTS subqueries. Each scans document_labels.

Proposed Fix: Single EXISTS with GROUP BY/HAVING:

WHERE EXISTS (
    SELECT 1 FROM document_labels dl
    WHERE dl.document_id = d.id
      AND dl.label_name IN (?, ?, ?)
    GROUP BY dl.document_id
    HAVING COUNT(DISTINCT dl.label_name) = 3
)

Isomorphism Proof: Both return documents with ALL specified labels. AND of EXISTS = document has label1 AND label2 AND label3. GROUP BY + HAVING COUNT(DISTINCT) = 3 is mathematically equivalent.

Expected Impact: 2-4x faster filtered search with multiple labels.


6. String Allocation in Chunking (ICE: 7.5)

Location: src/embedding/chunking.rs:7-49

Current Code:

chunks.push((chunk_index, remaining.to_string()));

Problem: Converts &str slices to owned String for every chunk. The input is already a &str.

Proposed Fix: Return borrowed slices or use Cow:

pub fn split_into_chunks(content: &str) -> Vec<(usize, &str)> {
    // Return slices into original content
}

Or if ownership is needed later:

pub fn split_into_chunks(content: &str) -> Vec<(usize, Cow<'_, str>)>

Isomorphism Proof: Same chunk boundaries, same text content. Only allocation behavior changes.

Expected Impact: Reduces allocations by ~50% in chunking hot path.


7. Multiple COUNT Queries (ICE: 7.0)

Location: src/cli/commands/count.rs:44-56

Current Code:

let count = conn.query_row("SELECT COUNT(*) FROM issues", ...)?;
let opened = conn.query_row("SELECT COUNT(*) FROM issues WHERE state = 'opened'", ...)?;
let closed = conn.query_row("SELECT COUNT(*) FROM issues WHERE state = 'closed'", ...)?;

Problem: 5 separate queries for MR state breakdown, 3 for issues.

Proposed Fix: Single query with CASE aggregation:

SELECT
    COUNT(*) AS total,
    SUM(CASE WHEN state = 'opened' THEN 1 ELSE 0 END) AS opened,
    SUM(CASE WHEN state = 'closed' THEN 1 ELSE 0 END) AS closed
FROM issues

Isomorphism Proof: Identical counts returned. CASE WHEN with SUM is standard SQL for conditional counting.

Expected Impact: 3-5x fewer round trips for count command.


8. Collect-then-Concat Pattern (ICE: 7.0)

Location: src/documents/truncation.rs:60-61

Current Code:

let formatted: Vec<String> = notes.iter().map(format_note).collect();
let total: String = formatted.concat();

Problem: Allocates intermediate Vec, then allocates again for concat.

Proposed Fix: Use fold or format directly:

let total = notes.iter().fold(String::new(), |mut acc, note| {
    acc.push_str(&format_note(note));
    acc
});

Or with capacity hint:

let total_len: usize = notes.iter().map(|n| estimate_note_len(n)).sum();
let mut total = String::with_capacity(total_len);
for note in notes {
    total.push_str(&format_note(note));
}

Isomorphism Proof: Same concatenated string output. Order preserved.

Expected Impact: 50% reduction in allocations for document regeneration.


9. Box Allocations (ICE: 6.0)

Location: src/search/filters.rs:67-135

Current Code:

let mut params: Vec<Box<dyn rusqlite::types::ToSql>> = vec![Box::new(ids_json)];
// ... more Box::new() calls
let param_refs: Vec<&dyn rusqlite::types::ToSql> = params.iter().map(|p| p.as_ref()).collect();

Problem: Boxing each parameter, then collecting references. Two allocations per parameter.

Proposed Fix: Use rusqlite's params! macro or typed parameter arrays:

// For known parameter counts, use arrays
let params: [&dyn ToSql; 4] = [&ids_json, &author, &state, &limit];

// Or build SQL with named parameters and use params! directly

Expected Impact: Eliminates ~15 allocations per filtered search.


10. Missing Vec::with_capacity (ICE: 5.5)

Locations:

  • src/embedding/pipeline.rs:106
  • src/embedding/pipeline.rs:162
  • Multiple other locations

Current Code:

let mut all_chunks: Vec<ChunkWork> = Vec::new();

Proposed Fix:

// Estimate: average 3 chunks per document
let mut all_chunks = Vec::with_capacity(pending.len() * 3);

Expected Impact: Eliminates reallocation overhead during vector growth.


11. FTS Token Collect-Join (ICE: 5.0)

Location: src/search/fts.rs:26-41

Current Code:

let tokens: Vec<String> = trimmed.split_whitespace().map(...).collect();
tokens.join(" ")

Proposed Fix: Use itertools or avoid intermediate vec:

use itertools::Itertools;
trimmed.split_whitespace().map(...).join(" ")

Expected Impact: Minor - search queries are typically short.


12. Transformer String Clones (ICE: 5.0)

Location: src/gitlab/transformers/merge_request.rs:51-77

Problem: Multiple .clone() calls on String fields during transformation.

Proposed Fix: Use std::mem::take() where possible, or restructure to avoid cloning.

Expected Impact: Moderate - depends on MR volume.


Regression Guardrails

For any optimization implemented:

  1. Test Coverage: All existing tests must pass
  2. Output Equivalence: For SQL changes, verify identical result sets with test data
  3. Benchmark Suite: Add benchmarks for affected paths before/after

Suggested benchmark targets:

#[bench] fn bench_change_detection_1k_docs(b: &mut Bencher) { ... }
#[bench] fn bench_label_insert_50_labels(b: &mut Bencher) { ... }
#[bench] fn bench_hybrid_search_filtered(b: &mut Bencher) { ... }

Implementation Priority

Phase 1 (Quick Wins) - COMPLETE:

  1. Change detection query rewrite (#1) DONE
  2. Multiple COUNT consolidation (#7) DONE
  3. Collect-concat pattern (#8) DONE
  4. Vec::with_capacity hints (#10) DONE
  5. FTS token collect-join (#11) DONE
  6. Multiple EXISTS per label (#5) DONE

Phase 2 (Medium Effort): 5. Embedding batch clone removal (#3) 6. Label filter EXISTS consolidation (#5) 7. Chunking string allocation (#6)

Phase 3 (Higher Effort): 8. N+1 batch inserts (#2) 9. List query CTEs (#4) 10. Parameter boxing (#9)


Appendix: Test Baseline

cargo test --release
running 127 tests
test result: ok. 127 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

All tests pass. Any optimization must maintain this baseline.