fix(search): cap vector search k-value and add rowid assertion
The vector search multiplier could grow unbounded on documents with many chunks, producing enormous k values that cause SQLite to scan far more rows than necessary. Clamp the multiplier to [8, 200] and cap k at 10,000 to prevent degenerate performance on large corpora. Also adds a debug_assert in decode_rowid to catch negative rowids early — these indicate a bug in the encoding pipeline and should fail fast rather than silently produce garbage document IDs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -14,6 +14,10 @@ pub fn encode_rowid(document_id: i64, chunk_index: i64) -> i64 {
|
||||
}
|
||||
|
||||
pub fn decode_rowid(rowid: i64) -> (i64, i64) {
|
||||
debug_assert!(
|
||||
rowid >= 0,
|
||||
"decode_rowid called with negative rowid: {rowid}"
|
||||
);
|
||||
let document_id = rowid / CHUNK_ROWID_MULTIPLIER;
|
||||
let chunk_index = rowid % CHUNK_ROWID_MULTIPLIER;
|
||||
(document_id, chunk_index)
|
||||
|
||||
Reference in New Issue
Block a user