fix: document upsert project_id, truncation budget, and Ollama model matching
- regenerator: Include project_id in the ON CONFLICT UPDATE clause for
document upserts. Previously, if a document moved between projects
(e.g., during re-ingestion), the project_id would remain stale.
- truncation: Compute the omission marker ("N notes omitted") before
checking whether first+last notes fit in the budget. The old order
computed the marker after the budget check, meaning the marker's byte
cost was unaccounted for and could cause over-budget output.
- ollama: Tighten model name matching to require either an exact match
or a colon-delimited tag prefix (model == name or name starts with
"model:"). The prior starts_with check would false-positive on
"nomic-embed-text-v2" when looking for "nomic-embed-text". Tests
updated to cover exact match, tagged, wrong model, and prefix
false-positive cases.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -145,6 +145,7 @@ fn upsert_document_inner(conn: &Connection, doc: &DocumentData) -> Result<bool>
|
||||
is_truncated, truncated_reason)
|
||||
VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8, ?9, ?10, ?11, ?12, ?13, ?14, ?15)
|
||||
ON CONFLICT(source_type, source_id) DO UPDATE SET
|
||||
project_id = excluded.project_id,
|
||||
author_username = excluded.author_username,
|
||||
label_names = excluded.label_names,
|
||||
labels_hash = excluded.labels_hash,
|
||||
|
||||
Reference in New Issue
Block a user