Replace .unwrap_or(), .ok(), and .filter_map(|r| r.ok()) patterns with
proper error propagation using ? and rusqlite::OptionalExtension where
the query may legitimately return no rows.
Affected areas:
- events_db::count_events: three count queries now propagate errors
instead of defaulting to (0, 0) on failure
- note_parser::extract_refs_from_system_notes: row iteration errors
are now propagated instead of silently dropped via filter_map
- note_parser::noteable_type_to_entity_type: unknown types now log a
debug warning before defaulting to "issue"
- payloads::store_payload/read_payload: use .optional()? instead of
.ok() to distinguish "no row" from "query failed"
- backoff::compute_next_attempt_at: use .clamp(0, 30) to guard against
negative attempt_count, not just .min(30)
- search::vector::max_chunks_per_document: returns Result<i64> with
proper error propagation through .optional()?.flatten()
- embedding::chunk_ids::decode_rowid: promote debug_assert to assert
since negative rowids indicate data corruption worth failing fast on
- ingestion::dirty_tracker::record_dirty_error: use .optional()? to
handle missing dirty_sources row gracefully instead of hard error
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Removes module-level doc comments (//! lines) and excessive inline doc
comments that were duplicating information already evident from:
- Function/struct names (self-documenting code)
- Type signatures (the what is clear from types)
- Implementation context (the how is clear from code)
Affected modules:
- cli/* - Removed command descriptions duplicating clap help text
- core/* - Removed module headers and obvious function docs
- documents/* - Removed extractor/regenerator/truncation docs
- embedding/* - Removed pipeline and chunking docs
- gitlab/* - Removed client and transformer docs (kept type definitions)
- ingestion/* - Removed orchestrator and ingestion docs
- search/* - Removed FTS and vector search docs
Philosophy: Code should be self-documenting. Comments should explain
"why" (business decisions, non-obvious constraints) not "what" (which
the code itself shows). This change reduces noise and maintenance burden
while keeping the codebase just as understandable.
Retains comments for:
- Non-obvious business logic
- Important safety invariants
- Complex algorithm explanations
- Public API boundaries where generated docs matter
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
GitLab returns null for the label/milestone fields on resource_label_events
and resource_milestone_events when the referenced label or milestone has
been deleted. This caused deserialization failures during sync.
- Add migration 012 to recreate both event tables with nullable
label_name, milestone_title, and milestone_id columns (SQLite
requires table recreation to alter NOT NULL constraints)
- Change GitLabLabelEvent.label and GitLabMilestoneEvent.milestone
to Option<> in the Rust types
- Update upsert functions to pass through None values correctly
- Add tests for null label and null milestone deserialization
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
events_db.rs:
- Removed internal savepoints from upsert_state_events,
upsert_label_events, and upsert_milestone_events. Each function
previously created its own savepoint, making it impossible for
callers to wrap all three in a single atomic transaction.
- Changed signatures from &mut Connection to &Connection, since
savepoints are no longer created internally. This makes the
functions compatible with rusqlite::Transaction (which derefs to
Connection), allowing callers to pass a transaction directly.
orchestrator.rs:
- Deleted the three store_*_events_tx() functions (store_state_events_tx,
store_label_events_tx, store_milestone_events_tx) which were
hand-duplicated copies of the events_db upsert functions, created as
a workaround for the &mut Connection requirement. Now that events_db
accepts &Connection, store_resource_events() calls the canonical
upsert functions directly through the unchecked_transaction.
- Replaced the max-iterations guard in drain_resource_events() with a
HashSet-based deduplication of job IDs. The old guard used an
arbitrary 2x multiplier on total_pending which could either terminate
too early (if many retries were legitimate) or too late. The new
approach precisely prevents reprocessing the same job within a single
drain run, which is the actual invariant we need.
Net effect: ~133 lines of duplicated SQL removed, single source of
truth for event upsert logic, and callers control transaction scope.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Automated formatting and lint corrections from parallel agent work:
- cargo fmt: import reordering (alphabetical), line wrapping to respect
max width, trailing comma normalization, destructuring alignment,
function signature reformatting, match arm formatting
- clippy (pedantic): Range::contains() instead of manual comparisons,
i64::from() instead of `as i64` casts, .clamp() instead of
.max().min() chains, let-chain refactors (if-let with &&),
#[allow(clippy::too_many_arguments)] and
#[allow(clippy::field_reassign_with_default)] where warranted
- Removed trailing blank lines and extra whitespace
No behavioral changes. All existing tests pass unmodified.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
New module src/core/events_db.rs provides database operations for
resource events:
- upsert_state_events: Batch INSERT OR REPLACE for state change events,
keyed on UNIQUE(gitlab_id, project_id). Wraps in a savepoint for
atomicity per entity batch. Maps GitLabStateEvent fields including
optional user, source_commit, and source_merge_request_iid.
- upsert_label_events: Same pattern for label add/remove events,
extracting label.name for denormalized storage.
- upsert_milestone_events: Same pattern for milestone assignment events,
storing both milestone.title and milestone.id.
All three upsert functions:
- Take &mut Connection (required for savepoint creation)
- Use prepare_cached for statement reuse across batch iterations
- Convert ISO timestamps via iso_to_ms_strict for ms-epoch storage
- Propagate rusqlite errors via the #[from] LoreError::Database path
- Return the count of events processed
Supporting functions:
- resolve_entity_ids: Maps entity_type string to (issue_id, MR_id) pair
with exactly-one-non-NULL invariant matching the CHECK constraints
- count_events: Queries all three event tables with conditional COUNT
aggregations, returning EventCounts struct. Uses unwrap_or((0, 0))
for graceful degradation when tables don't exist (pre-migration 011).
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>