Commit Graph

6 Commits

Author SHA1 Message Date
Taylor Eernisse
53ef21d653 fix: propagate DB errors instead of silently swallowing them
Replace .unwrap_or(), .ok(), and .filter_map(|r| r.ok()) patterns with
proper error propagation using ? and rusqlite::OptionalExtension where
the query may legitimately return no rows.

Affected areas:
- events_db::count_events: three count queries now propagate errors
  instead of defaulting to (0, 0) on failure
- note_parser::extract_refs_from_system_notes: row iteration errors
  are now propagated instead of silently dropped via filter_map
- note_parser::noteable_type_to_entity_type: unknown types now log a
  debug warning before defaulting to "issue"
- payloads::store_payload/read_payload: use .optional()? instead of
  .ok() to distinguish "no row" from "query failed"
- backoff::compute_next_attempt_at: use .clamp(0, 30) to guard against
  negative attempt_count, not just .min(30)
- search::vector::max_chunks_per_document: returns Result<i64> with
  proper error propagation through .optional()?.flatten()
- embedding::chunk_ids::decode_rowid: promote debug_assert to assert
  since negative rowids indicate data corruption worth failing fast on
- ingestion::dirty_tracker::record_dirty_error: use .optional()? to
  handle missing dirty_sources row gracefully instead of hard error

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 10:15:36 -05:00
Taylor Eernisse
65583ed5d6 refactor: Remove redundant doc comments throughout codebase
Removes module-level doc comments (//! lines) and excessive inline doc
comments that were duplicating information already evident from:
- Function/struct names (self-documenting code)
- Type signatures (the what is clear from types)
- Implementation context (the how is clear from code)

Affected modules:
- cli/* - Removed command descriptions duplicating clap help text
- core/* - Removed module headers and obvious function docs
- documents/* - Removed extractor/regenerator/truncation docs
- embedding/* - Removed pipeline and chunking docs
- gitlab/* - Removed client and transformer docs (kept type definitions)
- ingestion/* - Removed orchestrator and ingestion docs
- search/* - Removed FTS and vector search docs

Philosophy: Code should be self-documenting. Comments should explain
"why" (business decisions, non-obvious constraints) not "what" (which
the code itself shows). This change reduces noise and maintenance burden
while keeping the codebase just as understandable.

Retains comments for:
- Non-obvious business logic
- Important safety invariants
- Complex algorithm explanations
- Public API boundaries where generated docs matter

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 00:04:32 -05:00
Taylor Eernisse
a92e176bb6 fix(events): Handle nullable label and milestone in resource events
GitLab returns null for the label/milestone fields on resource_label_events
and resource_milestone_events when the referenced label or milestone has
been deleted. This caused deserialization failures during sync.

- Add migration 012 to recreate both event tables with nullable
  label_name, milestone_title, and milestone_id columns (SQLite
  requires table recreation to alter NOT NULL constraints)
- Change GitLabLabelEvent.label and GitLabMilestoneEvent.milestone
  to Option<> in the Rust types
- Update upsert functions to pass through None values correctly
- Add tests for null label and null milestone deserialization

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 17:36:17 -05:00
Taylor Eernisse
880ad1d3fa refactor(events): Lift transaction control to callers, eliminate duplicated store functions
events_db.rs:
- Removed internal savepoints from upsert_state_events,
  upsert_label_events, and upsert_milestone_events. Each function
  previously created its own savepoint, making it impossible for
  callers to wrap all three in a single atomic transaction.
- Changed signatures from &mut Connection to &Connection, since
  savepoints are no longer created internally. This makes the
  functions compatible with rusqlite::Transaction (which derefs to
  Connection), allowing callers to pass a transaction directly.

orchestrator.rs:
- Deleted the three store_*_events_tx() functions (store_state_events_tx,
  store_label_events_tx, store_milestone_events_tx) which were
  hand-duplicated copies of the events_db upsert functions, created as
  a workaround for the &mut Connection requirement. Now that events_db
  accepts &Connection, store_resource_events() calls the canonical
  upsert functions directly through the unchecked_transaction.
- Replaced the max-iterations guard in drain_resource_events() with a
  HashSet-based deduplication of job IDs. The old guard used an
  arbitrary 2x multiplier on total_pending which could either terminate
  too early (if many retries were legitimate) or too late. The new
  approach precisely prevents reprocessing the same job within a single
  drain run, which is the actual invariant we need.

Net effect: ~133 lines of duplicated SQL removed, single source of
truth for event upsert logic, and callers control transaction scope.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 14:09:35 -05:00
Taylor Eernisse
a50fc78823 style: Apply cargo fmt and clippy fixes across codebase
Automated formatting and lint corrections from parallel agent work:

- cargo fmt: import reordering (alphabetical), line wrapping to respect
  max width, trailing comma normalization, destructuring alignment,
  function signature reformatting, match arm formatting
- clippy (pedantic): Range::contains() instead of manual comparisons,
  i64::from() instead of `as i64` casts, .clamp() instead of
  .max().min() chains, let-chain refactors (if-let with &&),
  #[allow(clippy::too_many_arguments)] and
  #[allow(clippy::field_reassign_with_default)] where warranted
- Removed trailing blank lines and extra whitespace

No behavioral changes. All existing tests pass unmodified.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 13:01:59 -05:00
Taylor Eernisse
c34ed3007e feat(db): Add event upsert functions and count queries in events_db module
New module src/core/events_db.rs provides database operations for
resource events:

- upsert_state_events: Batch INSERT OR REPLACE for state change events,
  keyed on UNIQUE(gitlab_id, project_id). Wraps in a savepoint for
  atomicity per entity batch. Maps GitLabStateEvent fields including
  optional user, source_commit, and source_merge_request_iid.

- upsert_label_events: Same pattern for label add/remove events,
  extracting label.name for denormalized storage.

- upsert_milestone_events: Same pattern for milestone assignment events,
  storing both milestone.title and milestone.id.

All three upsert functions:
- Take &mut Connection (required for savepoint creation)
- Use prepare_cached for statement reuse across batch iterations
- Convert ISO timestamps via iso_to_ms_strict for ms-epoch storage
- Propagate rusqlite errors via the #[from] LoreError::Database path
- Return the count of events processed

Supporting functions:
- resolve_entity_ids: Maps entity_type string to (issue_id, MR_id) pair
  with exactly-one-non-NULL invariant matching the CHECK constraints
- count_events: Queries all three event tables with conditional COUNT
  aggregations, returning EventCounts struct. Uses unwrap_or((0, 0))
  for graceful degradation when tables don't exist (pre-migration 011).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 12:07:34 -05:00