Commit Graph

14 Commits

Author SHA1 Message Date
Taylor Eernisse
f1cb45a168 style: format perf_benchmark.rs with cargo fmt
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 08:49:53 -05:00
Taylor Eernisse
7eadae75f0 test(timeline): add integration tests for full seed-expand-collect pipeline
Adds tests/timeline_pipeline_tests.rs with end-to-end integration tests
that exercise the complete timeline pipeline against an in-memory SQLite
database with realistic data:

- pipeline_seed_expand_collect_end_to_end: Full scenario with an issue
  closed by an MR, state changes, and label events. Verifies that seed
  finds entities via FTS, expand discovers the closing MR through the
  entity_references graph, and collect assembles a chronologically sorted
  event stream containing Created, StateChanged, LabelAdded, and Merged
  events.

- pipeline_empty_query_produces_empty_result: Validates graceful
  degradation when FTS returns zero matches -- all three stages should
  produce empty results without errors.

- pipeline_since_filter_excludes_old_events: Verifies that the since
  timestamp filter propagates correctly through collect, excluding events
  before the cutoff while retaining newer ones.

- pipeline_unresolved_refs_have_optional_iid: Tests the Option<i64>
  target_iid on UnresolvedRef by creating cross-project references both
  with and without known IIDs.

- shared_resolve_entity_ref_scoping: Unit tests for the new shared
  resolve_entity_ref helper covering project-scoped lookup, unscoped
  lookup, wrong-project rejection, unknown entity types, and nonexistent
  entity IDs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 08:38:34 -05:00
Taylor Eernisse
e8845380e9 test: add performance regression benchmarks
Add tests/perf_benchmark.rs with three side-by-side benchmarks that
compare old vs new approaches for the optimizations introduced in the
preceding commits:

- bench_label_insert_individual_vs_batch: measures N individual INSERTs
  vs single multi-row INSERT (5k iterations, ~1.6x speedup)
- bench_string_building_old_vs_new: measures format!+push_str vs
  writeln! (50k iterations, ~1.9x speedup)
- bench_prepare_vs_prepare_cached: measures prepare vs prepare_cached
  (10k iterations, ~1.6x speedup)

Each benchmark verifies correctness (both approaches produce identical
output) and uses std::hint::black_box to prevent dead-code
elimination. Run with: cargo test --test perf_benchmark -- --nocapture

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 17:36:01 -05:00
Taylor Eernisse
233eb546af feat: Add commit SHAs, closes_issues watermark, and PRD alignment
Migration 015 adds merge_commit_sha/squash_commit_sha to merge_requests
(Gate 4/5 prerequisites), closes_issues_synced_for_updated_at watermark
for incremental sync, and the missing idx_label_events_label index.

The MR transformer and ingestion pipeline now populate commit SHAs during
sync. The orchestrator uses watermark-based filtering for closes_issues
jobs instead of re-enqueuing all MRs every sync.

The Phase B PRD is updated to match the actual codebase: corrected
migration numbering (011-015), documented nullable label/milestone
fields (migration 012), watermark patterns (013), observability
infrastructure (014), simplified source_method values, and updated
entity_references schema to match implementation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-05 15:29:51 -05:00
Taylor Eernisse
dd2869fd98 test: Remove redundant comments from test files
Applies the same doc comment cleanup to test files:
- Removes test module headers (//! lines)
- Removes obvious test function comments
- Retains comments explaining non-obvious test scenarios

Test names should be descriptive enough to convey intent without
additional comments. Complex test setup or assertions that need
explanation retain their comments.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 00:04:39 -05:00
Taylor Eernisse
976ad92ef0 test(gitlab): Add GitLabIssueRef deserialization tests
Adds test coverage for the new GitLabIssueRef type used by the
MR closes_issues API endpoint:

- deserializes_gitlab_issue_ref: Single object with all fields
- deserializes_gitlab_issue_ref_array: Array of refs (typical API response)

Validates that cross-project references (different project_id values)
deserialize correctly, which is important for cross-project close links.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-05 00:03:47 -05:00
Taylor Eernisse
a92e176bb6 fix(events): Handle nullable label and milestone in resource events
GitLab returns null for the label/milestone fields on resource_label_events
and resource_milestone_events when the referenced label or milestone has
been deleted. This caused deserialization failures during sync.

- Add migration 012 to recreate both event tables with nullable
  label_name, milestone_title, and milestone_id columns (SQLite
  requires table recreation to alter NOT NULL constraints)
- Change GitLabLabelEvent.label and GitLabMilestoneEvent.milestone
  to Option<> in the Rust types
- Update upsert functions to pass through None values correctly
- Add tests for null label and null milestone deserialization

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 17:36:17 -05:00
Taylor Eernisse
a50fc78823 style: Apply cargo fmt and clippy fixes across codebase
Automated formatting and lint corrections from parallel agent work:

- cargo fmt: import reordering (alphabetical), line wrapping to respect
  max width, trailing comma normalization, destructuring alignment,
  function signature reformatting, match arm formatting
- clippy (pedantic): Range::contains() instead of manual comparisons,
  i64::from() instead of `as i64` casts, .clamp() instead of
  .max().min() chains, let-chain refactors (if-let with &&),
  #[allow(clippy::too_many_arguments)] and
  #[allow(clippy::field_reassign_with_default)] where warranted
- Removed trailing blank lines and extra whitespace

No behavioral changes. All existing tests pass unmodified.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 13:01:59 -05:00
Taylor Eernisse
92ff255909 feat(types): Add GitLab Resource Event serde types with deserialization tests
Adds six new types for deserializing responses from GitLab's three
Resource Events API endpoints (state, label, milestone):

- GitLabStateEvent: State transitions with optional user, source_commit,
  and source_merge_request reference
- GitLabLabelEvent: Label add/remove events with nested GitLabLabelRef
- GitLabMilestoneEvent: Milestone assignment changes with nested
  GitLabMilestoneRef
- GitLabMergeRequestRef: Lightweight MR reference (iid, title, web_url)
- GitLabLabelRef: Label metadata (id, name, color, description)
- GitLabMilestoneRef: Milestone metadata (id, iid, title)

All types derive Deserialize + Serialize and use Option<T> for nullable
fields (user, source_commit, color, description) to match GitLab's API
contract where these fields may be null.

Includes 8 new test cases covering:
- State events with/without user, with/without source_merge_request
- Label events for add and remove actions, including null color handling
- Milestone event deserialization
- Standalone ref type deserialization (MR, label, milestone)

Uses r##"..."## raw string delimiters where JSON contains hex color
codes (#FF0000) that would conflict with r#"..."# delimiters.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 12:06:56 -05:00
Taylor Eernisse
f560e6bc00 test(embedding): Add regression tests for pipeline hardening bugs
Three targeted regression tests covering bugs fixed in the embedding
pipeline hardening:

- overflow_doc_with_error_sentinel_not_re_detected_as_pending: verifies
  that documents skipped for producing too many chunks have their
  sentinel error recorded in embedding_metadata and are NOT returned by
  find_pending_documents or count_pending_documents on subsequent runs
  (prevents infinite re-processing loop).

- count_and_find_pending_agree: exercises four states (empty DB, new
  document, fully-embedded document, config-drifted document) and
  asserts that count_pending_documents and find_pending_documents
  produce consistent results across all of them.

- full_embed_delete_is_atomic: confirms the --full flag's two DELETE
  statements (embedding_metadata + embeddings) execute atomically
  within a transaction.

Also updates test DB creation to apply migration 010.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 09:35:34 -05:00
Taylor Eernisse
d235f2b4dd test: Add test suites for embedding, FTS, hybrid search, and golden queries
Four new test modules covering the search infrastructure:

- tests/embedding.rs: Unit tests for the embedding pipeline including
  chunk ID encoding/decoding, change detection, and document chunking
  with overlap verification.

- tests/fts_search.rs: Integration tests for FTS5 search including
  safe query sanitization, multi-term queries, prefix matching, and
  the raw FTS mode for power users.

- tests/hybrid_search.rs: End-to-end tests for hybrid search mode
  including RRF fusion correctness, graceful degradation when
  embeddings are unavailable, and filter application.

- tests/golden_query_tests.rs: Golden query tests using fixtures
  from tests/fixtures/golden_queries.json to verify search quality
  against known-good query/result pairs. Ensures ranking stability
  across implementation changes.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 15:47:19 -05:00
teernisse
55b895a2eb Update name to gitlore instead of gitlab-inbox 2026-01-28 15:49:14 -05:00
Taylor Eernisse
d338d68191 test: Add comprehensive test suite for MR ingestion
Introduces thorough test coverage for merge request functionality,
following the established testing patterns from issue ingestion.

New test files:
- mr_transformer_tests.rs: NormalizedMergeRequest transformation tests
  covering full MR with all fields, minimal MR, draft detection via
  title prefix and work_in_progress field, label/assignee/reviewer
  extraction, and timestamp conversion

- mr_discussion_tests.rs: MR discussion normalization tests including
  polymorphic noteable binding, DiffNote position extraction with
  line ranges and SHA triplet, and resolvable note handling

- diffnote_position_tests.rs: Exhaustive DiffNote position scenarios
  covering text/image/file types, single-line vs multi-line comments,
  added/removed/modified lines, and missing position handling

New fixtures:
- fixtures/gitlab_merge_request.json: Representative MR API response
  with nested structures for integration testing

Updated tests:
- gitlab_types_tests.rs: Add MR type deserialization tests
- migration_tests.rs: Update expected schema version to 6

Test design follows property-based patterns where feasible, with
explicit edge case coverage for nullable fields and API variants
across different GitLab versions.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 22:47:17 -05:00
Taylor Eernisse
f53645790a test: Add comprehensive test suite with fixtures
Establishes testing infrastructure for reliable development.

tests/fixtures/ - GitLab API response samples:
- gitlab_issue.json: Single issue with full metadata
- gitlab_issues_page.json: Paginated issue list response
- gitlab_discussion.json: Discussion thread with notes
- gitlab_discussions_page.json: Paginated discussions response
All fixtures captured from real GitLab API responses with
sensitive data redacted, ensuring tests match actual behavior.

tests/gitlab_types_tests.rs - Type deserialization tests:
- Validates serde parsing of all GitLab API types
- Tests edge cases: null fields, empty arrays, nested objects
- Ensures GitLabIssue, GitLabDiscussion, GitLabNote parse correctly
- Verifies optional fields handle missing data gracefully
- Tests author/assignee extraction from various formats

tests/fixture_tests.rs - Integration with fixtures:
- Loads fixture files and validates parsing
- Tests transformer functions produce correct database rows
- Verifies IssueWithMetadata extracts labels and assignees
- Tests NormalizedDiscussion/NormalizedNote structure
- Validates raw payload preservation logic

tests/migration_tests.rs - Database schema tests:
- Creates in-memory SQLite for isolation
- Runs all migrations and verifies schema
- Tests table creation with expected columns
- Validates foreign key constraints
- Tests index creation for query performance
- Verifies idempotent migration behavior

Test infrastructure uses:
- tempfile for isolated database instances
- wiremock for HTTP mocking (available for future API tests)
- Standard Rust #[test] attributes

Run with: cargo test
Run single: cargo test test_name
Run with output: cargo test -- --nocapture

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 11:29:06 -05:00