Adds complete merge request ingestion pipeline with a novel two-phase
discussion sync strategy optimized for throughput.
New modules:
- merge_requests.rs: MR upsert with labels/assignees/reviewers handling,
stale MR cleanup, and watermark-based incremental sync
- mr_discussions.rs: Parallel prefetch strategy for MR discussions
Two-phase MR discussion sync:
1. PREFETCH PHASE: Spawn concurrent tasks to fetch discussions for
multiple MRs simultaneously (configurable concurrency, default 8).
Transform and validate in parallel, storing results in memory.
2. WRITE PHASE: Serial database writes to avoid lock contention.
Each MR's discussions written in a single transaction, with
proper stale discussion cleanup.
This approach achieves ~4-8x throughput vs serial fetching while
maintaining database consistency. Transform errors are tracked per-MR
to prevent partial writes from corrupting watermarks.
Orchestrator updates:
- ingest_merge_requests(): Coordinates MR fetch -> discussion sync flow
- Progress callbacks emit MR-specific events for UI feedback
- Respects --full flag to reset discussion watermarks for full resync
The prefetch strategy is critical for MRs which typically have more
discussions than issues, and where API latency dominates sync time.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This is a P0 fix from the CP1-CP2 alignment audit. The original
NormalizedDiscussion struct had issue_id as a non-optional i64 and
hardcoded noteable_type to "Issue", making it incompatible with merge
request discussions even though the database schema already supports
both via nullable columns and a CHECK constraint.
Changes:
- Add NoteableRef enum with Issue(i64) and MergeRequest(i64) variants
to provide compile-time safety against mixing up issue vs MR IDs
- Change NormalizedDiscussion.issue_id from i64 to Option<i64>
- Add NormalizedDiscussion.merge_request_id: Option<i64>
- Update transform_discussion() signature to take NoteableRef instead
of local_issue_id, deriving issue_id/merge_request_id/noteable_type
from the enum variant
- Update upsert_discussion() SQL to include merge_request_id column
(now 12 parameters instead of 11)
- Export NoteableRef from transformers module
- Add test for MergeRequest discussion transformation
- Update all existing tests to use NoteableRef::Issue(id)
The database schema (migration 002) was forward-thinking and already
supports both issue_id and merge_request_id as nullable columns with
a CHECK constraint. This change prepares the application layer for
CP2 merge request support without requiring any migrations.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>