gitlore

Author	SHA1	Message	Date
Taylor Eernisse	880ad1d3fa	refactor(events): Lift transaction control to callers, eliminate duplicated store functions events_db.rs: - Removed internal savepoints from upsert_state_events, upsert_label_events, and upsert_milestone_events. Each function previously created its own savepoint, making it impossible for callers to wrap all three in a single atomic transaction. - Changed signatures from &mut Connection to &Connection, since savepoints are no longer created internally. This makes the functions compatible with rusqlite::Transaction (which derefs to Connection), allowing callers to pass a transaction directly. orchestrator.rs: - Deleted the three store_*_events_tx() functions (store_state_events_tx, store_label_events_tx, store_milestone_events_tx) which were hand-duplicated copies of the events_db upsert functions, created as a workaround for the &mut Connection requirement. Now that events_db accepts &Connection, store_resource_events() calls the canonical upsert functions directly through the unchecked_transaction. - Replaced the max-iterations guard in drain_resource_events() with a HashSet-based deduplication of job IDs. The old guard used an arbitrary 2x multiplier on total_pending which could either terminate too early (if many retries were legitimate) or too late. The new approach precisely prevents reprocessing the same job within a single drain run, which is the actual invariant we need. Net effect: ~133 lines of duplicated SQL removed, single source of truth for event upsert logic, and callers control transaction scope. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-03 14:09:35 -05:00
Taylor Eernisse	bb75a9d228	fix(events): Resource events now run on incremental syncs, fix output and progress bar Three bugs fixed: 1. Early return in orchestrator when no discussions needed sync also skipped resource event enqueue+drain. On incremental syncs (the most common case), resource events were never fetched. Restructured to use if/else instead of early return so Step 4 always executes. 2. Ingest command JSON and human-readable output silently dropped resource_events_fetched/failed counts. Added to IngestJsonData and print_ingest_summary. 3. Progress bar reuse after finish_and_clear caused indicatif to silently ignore subsequent set_position/set_length calls. Added reset() call before reconfiguring the bar for resource events. Also removed stale comment referencing "unsafe" that didn't reflect the actual unchecked_transaction approach. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-03 13:06:35 -05:00
Taylor Eernisse	2bcd8db0e9	feat(events): Wire resource event fetching into sync pipeline (bd-1ep) Integrate resource event fetching as Step 4 of both issue and MR ingestion, gated behind the fetch_resource_events config flag. Orchestrator changes: - Add ProgressEvent variants: ResourceEventsFetchStarted, ResourceEventFetched, ResourceEventsFetchComplete - Add resource_events_fetched/failed fields to IngestProjectResult and IngestMrProjectResult - New enqueue_resource_events_for_entity_type() queries all issues/MRs for a project and enqueues resource_events jobs via the dependent queue (INSERT OR IGNORE for idempotency) - New drain_resource_events() claims jobs in batches, fetches state/label/milestone events from GitLab API, stores them atomically via unchecked_transaction, and handles failures with exponential backoff via fail_job() - Max-iterations guard prevents infinite retry loops within a single drain run - New store_resource_events() + per-type _tx helpers write events using prepared statements inside a single transaction - DrainResult struct tracks fetched/failed counts CLI ingest changes: - IngestResult gains resource_events_fetched/failed fields - Progress bar repurposed for resource event fetch phase (reuses discussion bar with updated template) - Accumulates event counts from both issue and MR ingestion CLI sync changes: - SyncResult gains resource_events_fetched/failed fields - Accumulates counts from both ingest stages - print_sync() conditionally displays event counts - Structured logging includes event counts Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-03 13:02:15 -05:00
Taylor Eernisse	cd44e516e3	feat(ingestion): Implement MR sync with parallel discussion prefetch Adds complete merge request ingestion pipeline with a novel two-phase discussion sync strategy optimized for throughput. New modules: - merge_requests.rs: MR upsert with labels/assignees/reviewers handling, stale MR cleanup, and watermark-based incremental sync - mr_discussions.rs: Parallel prefetch strategy for MR discussions Two-phase MR discussion sync: 1. PREFETCH PHASE: Spawn concurrent tasks to fetch discussions for multiple MRs simultaneously (configurable concurrency, default 8). Transform and validate in parallel, storing results in memory. 2. WRITE PHASE: Serial database writes to avoid lock contention. Each MR's discussions written in a single transaction, with proper stale discussion cleanup. This approach achieves ~4-8x throughput vs serial fetching while maintaining database consistency. Transform errors are tracked per-MR to prevent partial writes from corrupting watermarks. Orchestrator updates: - ingest_merge_requests(): Coordinates MR fetch -> discussion sync flow - Progress callbacks emit MR-specific events for UI feedback - Respects --full flag to reset discussion watermarks for full resync The prefetch strategy is critical for MRs which typically have more discussions than issues, and where API latency dominates sync time. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-26 22:45:48 -05:00
Taylor Eernisse	cd60350c6d	feat(ingestion): Implement cursor-based incremental sync from GitLab Provides efficient data synchronization with minimal API calls. src/ingestion/issues.rs - Issue sync logic: - Cursor-based incremental sync using updated_at timestamp - Fetches only issues modified since last sync - Configurable cursor rewind for overlap safety (default 2s) - Batched database writes with transaction wrapping - Upserts issues, labels, milestones, and assignees - Maintains issue_labels and issue_assignees junction tables - Returns IngestIssuesResult with counts and issues needing discussion sync - Identifies issues where discussion count changed src/ingestion/discussions.rs - Discussion sync logic: - Fetches discussions for issues that need sync - Compares discussion count vs stored to detect changes - Batched note insertion with raw payload preservation - Updates discussion metadata (resolved state, note counts) - Tracks sync state per discussion to enable incremental updates - Returns IngestDiscussionsResult with fetched/skipped counts src/ingestion/orchestrator.rs - Sync coordination: - Two-phase sync: issues first, then discussions - Progress callback support for CLI progress bars - ProgressEvent enum for fine-grained status updates: - IssueFetch, IssueProcess, DiscussionFetch, DiscussionSkip - Acquires sync lock before starting - Updates sync watermark on successful completion - Handles partial failures gracefully (watermark not updated) - Returns IngestProjectResult with detailed statistics The architecture supports future additions: - Merge request ingestion (parallel to issues) - Full-text search indexing hooks - Vector embedding pipeline integration Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-26 11:28:34 -05:00

5 Commits