Commit Graph

5 Commits

Author SHA1 Message Date
Taylor Eernisse
98907ac666 feat(events): Implement Gate 1 resource events infrastructure
Add complete infrastructure for ingesting GitLab Resource Events
(state, label, milestone) into local SQLite tables. This enables
temporal queries (timeline, file-history, trace) in later gates.

- Add migration 011: resource_state/label/milestone_events tables,
  entity_references table, pending_dependent_fetches queue
- Add 6 serde types for GitLab Resource Events API responses
- Add fetchResourceEvents config flag with --no-events CLI override
- Add fetch_all_pages<T> generic paginator and 6 API endpoint methods
- Add DB upsert functions with savepoint atomicity (events_db.rs)
- Add dependent fetch queue with exponential backoff (dependent_queue.rs)
- Add 'lore count events' command with human table and robot JSON output
- Extend 'lore stats --check' with event FK integrity and queue health
- Add 8 unit tests for resource event type deserialization

Closes: bd-hu3, bd-2e8, bd-2fm, bd-sqw, bd-1uc, bd-tir, bd-3sh, bd-1m8

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 11:23:44 -05:00
Taylor Eernisse
2a52594a60 feat(db): Add migration 010 for chunk config tracking columns
Add chunk_max_bytes and chunk_count columns to embedding_metadata to
support config drift detection and adaptive dedup sizing. Includes a
partial index on sentinel rows (chunk_index=0) to accelerate the drift
detection and max-chunk queries.

Also exports LATEST_SCHEMA_VERSION as a public constant derived from
the MIGRATIONS array length, replacing the previously hardcoded magic
number in the health check.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 09:34:48 -05:00
Taylor Eernisse
4270603da4 feat(db): Add migrations for documents, FTS5, and embeddings
Three new migrations establish the search infrastructure:

- 007_documents: Creates the `documents` table as the central search
  unit. Each document is a rendered text blob derived from an issue,
  MR, or discussion. Includes `dirty_queue` table for tracking which
  entities need document regeneration after ingestion changes.

- 008_fts5: Creates FTS5 virtual table `documents_fts` with content
  sync triggers. Uses `unicode61` tokenizer with `remove_diacritics=2`
  for broad language support. Automatic insert/update/delete triggers
  keep the FTS index synchronized with the documents table.

- 009_embeddings: Creates `embeddings` table for storing vector
  chunks produced by Ollama. Uses `doc_id * 1000 + chunk_index`
  rowid encoding to support multi-chunk documents while enabling
  efficient doc-level deduplication in vector search results.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 15:45:41 -05:00
Taylor Eernisse
39a71d8b85 feat(db): Add schema migration v6 for merge request support
Introduces comprehensive database schema for merge request ingestion
(CP2), designed with forward compatibility for future features.

New tables:
- merge_requests: Core MR metadata with draft status, branch info,
  detailed_merge_status (modern API field), and sync health telemetry
  columns for debuggability
- mr_labels: Junction table linking MRs to shared labels table
- mr_assignees: MR assignee usernames (same pattern as issues)
- mr_reviewers: MR-specific reviewer tracking (not applicable to issues)

Additional indexes:
- discussions: Add merge_request_id and resolved status indexes
- notes: Add composite indexes for DiffNote file/line queries

DiffNote position enhancements:
- position_type: 'text' | 'image' | 'file' for diff comment semantics
- position_line_range_start/end: Multi-line comment range support
- position_base_sha/start_sha/head_sha: Commit context for diff notes

The schema captures CP3-ready fields (head_sha, references_short/full,
SHA triplet) at zero additional API cost, preparing for file-context
and cross-project reference features.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 22:44:37 -05:00
Taylor Eernisse
d15f457a58 feat(db): Add SQLite database migrations for GitLab data model
Implements a comprehensive relational schema for storing GitLab data
with full audit trail and raw payload preservation.

Migration 001_initial.sql establishes core metadata tables:
- projects: Tracked GitLab projects with paths and namespace
- sync_watermarks: Cursor-based incremental sync state per project
- schema_migrations: Migration tracking with checksums for integrity

Migration 002_issues.sql creates the issues data model:
- issues: Core issue data with timestamps, author, state, counts
- labels: Project-specific label definitions with colors/descriptions
- issue_labels: Many-to-many junction for issue-label relationships
- milestones: Project milestones with state and due dates
- discussions: Threaded discussions linked to issues/MRs
- notes: Individual notes within discussions with full metadata
- raw_payloads: Compressed original API responses keyed by entity

Migration 003_indexes.sql adds performance indexes:
- Covering indexes for common query patterns (state, updated_at)
- Composite indexes for filtered queries (project + state)

Migration 004_discussions_payload.sql extends discussions:
- Adds raw_payload column for discussion-level API preservation
- Enables debugging and data recovery from original responses

Migration 005_assignees_milestone_duedate.sql completes the model:
- issue_assignees: Many-to-many for multiple assignees per issue
- Adds milestone_id, due_date columns to issues table
- Indexes for assignee and milestone filtering

Schema supports both incremental sync and full historical queries.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 11:27:51 -05:00