Commit Graph

168 Commits

Author SHA1 Message Date
Taylor Eernisse
39a71d8b85 feat(db): Add schema migration v6 for merge request support
Introduces comprehensive database schema for merge request ingestion
(CP2), designed with forward compatibility for future features.

New tables:
- merge_requests: Core MR metadata with draft status, branch info,
  detailed_merge_status (modern API field), and sync health telemetry
  columns for debuggability
- mr_labels: Junction table linking MRs to shared labels table
- mr_assignees: MR assignee usernames (same pattern as issues)
- mr_reviewers: MR-specific reviewer tracking (not applicable to issues)

Additional indexes:
- discussions: Add merge_request_id and resolved status indexes
- notes: Add composite indexes for DiffNote file/line queries

DiffNote position enhancements:
- position_type: 'text' | 'image' | 'file' for diff comment semantics
- position_line_range_start/end: Multi-line comment range support
- position_base_sha/start_sha/head_sha: Commit context for diff notes

The schema captures CP3-ready fields (head_sha, references_short/full,
SHA triplet) at zero additional API cost, preparing for file-context
and cross-project reference features.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 22:44:37 -05:00
Taylor Eernisse
8afb2c2e75 docs: Expand README with comprehensive CLI and config documentation
Significantly expand the README to serve as complete user documentation
for the CLI tool, reflecting the full CP1 implementation.

Configuration section:
- Add missing config options: heartbeatIntervalSeconds, primaryConcurrency,
  dependentConcurrency, backupDir, embedding provider settings
- Document config file resolution order (CLI flag, env var, XDG, local)
- Add environment variables table with GITLAB_TOKEN, GI_CONFIG_PATH,
  XDG_CONFIG_HOME, XDG_DATA_HOME, RUST_LOG

Commands section:
- Document --full flag for complete re-sync (resets cursors and watermarks)
- Add output descriptions for list, show, and count commands
- Document assignee filter with @ prefix normalization
- Add gi doctor checks explanation (config, db, GitLab auth, Ollama)
- Add gi sync-status output description
- Add placeholder documentation for backup and reset commands

Database schema section:
- Reformat as table with descriptions
- Add sync_runs, sync_cursors, app_locks, schema_version tables
- Note WAL mode and foreign keys enabled

Development section:
- Add RUST_LOG=gi=trace example for detailed logging

Current status section:
- Document CP1 scope (issues, discussions, incremental sync)
- List not-yet-implemented features (MRs, embeddings, backup/reset)
- Reference SPEC.md for full roadmap

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 17:01:37 -05:00
Taylor Eernisse
0952d21a90 docs(prd): Add CP2 PRD and CP1-CP2 alignment audit
Add comprehensive planning documentation for Checkpoint 2 (Merge Request
support) and document the results of the CP1 implementation audit.

checkpoint-2.md (2093 lines):
- Complete PRD for adding merge request ingestion, querying, and display
- Detailed user stories with acceptance criteria
- ASCII wireframes for CLI output formats
- Database schema extensions (migrations 006-007)
- API integration specifications for MR endpoints
- State transition diagrams for MR lifecycle
- Performance requirements and test specifications
- Risk assessment and mitigation strategies

cp1-cp2-alignment-audit.md (344 lines):
- Gap analysis between CP1 PRD and actual implementation
- Identified issues prioritized by severity (P0/P1/P2/P3)
- P0: NormalizedDiscussion struct incompatible with MR discussions
- P1: --full flag not resetting discussion watermarks
- P2: Missing Link header pagination fallback
- P3: Missing sync health telemetry and selective payload storage
- Each issue includes root cause, recommended fix, and affected files

The P0 and P1 issues have been fixed in accompanying commits. P2 and P3
items are deferred to CP2 implementation.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 17:01:20 -05:00
Taylor Eernisse
4abbe2a226 fix(ingest): Reset discussion watermarks when --full flag is used
This is a P1 fix from the CP1-CP2 alignment audit. The --full flag was
designed to enable complete data re-synchronization, but it only reset
sync_cursors for issues—it failed to reset the per-issue
discussions_synced_for_updated_at watermark.

The result was an inconsistent state: issues would be re-fetched from
GitLab (because sync_cursors were cleared), but their discussions would
NOT be re-synced (because the watermark comparison prevented it). This
was a subtle bug because the watermark check uses:

  WHERE updated_at > COALESCE(discussions_synced_for_updated_at, 0)

When discussions_synced_for_updated_at is already set to the issue's
updated_at, the comparison fails and discussions are skipped.

Fix: Before clearing sync_cursors, set discussions_synced_for_updated_at
to NULL for all issues in the project. This makes COALESCE return 0,
ensuring all issues become eligible for discussion sync.

The ordering is important: watermarks must be reset BEFORE cursors to
ensure the full sync behaves consistently.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 17:01:04 -05:00
Taylor Eernisse
d9d749ac57 fix(discussion): Make NormalizedDiscussion polymorphic for MR support
This is a P0 fix from the CP1-CP2 alignment audit. The original
NormalizedDiscussion struct had issue_id as a non-optional i64 and
hardcoded noteable_type to "Issue", making it incompatible with merge
request discussions even though the database schema already supports
both via nullable columns and a CHECK constraint.

Changes:
- Add NoteableRef enum with Issue(i64) and MergeRequest(i64) variants
  to provide compile-time safety against mixing up issue vs MR IDs
- Change NormalizedDiscussion.issue_id from i64 to Option<i64>
- Add NormalizedDiscussion.merge_request_id: Option<i64>
- Update transform_discussion() signature to take NoteableRef instead
  of local_issue_id, deriving issue_id/merge_request_id/noteable_type
  from the enum variant
- Update upsert_discussion() SQL to include merge_request_id column
  (now 12 parameters instead of 11)
- Export NoteableRef from transformers module
- Add test for MergeRequest discussion transformation
- Update all existing tests to use NoteableRef::Issue(id)

The database schema (migration 002) was forward-thinking and already
supports both issue_id and merge_request_id as nullable columns with
a CHECK constraint. This change prepares the application layer for
CP2 merge request support without requiring any migrations.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 17:00:49 -05:00
teernisse
fbdfd8f4cb beads 2026-01-26 11:34:04 -05:00
Taylor Eernisse
f53645790a test: Add comprehensive test suite with fixtures
Establishes testing infrastructure for reliable development.

tests/fixtures/ - GitLab API response samples:
- gitlab_issue.json: Single issue with full metadata
- gitlab_issues_page.json: Paginated issue list response
- gitlab_discussion.json: Discussion thread with notes
- gitlab_discussions_page.json: Paginated discussions response
All fixtures captured from real GitLab API responses with
sensitive data redacted, ensuring tests match actual behavior.

tests/gitlab_types_tests.rs - Type deserialization tests:
- Validates serde parsing of all GitLab API types
- Tests edge cases: null fields, empty arrays, nested objects
- Ensures GitLabIssue, GitLabDiscussion, GitLabNote parse correctly
- Verifies optional fields handle missing data gracefully
- Tests author/assignee extraction from various formats

tests/fixture_tests.rs - Integration with fixtures:
- Loads fixture files and validates parsing
- Tests transformer functions produce correct database rows
- Verifies IssueWithMetadata extracts labels and assignees
- Tests NormalizedDiscussion/NormalizedNote structure
- Validates raw payload preservation logic

tests/migration_tests.rs - Database schema tests:
- Creates in-memory SQLite for isolation
- Runs all migrations and verifies schema
- Tests table creation with expected columns
- Validates foreign key constraints
- Tests index creation for query performance
- Verifies idempotent migration behavior

Test infrastructure uses:
- tempfile for isolated database instances
- wiremock for HTTP mocking (available for future API tests)
- Standard Rust #[test] attributes

Run with: cargo test
Run single: cargo test test_name
Run with output: cargo test -- --nocapture

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 11:29:06 -05:00
Taylor Eernisse
8fb890c528 feat(cli): Implement complete command-line interface
Provides a user-friendly CLI for all GitLab Inbox operations.

src/cli/mod.rs - Clap command definitions:
- Global --config flag for alternate config path
- Subcommands: init, auth-test, doctor, version, backup, reset,
  migrate, sync-status, ingest, list, count, show
- Ingest supports --type (issues/merge_requests), --project filter,
  --force lock override, --full resync
- List supports rich filtering: --state, --author, --assignee,
  --label, --milestone, --since, --due-before, --has-due-date
- List supports --sort (updated/created/iid), --order (asc/desc)
- List supports --open to launch browser, --json for scripting

src/cli/commands/ - Command implementations:

init.rs: Interactive configuration wizard
- Prompts for GitLab URL, token env var, projects to track
- Creates config file and initializes database
- Supports --force overwrite and --non-interactive mode

auth_test.rs: Verify GitLab authentication
- Calls /api/v4/user to validate token
- Displays username and GitLab instance URL

doctor.rs: Environment health check
- Validates config file exists and parses correctly
- Checks database connectivity and migration state
- Verifies GitLab authentication
- Reports token environment variable status
- Supports --json output for CI integration

ingest.rs: Data synchronization from GitLab
- Acquires sync lock with stale detection
- Shows progress bars for issues and discussions
- Reports sync statistics on completion
- Supports --full flag to reset cursors and refetch all data

list.rs: Query local database
- Formatted table output with comfy-table
- Filters build dynamic SQL with parameterized queries
- Username filters normalize @ prefix automatically
- --open flag uses 'open' crate for cross-platform browser launch
- --json outputs array of issue objects

show.rs: Detailed entity view
- Displays issue metadata in structured format
- Shows full description with markdown
- Lists labels, assignees, milestone
- Shows discussion threads with notes

count.rs: Entity statistics
- Counts issues, discussions, or notes
- Supports --type filter for discussions/notes

sync_status.rs: Display sync watermarks
- Shows last sync time per project
- Displays cursor positions for debugging

src/main.rs - Application entry point:
- Initializes tracing subscriber with env-filter
- Parses CLI arguments via clap
- Dispatches to appropriate command handler
- Consistent error formatting for all failure modes

src/lib.rs - Library entry point:
- Exports cli, core, gitlab, ingestion modules
- Re-exports Config, GiError, Result for convenience

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 11:28:52 -05:00
Taylor Eernisse
cd60350c6d feat(ingestion): Implement cursor-based incremental sync from GitLab
Provides efficient data synchronization with minimal API calls.

src/ingestion/issues.rs - Issue sync logic:
- Cursor-based incremental sync using updated_at timestamp
- Fetches only issues modified since last sync
- Configurable cursor rewind for overlap safety (default 2s)
- Batched database writes with transaction wrapping
- Upserts issues, labels, milestones, and assignees
- Maintains issue_labels and issue_assignees junction tables
- Returns IngestIssuesResult with counts and issues needing discussion sync
- Identifies issues where discussion count changed

src/ingestion/discussions.rs - Discussion sync logic:
- Fetches discussions for issues that need sync
- Compares discussion count vs stored to detect changes
- Batched note insertion with raw payload preservation
- Updates discussion metadata (resolved state, note counts)
- Tracks sync state per discussion to enable incremental updates
- Returns IngestDiscussionsResult with fetched/skipped counts

src/ingestion/orchestrator.rs - Sync coordination:
- Two-phase sync: issues first, then discussions
- Progress callback support for CLI progress bars
- ProgressEvent enum for fine-grained status updates:
  - IssueFetch, IssueProcess, DiscussionFetch, DiscussionSkip
- Acquires sync lock before starting
- Updates sync watermark on successful completion
- Handles partial failures gracefully (watermark not updated)
- Returns IngestProjectResult with detailed statistics

The architecture supports future additions:
- Merge request ingestion (parallel to issues)
- Full-text search indexing hooks
- Vector embedding pipeline integration

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 11:28:34 -05:00
Taylor Eernisse
dd5eb04953 feat(gitlab): Implement GitLab REST API client and type definitions
Provides a typed interface to the GitLab API with pagination support.

src/gitlab/types.rs - API response type definitions:
- GitLabIssue: Full issue payload with author, assignees, labels
- GitLabDiscussion: Discussion thread with notes array
- GitLabNote: Individual note with author, timestamps, body
- GitLabAuthor/GitLabUser: User information with avatar URLs
- GitLabProject: Project metadata from /api/v4/projects
- GitLabVersion: GitLab instance version from /api/v4/version
- GitLabNotePosition: Line-level position for diff notes
- All types derive Deserialize for JSON parsing

src/gitlab/client.rs - HTTP client with authentication:
- Bearer token authentication from config
- Base URL configuration for self-hosted instances
- Paginated iteration via keyset or offset pagination
- Automatic Link header parsing for next page URLs
- Per-page limit control (default 100)
- Methods: get_user(), get_version(), get_project()
- Async stream for issues: list_issues_paginated()
- Async stream for discussions: list_issue_discussions_paginated()
- Respects GitLab rate limiting via response headers

src/gitlab/transformers/ - API to database mapping:

transformers/issue.rs - Issue transformation:
- Maps GitLabIssue to IssueRow for database insert
- Extracts milestone ID and due date
- Normalizes author/assignee usernames
- Preserves label IDs for junction table
- Returns IssueWithMetadata including label/assignee lists

transformers/discussion.rs - Discussion transformation:
- Maps GitLabDiscussion to NormalizedDiscussion
- Extracts thread metadata (resolvable, resolved)
- Flattens notes to NormalizedNote with foreign keys
- Handles system notes vs user notes
- Preserves note position for diff discussions

transformers/mod.rs - Re-exports all transformer types

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 11:28:21 -05:00
Taylor Eernisse
7aaa51f645 feat(core): Implement infrastructure layer for CLI operations
Establishes foundational modules that all other components depend on.

src/core/config.rs - Configuration management:
- JSON-based config file with Zod-like validation via serde
- GitLab settings: base URL, token environment variable
- Project list with paths to track
- Sync settings: backfill days, stale lock timeout, cursor rewind
- Storage settings: database path, payload compression toggle
- XDG-compliant config path resolution via dirs crate
- Loads GITLAB_TOKEN from configured environment variable

src/core/db.rs - Database connection and migrations:
- Opens or creates SQLite database with WAL mode for concurrency
- Embeds migration SQL as const strings (001-005)
- Runs migrations idempotently with checksum verification
- Provides thread-safe connection management

src/core/error.rs - Unified error handling:
- GiError enum with variants for all failure modes
- Config, Database, GitLab, Ingestion, Lock, IO, Parse errors
- thiserror derive for automatic Display/Error impls
- Result type alias for ergonomic error propagation

src/core/lock.rs - Distributed sync locking:
- File-based locks to prevent concurrent syncs
- Stale lock detection with configurable timeout
- Force override for recovery scenarios
- Lock file contains PID and timestamp for debugging

src/core/paths.rs - Path resolution:
- XDG Base Directory Specification compliance
- Config: ~/.config/gi/config.json
- Data: ~/.local/share/gi/gi.db
- Creates parent directories on first access

src/core/payloads.rs - Raw payload storage:
- Optional gzip compression for storage efficiency
- SHA-256 content addressing for deduplication
- Type-prefixed keys (issue:, discussion:, note:)
- Batch insert with UPSERT for idempotent ingestion

src/core/time.rs - Timestamp utilities:
- Relative time parsing (7d, 2w, 1m) for --since flag
- ISO 8601 date parsing for absolute dates
- Human-friendly relative time formatting

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 11:28:07 -05:00
Taylor Eernisse
d15f457a58 feat(db): Add SQLite database migrations for GitLab data model
Implements a comprehensive relational schema for storing GitLab data
with full audit trail and raw payload preservation.

Migration 001_initial.sql establishes core metadata tables:
- projects: Tracked GitLab projects with paths and namespace
- sync_watermarks: Cursor-based incremental sync state per project
- schema_migrations: Migration tracking with checksums for integrity

Migration 002_issues.sql creates the issues data model:
- issues: Core issue data with timestamps, author, state, counts
- labels: Project-specific label definitions with colors/descriptions
- issue_labels: Many-to-many junction for issue-label relationships
- milestones: Project milestones with state and due dates
- discussions: Threaded discussions linked to issues/MRs
- notes: Individual notes within discussions with full metadata
- raw_payloads: Compressed original API responses keyed by entity

Migration 003_indexes.sql adds performance indexes:
- Covering indexes for common query patterns (state, updated_at)
- Composite indexes for filtered queries (project + state)

Migration 004_discussions_payload.sql extends discussions:
- Adds raw_payload column for discussion-level API preservation
- Enables debugging and data recovery from original responses

Migration 005_assignees_milestone_duedate.sql completes the model:
- issue_assignees: Many-to-many for multiple assignees per issue
- Adds milestone_id, due_date columns to issues table
- Indexes for assignee and milestone filtering

Schema supports both incremental sync and full historical queries.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 11:27:51 -05:00
Taylor Eernisse
986bc59f6a docs: Add comprehensive documentation and planning artifacts
README.md provides complete user documentation:
- Installation via cargo install or build from source
- Quick start guide with example commands
- Configuration file format with all options documented
- Full command reference for init, auth-test, doctor, ingest,
  list, show, count, sync-status, migrate, and version
- Database schema overview covering projects, issues, milestones,
  assignees, labels, discussions, notes, and raw payloads
- Development setup with test, lint, and debug commands

SPEC.md updated from original TypeScript planning document:
- Added note clarifying this is historical (implementation uses Rust)
- Updated sqlite-vss references to sqlite-vec (deprecated library)
- Added architecture overview with Technology Choices rationale
- Expanded project structure showing all planned modules

docs/prd/ contains detailed checkpoint planning:
- checkpoint-0.md: Initial project vision and requirements
- checkpoint-1.md: Revised planning after technology decisions

These documents capture the evolution from initial concept through
the decision to use Rust for performance and type safety.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 11:27:40 -05:00
Taylor Eernisse
e065862f81 feat: Initialize Rust project with dependencies and tooling
Set up the GitLab Inbox (gi) CLI tool as a Rust 2024 edition project.

Dependencies organized by purpose:
- Database: rusqlite (bundled SQLite), sqlite-vec for vector search
- Serialization: serde/serde_json for GitLab API responses
- CLI: clap for argument parsing, dialoguer for interactive prompts,
  comfy-table for formatted output, indicatif for progress bars
- HTTP: reqwest with tokio async runtime for GitLab API calls
- Async: async-stream and futures for paginated API iteration
- Utilities: thiserror for error types, chrono for timestamps,
  flate2 for payload compression, sha2 for content hashing
- Logging: tracing with env-filter for structured debug output

Release profile optimized for small binary size (LTO, strip symbols).

Project structure follows standard Rust conventions with src/lib.rs
exposing modules and src/main.rs as CLI entry point.

Added .gitignore for Rust/Cargo artifacts and local database files.
Added AGENTS.md with TDD workflow guidance and beads issue tracking
integration instructions for AI-assisted development.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 11:27:30 -05:00
teernisse
e846a39ce6 More planning 2026-01-23 15:31:52 -05:00
teernisse
1f36fe6a21 more planning 2026-01-21 15:56:11 -05:00
teernisse
97a303eca9 Spec iterations 2026-01-20 16:43:39 -05:00
teernisse
7702d2a493 initial 2026-01-20 13:11:40 -05:00