Taylor Eernisse d33f24c91b feat(transformers): Add MR transformer and polymorphic discussion support
Introduces NormalizedMergeRequest transformer and updates discussion
normalization to handle both issue and MR discussions polymorphically.

New transformers:
- NormalizedMergeRequest: Transforms API MergeRequest to database row,
  extracting labels/assignees/reviewers into separate collections for
  junction table insertion. Handles draft detection, detailed_merge_status
  preference over deprecated merge_status, and merge_user over merged_by.

Discussion transformer updates:
- NormalizedDiscussion now takes noteable_type ("Issue" | "MergeRequest")
  and noteable_id for polymorphic FK binding
- normalize_discussions_for_issue(): Convenience wrapper for issues
- normalize_discussions_for_mr(): Convenience wrapper for MRs
- DiffNote position fields (type, line_range, SHA triplet) now extracted
  from API position object for code review context

Design decisions:
- Transformer returns (normalized_item, labels, assignees, reviewers)
  tuple for efficient batch insertion without re-querying
- Timestamps converted to ms epoch for SQLite storage consistency
- Optional fields use map() chains for clean null handling

The polymorphic discussion approach allows reusing the same discussions
and notes tables for both issues and MRs, with noteable_type + FK
determining the parent relationship.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 22:45:29 -05:00
2026-01-26 11:34:04 -05:00
2026-01-20 13:11:40 -05:00
2026-01-21 15:56:11 -05:00
2026-01-21 15:56:11 -05:00
2026-01-21 15:56:11 -05:00

gi - GitLab Inbox

A command-line tool for managing GitLab issues locally. Syncs issues, discussions, and notes from GitLab to a local SQLite database for fast, offline-capable querying and filtering.

Features

  • Local-first: All data stored in SQLite for instant queries
  • Incremental sync: Cursor-based sync only fetches changes since last sync
  • Full re-sync: Reset cursors and fetch all data from scratch when needed
  • Multi-project: Track issues across multiple GitLab projects
  • Rich filtering: Filter by state, author, assignee, labels, milestone, due date
  • Raw payload storage: Preserves original GitLab API responses for debugging

Installation

cargo install --path .

Or build from source:

cargo build --release
./target/release/gi --help

Quick Start

# Initialize configuration (interactive)
gi init

# Verify authentication
gi auth-test

# Sync issues from GitLab
gi ingest --type issues

# List recent issues
gi list issues --limit 10

# Show issue details
gi show issue 123 --project group/repo

Configuration

Configuration is stored in ~/.config/gi/config.json (or $XDG_CONFIG_HOME/gi/config.json).

Example Configuration

{
  "gitlab": {
    "baseUrl": "https://gitlab.com",
    "tokenEnvVar": "GITLAB_TOKEN"
  },
  "projects": [
    { "path": "group/project" },
    { "path": "other-group/other-project" }
  ],
  "sync": {
    "backfillDays": 14,
    "staleLockMinutes": 10,
    "heartbeatIntervalSeconds": 30,
    "cursorRewindSeconds": 2,
    "primaryConcurrency": 4,
    "dependentConcurrency": 2
  },
  "storage": {
    "compressRawPayloads": true
  }
}

Configuration Options

Section Field Default Description
gitlab baseUrl GitLab instance URL (required)
gitlab tokenEnvVar GITLAB_TOKEN Environment variable containing API token
projects path Project path (e.g., group/project)
sync backfillDays 14 Days to backfill on initial sync
sync staleLockMinutes 10 Minutes before sync lock considered stale
sync heartbeatIntervalSeconds 30 Frequency of lock heartbeat updates
sync cursorRewindSeconds 2 Seconds to rewind cursor for overlap safety
sync primaryConcurrency 4 Concurrent GitLab requests for primary resources
sync dependentConcurrency 2 Concurrent requests for dependent resources
storage dbPath ~/.local/share/gi/gi.db Database file path
storage backupDir ~/.local/share/gi/backups Backup directory
storage compressRawPayloads true Compress stored API responses with gzip
embedding provider ollama Embedding provider
embedding model nomic-embed-text Model name for embeddings
embedding baseUrl http://localhost:11434 Ollama server URL
embedding concurrency 4 Concurrent embedding requests

Config File Resolution

The config file is resolved in this order:

  1. --config CLI flag
  2. GI_CONFIG_PATH environment variable
  3. ~/.config/gi/config.json (XDG default)
  4. ./gi.config.json (local fallback for development)

GitLab Token

Create a personal access token with read_api scope:

  1. Go to GitLab → Settings → Access Tokens
  2. Create token with read_api scope
  3. Export it: export GITLAB_TOKEN=glpat-xxxxxxxxxxxx

Environment Variables

Variable Purpose Required
GITLAB_TOKEN GitLab API authentication token (name configurable via gitlab.tokenEnvVar) Yes
GI_CONFIG_PATH Override config file location No
XDG_CONFIG_HOME XDG Base Directory for config (fallback: ~/.config) No
XDG_DATA_HOME XDG Base Directory for data (fallback: ~/.local/share) No
RUST_LOG Logging level filter (e.g., gi=debug) No

Commands

gi init

Initialize configuration and database interactively.

gi init                    # Interactive setup
gi init --force            # Overwrite existing config
gi init --non-interactive  # Fail if prompts needed

gi auth-test

Verify GitLab authentication is working.

gi auth-test
# Authenticated as @username (Full Name)
# GitLab: https://gitlab.com

gi doctor

Check environment health and configuration.

gi doctor          # Human-readable output
gi doctor --json   # JSON output for scripting

Checks performed:

  • Config file existence and validity
  • Database existence and pragmas (WAL mode, foreign keys)
  • GitLab authentication
  • Project accessibility
  • Ollama connectivity (optional)

gi ingest

Sync data from GitLab to local database.

gi ingest --type issues                       # Sync all projects
gi ingest --type issues --project group/repo  # Single project
gi ingest --type issues --force               # Override stale lock
gi ingest --type issues --full                # Full re-sync (reset cursors)

The --full flag resets sync cursors and fetches all data from scratch, useful when:

  • Assignee data or other fields were missing from earlier syncs
  • You want to ensure complete data after schema changes
  • Troubleshooting sync issues

gi list issues

Query issues from local database.

gi list issues                              # Recent issues (default 50)
gi list issues --limit 100                  # More results
gi list issues --state opened               # Only open issues
gi list issues --state closed               # Only closed issues
gi list issues --author username            # By author (@ prefix optional)
gi list issues --assignee username          # By assignee (@ prefix optional)
gi list issues --label bug                  # By label (AND logic)
gi list issues --label bug --label urgent   # Multiple labels
gi list issues --milestone "v1.0"           # By milestone title
gi list issues --since 7d                   # Updated in last 7 days
gi list issues --since 2w                   # Updated in last 2 weeks
gi list issues --since 2024-01-01           # Updated since date
gi list issues --due-before 2024-12-31      # Due before date
gi list issues --has-due-date               # Only issues with due dates
gi list issues --project group/repo         # Filter by project
gi list issues --sort created --order asc   # Sort options
gi list issues --open                       # Open first result in browser
gi list issues --json                       # JSON output

Output includes: IID, title, state, author, assignee, labels, and update time.

gi show issue

Display detailed issue information.

gi show issue 123                      # Show issue #123
gi show issue 123 --project group/repo # Disambiguate if needed

Shows: title, description, state, author, assignees, labels, milestone, due date, web URL, and threaded discussions.

gi count

Count entities in local database.

gi count issues                    # Total issues
gi count discussions               # Total discussions
gi count discussions --type issue  # Issue discussions only
gi count notes                     # Total notes (shows system vs user breakdown)

gi sync-status

Show current sync state and watermarks.

gi sync-status

Displays:

  • Last sync run details (status, timing)
  • Cursor positions per project and resource type
  • Data summary counts

gi migrate

Run pending database migrations.

gi migrate

Shows current schema version and applies any pending migrations.

gi version

Show version information.

gi version

gi backup

Create timestamped database backup.

gi backup

Note: Not yet implemented.

gi reset

Delete database and reset all state.

gi reset --confirm

Note: Not yet implemented.

Database Schema

Data is stored in SQLite with WAL mode and foreign keys enabled. Main tables:

Table Purpose
projects Tracked GitLab projects with metadata
issues Issue metadata (title, state, author, due date, milestone)
milestones Project milestones with state and due dates
labels Project labels with colors
issue_labels Many-to-many issue-label relationships
issue_assignees Many-to-many issue-assignee relationships
discussions Issue/MR discussion threads
notes Individual notes within discussions (with system note flag)
sync_runs Audit trail of sync operations
sync_cursors Cursor positions for incremental sync
app_locks Crash-safe single-flight lock
raw_payloads Compressed original API responses
schema_version Migration version tracking

The database is stored at ~/.local/share/gi/gi.db by default (XDG compliant).

Global Options

gi --config /path/to/config.json <command>  # Use alternate config

Development

# Run tests
cargo test

# Run with debug logging
RUST_LOG=gi=debug gi list issues

# Run with trace logging
RUST_LOG=gi=trace gi ingest --type issues

# Check formatting
cargo fmt --check

# Lint
cargo clippy

Tech Stack

  • Rust (2024 edition)
  • SQLite via rusqlite (bundled)
  • clap for CLI parsing
  • reqwest for HTTP
  • tokio for async runtime
  • serde for serialization
  • tracing for logging
  • indicatif for progress bars

Current Status

This is Checkpoint 1 (CP1) of the GitLab Knowledge Engine project. Currently implemented:

  • Issue ingestion with cursor-based incremental sync
  • Discussion and note syncing for issues
  • Rich filtering and querying
  • Full re-sync capability

Not yet implemented:

  • Merge request support (CP2)
  • Semantic search with embeddings (CP3+)
  • Backup and reset commands

See SPEC.md for the full project roadmap and architecture.

License

MIT

Description
No description provided
Readme 42 MiB
Languages
Rust 100%