docs: Expand README with comprehensive CLI and config documentation

Significantly expand the README to serve as complete user documentation
for the CLI tool, reflecting the full CP1 implementation.

Configuration section:
- Add missing config options: heartbeatIntervalSeconds, primaryConcurrency,
  dependentConcurrency, backupDir, embedding provider settings
- Document config file resolution order (CLI flag, env var, XDG, local)
- Add environment variables table with GITLAB_TOKEN, GI_CONFIG_PATH,
  XDG_CONFIG_HOME, XDG_DATA_HOME, RUST_LOG

Commands section:
- Document --full flag for complete re-sync (resets cursors and watermarks)
- Add output descriptions for list, show, and count commands
- Document assignee filter with @ prefix normalization
- Add gi doctor checks explanation (config, db, GitLab auth, Ollama)
- Add gi sync-status output description
- Add placeholder documentation for backup and reset commands

Database schema section:
- Reformat as table with descriptions
- Add sync_runs, sync_cursors, app_locks, schema_version tables
- Note WAL mode and foreign keys enabled

Development section:
- Add RUST_LOG=gi=trace example for detailed logging

Current status section:
- Document CP1 scope (issues, discussions, incremental sync)
- List not-yet-implemented features (MRs, embeddings, backup/reset)
- Reference SPEC.md for full roadmap

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Taylor Eernisse
2026-01-26 17:01:37 -05:00
parent 0952d21a90
commit 8afb2c2e75

133
README.md
View File

@@ -6,6 +6,7 @@ A command-line tool for managing GitLab issues locally. Syncs issues, discussion
- **Local-first**: All data stored in SQLite for instant queries
- **Incremental sync**: Cursor-based sync only fetches changes since last sync
- **Full re-sync**: Reset cursors and fetch all data from scratch when needed
- **Multi-project**: Track issues across multiple GitLab projects
- **Rich filtering**: Filter by state, author, assignee, labels, milestone, due date
- **Raw payload storage**: Preserves original GitLab API responses for debugging
@@ -60,7 +61,11 @@ Configuration is stored in `~/.config/gi/config.json` (or `$XDG_CONFIG_HOME/gi/c
],
"sync": {
"backfillDays": 14,
"staleLockMinutes": 10
"staleLockMinutes": 10,
"heartbeatIntervalSeconds": 30,
"cursorRewindSeconds": 2,
"primaryConcurrency": 4,
"dependentConcurrency": 2
},
"storage": {
"compressRawPayloads": true
@@ -77,9 +82,25 @@ Configuration is stored in `~/.config/gi/config.json` (or `$XDG_CONFIG_HOME/gi/c
| `projects` | `path` | — | Project path (e.g., `group/project`) |
| `sync` | `backfillDays` | `14` | Days to backfill on initial sync |
| `sync` | `staleLockMinutes` | `10` | Minutes before sync lock considered stale |
| `sync` | `heartbeatIntervalSeconds` | `30` | Frequency of lock heartbeat updates |
| `sync` | `cursorRewindSeconds` | `2` | Seconds to rewind cursor for overlap safety |
| `sync` | `primaryConcurrency` | `4` | Concurrent GitLab requests for primary resources |
| `sync` | `dependentConcurrency` | `2` | Concurrent requests for dependent resources |
| `storage` | `dbPath` | `~/.local/share/gi/gi.db` | Database file path |
| `storage` | `compressRawPayloads` | `true` | Compress stored API responses |
| `storage` | `backupDir` | `~/.local/share/gi/backups` | Backup directory |
| `storage` | `compressRawPayloads` | `true` | Compress stored API responses with gzip |
| `embedding` | `provider` | `ollama` | Embedding provider |
| `embedding` | `model` | `nomic-embed-text` | Model name for embeddings |
| `embedding` | `baseUrl` | `http://localhost:11434` | Ollama server URL |
| `embedding` | `concurrency` | `4` | Concurrent embedding requests |
### Config File Resolution
The config file is resolved in this order:
1. `--config` CLI flag
2. `GI_CONFIG_PATH` environment variable
3. `~/.config/gi/config.json` (XDG default)
4. `./gi.config.json` (local fallback for development)
### GitLab Token
@@ -89,6 +110,16 @@ Create a personal access token with `read_api` scope:
2. Create token with `read_api` scope
3. Export it: `export GITLAB_TOKEN=glpat-xxxxxxxxxxxx`
## Environment Variables
| Variable | Purpose | Required |
|----------|---------|----------|
| `GITLAB_TOKEN` | GitLab API authentication token (name configurable via `gitlab.tokenEnvVar`) | Yes |
| `GI_CONFIG_PATH` | Override config file location | No |
| `XDG_CONFIG_HOME` | XDG Base Directory for config (fallback: `~/.config`) | No |
| `XDG_DATA_HOME` | XDG Base Directory for data (fallback: `~/.local/share`) | No |
| `RUST_LOG` | Logging level filter (e.g., `gi=debug`) | No |
## Commands
### `gi init`
@@ -120,6 +151,13 @@ gi doctor # Human-readable output
gi doctor --json # JSON output for scripting
```
Checks performed:
- Config file existence and validity
- Database existence and pragmas (WAL mode, foreign keys)
- GitLab authentication
- Project accessibility
- Ollama connectivity (optional)
### `gi ingest`
Sync data from GitLab to local database.
@@ -128,8 +166,14 @@ Sync data from GitLab to local database.
gi ingest --type issues # Sync all projects
gi ingest --type issues --project group/repo # Single project
gi ingest --type issues --force # Override stale lock
gi ingest --type issues --full # Full re-sync (reset cursors)
```
The `--full` flag resets sync cursors and fetches all data from scratch, useful when:
- Assignee data or other fields were missing from earlier syncs
- You want to ensure complete data after schema changes
- Troubleshooting sync issues
### `gi list issues`
Query issues from local database.
@@ -139,8 +183,8 @@ gi list issues # Recent issues (default 50)
gi list issues --limit 100 # More results
gi list issues --state opened # Only open issues
gi list issues --state closed # Only closed issues
gi list issues --author username # By author
gi list issues --assignee username # By assignee
gi list issues --author username # By author (@ prefix optional)
gi list issues --assignee username # By assignee (@ prefix optional)
gi list issues --label bug # By label (AND logic)
gi list issues --label bug --label urgent # Multiple labels
gi list issues --milestone "v1.0" # By milestone title
@@ -155,6 +199,8 @@ gi list issues --open # Open first result in browser
gi list issues --json # JSON output
```
Output includes: IID, title, state, author, assignee, labels, and update time.
### `gi show issue`
Display detailed issue information.
@@ -164,6 +210,8 @@ gi show issue 123 # Show issue #123
gi show issue 123 --project group/repo # Disambiguate if needed
```
Shows: title, description, state, author, assignees, labels, milestone, due date, web URL, and threaded discussions.
### `gi count`
Count entities in local database.
@@ -172,7 +220,7 @@ Count entities in local database.
gi count issues # Total issues
gi count discussions # Total discussions
gi count discussions --type issue # Issue discussions only
gi count notes # Total notes
gi count notes # Total notes (shows system vs user breakdown)
```
### `gi sync-status`
@@ -183,6 +231,11 @@ Show current sync state and watermarks.
gi sync-status
```
Displays:
- Last sync run details (status, timing)
- Cursor positions per project and resource type
- Data summary counts
### `gi migrate`
Run pending database migrations.
@@ -191,6 +244,8 @@ Run pending database migrations.
gi migrate
```
Shows current schema version and applies any pending migrations.
### `gi version`
Show version information.
@@ -199,21 +254,47 @@ Show version information.
gi version
```
### `gi backup`
Create timestamped database backup.
```bash
gi backup
```
*Note: Not yet implemented.*
### `gi reset`
Delete database and reset all state.
```bash
gi reset --confirm
```
*Note: Not yet implemented.*
## Database Schema
Data is stored in SQLite with the following main tables:
Data is stored in SQLite with WAL mode and foreign keys enabled. Main tables:
- **projects**: Tracked GitLab projects
- **issues**: Issue metadata (title, state, author, assignee info, due date, milestone)
- **milestones**: Project milestones with state and due dates
- **issue_assignees**: Many-to-many issue-assignee relationships
- **labels**: Project labels with colors
- **issue_labels**: Many-to-many issue-label relationships
- **discussions**: Issue/MR discussions
- **notes**: Individual notes within discussions
- **raw_payloads**: Compressed original API responses
| Table | Purpose |
|-------|---------|
| `projects` | Tracked GitLab projects with metadata |
| `issues` | Issue metadata (title, state, author, due date, milestone) |
| `milestones` | Project milestones with state and due dates |
| `labels` | Project labels with colors |
| `issue_labels` | Many-to-many issue-label relationships |
| `issue_assignees` | Many-to-many issue-assignee relationships |
| `discussions` | Issue/MR discussion threads |
| `notes` | Individual notes within discussions (with system note flag) |
| `sync_runs` | Audit trail of sync operations |
| `sync_cursors` | Cursor positions for incremental sync |
| `app_locks` | Crash-safe single-flight lock |
| `raw_payloads` | Compressed original API responses |
| `schema_version` | Migration version tracking |
The database is stored at `~/.local/share/gi/gi.db` by default.
The database is stored at `~/.local/share/gi/gi.db` by default (XDG compliant).
## Global Options
@@ -230,6 +311,9 @@ cargo test
# Run with debug logging
RUST_LOG=gi=debug gi list issues
# Run with trace logging
RUST_LOG=gi=trace gi ingest --type issues
# Check formatting
cargo fmt --check
@@ -246,6 +330,23 @@ cargo clippy
- **tokio** for async runtime
- **serde** for serialization
- **tracing** for logging
- **indicatif** for progress bars
## Current Status
This is Checkpoint 1 (CP1) of the GitLab Knowledge Engine project. Currently implemented:
- Issue ingestion with cursor-based incremental sync
- Discussion and note syncing for issues
- Rich filtering and querying
- Full re-sync capability
Not yet implemented:
- Merge request support (CP2)
- Semantic search with embeddings (CP3+)
- Backup and reset commands
See [SPEC.md](SPEC.md) for the full project roadmap and architecture.
## License