14 KiB
SPEC.md Revisions - First-Time User Experience
Note: The project was renamed from "gitlab-inbox" to "gitlore" and the CLI from "gi" to "lore". References to "gi" in this document should be read as "lore".
Date: 2026-01-21 Purpose: Document all changes adding installation, setup, and user flow documentation to SPEC.md
Summary of Changes
| Change | Location | Description |
|---|---|---|
| 1. Quick Start | After Executive Summary | Prerequisites, installation, first-run walkthrough |
2. gi init Command |
Checkpoint 0 | Interactive setup wizard with GitLab validation |
| 3. CLI Command Reference | Before Future Work | Unified table of all commands |
| 4. Error Handling | After CLI Reference | Common errors with recovery guidance |
| 5. Database Management | After Error Handling | Location, backup, reset, migrations |
| 6. Empty State Handling | Checkpoint 4 scope | Behavior when no data indexed |
| 7. Resolved Decisions | Resolved Decisions table | New decisions from this revision |
Change 1: Quick Start Section
Location: Insert after line 6 (after Executive Summary), before Discovery Summary
A self-hosted tool to extract, index, and semantically search 2+ years of GitLab data (issues, MRs, and discussion threads) from 2 main repositories (~50-100K documents including threaded discussions). The MVP delivers semantic search as a foundational capability that enables future specialized views (file history, personal tracking, person context). Discussion threads are preserved as first-class entities to maintain conversational context essential for decision traceability.
---
+## Quick Start
+
+### Prerequisites
+
+| Requirement | Version | Notes |
+|-------------|---------|-------|
+| Node.js | 20+ | LTS recommended |
+| npm | 10+ | Comes with Node.js |
+| Ollama | Latest | Optional for semantic search; lexical search works without it |
+
+### Installation
+
+```bash
+# Clone and install
+git clone https://github.com/your-org/gitlab-inbox.git
+cd gitlab-inbox
+npm install
+npm run build
+npm link # Makes `gi` available globally
+```
+
+### First Run
+
+1. **Set your GitLab token** (create at GitLab > Settings > Access Tokens with `read_api` scope):
+ ```bash
+ export GITLAB_TOKEN="glpat-xxxxxxxxxxxxxxxxxxxx"
+ ```
+
+2. **Run the setup wizard:**
+ ```bash
+ gi init
+ ```
+ This creates `gi.config.json` with your GitLab URL and project paths.
+
+3. **Verify your environment:**
+ ```bash
+ gi doctor
+ ```
+ All checks should pass (Ollama warning is OK if you only need lexical search).
+
+4. **Sync your data:**
+ ```bash
+ gi sync
+ ```
+ Initial sync takes 10-20 minutes depending on repo size and rate limits.
+
+5. **Search:**
+ ```bash
+ gi search "authentication redesign"
+ ```
+
+### Troubleshooting First Run
+
+| Symptom | Solution |
+|---------|----------|
+| `Config file not found` | Run `gi init` first |
+| `GITLAB_TOKEN not set` | Export the environment variable |
+| `401 Unauthorized` | Check token has `read_api` scope |
+| `Project not found: group/project` | Verify project path in GitLab URL |
+| `Ollama connection refused` | Start Ollama or use `--mode=lexical` for search |
+
+---
+
## Discovery Summary
Change 2: gi init Command in Checkpoint 0
Location: Insert in Checkpoint 0 Manual CLI Smoke Tests table and Scope section
2a: Add to Manual CLI Smoke Tests table (after line 193)
| `GITLAB_TOKEN=invalid gi auth-test` | Error message | Non-zero exit code, clear error about auth failure |
+| `gi init` | Interactive prompts | Creates valid gi.config.json |
+| `gi init` (config exists) | Confirmation prompt | Warns before overwriting |
+| `gi --help` | Command list | Shows all available commands |
+| `gi version` | Version number | Shows installed version |
2b: Add Automated Tests for init (after line 185)
tests/integration/app-lock.test.ts
✓ acquires lock successfully
✓ updates heartbeat during operation
✓ detects stale lock and recovers
✓ refuses concurrent acquisition
+
+tests/integration/init.test.ts
+ ✓ creates config file with valid structure
+ ✓ validates GitLab URL format
+ ✓ validates GitLab connection before writing config
+ ✓ validates each project path exists in GitLab
+ ✓ fails if token not set
+ ✓ fails if GitLab auth fails
+ ✓ fails if any project path not found
+ ✓ prompts before overwriting existing config
+ ✓ respects --force to skip confirmation
2c: Add to Checkpoint 0 Scope (after line 209)
- Rate limit handling with exponential backoff + jitter
+- `gi init` command for guided setup:
+ - Prompts for GitLab base URL
+ - Prompts for project paths (comma-separated or multiple prompts)
+ - Prompts for token environment variable name (default: GITLAB_TOKEN)
+ - **Validates before writing config:**
+ - Token must be set in environment
+ - Tests auth with `GET /user` endpoint
+ - Validates each project path with `GET /projects/:path`
+ - Only writes config after all validations pass
+ - Generates `gi.config.json` with sensible defaults
+- `gi --help` shows all available commands
+- `gi <command> --help` shows command-specific help
+- `gi version` shows installed version
+- First-run detection: if no config exists, suggest `gi init`
Change 3: CLI Command Reference Section
Location: Insert before "## Future Work (Post-MVP)" (before line 1174)
+## CLI Command Reference
+
+All commands support `--help` for detailed usage information.
+
+### Setup & Diagnostics
+
+| Command | CP | Description |
+|---------|-----|-------------|
+| `gi init` | 0 | Interactive setup wizard; creates gi.config.json |
+| `gi auth-test` | 0 | Verify GitLab authentication |
+| `gi doctor` | 0 | Check environment (GitLab, Ollama, DB) |
+| `gi doctor --json` | 0 | JSON output for scripting |
+| `gi version` | 0 | Show installed version |
+
+### Data Ingestion
+
+| Command | CP | Description |
+|---------|-----|-------------|
+| `gi ingest --type=issues` | 1 | Fetch issues from GitLab |
+| `gi ingest --type=merge_requests` | 2 | Fetch MRs and discussions |
+| `gi embed --all` | 3 | Generate embeddings for all documents |
+| `gi embed --retry-failed` | 3 | Retry failed embeddings |
+| `gi sync` | 5 | Full sync orchestration (ingest + docs + embed) |
+| `gi sync --full` | 5 | Force complete re-sync (reset cursors) |
+| `gi sync --force` | 5 | Override stale lock after operator review |
+| `gi sync --no-embed` | 5 | Sync without embedding (faster) |
+
+### Data Inspection
+
+| Command | CP | Description |
+|---------|-----|-------------|
+| `gi list issues [--limit=N] [--project=PATH]` | 1 | List issues |
+| `gi list mrs [--limit=N]` | 2 | List merge requests |
+| `gi count issues` | 1 | Count issues |
+| `gi count mrs` | 2 | Count merge requests |
+| `gi count discussions` | 2 | Count discussions |
+| `gi count notes` | 2 | Count notes |
+| `gi show issue <iid>` | 1 | Show issue details |
+| `gi show mr <iid>` | 2 | Show MR details with discussions |
+| `gi stats` | 3 | Embedding coverage statistics |
+| `gi stats --json` | 3 | JSON stats for scripting |
+| `gi sync-status` | 1 | Show cursor positions and last sync |
+
+### Search
+
+| Command | CP | Description |
+|---------|-----|-------------|
+| `gi search "query"` | 4 | Hybrid semantic + lexical search |
+| `gi search "query" --mode=lexical` | 3 | Lexical-only search (no Ollama required) |
+| `gi search "query" --type=issue\|mr\|discussion` | 4 | Filter by document type |
+| `gi search "query" --author=USERNAME` | 4 | Filter by author |
+| `gi search "query" --after=YYYY-MM-DD` | 4 | Filter by date |
+| `gi search "query" --label=NAME` | 4 | Filter by label (repeatable) |
+| `gi search "query" --project=PATH` | 4 | Filter by project |
+| `gi search "query" --path=FILE` | 4 | Filter by file path |
+| `gi search "query" --json` | 4 | JSON output for scripting |
+| `gi search "query" --explain` | 4 | Show ranking breakdown |
+
+### Database Management
+
+| Command | CP | Description |
+|---------|-----|-------------|
+| `gi backup` | 0 | Create timestamped database backup |
+| `gi reset --confirm` | 0 | Delete database and reset cursors |
+
+---
+
## Future Work (Post-MVP)
Change 4: Error Handling Section
Location: Insert after CLI Command Reference, before Future Work
+## Error Handling
+
+Common errors and their resolutions:
+
+### Configuration Errors
+
+| Error | Cause | Resolution |
+|-------|-------|------------|
+| `Config file not found` | No gi.config.json | Run `gi init` to create configuration |
+| `Invalid config: missing baseUrl` | Malformed config | Re-run `gi init` or fix gi.config.json manually |
+| `Invalid config: no projects defined` | Empty projects array | Add at least one project path to config |
+
+### Authentication Errors
+
+| Error | Cause | Resolution |
+|-------|-------|------------|
+| `GITLAB_TOKEN environment variable not set` | Token not exported | `export GITLAB_TOKEN="glpat-xxx"` |
+| `401 Unauthorized` | Invalid or expired token | Generate new token with `read_api` scope |
+| `403 Forbidden` | Token lacks permissions | Ensure token has `read_api` scope |
+
+### GitLab API Errors
+
+| Error | Cause | Resolution |
+|-------|-------|------------|
+| `Project not found: group/project` | Invalid project path | Verify path matches GitLab URL (case-sensitive) |
+| `429 Too Many Requests` | Rate limited | Wait for Retry-After period; sync will auto-retry |
+| `Connection refused` | GitLab unreachable | Check GitLab URL and network connectivity |
+
+### Data Errors
+
+| Error | Cause | Resolution |
+|-------|-------|------------|
+| `No documents indexed` | Sync not run | Run `gi sync` first |
+| `No results found` | Query too specific | Try broader search terms |
+| `Database locked` | Concurrent access | Wait for other process; use `gi sync --force` if stale |
+
+### Embedding Errors
+
+| Error | Cause | Resolution |
+|-------|-------|------------|
+| `Ollama connection refused` | Ollama not running | Start Ollama or use `--mode=lexical` |
+| `Model not found: nomic-embed-text` | Model not pulled | Run `ollama pull nomic-embed-text` |
+| `Embedding failed for N documents` | Transient failures | Run `gi embed --retry-failed` |
+
+### Operational Behavior
+
+| Scenario | Behavior |
+|----------|----------|
+| **Ctrl+C during sync** | Graceful shutdown: finishes current page, commits cursor, exits cleanly. Resume with `gi sync`. |
+| **Disk full during write** | Fails with clear error. Cursor preserved at last successful commit. Free space and resume. |
+| **Stale lock detected** | Lock held > 10 minutes without heartbeat is considered stale. Next sync auto-recovers. |
+| **Network interruption** | Retries with exponential backoff. After max retries, sync fails but cursor is preserved. |
+
+---
+
## Future Work (Post-MVP)
Change 5: Database Management Section
Location: Insert after Error Handling, before Future Work
+## Database Management
+
+### Database Location
+
+The SQLite database is stored at an XDG-compliant location:
+
+```
+~/.local/share/gi/data.db
+```
+
+This can be overridden in `gi.config.json`:
+
+```json
+{
+ "storage": {
+ "dbPath": "/custom/path/to/data.db"
+ }
+}
+```
+
+### Backup
+
+Create a timestamped backup of the database:
+
+```bash
+gi backup
+# Creates: ~/.local/share/gi/backups/data-2026-01-21T14-30-00.db
+```
+
+Backups are SQLite `.backup` command copies (safe even during active writes due to WAL mode).
+
+### Reset
+
+To completely reset the database and all sync cursors:
+
+```bash
+gi reset --confirm
+```
+
+This deletes:
+- The database file
+- All sync cursors
+- All embeddings
+
+You'll need to run `gi sync` again to repopulate.
+
+### Schema Migrations
+
+Database schema is version-tracked and migrations auto-apply on startup:
+
+1. On first run, schema is created at latest version
+2. On subsequent runs, pending migrations are applied automatically
+3. Migration version is stored in `schema_version` table
+4. Migrations are idempotent and reversible where possible
+
+**Manual migration check:**
+```bash
+gi doctor --json | jq '.checks.database'
+# Shows: { "status": "ok", "schemaVersion": 5, "pendingMigrations": 0 }
+```
+
+---
+
## Future Work (Post-MVP)
Change 6: Empty State Handling in Checkpoint 4
Location: Add to Checkpoint 4 scope section (around line 885, after "Graceful degradation")
- Graceful degradation: if Ollama is unreachable, fall back to FTS5-only search with warning
+- Empty state handling:
+ - No documents indexed: `No data indexed. Run 'gi sync' first.`
+ - Query returns no results: `No results found for "query".`
+ - Filters exclude all results: `No results match the specified filters.`
+ - Helpful hints shown in non-JSON mode (e.g., "Try broadening your search")
Location: Add to Manual CLI Smoke Tests table (after gi search "xyznonexistent123" row)
| `gi search "xyznonexistent123"` | No results message | Graceful empty state |
+| `gi search "auth"` (no data synced) | No data message | Shows "Run gi sync first" |
Change 7: Update Resolved Decisions Table
Location: Add new rows to Resolved Decisions table (around line 1280)
| JSON output | **Stable documented schema** | Enables reliable agent/MCP consumption |
+| Database location | **XDG compliant: `~/.local/share/gi/`** | Standard location, user-configurable |
+| `gi init` validation | **Validate GitLab before writing config** | Fail fast, better UX |
+| Ctrl+C handling | **Graceful shutdown** | Finish page, commit cursor, exit cleanly |
+| Empty state UX | **Actionable messages** | Guide user to next step |
Files Modified
| File | Action |
|---|---|
SPEC.md |
7 changes applied |
SPEC-REVISIONS-3.md |
Created (this file) |
Verification Checklist
After applying changes:
- Quick Start section provides clear 5-step onboarding
gi initfully specified with validation behavior- All CLI commands documented in reference table
- Error scenarios have recovery guidance
- Database location and management documented
- Empty states have helpful messages
- Resolved Decisions updated with new choices
- No orphaned command references