Files
gitlore/SPEC-REVISIONS-3.md
2026-01-28 15:49:14 -05:00

14 KiB

SPEC.md Revisions - First-Time User Experience

Note: The project was renamed from "gitlab-inbox" to "gitlore" and the CLI from "gi" to "lore". References to "gi" in this document should be read as "lore".

Date: 2026-01-21 Purpose: Document all changes adding installation, setup, and user flow documentation to SPEC.md


Summary of Changes

Change Location Description
1. Quick Start After Executive Summary Prerequisites, installation, first-run walkthrough
2. gi init Command Checkpoint 0 Interactive setup wizard with GitLab validation
3. CLI Command Reference Before Future Work Unified table of all commands
4. Error Handling After CLI Reference Common errors with recovery guidance
5. Database Management After Error Handling Location, backup, reset, migrations
6. Empty State Handling Checkpoint 4 scope Behavior when no data indexed
7. Resolved Decisions Resolved Decisions table New decisions from this revision

Change 1: Quick Start Section

Location: Insert after line 6 (after Executive Summary), before Discovery Summary

 A self-hosted tool to extract, index, and semantically search 2+ years of GitLab data (issues, MRs, and discussion threads) from 2 main repositories (~50-100K documents including threaded discussions). The MVP delivers semantic search as a foundational capability that enables future specialized views (file history, personal tracking, person context). Discussion threads are preserved as first-class entities to maintain conversational context essential for decision traceability.

 ---

+## Quick Start
+
+### Prerequisites
+
+| Requirement | Version | Notes |
+|-------------|---------|-------|
+| Node.js | 20+ | LTS recommended |
+| npm | 10+ | Comes with Node.js |
+| Ollama | Latest | Optional for semantic search; lexical search works without it |
+
+### Installation
+
+```bash
+# Clone and install
+git clone https://github.com/your-org/gitlab-inbox.git
+cd gitlab-inbox
+npm install
+npm run build
+npm link  # Makes `gi` available globally
+```
+
+### First Run
+
+1. **Set your GitLab token** (create at GitLab > Settings > Access Tokens with `read_api` scope):
+   ```bash
+   export GITLAB_TOKEN="glpat-xxxxxxxxxxxxxxxxxxxx"
+   ```
+
+2. **Run the setup wizard:**
+   ```bash
+   gi init
+   ```
+   This creates `gi.config.json` with your GitLab URL and project paths.
+
+3. **Verify your environment:**
+   ```bash
+   gi doctor
+   ```
+   All checks should pass (Ollama warning is OK if you only need lexical search).
+
+4. **Sync your data:**
+   ```bash
+   gi sync
+   ```
+   Initial sync takes 10-20 minutes depending on repo size and rate limits.
+
+5. **Search:**
+   ```bash
+   gi search "authentication redesign"
+   ```
+
+### Troubleshooting First Run
+
+| Symptom | Solution |
+|---------|----------|
+| `Config file not found` | Run `gi init` first |
+| `GITLAB_TOKEN not set` | Export the environment variable |
+| `401 Unauthorized` | Check token has `read_api` scope |
+| `Project not found: group/project` | Verify project path in GitLab URL |
+| `Ollama connection refused` | Start Ollama or use `--mode=lexical` for search |
+
+---
+
 ## Discovery Summary

Change 2: gi init Command in Checkpoint 0

Location: Insert in Checkpoint 0 Manual CLI Smoke Tests table and Scope section

2a: Add to Manual CLI Smoke Tests table (after line 193)

 | `GITLAB_TOKEN=invalid gi auth-test` | Error message | Non-zero exit code, clear error about auth failure |
+| `gi init` | Interactive prompts | Creates valid gi.config.json |
+| `gi init` (config exists) | Confirmation prompt | Warns before overwriting |
+| `gi --help` | Command list | Shows all available commands |
+| `gi version` | Version number | Shows installed version |

2b: Add Automated Tests for init (after line 185)

 tests/integration/app-lock.test.ts
   ✓ acquires lock successfully
   ✓ updates heartbeat during operation
   ✓ detects stale lock and recovers
   ✓ refuses concurrent acquisition
+
+tests/integration/init.test.ts
+  ✓ creates config file with valid structure
+  ✓ validates GitLab URL format
+  ✓ validates GitLab connection before writing config
+  ✓ validates each project path exists in GitLab
+  ✓ fails if token not set
+  ✓ fails if GitLab auth fails
+  ✓ fails if any project path not found
+  ✓ prompts before overwriting existing config
+  ✓ respects --force to skip confirmation

2c: Add to Checkpoint 0 Scope (after line 209)

 - Rate limit handling with exponential backoff + jitter
+- `gi init` command for guided setup:
+  - Prompts for GitLab base URL
+  - Prompts for project paths (comma-separated or multiple prompts)
+  - Prompts for token environment variable name (default: GITLAB_TOKEN)
+  - **Validates before writing config:**
+    - Token must be set in environment
+    - Tests auth with `GET /user` endpoint
+    - Validates each project path with `GET /projects/:path`
+    - Only writes config after all validations pass
+  - Generates `gi.config.json` with sensible defaults
+- `gi --help` shows all available commands
+- `gi <command> --help` shows command-specific help
+- `gi version` shows installed version
+- First-run detection: if no config exists, suggest `gi init`

Change 3: CLI Command Reference Section

Location: Insert before "## Future Work (Post-MVP)" (before line 1174)

+## CLI Command Reference
+
+All commands support `--help` for detailed usage information.
+
+### Setup & Diagnostics
+
+| Command | CP | Description |
+|---------|-----|-------------|
+| `gi init` | 0 | Interactive setup wizard; creates gi.config.json |
+| `gi auth-test` | 0 | Verify GitLab authentication |
+| `gi doctor` | 0 | Check environment (GitLab, Ollama, DB) |
+| `gi doctor --json` | 0 | JSON output for scripting |
+| `gi version` | 0 | Show installed version |
+
+### Data Ingestion
+
+| Command | CP | Description |
+|---------|-----|-------------|
+| `gi ingest --type=issues` | 1 | Fetch issues from GitLab |
+| `gi ingest --type=merge_requests` | 2 | Fetch MRs and discussions |
+| `gi embed --all` | 3 | Generate embeddings for all documents |
+| `gi embed --retry-failed` | 3 | Retry failed embeddings |
+| `gi sync` | 5 | Full sync orchestration (ingest + docs + embed) |
+| `gi sync --full` | 5 | Force complete re-sync (reset cursors) |
+| `gi sync --force` | 5 | Override stale lock after operator review |
+| `gi sync --no-embed` | 5 | Sync without embedding (faster) |
+
+### Data Inspection
+
+| Command | CP | Description |
+|---------|-----|-------------|
+| `gi list issues [--limit=N] [--project=PATH]` | 1 | List issues |
+| `gi list mrs [--limit=N]` | 2 | List merge requests |
+| `gi count issues` | 1 | Count issues |
+| `gi count mrs` | 2 | Count merge requests |
+| `gi count discussions` | 2 | Count discussions |
+| `gi count notes` | 2 | Count notes |
+| `gi show issue <iid>` | 1 | Show issue details |
+| `gi show mr <iid>` | 2 | Show MR details with discussions |
+| `gi stats` | 3 | Embedding coverage statistics |
+| `gi stats --json` | 3 | JSON stats for scripting |
+| `gi sync-status` | 1 | Show cursor positions and last sync |
+
+### Search
+
+| Command | CP | Description |
+|---------|-----|-------------|
+| `gi search "query"` | 4 | Hybrid semantic + lexical search |
+| `gi search "query" --mode=lexical` | 3 | Lexical-only search (no Ollama required) |
+| `gi search "query" --type=issue\|mr\|discussion` | 4 | Filter by document type |
+| `gi search "query" --author=USERNAME` | 4 | Filter by author |
+| `gi search "query" --after=YYYY-MM-DD` | 4 | Filter by date |
+| `gi search "query" --label=NAME` | 4 | Filter by label (repeatable) |
+| `gi search "query" --project=PATH` | 4 | Filter by project |
+| `gi search "query" --path=FILE` | 4 | Filter by file path |
+| `gi search "query" --json` | 4 | JSON output for scripting |
+| `gi search "query" --explain` | 4 | Show ranking breakdown |
+
+### Database Management
+
+| Command | CP | Description |
+|---------|-----|-------------|
+| `gi backup` | 0 | Create timestamped database backup |
+| `gi reset --confirm` | 0 | Delete database and reset cursors |
+
+---
+
 ## Future Work (Post-MVP)

Change 4: Error Handling Section

Location: Insert after CLI Command Reference, before Future Work

+## Error Handling
+
+Common errors and their resolutions:
+
+### Configuration Errors
+
+| Error | Cause | Resolution |
+|-------|-------|------------|
+| `Config file not found` | No gi.config.json | Run `gi init` to create configuration |
+| `Invalid config: missing baseUrl` | Malformed config | Re-run `gi init` or fix gi.config.json manually |
+| `Invalid config: no projects defined` | Empty projects array | Add at least one project path to config |
+
+### Authentication Errors
+
+| Error | Cause | Resolution |
+|-------|-------|------------|
+| `GITLAB_TOKEN environment variable not set` | Token not exported | `export GITLAB_TOKEN="glpat-xxx"` |
+| `401 Unauthorized` | Invalid or expired token | Generate new token with `read_api` scope |
+| `403 Forbidden` | Token lacks permissions | Ensure token has `read_api` scope |
+
+### GitLab API Errors
+
+| Error | Cause | Resolution |
+|-------|-------|------------|
+| `Project not found: group/project` | Invalid project path | Verify path matches GitLab URL (case-sensitive) |
+| `429 Too Many Requests` | Rate limited | Wait for Retry-After period; sync will auto-retry |
+| `Connection refused` | GitLab unreachable | Check GitLab URL and network connectivity |
+
+### Data Errors
+
+| Error | Cause | Resolution |
+|-------|-------|------------|
+| `No documents indexed` | Sync not run | Run `gi sync` first |
+| `No results found` | Query too specific | Try broader search terms |
+| `Database locked` | Concurrent access | Wait for other process; use `gi sync --force` if stale |
+
+### Embedding Errors
+
+| Error | Cause | Resolution |
+|-------|-------|------------|
+| `Ollama connection refused` | Ollama not running | Start Ollama or use `--mode=lexical` |
+| `Model not found: nomic-embed-text` | Model not pulled | Run `ollama pull nomic-embed-text` |
+| `Embedding failed for N documents` | Transient failures | Run `gi embed --retry-failed` |
+
+### Operational Behavior
+
+| Scenario | Behavior |
+|----------|----------|
+| **Ctrl+C during sync** | Graceful shutdown: finishes current page, commits cursor, exits cleanly. Resume with `gi sync`. |
+| **Disk full during write** | Fails with clear error. Cursor preserved at last successful commit. Free space and resume. |
+| **Stale lock detected** | Lock held > 10 minutes without heartbeat is considered stale. Next sync auto-recovers. |
+| **Network interruption** | Retries with exponential backoff. After max retries, sync fails but cursor is preserved. |
+
+---
+
 ## Future Work (Post-MVP)

Change 5: Database Management Section

Location: Insert after Error Handling, before Future Work

+## Database Management
+
+### Database Location
+
+The SQLite database is stored at an XDG-compliant location:
+
+```
+~/.local/share/gi/data.db
+```
+
+This can be overridden in `gi.config.json`:
+
+```json
+{
+  "storage": {
+    "dbPath": "/custom/path/to/data.db"
+  }
+}
+```
+
+### Backup
+
+Create a timestamped backup of the database:
+
+```bash
+gi backup
+# Creates: ~/.local/share/gi/backups/data-2026-01-21T14-30-00.db
+```
+
+Backups are SQLite `.backup` command copies (safe even during active writes due to WAL mode).
+
+### Reset
+
+To completely reset the database and all sync cursors:
+
+```bash
+gi reset --confirm
+```
+
+This deletes:
+- The database file
+- All sync cursors
+- All embeddings
+
+You'll need to run `gi sync` again to repopulate.
+
+### Schema Migrations
+
+Database schema is version-tracked and migrations auto-apply on startup:
+
+1. On first run, schema is created at latest version
+2. On subsequent runs, pending migrations are applied automatically
+3. Migration version is stored in `schema_version` table
+4. Migrations are idempotent and reversible where possible
+
+**Manual migration check:**
+```bash
+gi doctor --json | jq '.checks.database'
+# Shows: { "status": "ok", "schemaVersion": 5, "pendingMigrations": 0 }
+```
+
+---
+
 ## Future Work (Post-MVP)

Change 6: Empty State Handling in Checkpoint 4

Location: Add to Checkpoint 4 scope section (around line 885, after "Graceful degradation")

 - Graceful degradation: if Ollama is unreachable, fall back to FTS5-only search with warning
+- Empty state handling:
+  - No documents indexed: `No data indexed. Run 'gi sync' first.`
+  - Query returns no results: `No results found for "query".`
+  - Filters exclude all results: `No results match the specified filters.`
+  - Helpful hints shown in non-JSON mode (e.g., "Try broadening your search")

Location: Add to Manual CLI Smoke Tests table (after gi search "xyznonexistent123" row)

 | `gi search "xyznonexistent123"` | No results message | Graceful empty state |
+| `gi search "auth"` (no data synced) | No data message | Shows "Run gi sync first" |

Change 7: Update Resolved Decisions Table

Location: Add new rows to Resolved Decisions table (around line 1280)

 | JSON output | **Stable documented schema** | Enables reliable agent/MCP consumption |
+| Database location | **XDG compliant: `~/.local/share/gi/`** | Standard location, user-configurable |
+| `gi init` validation | **Validate GitLab before writing config** | Fail fast, better UX |
+| Ctrl+C handling | **Graceful shutdown** | Finish page, commit cursor, exit cleanly |
+| Empty state UX | **Actionable messages** | Guide user to next step |

Files Modified

File Action
SPEC.md 7 changes applied
SPEC-REVISIONS-3.md Created (this file)

Verification Checklist

After applying changes:

  • Quick Start section provides clear 5-step onboarding
  • gi init fully specified with validation behavior
  • All CLI commands documented in reference table
  • Error scenarios have recovery guidance
  • Database location and management documented
  • Empty states have helpful messages
  • Resolved Decisions updated with new choices
  • No orphaned command references