Files
gitlore/docs/prd/checkpoint-0.md
2026-01-28 15:49:14 -05:00

32 KiB

Checkpoint 0: Project Setup - PRD

Note: The project was renamed from "gitlab-inbox" to "gitlore" and the CLI from "gi" to "lore". References to "gi" in this document should be read as "lore".

Version: 1.0 Status: Ready for Implementation Depends On: None (first checkpoint) Enables: Checkpoint 1 (Issue Ingestion)


Overview

Objective

Scaffold the gi CLI tool with verified GitLab API connectivity, database infrastructure, and foundational CLI commands. This checkpoint establishes the project foundation that all subsequent checkpoints build upon.

Success Criteria

Criterion Validation
gi init writes config and validates against GitLab gi doctor shows GitLab OK
gi auth-test succeeds with real PAT Shows username and display name
Database migrations apply correctly gi doctor shows DB OK
SQLite pragmas set correctly WAL, FK, busy_timeout verified
App lock mechanism works Concurrent runs blocked
Config resolves from XDG paths Works from any directory

Deliverables

1. Project Structure

Create the following directory structure:

gitlab-inbox/
├── src/
│   ├── cli/
│   │   ├── index.ts              # CLI entry point (Commander.js)
│   │   └── commands/
│   │       ├── init.ts           # gi init
│   │       ├── auth-test.ts      # gi auth-test
│   │       ├── doctor.ts         # gi doctor
│   │       ├── sync-status.ts    # gi sync-status (stub for CP0)
│   │       ├── backup.ts         # gi backup
│   │       └── reset.ts          # gi reset
│   ├── core/
│   │   ├── config.ts             # Config loading/validation (Zod)
│   │   ├── db.ts                 # Database connection + migrations
│   │   ├── errors.ts             # Custom error classes
│   │   ├── logger.ts             # pino logger setup
│   │   └── paths.ts              # XDG path resolution
│   ├── gitlab/
│   │   ├── client.ts             # GitLab API client with rate limiting
│   │   └── types.ts              # GitLab API response types
│   └── types/
│       └── index.ts              # Shared TypeScript types
├── tests/
│   ├── unit/
│   │   ├── config.test.ts
│   │   ├── db.test.ts
│   │   ├── paths.test.ts
│   │   └── errors.test.ts
│   ├── integration/
│   │   ├── gitlab-client.test.ts
│   │   ├── app-lock.test.ts
│   │   └── init.test.ts
│   ├── live/                     # Gated by GITLAB_LIVE_TESTS=1
│   │   └── gitlab-client.live.test.ts
│   └── fixtures/
│       └── mock-responses/
├── migrations/
│   └── 001_initial.sql
├── package.json
├── tsconfig.json
├── vitest.config.ts
├── eslint.config.js
└── .gitignore

2. Config + Data Locations (XDG Compliant)

Location Default Path Override
Config ~/.config/gi/config.json GI_CONFIG_PATH env var or --config flag
Database ~/.local/share/gi/data.db storage.dbPath in config
Backups ~/.local/share/gi/backups/ storage.backupDir in config
Logs stderr (not persisted) LOG_PATH env var

Config Resolution Order:

  1. --config /path/to/config.json (explicit CLI flag)
  2. GI_CONFIG_PATH environment variable
  3. ~/.config/gi/config.json (XDG default)
  4. ./gi.config.json (local development fallback - useful during dev)

Implementation (src/core/paths.ts):

import { homedir } from 'node:os';
import { join } from 'node:path';
import { existsSync } from 'node:fs';

export function getConfigPath(cliOverride?: string): string {
  // 1. CLI flag override
  if (cliOverride) return cliOverride;

  // 2. Environment variable
  if (process.env.GI_CONFIG_PATH) return process.env.GI_CONFIG_PATH;

  // 3. XDG default
  const xdgConfig = process.env.XDG_CONFIG_HOME || join(homedir(), '.config');
  const xdgPath = join(xdgConfig, 'gi', 'config.json');
  if (existsSync(xdgPath)) return xdgPath;

  // 4. Local fallback (for development)
  const localPath = join(process.cwd(), 'gi.config.json');
  if (existsSync(localPath)) return localPath;

  // Return XDG path (will trigger not-found error if missing)
  return xdgPath;
}

export function getDataDir(): string {
  const xdgData = process.env.XDG_DATA_HOME || join(homedir(), '.local', 'share');
  return join(xdgData, 'gi');
}

export function getDbPath(configOverride?: string): string {
  if (configOverride) return configOverride;
  return join(getDataDir(), 'data.db');
}

export function getBackupDir(configOverride?: string): string {
  if (configOverride) return configOverride;
  return join(getDataDir(), 'backups');
}

3. Timestamp Convention (Global)

All *_at integer columns are milliseconds since Unix epoch (UTC).

Context Format Example
Database columns INTEGER (ms epoch) 1706313600000
GitLab API responses ISO 8601 string "2024-01-27T00:00:00.000Z"
CLI display ISO 8601 or relative 2024-01-27 or 3 days ago
Config durations Seconds (with suffix in name) staleLockMinutes: 10

Conversion utilities (src/core/time.ts):

// GitLab API → Database
export function isoToMs(isoString: string): number {
  return new Date(isoString).getTime();
}

// Database → Display
export function msToIso(ms: number): string {
  return new Date(ms).toISOString();
}

// Current time for database storage
export function nowMs(): number {
  return Date.now();
}

Dependencies

Runtime Dependencies

{
  "dependencies": {
    "better-sqlite3": "^11.0.0",
    "sqlite-vec": "^0.1.0",
    "commander": "^12.0.0",
    "zod": "^3.23.0",
    "pino": "^9.0.0",
    "pino-pretty": "^11.0.0",
    "ora": "^8.0.0",
    "chalk": "^5.3.0",
    "cli-table3": "^0.6.0",
    "inquirer": "^9.0.0"
  }
}

Dev Dependencies

{
  "devDependencies": {
    "typescript": "^5.4.0",
    "@types/better-sqlite3": "^7.6.0",
    "@types/node": "^20.0.0",
    "vitest": "^1.6.0",
    "msw": "^2.3.0",
    "eslint": "^9.0.0",
    "@typescript-eslint/eslint-plugin": "^7.0.0",
    "@typescript-eslint/parser": "^7.0.0",
    "tsx": "^4.0.0"
  }
}

Configuration Schema

Config File Structure

// src/types/config.ts
import { z } from 'zod';

export const ConfigSchema = z.object({
  gitlab: z.object({
    baseUrl: z.string().url(),
    tokenEnvVar: z.string().default('GITLAB_TOKEN'),
  }),
  projects: z.array(z.object({
    path: z.string().min(1),
  })).min(1),
  sync: z.object({
    backfillDays: z.number().int().positive().default(14),
    staleLockMinutes: z.number().int().positive().default(10),
    heartbeatIntervalSeconds: z.number().int().positive().default(30),
    cursorRewindSeconds: z.number().int().nonnegative().default(2),
    primaryConcurrency: z.number().int().positive().default(4),
    dependentConcurrency: z.number().int().positive().default(2),
  }).default({}),
  storage: z.object({
    dbPath: z.string().optional(),
    backupDir: z.string().optional(),
    compressRawPayloads: z.boolean().default(true),
  }).default({}),
  embedding: z.object({
    provider: z.literal('ollama').default('ollama'),
    model: z.string().default('nomic-embed-text'),
    baseUrl: z.string().url().default('http://localhost:11434'),
    concurrency: z.number().int().positive().default(4),
  }).default({}),
});

export type Config = z.infer<typeof ConfigSchema>;

Example Config File

{
  "gitlab": {
    "baseUrl": "https://gitlab.example.com",
    "tokenEnvVar": "GITLAB_TOKEN"
  },
  "projects": [
    { "path": "group/project-one" },
    { "path": "group/project-two" }
  ],
  "sync": {
    "backfillDays": 14,
    "staleLockMinutes": 10,
    "heartbeatIntervalSeconds": 30,
    "cursorRewindSeconds": 2,
    "primaryConcurrency": 4,
    "dependentConcurrency": 2
  },
  "storage": {
    "compressRawPayloads": true
  },
  "embedding": {
    "provider": "ollama",
    "model": "nomic-embed-text",
    "baseUrl": "http://localhost:11434",
    "concurrency": 4
  }
}

Database Schema

Migration 001_initial.sql

-- Schema version tracking
CREATE TABLE schema_version (
  version INTEGER PRIMARY KEY,
  applied_at INTEGER NOT NULL,      -- ms epoch UTC
  description TEXT
);

INSERT INTO schema_version (version, applied_at, description)
VALUES (1, strftime('%s', 'now') * 1000, 'Initial schema');

-- Projects table (configured targets)
CREATE TABLE projects (
  id INTEGER PRIMARY KEY,
  gitlab_project_id INTEGER UNIQUE NOT NULL,
  path_with_namespace TEXT NOT NULL,
  default_branch TEXT,
  web_url TEXT,
  created_at INTEGER,               -- ms epoch UTC
  updated_at INTEGER,               -- ms epoch UTC
  raw_payload_id INTEGER REFERENCES raw_payloads(id)
);
CREATE INDEX idx_projects_path ON projects(path_with_namespace);

-- Sync tracking for reliability
CREATE TABLE sync_runs (
  id INTEGER PRIMARY KEY,
  started_at INTEGER NOT NULL,      -- ms epoch UTC
  heartbeat_at INTEGER NOT NULL,    -- ms epoch UTC
  finished_at INTEGER,              -- ms epoch UTC
  status TEXT NOT NULL,             -- 'running' | 'succeeded' | 'failed'
  command TEXT NOT NULL,            -- 'init' | 'ingest issues' | 'sync' | etc.
  error TEXT,
  metrics_json TEXT                 -- JSON blob of per-run counters/timing
);

-- metrics_json schema (informational, not enforced):
-- {
--   "apiCalls": number,
--   "rateLimitHits": number,
--   "pagesFetched": number,
--   "entitiesUpserted": number,
--   "discussionsFetched": number,
--   "notesUpserted": number,
--   "docsRegenerated": number,
--   "embeddingsCreated": number,
--   "durationMs": number
-- }

-- Crash-safe single-flight lock (DB-enforced)
CREATE TABLE app_locks (
  name TEXT PRIMARY KEY,            -- 'sync'
  owner TEXT NOT NULL,              -- random run token (UUIDv4)
  acquired_at INTEGER NOT NULL,     -- ms epoch UTC
  heartbeat_at INTEGER NOT NULL     -- ms epoch UTC
);

-- Sync cursors for primary resources only
CREATE TABLE sync_cursors (
  project_id INTEGER NOT NULL REFERENCES projects(id),
  resource_type TEXT NOT NULL,      -- 'issues' | 'merge_requests'
  updated_at_cursor INTEGER,        -- ms epoch UTC, last fully processed
  tie_breaker_id INTEGER,           -- last fully processed gitlab_id
  PRIMARY KEY(project_id, resource_type)
);

-- Raw payload storage (decoupled from entity tables)
CREATE TABLE raw_payloads (
  id INTEGER PRIMARY KEY,
  source TEXT NOT NULL,             -- 'gitlab'
  project_id INTEGER REFERENCES projects(id),
  resource_type TEXT NOT NULL,      -- 'project' | 'issue' | 'mr' | 'note' | 'discussion'
  gitlab_id TEXT NOT NULL,          -- TEXT: discussion IDs are strings
  fetched_at INTEGER NOT NULL,      -- ms epoch UTC
  content_encoding TEXT NOT NULL DEFAULT 'identity',  -- 'identity' | 'gzip'
  payload_hash TEXT NOT NULL,       -- SHA-256 of decoded JSON bytes (pre-compression)
  payload BLOB NOT NULL             -- raw JSON or gzip-compressed JSON
);
CREATE INDEX idx_raw_payloads_lookup ON raw_payloads(project_id, resource_type, gitlab_id);
CREATE INDEX idx_raw_payloads_history ON raw_payloads(project_id, resource_type, gitlab_id, fetched_at);
CREATE UNIQUE INDEX uq_raw_payloads_dedupe
  ON raw_payloads(project_id, resource_type, gitlab_id, payload_hash);

SQLite Runtime Pragmas

Set on every database connection:

// src/core/db.ts
import Database from 'better-sqlite3';

export function createConnection(dbPath: string): Database.Database {
  const db = new Database(dbPath);

  // Production-grade defaults for single-user CLI
  db.pragma('journal_mode = WAL');
  db.pragma('synchronous = NORMAL');      // Safe for WAL on local disk
  db.pragma('foreign_keys = ON');
  db.pragma('busy_timeout = 5000');       // 5s wait on lock contention
  db.pragma('temp_store = MEMORY');       // Small speed win

  return db;
}

Error Classes

// src/core/errors.ts

export class GiError extends Error {
  constructor(
    message: string,
    public readonly code: string,
    public readonly cause?: Error
  ) {
    super(message);
    this.name = 'GiError';
  }
}

// Config errors
export class ConfigNotFoundError extends GiError {
  constructor(searchedPath: string) {
    super(
      `Config file not found at ${searchedPath}. Run "gi init" first.`,
      'CONFIG_NOT_FOUND'
    );
  }
}

export class ConfigValidationError extends GiError {
  constructor(details: string) {
    super(`Invalid config: ${details}`, 'CONFIG_INVALID');
  }
}

// GitLab API errors
export class GitLabAuthError extends GiError {
  constructor() {
    super(
      'GitLab authentication failed. Check your token has read_api scope.',
      'GITLAB_AUTH_FAILED'
    );
  }
}

export class GitLabNotFoundError extends GiError {
  constructor(resource: string) {
    super(`GitLab resource not found: ${resource}`, 'GITLAB_NOT_FOUND');
  }
}

export class GitLabRateLimitError extends GiError {
  constructor(public readonly retryAfter: number) {
    super(`Rate limited. Retry after ${retryAfter}s`, 'GITLAB_RATE_LIMITED');
  }
}

export class GitLabNetworkError extends GiError {
  constructor(baseUrl: string, cause?: Error) {
    super(
      `Cannot connect to GitLab at ${baseUrl}`,
      'GITLAB_NETWORK_ERROR',
      cause
    );
  }
}

// Database errors
export class DatabaseLockError extends GiError {
  constructor(owner: string, acquiredAt: number) {
    super(
      `Another sync is running (owner: ${owner}, started: ${new Date(acquiredAt).toISOString()}). Use --force to override if stale.`,
      'DB_LOCKED'
    );
  }
}

export class MigrationError extends GiError {
  constructor(version: number, cause: Error) {
    super(
      `Migration ${version} failed: ${cause.message}`,
      'MIGRATION_FAILED',
      cause
    );
  }
}

// Token errors
export class TokenNotSetError extends GiError {
  constructor(envVar: string) {
    super(
      `GitLab token not set. Export ${envVar} environment variable.`,
      'TOKEN_NOT_SET'
    );
  }
}

Logging Configuration

// src/core/logger.ts
import pino from 'pino';

// Logs go to stderr, results to stdout (allows clean JSON piping)
export const logger = pino({
  level: process.env.LOG_LEVEL || 'info',
  transport: process.env.NODE_ENV === 'production' ? undefined : {
    target: 'pino-pretty',
    options: {
      colorize: true,
      destination: 2,  // stderr
      translateTime: 'SYS:standard',
      ignore: 'pid,hostname'
    }
  }
}, pino.destination(2));

// Create child loggers for components
export const dbLogger = logger.child({ component: 'db' });
export const gitlabLogger = logger.child({ component: 'gitlab' });
export const configLogger = logger.child({ component: 'config' });

Log Levels:

Level When to use
debug Detailed API calls, SQL queries, config resolution
info Sync start/complete, project counts, major milestones
warn Rate limits hit, retries, Ollama unavailable
error Failures that stop operations

CLI Commands (Checkpoint 0)

gi init

Interactive setup wizard that creates config at XDG path.

Flow:

  1. Check if config already exists → prompt to overwrite
  2. Prompt for GitLab base URL
  3. Prompt for project paths (comma-separated or one at a time)
  4. Prompt for token env var name (default: GITLAB_TOKEN)
  5. Validate before writing:
    • Token must be set in environment
    • Test auth with GET /api/v4/user
    • Validate each project path with GET /api/v4/projects/:path
  6. Write config file
  7. Initialize database with migrations
  8. Insert validated projects into projects table

Flags:

  • --config <path>: Write config to specific path
  • --force: Skip overwrite confirmation
  • --non-interactive: Fail if prompts would be shown (for scripting)

Exit codes:

  • 0: Success
  • 1: Validation failed (token, auth, project not found)
  • 2: User cancelled

gi auth-test

Verify GitLab authentication.

Output:

Authenticated as @johndoe (John Doe)
GitLab: https://gitlab.example.com (v16.8.0)

Exit codes:

  • 0: Auth successful
  • 1: Auth failed

gi doctor

Check environment health.

Output:

gi doctor

  Config     ✓  Loaded from ~/.config/gi/config.json
  Database   ✓  ~/.local/share/gi/data.db (schema v1)
  GitLab     ✓  https://gitlab.example.com (authenticated as @johndoe)
  Projects   ✓  2 configured, 2 resolved
  Ollama     ⚠  Not running (semantic search unavailable)

Status: Ready (lexical search available, semantic search requires Ollama)

Flags:

  • --json: Output as JSON for scripting

JSON output schema:

interface DoctorResult {
  success: boolean;       // All required checks passed
  checks: {
    config: { status: 'ok' | 'error'; path?: string; error?: string };
    database: { status: 'ok' | 'error'; path?: string; schemaVersion?: number; error?: string };
    gitlab: { status: 'ok' | 'error'; url?: string; username?: string; error?: string };
    projects: { status: 'ok' | 'error'; configured?: number; resolved?: number; error?: string };
    ollama: { status: 'ok' | 'warning' | 'error'; url?: string; model?: string; error?: string };
  };
}

gi version

Show version information.

Output:

gi version 0.1.0

gi backup

Create timestamped database backup.

Output:

Created backup: ~/.local/share/gi/backups/data-2026-01-24T10-30-00.db

gi reset --confirm

Delete database and reset all state.

Output:

This will delete:
  - Database: ~/.local/share/gi/data.db
  - All sync cursors
  - All cached data

Type 'yes' to confirm: yes
Database reset. Run 'gi sync' to repopulate.

gi sync-status

Show sync state (stub in CP0, full implementation in CP1).

Output (CP0 stub):

No sync runs yet. Run 'gi sync' to start.

GitLab Client

Core Client Implementation

// src/gitlab/client.ts
import { GitLabAuthError, GitLabNotFoundError, GitLabRateLimitError, GitLabNetworkError } from '../core/errors';
import { gitlabLogger } from '../core/logger';

interface GitLabClientOptions {
  baseUrl: string;
  token: string;
  requestsPerSecond?: number;
}

interface GitLabUser {
  id: number;
  username: string;
  name: string;
}

interface GitLabProject {
  id: number;
  path_with_namespace: string;
  default_branch: string;
  web_url: string;
  created_at: string;
  updated_at: string;
}

export class GitLabClient {
  private baseUrl: string;
  private token: string;
  private rateLimiter: RateLimiter;

  constructor(options: GitLabClientOptions) {
    this.baseUrl = options.baseUrl.replace(/\/$/, '');
    this.token = options.token;
    this.rateLimiter = new RateLimiter(options.requestsPerSecond ?? 10);
  }

  async getCurrentUser(): Promise<GitLabUser> {
    return this.request<GitLabUser>('/api/v4/user');
  }

  async getProject(pathWithNamespace: string): Promise<GitLabProject> {
    const encoded = encodeURIComponent(pathWithNamespace);
    return this.request<GitLabProject>(`/api/v4/projects/${encoded}`);
  }

  private async request<T>(path: string, options: RequestInit = {}): Promise<T> {
    await this.rateLimiter.acquire();

    const url = `${this.baseUrl}${path}`;
    gitlabLogger.debug({ url }, 'GitLab request');

    let response: Response;
    try {
      response = await fetch(url, {
        ...options,
        headers: {
          'PRIVATE-TOKEN': this.token,
          'Accept': 'application/json',
          ...options.headers,
        },
      });
    } catch (err) {
      throw new GitLabNetworkError(this.baseUrl, err as Error);
    }

    if (response.status === 401) {
      throw new GitLabAuthError();
    }

    if (response.status === 404) {
      throw new GitLabNotFoundError(path);
    }

    if (response.status === 429) {
      const retryAfter = parseInt(response.headers.get('Retry-After') || '60', 10);
      throw new GitLabRateLimitError(retryAfter);
    }

    if (!response.ok) {
      throw new Error(`GitLab API error: ${response.status} ${response.statusText}`);
    }

    return response.json() as Promise<T>;
  }
}

// Simple rate limiter with jitter
class RateLimiter {
  private lastRequest = 0;
  private minInterval: number;

  constructor(requestsPerSecond: number) {
    this.minInterval = 1000 / requestsPerSecond;
  }

  async acquire(): Promise<void> {
    const now = Date.now();
    const elapsed = now - this.lastRequest;

    if (elapsed < this.minInterval) {
      const jitter = Math.random() * 50; // 0-50ms jitter
      await new Promise(resolve => setTimeout(resolve, this.minInterval - elapsed + jitter));
    }

    this.lastRequest = Date.now();
  }
}

App Lock Mechanism

Crash-safe single-flight lock using heartbeat pattern.

// src/core/lock.ts
import { randomUUID } from 'node:crypto';
import Database from 'better-sqlite3';
import { DatabaseLockError } from './errors';
import { dbLogger } from './logger';
import { nowMs } from './time';

interface LockOptions {
  name: string;
  staleLockMinutes: number;
  heartbeatIntervalSeconds: number;
}

export class AppLock {
  private db: Database.Database;
  private owner: string;
  private name: string;
  private staleLockMs: number;
  private heartbeatIntervalMs: number;
  private heartbeatTimer?: NodeJS.Timeout;
  private released = false;

  constructor(db: Database.Database, options: LockOptions) {
    this.db = db;
    this.owner = randomUUID();
    this.name = options.name;
    this.staleLockMs = options.staleLockMinutes * 60 * 1000;
    this.heartbeatIntervalMs = options.heartbeatIntervalSeconds * 1000;
  }

  acquire(force = false): boolean {
    const now = nowMs();

    return this.db.transaction(() => {
      const existing = this.db.prepare(
        'SELECT owner, acquired_at, heartbeat_at FROM app_locks WHERE name = ?'
      ).get(this.name) as { owner: string; acquired_at: number; heartbeat_at: number } | undefined;

      if (!existing) {
        // No lock exists, acquire it
        this.db.prepare(
          'INSERT INTO app_locks (name, owner, acquired_at, heartbeat_at) VALUES (?, ?, ?, ?)'
        ).run(this.name, this.owner, now, now);
        this.startHeartbeat();
        dbLogger.info({ owner: this.owner }, 'Lock acquired (new)');
        return true;
      }

      const isStale = (now - existing.heartbeat_at) > this.staleLockMs;

      if (isStale || force) {
        // Lock is stale or force override, take it
        this.db.prepare(
          'UPDATE app_locks SET owner = ?, acquired_at = ?, heartbeat_at = ? WHERE name = ?'
        ).run(this.owner, now, now, this.name);
        this.startHeartbeat();
        dbLogger.info({ owner: this.owner, previousOwner: existing.owner, wasStale: isStale }, 'Lock acquired (override)');
        return true;
      }

      if (existing.owner === this.owner) {
        // Re-entrant, update heartbeat
        this.db.prepare(
          'UPDATE app_locks SET heartbeat_at = ? WHERE name = ?'
        ).run(now, this.name);
        return true;
      }

      // Lock held by another active process
      throw new DatabaseLockError(existing.owner, existing.acquired_at);
    })();
  }

  release(): void {
    if (this.released) return;
    this.released = true;

    if (this.heartbeatTimer) {
      clearInterval(this.heartbeatTimer);
    }

    this.db.prepare('DELETE FROM app_locks WHERE name = ? AND owner = ?')
      .run(this.name, this.owner);

    dbLogger.info({ owner: this.owner }, 'Lock released');
  }

  private startHeartbeat(): void {
    this.heartbeatTimer = setInterval(() => {
      if (this.released) return;

      this.db.prepare('UPDATE app_locks SET heartbeat_at = ? WHERE name = ? AND owner = ?')
        .run(nowMs(), this.name, this.owner);

      dbLogger.debug({ owner: this.owner }, 'Heartbeat updated');
    }, this.heartbeatIntervalMs);

    // Don't prevent process from exiting
    this.heartbeatTimer.unref();
  }
}

Raw Payload Handling

Compression and Deduplication

// src/core/payloads.ts
import { createHash } from 'node:crypto';
import { gzipSync, gunzipSync } from 'node:zlib';
import Database from 'better-sqlite3';
import { nowMs } from './time';

interface StorePayloadOptions {
  projectId: number | null;
  resourceType: string;
  gitlabId: string;
  payload: unknown;
  compress: boolean;
}

export function storePayload(
  db: Database.Database,
  options: StorePayloadOptions
): number | null {
  const jsonBytes = Buffer.from(JSON.stringify(options.payload));
  const payloadHash = createHash('sha256').update(jsonBytes).digest('hex');

  // Check for duplicate (same content already stored)
  const existing = db.prepare(`
    SELECT id FROM raw_payloads
    WHERE project_id IS ? AND resource_type = ? AND gitlab_id = ? AND payload_hash = ?
  `).get(options.projectId, options.resourceType, options.gitlabId, payloadHash) as { id: number } | undefined;

  if (existing) {
    // Duplicate content, return existing ID
    return existing.id;
  }

  const encoding = options.compress ? 'gzip' : 'identity';
  const payloadBytes = options.compress ? gzipSync(jsonBytes) : jsonBytes;

  const result = db.prepare(`
    INSERT INTO raw_payloads
    (source, project_id, resource_type, gitlab_id, fetched_at, content_encoding, payload_hash, payload)
    VALUES ('gitlab', ?, ?, ?, ?, ?, ?, ?)
  `).run(
    options.projectId,
    options.resourceType,
    options.gitlabId,
    nowMs(),
    encoding,
    payloadHash,
    payloadBytes
  );

  return result.lastInsertRowid as number;
}

export function readPayload(
  db: Database.Database,
  id: number
): unknown {
  const row = db.prepare(
    'SELECT content_encoding, payload FROM raw_payloads WHERE id = ?'
  ).get(id) as { content_encoding: string; payload: Buffer } | undefined;

  if (!row) return null;

  const jsonBytes = row.content_encoding === 'gzip'
    ? gunzipSync(row.payload)
    : row.payload;

  return JSON.parse(jsonBytes.toString());
}

Automated Tests

Unit Tests

tests/unit/config.test.ts

describe('Config', () => {
  it('loads config from file path');
  it('throws ConfigNotFoundError if file missing');
  it('throws ConfigValidationError if required fields missing');
  it('validates project paths are non-empty strings');
  it('applies default values for optional fields');
  it('loads from XDG path by default');
  it('respects GI_CONFIG_PATH override');
  it('respects --config flag override');
});

tests/unit/db.test.ts

describe('Database', () => {
  it('creates database file if not exists');
  it('applies migrations in order');
  it('sets WAL journal mode');
  it('enables foreign keys');
  it('sets busy_timeout=5000');
  it('sets synchronous=NORMAL');
  it('sets temp_store=MEMORY');
  it('tracks schema version');
});

tests/unit/paths.test.ts

describe('Path Resolution', () => {
  it('uses XDG_CONFIG_HOME if set');
  it('falls back to ~/.config/gi if XDG not set');
  it('prefers --config flag over environment');
  it('prefers environment over XDG default');
  it('falls back to local gi.config.json in dev');
});

Integration Tests

tests/integration/gitlab-client.test.ts (mocked)

describe('GitLab Client', () => {
  it('authenticates with valid PAT');
  it('returns 401 for invalid PAT');
  it('fetches project by path');
  it('handles rate limiting (429) with Retry-After');
  it('respects rate limit (requests per second)');
  it('adds jitter to rate limiting');
});

tests/integration/app-lock.test.ts

describe('App Lock', () => {
  it('acquires lock successfully');
  it('updates heartbeat during operation');
  it('detects stale lock and recovers');
  it('refuses concurrent acquisition');
  it('allows force override');
  it('releases lock on completion');
});

tests/integration/init.test.ts

describe('gi init', () => {
  it('creates config file with valid structure');
  it('validates GitLab URL format');
  it('validates GitLab connection before writing config');
  it('validates each project path exists in GitLab');
  it('fails if token not set');
  it('fails if GitLab auth fails');
  it('fails if any project path not found');
  it('prompts before overwriting existing config');
  it('respects --force to skip confirmation');
  it('generates config with sensible defaults');
  it('creates data directory if missing');
});

Live Tests (Gated)

tests/live/gitlab-client.live.test.ts

// Only runs when GITLAB_LIVE_TESTS=1
describe('GitLab Client (Live)', () => {
  it('authenticates with real PAT');
  it('fetches real project by path');
  it('handles actual rate limiting');
});

Manual Smoke Tests

Command Expected Output Pass Criteria
gi --help Command list Shows all available commands
gi version Version number Shows installed version
gi init Interactive prompts Creates valid config
gi init (config exists) Confirmation prompt Warns before overwriting
gi init --force No prompt Overwrites without asking
gi auth-test Authenticated as @username Shows GitLab username
GITLAB_TOKEN=invalid gi auth-test Error message Non-zero exit, clear error
gi doctor Status table All required checks pass
gi doctor --json JSON object Valid JSON, success: true
gi backup Backup path Creates timestamped backup
gi sync-status No runs message Stub output works

Definition of Done

Gate (Must Pass)

  • gi init writes config to XDG path and validates projects against GitLab
  • gi auth-test succeeds with real PAT (live test, can be manual)
  • gi doctor reports DB ok + GitLab ok (Ollama may warn if not running)
  • DB migrations apply; WAL + FK enabled; busy_timeout + synchronous set
  • App lock mechanism works (concurrent runs blocked)
  • All unit tests pass
  • All integration tests pass (mocked)
  • ESLint passes with no errors
  • TypeScript compiles with strict mode

Hardening (Optional Before CP1)

  • Additional negative-path tests (overwrite prompts, JSON outputs)
  • Edge cases: empty project list, invalid URLs, network timeouts
  • Config migration from old paths (if upgrading)
  • Live tests pass against real GitLab instance

Implementation Order

  1. Project scaffold (5 min)

    • package.json, tsconfig.json, vitest.config.ts, eslint.config.js
    • Directory structure
    • .gitignore
  2. Core utilities (30 min)

    • src/core/paths.ts - XDG path resolution
    • src/core/time.ts - Timestamp utilities
    • src/core/errors.ts - Error classes
    • src/core/logger.ts - pino setup
  3. Config loading (30 min)

    • src/core/config.ts - Zod schema, load/validate
    • Unit tests for config
  4. Database (45 min)

    • src/core/db.ts - Connection, pragmas, migrations
    • migrations/001_initial.sql
    • Unit tests for DB
    • App lock mechanism
  5. GitLab client (30 min)

    • src/gitlab/client.ts - API client with rate limiting
    • src/gitlab/types.ts - Response types
    • Integration tests (mocked)
  6. Raw payload handling (20 min)

    • src/core/payloads.ts - Compression, deduplication, storage
  7. CLI commands (60 min)

    • src/cli/index.ts - Commander setup
    • gi init - Full implementation
    • gi auth-test - Simple
    • gi doctor - Health checks
    • gi version - Version display
    • gi backup - Database backup
    • gi reset - Database reset
    • gi sync-status - Stub
  8. Final validation (15 min)

    • Run all tests
    • Manual smoke tests
    • ESLint + TypeScript check

Risks & Mitigations

Risk Mitigation
sqlite-vec installation fails Document manual install steps; degrade to FTS-only
better-sqlite3 native compilation Provide prebuilt binaries in package
XDG paths not writable Fall back to cwd; show clear error
GitLab API changes Pin to known API version; document tested version

References