docs(readme): add timeline pipeline documentation and schema updates
Documents the timeline pipeline feature in the README: - New feature bullets: timeline pipeline, git history linking, file change tracking - Updated schema table: merge_requests now includes commit SHAs, added mr_file_changes table - New "Timeline Pipeline" section explaining the 5-stage architecture (SEED -> HYDRATE -> EXPAND -> COLLECT -> RENDER) with a table of all event types and a note on unresolved cross-project references Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
44
README.md
44
README.md
@@ -1,6 +1,6 @@
|
|||||||
# Gitlore
|
# Gitlore
|
||||||
|
|
||||||
Local GitLab data management with semantic search. Syncs issues, MRs, discussions, and notes from GitLab to a local SQLite database for fast, offline-capable querying, filtering, and hybrid search.
|
Local GitLab data management with semantic search and temporal intelligence. Syncs issues, MRs, discussions, and notes from GitLab to a local SQLite database for fast, offline-capable querying, filtering, hybrid search, and chronological event reconstruction.
|
||||||
|
|
||||||
## Features
|
## Features
|
||||||
|
|
||||||
@@ -10,6 +10,9 @@ Local GitLab data management with semantic search. Syncs issues, MRs, discussion
|
|||||||
- **Multi-project**: Track issues and MRs across multiple GitLab projects
|
- **Multi-project**: Track issues and MRs across multiple GitLab projects
|
||||||
- **Rich filtering**: Filter by state, author, assignee, labels, milestone, due date, draft status, reviewer, branches
|
- **Rich filtering**: Filter by state, author, assignee, labels, milestone, due date, draft status, reviewer, branches
|
||||||
- **Hybrid search**: Combines FTS5 lexical search with Ollama-powered vector embeddings via Reciprocal Rank Fusion
|
- **Hybrid search**: Combines FTS5 lexical search with Ollama-powered vector embeddings via Reciprocal Rank Fusion
|
||||||
|
- **Timeline pipeline**: Reconstructs chronological event histories by combining search, graph traversal, and event aggregation across related entities
|
||||||
|
- **Git history linking**: Tracks merge and squash commit SHAs to connect MRs with git history
|
||||||
|
- **File change tracking**: Records which files each MR touches, enabling file-level history queries
|
||||||
- **Raw payload storage**: Preserves original GitLab API responses for debugging
|
- **Raw payload storage**: Preserves original GitLab API responses for debugging
|
||||||
- **Discussion threading**: Full support for issue and MR discussions including inline code review comments
|
- **Discussion threading**: Full support for issue and MR discussions including inline code review comments
|
||||||
- **Cross-reference tracking**: Automatic extraction of "closes", "mentioned" relationships between MRs and issues
|
- **Cross-reference tracking**: Automatic extraction of "closes", "mentioned" relationships between MRs and issues
|
||||||
@@ -518,7 +521,7 @@ Data is stored in SQLite with WAL mode and foreign keys enabled. Main tables:
|
|||||||
|-------|---------|
|
|-------|---------|
|
||||||
| `projects` | Tracked GitLab projects with metadata |
|
| `projects` | Tracked GitLab projects with metadata |
|
||||||
| `issues` | Issue metadata (title, state, author, due date, milestone) |
|
| `issues` | Issue metadata (title, state, author, due date, milestone) |
|
||||||
| `merge_requests` | MR metadata (title, state, draft, branches, merge status) |
|
| `merge_requests` | MR metadata (title, state, draft, branches, merge status, commit SHAs) |
|
||||||
| `milestones` | Project milestones with state and due dates |
|
| `milestones` | Project milestones with state and due dates |
|
||||||
| `labels` | Project labels with colors |
|
| `labels` | Project labels with colors |
|
||||||
| `issue_labels` | Many-to-many issue-label relationships |
|
| `issue_labels` | Many-to-many issue-label relationships |
|
||||||
@@ -526,6 +529,7 @@ Data is stored in SQLite with WAL mode and foreign keys enabled. Main tables:
|
|||||||
| `mr_labels` | Many-to-many MR-label relationships |
|
| `mr_labels` | Many-to-many MR-label relationships |
|
||||||
| `mr_assignees` | Many-to-many MR-assignee relationships |
|
| `mr_assignees` | Many-to-many MR-assignee relationships |
|
||||||
| `mr_reviewers` | Many-to-many MR-reviewer relationships |
|
| `mr_reviewers` | Many-to-many MR-reviewer relationships |
|
||||||
|
| `mr_file_changes` | Files touched by each MR (path, change type, renames) |
|
||||||
| `discussions` | Issue/MR discussion threads |
|
| `discussions` | Issue/MR discussion threads |
|
||||||
| `notes` | Individual notes within discussions (with system note flag and DiffNote position data) |
|
| `notes` | Individual notes within discussions (with system note flag and DiffNote position data) |
|
||||||
| `resource_state_events` | Issue/MR state change history (opened, closed, merged, reopened) |
|
| `resource_state_events` | Issue/MR state change history (opened, closed, merged, reopened) |
|
||||||
@@ -545,6 +549,42 @@ Data is stored in SQLite with WAL mode and foreign keys enabled. Main tables:
|
|||||||
|
|
||||||
The database is stored at `~/.local/share/lore/lore.db` by default (XDG compliant).
|
The database is stored at `~/.local/share/lore/lore.db` by default (XDG compliant).
|
||||||
|
|
||||||
|
## Timeline Pipeline
|
||||||
|
|
||||||
|
The timeline pipeline reconstructs chronological event histories for GitLab entities by combining full-text search, cross-reference graph traversal, and resource event aggregation. Given a search query, it identifies relevant issues and MRs, discovers related entities through their reference graph, and assembles a unified, time-ordered event stream.
|
||||||
|
|
||||||
|
### Stages
|
||||||
|
|
||||||
|
The pipeline executes in five stages:
|
||||||
|
|
||||||
|
1. **SEED** -- Full-text search identifies the most relevant issues and MRs matching the query. Documents (issue bodies, MR descriptions, discussion notes) are ranked by BM25 relevance.
|
||||||
|
|
||||||
|
2. **HYDRATE** -- Evidence notes are extracted from the seed results: the top FTS-matched discussion notes with 200-character snippets that explain *why* each entity was surfaced.
|
||||||
|
|
||||||
|
3. **EXPAND** -- Breadth-first traversal over the `entity_references` graph discovers related entities. Starting from seed entities, the pipeline follows "closes", "related", and optionally "mentioned" references up to a configurable depth, tracking provenance (which entity referenced which, via what method).
|
||||||
|
|
||||||
|
4. **COLLECT** -- Events are gathered for all discovered entities (seeds + expanded). Event types include: creation, state changes, label adds/removes, milestone assignments, merge events, and evidence notes. Events are sorted chronologically with stable tiebreaking (timestamp, then entity ID, then event type).
|
||||||
|
|
||||||
|
5. **RENDER** -- Events are formatted for output as human-readable text or structured JSON.
|
||||||
|
|
||||||
|
### Event Types
|
||||||
|
|
||||||
|
| Event | Description |
|
||||||
|
|-------|-------------|
|
||||||
|
| `Created` | Entity creation |
|
||||||
|
| `StateChanged` | State transitions (opened, closed, reopened) |
|
||||||
|
| `LabelAdded` | Label applied to entity |
|
||||||
|
| `LabelRemoved` | Label removed from entity |
|
||||||
|
| `MilestoneSet` | Milestone assigned |
|
||||||
|
| `MilestoneRemoved` | Milestone removed |
|
||||||
|
| `Merged` | MR merged (deduplicated against state events) |
|
||||||
|
| `NoteEvidence` | Discussion note matched by FTS, with snippet |
|
||||||
|
| `CrossReferenced` | Reference to another entity |
|
||||||
|
|
||||||
|
### Unresolved References
|
||||||
|
|
||||||
|
When the graph expansion encounters cross-project references to entities not yet synced locally, these are collected as unresolved references in the pipeline output. This enables discovery of external dependencies and can inform future sync targets.
|
||||||
|
|
||||||
## Development
|
## Development
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
|||||||
Reference in New Issue
Block a user