docs(readme): add timeline pipeline documentation and schema updates

Documents the timeline pipeline feature in the README: - New feature bullets: timeline pipeline, git history linking, file change tracking - Updated schema table: merge_requests now includes commit SHAs, added mr_file_changes table - New "Timeline Pipeline" section explaining the 5-stage architecture (SEED -> HYDRATE -> EXPAND -> COLLECT -> RENDER) with a table of all event types and a note on unresolved cross-project references Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
docs(db): document safety invariants for sqlite-vec transmute
2026-02-06 08:38:48 -05:00 · 2026-02-06 08:38:41 -05:00 · 2026-02-06 08:38:34 -05:00 · 2026-02-06 08:38:24 -05:00 · 2026-02-06 08:38:12 -05:00
7 changed files with 518 additions and 119 deletions
--- a/README.md
+++ b/README.md
@@ -1,6 +1,6 @@
 # Gitlore
-Local GitLab data management with semantic search. Syncs issues, MRs, discussions, and notes from GitLab to a local SQLite database for fast, offline-capable querying, filtering, and hybrid search.
+Local GitLab data management with semantic search and temporal intelligence. Syncs issues, MRs, discussions, and notes from GitLab to a local SQLite database for fast, offline-capable querying, filtering, hybrid search, and chronological event reconstruction.
 ## Features
@@ -10,6 +10,9 @@ Local GitLab data management with semantic search. Syncs issues, MRs, discussion
 - **Multi-project**: Track issues and MRs across multiple GitLab projects
 - **Rich filtering**: Filter by state, author, assignee, labels, milestone, due date, draft status, reviewer, branches
 - **Hybrid search**: Combines FTS5 lexical search with Ollama-powered vector embeddings via Reciprocal Rank Fusion
 - **Timeline pipeline**: Reconstructs chronological event histories by combining search, graph traversal, and event aggregation across related entities
 - **Git history linking**: Tracks merge and squash commit SHAs to connect MRs with git history
 - **File change tracking**: Records which files each MR touches, enabling file-level history queries
 - **Raw payload storage**: Preserves original GitLab API responses for debugging
 - **Discussion threading**: Full support for issue and MR discussions including inline code review comments
 - **Cross-reference tracking**: Automatic extraction of "closes", "mentioned" relationships between MRs and issues
@@ -518,7 +521,7 @@ Data is stored in SQLite with WAL mode and foreign keys enabled. Main tables:
 |-------|---------|
 | `projects` | Tracked GitLab projects with metadata |
 | `issues` | Issue metadata (title, state, author, due date, milestone) |
-| `merge_requests` | MR metadata (title, state, draft, branches, merge status) |
+| `merge_requests` | MR metadata (title, state, draft, branches, merge status, commit SHAs) |
 | `milestones` | Project milestones with state and due dates |
 | `labels` | Project labels with colors |
 | `issue_labels` | Many-to-many issue-label relationships |
@@ -526,6 +529,7 @@ Data is stored in SQLite with WAL mode and foreign keys enabled. Main tables:
 | `mr_labels` | Many-to-many MR-label relationships |
 | `mr_assignees` | Many-to-many MR-assignee relationships |
 | `mr_reviewers` | Many-to-many MR-reviewer relationships |
 | `mr_file_changes` | Files touched by each MR (path, change type, renames) |
 | `discussions` | Issue/MR discussion threads |
 | `notes` | Individual notes within discussions (with system note flag and DiffNote position data) |
 | `resource_state_events` | Issue/MR state change history (opened, closed, merged, reopened) |
@@ -545,6 +549,42 @@ Data is stored in SQLite with WAL mode and foreign keys enabled. Main tables:
 The database is stored at `~/.local/share/lore/lore.db` by default (XDG compliant).
 ## Timeline Pipeline
 The timeline pipeline reconstructs chronological event histories for GitLab entities by combining full-text search, cross-reference graph traversal, and resource event aggregation. Given a search query, it identifies relevant issues and MRs, discovers related entities through their reference graph, and assembles a unified, time-ordered event stream.
 ### Stages
 The pipeline executes in five stages:
 1. **SEED** -- Full-text search identifies the most relevant issues and MRs matching the query. Documents (issue bodies, MR descriptions, discussion notes) are ranked by BM25 relevance.
 2. **HYDRATE** -- Evidence notes are extracted from the seed results: the top FTS-matched discussion notes with 200-character snippets that explain *why* each entity was surfaced.
 3. **EXPAND** -- Breadth-first traversal over the `entity_references` graph discovers related entities. Starting from seed entities, the pipeline follows "closes", "related", and optionally "mentioned" references up to a configurable depth, tracking provenance (which entity referenced which, via what method).
 4. **COLLECT** -- Events are gathered for all discovered entities (seeds + expanded). Event types include: creation, state changes, label adds/removes, milestone assignments, merge events, and evidence notes. Events are sorted chronologically with stable tiebreaking (timestamp, then entity ID, then event type).
 5. **RENDER** -- Events are formatted for output as human-readable text or structured JSON.
 ### Event Types
 | Event | Description |
 |-------|-------------|
 | `Created` | Entity creation |
 | `StateChanged` | State transitions (opened, closed, reopened) |
 | `LabelAdded` | Label applied to entity |
 | `LabelRemoved` | Label removed from entity |
 | `MilestoneSet` | Milestone assigned |
 | `MilestoneRemoved` | Milestone removed |
 | `Merged` | MR merged (deduplicated against state events) |
 | `NoteEvidence` | Discussion note matched by FTS, with snippet |
 | `CrossReferenced` | Reference to another entity |
 ### Unresolved References
 When the graph expansion encounters cross-project references to entities not yet synced locally, these are collected as unresolved references in the pipeline output. This enables discovery of external dependencies and can inform future sync targets.
 ## Development
 ```bash
--- a/src/core/db.rs
+++ b/src/core/db.rs
@@ -55,6 +55,13 @@ const MIGRATIONS: &[(&str, &str)] = &[
 ];
 pub fn create_connection(db_path: &Path) -> Result<Connection> {
    // SAFETY: `sqlite3_vec_init` is an extern "C" function provided by the sqlite-vec
    // crate with the exact signature expected by `sqlite3_auto_extension`. The transmute
    // converts the concrete function pointer to the `Option<unsafe extern "C" fn()>` type
    // that the FFI expects. This is safe because:
    // 1. The function is a C-ABI init callback with a stable signature.
    // 2. SQLite calls it once per new connection, matching sqlite-vec's contract.
    // 3. `sqlite3_auto_extension` is idempotent for the same function pointer.
    #[allow(clippy::missing_transmute_annotations)]
    unsafe {
        rusqlite::ffi::sqlite3_auto_extension(Some(std::mem::transmute(
--- a/src/core/timeline.rs
+++ b/src/core/timeline.rs
@@ -1,7 +1,10 @@
 use std::cmp::Ordering;
 use rusqlite::Connection;
 use serde::Serialize;
 use super::error::Result;
 /// The core timeline event. All pipeline stages produce or consume these.
 /// Spec ref: Section 3.3 "Event Model"
 #[derive(Debug, Clone, Serialize)]
@@ -121,7 +124,7 @@ pub struct UnresolvedRef {
    pub source: EntityRef,
    pub target_project: Option<String>,
    pub target_type: String,
-    pub target_iid: i64,
+    pub target_iid: Option<i64>,
    pub reference_type: String,
 }
@@ -135,6 +138,45 @@ pub struct TimelineResult {
    pub unresolved_references: Vec<UnresolvedRef>,
 }
 /// Resolve an entity's internal DB id to a full [`EntityRef`] with iid and project path.
 ///
 /// When `project_id` is `Some`, the query is scoped to that project.
 /// Returns `Ok(None)` for unknown entity types or when no matching row exists.
 pub fn resolve_entity_ref(
    conn: &Connection,
    entity_type: &str,
    entity_id: i64,
    project_id: Option<i64>,
 ) -> Result<Option<EntityRef>> {
    let table = match entity_type {
        "issue" => "issues",
        "merge_request" => "merge_requests",
        _ => return Ok(None),
    };
    let sql = format!(
        "SELECT e.iid, p.path_with_namespace
         FROM {table} e
         JOIN projects p ON p.id = e.project_id
         WHERE e.id = ?1 AND (?2 IS NULL OR e.project_id = ?2)"
    );
    let result = conn.query_row(&sql, rusqlite::params![entity_id, project_id], |row| {
        Ok((row.get::<_, i64>(0)?, row.get::<_, String>(1)?))
    });
    match result {
        Ok((iid, project_path)) => Ok(Some(EntityRef {
            entity_type: entity_type.to_owned(),
            entity_id,
            entity_iid: iid,
            project_path,
        })),
        Err(rusqlite::Error::QueryReturnedNoRows) => Ok(None),
        Err(e) => Err(e.into()),
    }
 }
 #[cfg(test)]
 mod tests {
    use super::*;
--- a/src/core/timeline_collect.rs
+++ b/src/core/timeline_collect.rs
@@ -1,6 +1,6 @@
 use rusqlite::Connection;
-use crate::core::error::Result;
+use crate::core::error::{LoreError, Result};
 use crate::core::timeline::{EntityRef, ExpandedEntityRef, TimelineEvent, TimelineEventType};
 /// Collect all events for seed and expanded entities, interleave chronologically.
@@ -118,7 +118,7 @@ fn collect_state_events(
    is_seed: bool,
    events: &mut Vec<TimelineEvent>,
 ) -> Result<()> {
-    let (id_col, id_val) = entity_id_column(entity);
+    let (id_col, id_val) = entity_id_column(entity)?;
    let sql = format!(
        "SELECT state, actor_username, created_at FROM resource_state_events
@@ -169,7 +169,7 @@ fn collect_label_events(
    is_seed: bool,
    events: &mut Vec<TimelineEvent>,
 ) -> Result<()> {
-    let (id_col, id_val) = entity_id_column(entity);
+    let (id_col, id_val) = entity_id_column(entity)?;
    let sql = format!(
        "SELECT action, label_name, actor_username, created_at FROM resource_label_events
@@ -231,7 +231,7 @@ fn collect_milestone_events(
    is_seed: bool,
    events: &mut Vec<TimelineEvent>,
 ) -> Result<()> {
-    let (id_col, id_val) = entity_id_column(entity);
+    let (id_col, id_val) = entity_id_column(entity)?;
    let sql = format!(
        "SELECT action, milestone_title, actor_username, created_at FROM resource_milestone_events
@@ -311,20 +311,25 @@ fn collect_merged_event(
        },
    );
-    if let Ok((Some(merged_at), merge_user, url)) = mr_result {
+    match mr_result {
-        events.push(TimelineEvent {
+        Ok((Some(merged_at), merge_user, url)) => {
-            timestamp: merged_at,
+            events.push(TimelineEvent {
-            entity_type: entity.entity_type.clone(),
+                timestamp: merged_at,
-            entity_id: entity.entity_id,
+                entity_type: entity.entity_type.clone(),
-            entity_iid: entity.entity_iid,
+                entity_id: entity.entity_id,
-            project_path: entity.project_path.clone(),
+                entity_iid: entity.entity_iid,
-            event_type: TimelineEventType::Merged,
+                project_path: entity.project_path.clone(),
-            summary: format!("MR !{} merged", entity.entity_iid),
+                event_type: TimelineEventType::Merged,
-            actor: merge_user,
+                summary: format!("MR !{} merged", entity.entity_iid),
-            url,
+                actor: merge_user,
-            is_seed,
+                url,
-        });
+                is_seed,
-        return Ok(());
+            });
            return Ok(());
        }
        Ok((None, _, _)) => {} // merged_at is NULL, try fallback
        Err(rusqlite::Error::QueryReturnedNoRows) => {} // entity not found, try fallback
        Err(e) => return Err(e.into()),
    }
    // Fallback: check resource_state_events for state='merged'
@@ -336,30 +341,37 @@ fn collect_merged_event(
        |row| Ok((row.get::<_, Option<String>>(0)?, row.get::<_, i64>(1)?)),
    );
-    if let Ok((actor, created_at)) = fallback_result {
+    match fallback_result {
-        events.push(TimelineEvent {
+        Ok((actor, created_at)) => {
-            timestamp: created_at,
+            events.push(TimelineEvent {
-            entity_type: entity.entity_type.clone(),
+                timestamp: created_at,
-            entity_id: entity.entity_id,
+                entity_type: entity.entity_type.clone(),
-            entity_iid: entity.entity_iid,
+                entity_id: entity.entity_id,
-            project_path: entity.project_path.clone(),
+                entity_iid: entity.entity_iid,
-            event_type: TimelineEventType::Merged,
+                project_path: entity.project_path.clone(),
-            summary: format!("MR !{} merged", entity.entity_iid),
+                event_type: TimelineEventType::Merged,
-            actor,
+                summary: format!("MR !{} merged", entity.entity_iid),
-            url: None,
+                actor,
-            is_seed,
+                url: None,
-        });
+                is_seed,
            });
        }
        Err(rusqlite::Error::QueryReturnedNoRows) => {} // no merged state event, MR wasn't merged
        Err(e) => return Err(e.into()),
    }
    Ok(())
 }
 /// Return the correct column name and value for querying resource event tables.
-fn entity_id_column(entity: &EntityRef) -> (&'static str, i64) {
+fn entity_id_column(entity: &EntityRef) -> Result<(&'static str, i64)> {
    match entity.entity_type.as_str() {
-        "issue" => ("issue_id", entity.entity_id),
+        "issue" => Ok(("issue_id", entity.entity_id)),
-        "merge_request" => ("merge_request_id", entity.entity_id),
+        "merge_request" => Ok(("merge_request_id", entity.entity_id)),
-        _ => ("issue_id", entity.entity_id), // shouldn't happen
+        _ => Err(LoreError::Other(format!(
            "Unknown entity type for event collection: {}",
            entity.entity_type
        ))),
    }
 }
--- a/src/core/timeline_expand.rs
+++ b/src/core/timeline_expand.rs
@@ -3,7 +3,7 @@ use std::collections::{HashSet, VecDeque};
 use rusqlite::Connection;
 use crate::core::error::Result;
-use crate::core::timeline::{EntityRef, ExpandedEntityRef, UnresolvedRef};
+use crate::core::timeline::{EntityRef, ExpandedEntityRef, UnresolvedRef, resolve_entity_ref};
 /// Result of the expand phase.
 pub struct ExpandResult {
@@ -167,7 +167,7 @@ fn find_outgoing(
        match target_id {
            Some(tid) => {
-                if let Some(resolved) = resolve_entity_ref(conn, &target_type, tid)? {
+                if let Some(resolved) = resolve_entity_ref(conn, &target_type, tid, None)? {
                    neighbors.push(Neighbor::Resolved {
                        entity_ref: resolved,
                        reference_type: ref_type,
@@ -180,7 +180,7 @@ fn find_outgoing(
                    source: entity.clone(),
                    target_project: target_project_path,
                    target_type,
-                    target_iid: target_iid.unwrap_or(0),
+                    target_iid,
                    reference_type: ref_type,
                }));
            }
@@ -235,7 +235,7 @@ fn find_incoming(
    for row_result in rows {
        let (source_type, source_id, ref_type, source_method) = row_result?;
-        if let Some(resolved) = resolve_entity_ref(conn, &source_type, source_id)? {
+        if let Some(resolved) = resolve_entity_ref(conn, &source_type, source_id, None)? {
            neighbors.push(Neighbor::Resolved {
                entity_ref: resolved,
                reference_type: ref_type,
@@ -247,41 +247,6 @@ fn find_incoming(
    Ok(())
 }
 /// Resolve an entity ID to a full EntityRef with iid and project_path.
 fn resolve_entity_ref(
    conn: &Connection,
    entity_type: &str,
    entity_id: i64,
 ) -> Result<Option<EntityRef>> {
    let table = match entity_type {
        "issue" => "issues",
        "merge_request" => "merge_requests",
        _ => return Ok(None),
    };
    let sql = format!(
        "SELECT e.iid, p.path_with_namespace
         FROM {table} e
         JOIN projects p ON p.id = e.project_id
         WHERE e.id = ?1"
    );
    let result = conn.query_row(&sql, rusqlite::params![entity_id], |row| {
        Ok((row.get::<_, i64>(0)?, row.get::<_, String>(1)?))
    });
    match result {
        Ok((iid, project_path)) => Ok(Some(EntityRef {
            entity_type: entity_type.to_owned(),
            entity_id,
            entity_iid: iid,
            project_path,
        })),
        Err(rusqlite::Error::QueryReturnedNoRows) => Ok(None),
        Err(e) => Err(e.into()),
    }
 }
 #[cfg(test)]
 mod tests {
    use super::*;
@@ -515,7 +480,7 @@ mod tests {
            result.unresolved_references[0].target_project,
            Some("other/repo".to_owned())
        );
-        assert_eq!(result.unresolved_references[0].target_iid, 42);
+        assert_eq!(result.unresolved_references[0].target_iid, Some(42));
    }
    #[test]
--- a/src/core/timeline_seed.rs
+++ b/src/core/timeline_seed.rs
@@ -1,9 +1,10 @@
 use std::collections::HashSet;
 use rusqlite::Connection;
 use tracing::debug;
 use crate::core::error::Result;
-use crate::core::timeline::{EntityRef, TimelineEvent, TimelineEventType};
+use crate::core::timeline::{EntityRef, TimelineEvent, TimelineEventType, resolve_entity_ref};
 use crate::search::{FtsQueryMode, to_fts_query};
 /// Result of the seed + hydrate phases.
@@ -67,7 +68,12 @@ fn find_seed_entities(
    let mut stmt = conn.prepare(sql)?;
    let rows = stmt.query_map(
-        rusqlite::params![fts_query, project_id, since_ms, (max_seeds * 3) as i64],
+        rusqlite::params![
            fts_query,
            project_id,
            since_ms,
            max_seeds.saturating_mul(3) as i64
        ],
        |row| {
            Ok((
                row.get::<_, String>(0)?,
@@ -105,7 +111,8 @@ fn find_seed_entities(
            continue;
        }
-        if let Some(entity_ref) = resolve_entity(conn, &entity_type, entity_id, proj_id)? {
+        if let Some(entity_ref) = resolve_entity_ref(conn, &entity_type, entity_id, Some(proj_id))?
        {
            entities.push(entity_ref);
        }
@@ -117,42 +124,6 @@ fn find_seed_entities(
    Ok(entities)
 }
 /// Resolve an entity ID to a full EntityRef with iid and project_path.
 fn resolve_entity(
    conn: &Connection,
    entity_type: &str,
    entity_id: i64,
    project_id: i64,
 ) -> Result<Option<EntityRef>> {
    let (table, id_col) = match entity_type {
        "issue" => ("issues", "id"),
        "merge_request" => ("merge_requests", "id"),
        _ => return Ok(None),
    };
    let sql = format!(
        "SELECT e.iid, p.path_with_namespace
         FROM {table} e
         JOIN projects p ON p.id = e.project_id
         WHERE e.{id_col} = ?1 AND e.project_id = ?2"
    );
    let result = conn.query_row(&sql, rusqlite::params![entity_id, project_id], |row| {
        Ok((row.get::<_, i64>(0)?, row.get::<_, String>(1)?))
    });
    match result {
        Ok((iid, project_path)) => Ok(Some(EntityRef {
            entity_type: entity_type.to_owned(),
            entity_id,
            entity_iid: iid,
            project_path,
        })),
        Err(rusqlite::Error::QueryReturnedNoRows) => Ok(None),
        Err(e) => Err(e.into()),
    }
 }
 /// Find evidence notes: FTS5-matched discussion notes that provide context.
 fn find_evidence_notes(
    conn: &Connection,
@@ -211,10 +182,18 @@ fn find_evidence_notes(
        let snippet = truncate_to_chars(body.as_deref().unwrap_or(""), 200);
-        let entity_ref = resolve_entity(conn, &parent_type, parent_entity_id, proj_id)?;
+        let entity_ref = resolve_entity_ref(conn, &parent_type, parent_entity_id, Some(proj_id))?;
        let (iid, project_path) = match entity_ref {
            Some(ref e) => (e.entity_iid, e.project_path.clone()),
-            None => continue,
+            None => {
                debug!(
                    parent_type,
                    parent_entity_id,
                    proj_id,
                    "Skipping evidence note: parent entity not found (orphaned discussion)"
                );
                continue;
            }
        };
        events.push(TimelineEvent {
--- a/tests/timeline_pipeline_tests.rs
+++ b/tests/timeline_pipeline_tests.rs
@@ -0,0 +1,354 @@
 use lore::core::db::{create_connection, run_migrations};
 use lore::core::timeline::{TimelineEventType, resolve_entity_ref};
 use lore::core::timeline_collect::collect_events;
 use lore::core::timeline_expand::expand_timeline;
 use lore::core::timeline_seed::seed_timeline;
 use rusqlite::Connection;
 use std::path::Path;
 fn setup_db() -> Connection {
    let conn = create_connection(Path::new(":memory:")).unwrap();
    run_migrations(&conn).unwrap();
    conn
 }
 fn insert_project(conn: &Connection, path: &str) -> i64 {
    conn.execute(
        "INSERT INTO projects (gitlab_project_id, path_with_namespace, web_url) VALUES (?1, ?2, ?3)",
        rusqlite::params![1, path, format!("https://gitlab.com/{path}")],
    )
    .unwrap();
    conn.last_insert_rowid()
 }
 fn insert_issue(conn: &Connection, project_id: i64, iid: i64, title: &str) -> i64 {
    conn.execute(
        "INSERT INTO issues (gitlab_id, project_id, iid, title, state, author_username, created_at, updated_at, last_seen_at, web_url) VALUES (?1, ?2, ?3, ?4, 'opened', 'alice', 1000, 2000, 3000, ?5)",
        rusqlite::params![iid * 100, project_id, iid, title, format!("https://gitlab.com/g/p/-/issues/{iid}")],
    )
    .unwrap();
    conn.last_insert_rowid()
 }
 fn insert_mr(
    conn: &Connection,
    project_id: i64,
    iid: i64,
    title: &str,
    merged_at: Option<i64>,
 ) -> i64 {
    conn.execute(
        "INSERT INTO merge_requests (gitlab_id, project_id, iid, title, state, author_username, created_at, updated_at, last_seen_at, merged_at, merge_user_username, web_url) VALUES (?1, ?2, ?3, ?4, 'merged', 'bob', 1500, 5000, 6000, ?5, 'charlie', ?6)",
        rusqlite::params![iid * 100, project_id, iid, title, merged_at, format!("https://gitlab.com/g/p/-/merge_requests/{iid}")],
    )
    .unwrap();
    conn.last_insert_rowid()
 }
 fn insert_document(
    conn: &Connection,
    source_type: &str,
    source_id: i64,
    project_id: i64,
    content: &str,
 ) {
    conn.execute(
        "INSERT INTO documents (source_type, source_id, project_id, content_text, content_hash) VALUES (?1, ?2, ?3, ?4, ?5)",
        rusqlite::params![source_type, source_id, project_id, content, format!("hash_{source_id}")],
    )
    .unwrap();
 }
 fn insert_entity_ref(
    conn: &Connection,
    project_id: i64,
    source_type: &str,
    source_id: i64,
    target_type: &str,
    target_id: Option<i64>,
    ref_type: &str,
 ) {
    conn.execute(
        "INSERT INTO entity_references (project_id, source_entity_type, source_entity_id, target_entity_type, target_entity_id, reference_type, source_method, created_at) VALUES (?1, ?2, ?3, ?4, ?5, ?6, 'api', 1000)",
        rusqlite::params![project_id, source_type, source_id, target_type, target_id, ref_type],
    )
    .unwrap();
 }
 fn insert_state_event(
    conn: &Connection,
    project_id: i64,
    issue_id: Option<i64>,
    mr_id: Option<i64>,
    state: &str,
    created_at: i64,
 ) {
    let gitlab_id: i64 = rand::random::<u32>().into();
    conn.execute(
        "INSERT INTO resource_state_events (gitlab_id, project_id, issue_id, merge_request_id, state, actor_username, created_at) VALUES (?1, ?2, ?3, ?4, ?5, 'alice', ?6)",
        rusqlite::params![gitlab_id, project_id, issue_id, mr_id, state, created_at],
    )
    .unwrap();
 }
 fn insert_label_event(
    conn: &Connection,
    project_id: i64,
    issue_id: Option<i64>,
    label: &str,
    created_at: i64,
 ) {
    let gitlab_id: i64 = rand::random::<u32>().into();
    conn.execute(
        "INSERT INTO resource_label_events (gitlab_id, project_id, issue_id, merge_request_id, action, label_name, actor_username, created_at) VALUES (?1, ?2, ?3, NULL, 'add', ?4, 'alice', ?5)",
        rusqlite::params![gitlab_id, project_id, issue_id, label, created_at],
    )
    .unwrap();
 }
 /// Full pipeline: seed -> expand -> collect for a scenario with an issue
 /// that has a closing MR, state changes, and label events.
 #[test]
 fn pipeline_seed_expand_collect_end_to_end() {
    let conn = setup_db();
    let project_id = insert_project(&conn, "group/project");
    // Issue #5: "authentication error in login"
    let issue_id = insert_issue(&conn, project_id, 5, "Authentication error in login");
    insert_document(
        &conn,
        "issue",
        issue_id,
        project_id,
        "authentication error in login flow causing 401",
    );
    // MR !10 closes issue #5
    let mr_id = insert_mr(&conn, project_id, 10, "Fix auth bug", Some(4000));
    insert_document(
        &conn,
        "merge_request",
        mr_id,
        project_id,
        "fix authentication error in login module",
    );
    insert_entity_ref(
        &conn,
        project_id,
        "merge_request",
        mr_id,
        "issue",
        Some(issue_id),
        "closes",
    );
    // State changes on issue
    insert_state_event(&conn, project_id, Some(issue_id), None, "closed", 3000);
    // Label added to issue
    insert_label_event(&conn, project_id, Some(issue_id), "bug", 1500);
    // SEED: find entities matching "authentication"
    let seed_result = seed_timeline(&conn, "authentication", None, None, 50, 10).unwrap();
    assert!(
        !seed_result.seed_entities.is_empty(),
        "Seed should find at least one entity"
    );
    // Verify seeds contain the issue
    let has_issue = seed_result
        .seed_entities
        .iter()
        .any(|e| e.entity_type == "issue" && e.entity_iid == 5);
    assert!(has_issue, "Seeds should include issue #5");
    // EXPAND: discover related entities (MR !10 via closes reference)
    let expand_result = expand_timeline(&conn, &seed_result.seed_entities, 1, false, 100).unwrap();
    // The MR should appear as an expanded entity (or as a seed if it was also matched)
    let total_entities = seed_result.seed_entities.len() + expand_result.expanded_entities.len();
    assert!(total_entities >= 2, "Should have at least issue + MR");
    // COLLECT: gather all events
    let events = collect_events(
        &conn,
        &seed_result.seed_entities,
        &expand_result.expanded_entities,
        &seed_result.evidence_notes,
        None,
        1000,
    )
    .unwrap();
    assert!(!events.is_empty(), "Should have events");
    // Verify chronological ordering
    for window in events.windows(2) {
        assert!(
            window[0].timestamp <= window[1].timestamp,
            "Events must be chronologically sorted: {} > {}",
            window[0].timestamp,
            window[1].timestamp
        );
    }
    // Verify expected event types are present
    let has_created = events
        .iter()
        .any(|e| matches!(e.event_type, TimelineEventType::Created));
    let has_state_change = events
        .iter()
        .any(|e| matches!(e.event_type, TimelineEventType::StateChanged { .. }));
    let has_label = events
        .iter()
        .any(|e| matches!(e.event_type, TimelineEventType::LabelAdded { .. }));
    let has_merged = events
        .iter()
        .any(|e| matches!(e.event_type, TimelineEventType::Merged));
    assert!(has_created, "Should have Created events");
    assert!(has_state_change, "Should have StateChanged events");
    assert!(has_label, "Should have LabelAdded events");
    assert!(has_merged, "Should have Merged event from MR");
 }
 /// Verify the pipeline handles an empty FTS result gracefully.
 #[test]
 fn pipeline_empty_query_produces_empty_result() {
    let conn = setup_db();
    let _project_id = insert_project(&conn, "group/project");
    let seed_result = seed_timeline(&conn, "", None, None, 50, 10).unwrap();
    assert!(seed_result.seed_entities.is_empty());
    let expand_result = expand_timeline(&conn, &seed_result.seed_entities, 1, false, 100).unwrap();
    assert!(expand_result.expanded_entities.is_empty());
    let events = collect_events(
        &conn,
        &seed_result.seed_entities,
        &expand_result.expanded_entities,
        &seed_result.evidence_notes,
        None,
        1000,
    )
    .unwrap();
    assert!(events.is_empty());
 }
 /// Verify since filter propagates through the full pipeline.
 #[test]
 fn pipeline_since_filter_excludes_old_events() {
    let conn = setup_db();
    let project_id = insert_project(&conn, "group/project");
    let issue_id = insert_issue(&conn, project_id, 1, "Deploy failure");
    insert_document(
        &conn,
        "issue",
        issue_id,
        project_id,
        "deploy failure in staging environment",
    );
    // Old state change at 2000, recent state change at 8000
    insert_state_event(&conn, project_id, Some(issue_id), None, "closed", 2000);
    insert_state_event(&conn, project_id, Some(issue_id), None, "reopened", 8000);
    let seed_result = seed_timeline(&conn, "deploy", None, None, 50, 10).unwrap();
    let expand_result = expand_timeline(&conn, &seed_result.seed_entities, 0, false, 100).unwrap();
    // Collect with since=5000: should exclude Created(1000) and closed(2000)
    let events = collect_events(
        &conn,
        &seed_result.seed_entities,
        &expand_result.expanded_entities,
        &seed_result.evidence_notes,
        Some(5000),
        1000,
    )
    .unwrap();
    assert_eq!(events.len(), 1, "Only the reopened event should survive");
    assert_eq!(events[0].timestamp, 8000);
 }
 /// Verify unresolved references use Option<i64> for target_iid.
 #[test]
 fn pipeline_unresolved_refs_have_optional_iid() {
    let conn = setup_db();
    let project_id = insert_project(&conn, "group/project");
    let issue_id = insert_issue(&conn, project_id, 1, "Cross-project reference");
    insert_document(
        &conn,
        "issue",
        issue_id,
        project_id,
        "cross project reference test",
    );
    // Unresolved reference with known iid
    conn.execute(
        "INSERT INTO entity_references (project_id, source_entity_type, source_entity_id, target_entity_type, target_entity_id, target_project_path, target_entity_iid, reference_type, source_method, created_at) VALUES (?1, 'issue', ?2, 'issue', NULL, 'other/repo', 42, 'closes', 'description_parse', 1000)",
        rusqlite::params![project_id, issue_id],
    )
    .unwrap();
    // Unresolved reference with NULL iid
    conn.execute(
        "INSERT INTO entity_references (project_id, source_entity_type, source_entity_id, target_entity_type, target_entity_id, target_project_path, target_entity_iid, reference_type, source_method, created_at) VALUES (?1, 'issue', ?2, 'merge_request', NULL, 'other/repo', NULL, 'related', 'note_parse', 1000)",
        rusqlite::params![project_id, issue_id],
    )
    .unwrap();
    let seed_result = seed_timeline(&conn, "cross project", None, None, 50, 10).unwrap();
    let expand_result = expand_timeline(&conn, &seed_result.seed_entities, 1, false, 100).unwrap();
    assert_eq!(expand_result.unresolved_references.len(), 2);
    let with_iid = expand_result
        .unresolved_references
        .iter()
        .find(|r| r.target_type == "issue")
        .unwrap();
    assert_eq!(with_iid.target_iid, Some(42));
    let without_iid = expand_result
        .unresolved_references
        .iter()
        .find(|r| r.target_type == "merge_request")
        .unwrap();
    assert_eq!(without_iid.target_iid, None);
 }
 /// Verify the shared resolve_entity_ref works with and without project scoping.
 #[test]
 fn shared_resolve_entity_ref_scoping() {
    let conn = setup_db();
    let project_id = insert_project(&conn, "group/project");
    let issue_id = insert_issue(&conn, project_id, 42, "Test issue");
    // Resolve with project filter
    let result = resolve_entity_ref(&conn, "issue", issue_id, Some(project_id)).unwrap();
    assert!(result.is_some());
    let entity = result.unwrap();
    assert_eq!(entity.entity_iid, 42);
    assert_eq!(entity.project_path, "group/project");
    // Resolve without project filter
    let result = resolve_entity_ref(&conn, "issue", issue_id, None).unwrap();
    assert!(result.is_some());
    // Resolve with wrong project filter
    let result = resolve_entity_ref(&conn, "issue", issue_id, Some(9999)).unwrap();
    assert!(result.is_none());
    // Resolve unknown entity type
    let result = resolve_entity_ref(&conn, "unknown_type", issue_id, None).unwrap();
    assert!(result.is_none());
    // Resolve nonexistent entity
    let result = resolve_entity_ref(&conn, "issue", 99999, None).unwrap();
    assert!(result.is_none());
 }
Author	SHA1	Message	Date
Taylor Eernisse	b005edb7f2	docs(readme): add timeline pipeline documentation and schema updates Documents the timeline pipeline feature in the README: - New feature bullets: timeline pipeline, git history linking, file change tracking - Updated schema table: merge_requests now includes commit SHAs, added mr_file_changes table - New "Timeline Pipeline" section explaining the 5-stage architecture (SEED -> HYDRATE -> EXPAND -> COLLECT -> RENDER) with a table of all event types and a note on unresolved cross-project references Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 08:38:48 -05:00
Taylor Eernisse	03d9f8cce5	docs(db): document safety invariants for sqlite-vec transmute Adds a SAFETY comment explaining why the transmute of sqlite3_vec_init to the sqlite3_auto_extension callback type is sound. The three invariants (stable C-ABI signature, single-call-per-connection contract, idempotency) were previously undocumented, which left the lone unsafe block without justification for future readers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 08:38:41 -05:00
Taylor Eernisse	7eadae75f0	test(timeline): add integration tests for full seed-expand-collect pipeline Adds tests/timeline_pipeline_tests.rs with end-to-end integration tests that exercise the complete timeline pipeline against an in-memory SQLite database with realistic data: - pipeline_seed_expand_collect_end_to_end: Full scenario with an issue closed by an MR, state changes, and label events. Verifies that seed finds entities via FTS, expand discovers the closing MR through the entity_references graph, and collect assembles a chronologically sorted event stream containing Created, StateChanged, LabelAdded, and Merged events. - pipeline_empty_query_produces_empty_result: Validates graceful degradation when FTS returns zero matches -- all three stages should produce empty results without errors. - pipeline_since_filter_excludes_old_events: Verifies that the since timestamp filter propagates correctly through collect, excluding events before the cutoff while retaining newer ones. - pipeline_unresolved_refs_have_optional_iid: Tests the Option<i64> target_iid on UnresolvedRef by creating cross-project references both with and without known IIDs. - shared_resolve_entity_ref_scoping: Unit tests for the new shared resolve_entity_ref helper covering project-scoped lookup, unscoped lookup, wrong-project rejection, unknown entity types, and nonexistent entity IDs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 08:38:34 -05:00
Taylor Eernisse	9b23d91378	refactor(timeline): harden pipeline stages with shared resolver and exhaustive error handling Follows up on the resolve_entity_ref extraction by updating all three pipeline stages to consume the shared helper and removing their local duplicates (~75 lines of dead code eliminated). timeline_seed.rs: - Switch from local resolve_entity to shared resolve_entity_ref with explicit Some(proj_id) scoping - Add tracing::debug for orphaned discussion parents instead of silently skipping them, aiding debugging when evidence notes go missing - Use saturating_mul for the over-fetch multiplier to prevent overflow on pathological max_seeds values timeline_expand.rs: - Switch from local resolve_entity_ref to shared version with None project scoping (cross-project traversal) - Pass Option<i64> for target_iid in UnresolvedRef construction instead of unwrap_or(0) sentinel - Update test assertion to compare against Some(42) timeline_collect.rs: - Make entity_id_column return Result instead of silently defaulting to issue_id for unknown entity types. The previous fallback could produce incorrect SQL queries that return wrong results rather than failing - Replace if-let chains in collect_merged_event with exhaustive match blocks that propagate real DB errors while gracefully handling expected missing-data cases (QueryReturnedNoRows, NULL merged_at) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 08:38:24 -05:00
Taylor Eernisse	a324fa26e1	refactor(timeline): extract shared resolve_entity_ref and make target_iid optional The seed, expand, and collect stages each had their own near-identical resolve_entity_ref helper that converted internal DB IDs to full EntityRef structs. This duplication made it easy for bug fixes to land in one copy but not the others. Extract a single public resolve_entity_ref into timeline.rs with an optional project_id parameter: - Some(project_id): scopes the lookup (used by seed, which knows the project from the FTS result) - None: unscoped lookup (used by expand, which traverses cross-project references) Also changes UnresolvedRef.target_iid from i64 to Option<i64>. Cross- project references parsed from descriptions may not always carry an IID (e.g. when the reference is malformed or the target was deleted). The previous sentinel value of 0 was semantically incorrect since GitLab IIDs start at 1. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 08:38:12 -05:00