Compare commits
2 Commits
master
...
b54d9e7324
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
b54d9e7324 | ||
|
|
c2bdc01eac |
File diff suppressed because one or more lines are too long
@@ -1 +1 @@
|
||||
bd-1cjx
|
||||
bd-1ksf
|
||||
|
||||
56
AGENTS.md
56
AGENTS.md
@@ -16,31 +16,42 @@ If I tell you to do something, even if it goes against what follows below, YOU M
|
||||
|
||||
## Version Control: jj-First (CRITICAL)
|
||||
|
||||
**ALWAYS prefer jj (Jujutsu) over git for all VCS operations.** This is a colocated repo with both `.jj/` and `.git/`. When instructed to use git by anything — even later in this file — use the best jj replacement commands instead. Only fall back to raw `git` for things jj cannot do (hooks, LFS, submodules, `gh` CLI interop).
|
||||
**ALWAYS prefer jj (Jujutsu) over git for VCS mutations** (commit, describe, rebase, push, bookmark, undo). This is a colocated repo with both `.jj/` and `.git/`. Only fall back to raw `git` for things jj cannot do (hooks, LFS, submodules, `gh` CLI interop).
|
||||
|
||||
**Exception — read-only inspection:** Use `git status`, `git diff`, `git log` instead of their jj equivalents. In a colocated repo these see accurate data, and unlike jj, they don't create operations that cause divergences when multiple agents run concurrently. See "Parallel Agent VCS Protocol" below.
|
||||
|
||||
See `~/.claude/rules/jj-vcs/` for the full command reference, translation table, revsets, patterns, and recovery recipes.
|
||||
|
||||
### Multi-Agent VCS Protocol (CRITICAL)
|
||||
### Parallel Agent VCS Protocol (CRITICAL)
|
||||
|
||||
**In a multi-agent session, ONLY THE TEAM LEAD performs jj/git operations.** Worker agents MUST NEVER run `jj` or `git` commands.
|
||||
Multiple agents often run concurrently in separate terminal panes, sharing the same repo directory. This requires care because jj's auto-snapshot creates operations on EVERY command — even read-only ones like `jj status`. Concurrent jj commands fork from the same parent operation and create **divergent changes**.
|
||||
|
||||
**Why:** jj has a single working copy (`@`) per workspace. Every `jj` command — even read-only ones like `jj status` — triggers a working copy snapshot recorded as an operation. When two agents run `jj` commands concurrently, both operations fork from the same parent operation and both rewrite `@`. jj detects this as a **divergent change**: same change ID, two different commits. Resolving divergences requires manual intervention and risks losing work.
|
||||
**The rule: use git for reads, jj for writes.**
|
||||
|
||||
**Rules for worker agents:**
|
||||
In a colocated repo, git reads see accurate data because jj keeps `.git/` in sync.
|
||||
|
||||
- Edit files only via Edit/Write tools — NEVER run `jj`, `git`, or any shell command that triggers jj
|
||||
- If you need VCS info (status, diff, log), message the team lead
|
||||
- Do NOT run "Landing the Plane" — the lead handles all VCS for the team
|
||||
- Treat all file changes on disk as your own (other agents' edits are normal)
|
||||
| Operation | Use | Why |
|
||||
|-----------|-----|-----|
|
||||
| Check status | `git status` | No jj operation created |
|
||||
| View diff | `git diff` | No jj operation created |
|
||||
| Browse history | `git log` | No jj operation created |
|
||||
| Commit work | `jj commit -m "msg"` | jj mutation (better UX) |
|
||||
| Update description | `jj describe -m "msg"` | jj mutation |
|
||||
| Rebase | `jj rebase -d trunk()` | jj mutation |
|
||||
| Push | `jj git push -b <name>` | jj mutation |
|
||||
| Manage bookmarks | `jj bookmark set ...` | jj mutation |
|
||||
| Undo a mistake | `jj undo` | jj mutation |
|
||||
|
||||
**Rules for the team lead:**
|
||||
**NEVER run `jj status`, `jj diff`, `jj log`, or `jj show` when other agents may be active** — these trigger snapshots that cause divergences.
|
||||
|
||||
- You are the sole VCS operator — all commits, pushes, and rebases go through you
|
||||
- Run `jj status` / `jj diff` to review all agents' work before committing
|
||||
- Use `jj split` to separate different agents' work into distinct commits if needed
|
||||
- Follow "Landing the Plane" when ending the session
|
||||
**If using Claude Code's built-in agent teams:** Only the team lead runs ANY VCS commands (git or jj). Workers only edit files via Edit/Write tools and do NOT run "Landing the Plane".
|
||||
|
||||
**Solo sessions:** When you are the only agent, you handle VCS yourself normally.
|
||||
**Resolving divergences if they occur:**
|
||||
|
||||
```bash
|
||||
jj log -r 'divergent()' # Find divergent changes
|
||||
jj abandon <unwanted-commit-id> # Keep the version you want
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
@@ -776,6 +787,21 @@ lore -J mrs --fields iid,title,state,draft,labels # Custom field list
|
||||
- Use `lore robot-docs` for response schema discovery
|
||||
- The `-p` flag supports fuzzy project matching (suffix and substring)
|
||||
|
||||
---
|
||||
|
||||
## Read/Write Split: lore vs glab
|
||||
|
||||
| Operation | Tool | Why |
|
||||
|-----------|------|-----|
|
||||
| List issues/MRs | lore | Richer: includes status, discussions, closing MRs |
|
||||
| View issue/MR detail | lore | Pre-joined discussions, work-item status |
|
||||
| Search across entities | lore | FTS5 + vector hybrid search |
|
||||
| Expert/workload analysis | lore | who command — no glab equivalent |
|
||||
| Timeline reconstruction | lore | Chronological narrative — no glab equivalent |
|
||||
| Create/update/close | glab | Write operations |
|
||||
| Approve/merge MR | glab | Write operations |
|
||||
| CI/CD pipelines | glab | Not in lore scope |
|
||||
|
||||
````markdown
|
||||
## UBS Quick Reference for AI Agents
|
||||
|
||||
|
||||
5
migrations/023_issue_detail_fields.sql
Normal file
5
migrations/023_issue_detail_fields.sql
Normal file
@@ -0,0 +1,5 @@
|
||||
ALTER TABLE issues ADD COLUMN closed_at TEXT;
|
||||
ALTER TABLE issues ADD COLUMN confidential INTEGER NOT NULL DEFAULT 0;
|
||||
|
||||
INSERT INTO schema_version (version, applied_at, description)
|
||||
VALUES (23, strftime('%s', 'now') * 1000, 'Add closed_at and confidential to issues');
|
||||
@@ -185,6 +185,7 @@ const COMMAND_FLAGS: &[(&str, &[&str])] = &[
|
||||
"--no-detail",
|
||||
],
|
||||
),
|
||||
("drift", &["--threshold", "--project"]),
|
||||
(
|
||||
"init",
|
||||
&[
|
||||
|
||||
642
src/cli/commands/drift.rs
Normal file
642
src/cli/commands/drift.rs
Normal file
@@ -0,0 +1,642 @@
|
||||
use std::collections::HashMap;
|
||||
use std::sync::LazyLock;
|
||||
|
||||
use console::style;
|
||||
use regex::Regex;
|
||||
use serde::Serialize;
|
||||
|
||||
use crate::cli::robot::RobotMeta;
|
||||
use crate::core::config::Config;
|
||||
use crate::core::db::create_connection;
|
||||
use crate::core::error::{LoreError, Result};
|
||||
use crate::core::paths::get_db_path;
|
||||
use crate::core::project::resolve_project;
|
||||
use crate::core::time::ms_to_iso;
|
||||
use crate::embedding::ollama::{OllamaClient, OllamaConfig};
|
||||
use crate::embedding::similarity::cosine_similarity;
|
||||
|
||||
const BATCH_SIZE: usize = 32;
|
||||
const WINDOW_SIZE: usize = 3;
|
||||
const MIN_DESCRIPTION_LEN: usize = 20;
|
||||
const MAX_NOTES: i64 = 200;
|
||||
const TOP_TOPICS: usize = 3;
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Response types
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
#[derive(Debug, Serialize)]
|
||||
pub struct DriftResponse {
|
||||
pub entity: DriftEntity,
|
||||
pub drift_detected: bool,
|
||||
pub threshold: f32,
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub drift_point: Option<DriftPoint>,
|
||||
pub drift_topics: Vec<String>,
|
||||
pub similarity_curve: Vec<SimilarityPoint>,
|
||||
pub recommendation: String,
|
||||
}
|
||||
|
||||
#[derive(Debug, Serialize)]
|
||||
pub struct DriftEntity {
|
||||
pub entity_type: String,
|
||||
pub iid: i64,
|
||||
pub title: String,
|
||||
}
|
||||
|
||||
#[derive(Debug, Serialize)]
|
||||
pub struct DriftPoint {
|
||||
pub note_index: usize,
|
||||
pub note_id: i64,
|
||||
pub author: String,
|
||||
pub created_at: String,
|
||||
pub similarity: f32,
|
||||
}
|
||||
|
||||
#[derive(Debug, Serialize)]
|
||||
pub struct SimilarityPoint {
|
||||
pub note_index: usize,
|
||||
pub similarity: f32,
|
||||
pub author: String,
|
||||
pub created_at: String,
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Internal row types
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
struct IssueInfo {
|
||||
id: i64,
|
||||
iid: i64,
|
||||
title: String,
|
||||
description: Option<String>,
|
||||
}
|
||||
|
||||
struct NoteRow {
|
||||
id: i64,
|
||||
body: String,
|
||||
author_username: String,
|
||||
created_at: i64,
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Main entry point
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
pub async fn run_drift(
|
||||
config: &Config,
|
||||
entity_type: &str,
|
||||
iid: i64,
|
||||
threshold: f32,
|
||||
project: Option<&str>,
|
||||
) -> Result<DriftResponse> {
|
||||
if entity_type != "issues" {
|
||||
return Err(LoreError::Other(
|
||||
"drift currently supports 'issues' only".to_string(),
|
||||
));
|
||||
}
|
||||
|
||||
let db_path = get_db_path(config.storage.db_path.as_deref());
|
||||
let conn = create_connection(&db_path)?;
|
||||
|
||||
let issue = find_issue(&conn, iid, project)?;
|
||||
|
||||
let description = match &issue.description {
|
||||
Some(d) if d.len() >= MIN_DESCRIPTION_LEN => d.clone(),
|
||||
_ => {
|
||||
return Ok(DriftResponse {
|
||||
entity: DriftEntity {
|
||||
entity_type: entity_type.to_string(),
|
||||
iid: issue.iid,
|
||||
title: issue.title,
|
||||
},
|
||||
drift_detected: false,
|
||||
threshold,
|
||||
drift_point: None,
|
||||
drift_topics: vec![],
|
||||
similarity_curve: vec![],
|
||||
recommendation: "Description too short for drift analysis.".to_string(),
|
||||
});
|
||||
}
|
||||
};
|
||||
|
||||
let notes = fetch_notes(&conn, issue.id)?;
|
||||
|
||||
if notes.len() < WINDOW_SIZE {
|
||||
return Ok(DriftResponse {
|
||||
entity: DriftEntity {
|
||||
entity_type: entity_type.to_string(),
|
||||
iid: issue.iid,
|
||||
title: issue.title,
|
||||
},
|
||||
drift_detected: false,
|
||||
threshold,
|
||||
drift_point: None,
|
||||
drift_topics: vec![],
|
||||
similarity_curve: vec![],
|
||||
recommendation: format!(
|
||||
"Only {} note(s) found; need at least {} for drift detection.",
|
||||
notes.len(),
|
||||
WINDOW_SIZE
|
||||
),
|
||||
});
|
||||
}
|
||||
|
||||
// Build texts to embed: description first, then each note body.
|
||||
let mut texts: Vec<String> = Vec::with_capacity(1 + notes.len());
|
||||
texts.push(description.clone());
|
||||
for note in ¬es {
|
||||
texts.push(note.body.clone());
|
||||
}
|
||||
|
||||
let embeddings = embed_texts(config, &texts).await?;
|
||||
|
||||
let desc_embedding = &embeddings[0];
|
||||
let note_embeddings = &embeddings[1..];
|
||||
|
||||
// Build similarity curve.
|
||||
let similarity_curve: Vec<SimilarityPoint> = note_embeddings
|
||||
.iter()
|
||||
.enumerate()
|
||||
.map(|(i, emb)| SimilarityPoint {
|
||||
note_index: i,
|
||||
similarity: cosine_similarity(desc_embedding, emb),
|
||||
author: notes[i].author_username.clone(),
|
||||
created_at: ms_to_iso(notes[i].created_at),
|
||||
})
|
||||
.collect();
|
||||
|
||||
// Detect drift via sliding window.
|
||||
let (drift_detected, drift_point) = detect_drift(&similarity_curve, ¬es, threshold);
|
||||
|
||||
// Extract drift topics.
|
||||
let drift_topics = if drift_detected {
|
||||
let drift_idx = drift_point.as_ref().map_or(0, |dp| dp.note_index);
|
||||
extract_drift_topics(&description, ¬es, drift_idx)
|
||||
} else {
|
||||
vec![]
|
||||
};
|
||||
|
||||
let recommendation = if drift_detected {
|
||||
let dp = drift_point.as_ref().unwrap();
|
||||
format!(
|
||||
"Discussion drifted at note {} by @{} (similarity {:.2}). Consider splitting into a new issue.",
|
||||
dp.note_index, dp.author, dp.similarity
|
||||
)
|
||||
} else {
|
||||
"Discussion remains on topic.".to_string()
|
||||
};
|
||||
|
||||
Ok(DriftResponse {
|
||||
entity: DriftEntity {
|
||||
entity_type: entity_type.to_string(),
|
||||
iid: issue.iid,
|
||||
title: issue.title,
|
||||
},
|
||||
drift_detected,
|
||||
threshold,
|
||||
drift_point,
|
||||
drift_topics,
|
||||
similarity_curve,
|
||||
recommendation,
|
||||
})
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// DB helpers
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
fn find_issue(
|
||||
conn: &rusqlite::Connection,
|
||||
iid: i64,
|
||||
project_filter: Option<&str>,
|
||||
) -> Result<IssueInfo> {
|
||||
let (sql, params): (&str, Vec<Box<dyn rusqlite::ToSql>>) = match project_filter {
|
||||
Some(project) => {
|
||||
let project_id = resolve_project(conn, project)?;
|
||||
(
|
||||
"SELECT i.id, i.iid, i.title, i.description
|
||||
FROM issues i
|
||||
WHERE i.iid = ? AND i.project_id = ?",
|
||||
vec![Box::new(iid), Box::new(project_id)],
|
||||
)
|
||||
}
|
||||
None => (
|
||||
"SELECT i.id, i.iid, i.title, i.description
|
||||
FROM issues i
|
||||
WHERE i.iid = ?",
|
||||
vec![Box::new(iid)],
|
||||
),
|
||||
};
|
||||
|
||||
let param_refs: Vec<&dyn rusqlite::ToSql> = params.iter().map(|p| p.as_ref()).collect();
|
||||
|
||||
let mut stmt = conn.prepare(sql)?;
|
||||
let rows: Vec<IssueInfo> = stmt
|
||||
.query_map(param_refs.as_slice(), |row| {
|
||||
Ok(IssueInfo {
|
||||
id: row.get(0)?,
|
||||
iid: row.get(1)?,
|
||||
title: row.get(2)?,
|
||||
description: row.get(3)?,
|
||||
})
|
||||
})?
|
||||
.collect::<std::result::Result<Vec<_>, _>>()?;
|
||||
|
||||
match rows.len() {
|
||||
0 => Err(LoreError::NotFound(format!("Issue #{iid} not found"))),
|
||||
1 => Ok(rows.into_iter().next().unwrap()),
|
||||
_ => Err(LoreError::Ambiguous(format!(
|
||||
"Issue #{iid} exists in multiple projects. Use --project to specify."
|
||||
))),
|
||||
}
|
||||
}
|
||||
|
||||
fn fetch_notes(conn: &rusqlite::Connection, issue_id: i64) -> Result<Vec<NoteRow>> {
|
||||
let mut stmt = conn.prepare(
|
||||
"SELECT n.id, n.body, n.author_username, n.created_at
|
||||
FROM notes n
|
||||
JOIN discussions d ON n.discussion_id = d.id
|
||||
WHERE d.issue_id = ?
|
||||
AND n.is_system = 0
|
||||
AND LENGTH(n.body) >= 20
|
||||
ORDER BY n.created_at ASC
|
||||
LIMIT ?",
|
||||
)?;
|
||||
|
||||
let notes: Vec<NoteRow> = stmt
|
||||
.query_map(rusqlite::params![issue_id, MAX_NOTES], |row| {
|
||||
Ok(NoteRow {
|
||||
id: row.get(0)?,
|
||||
body: row.get(1)?,
|
||||
author_username: row.get(2)?,
|
||||
created_at: row.get(3)?,
|
||||
})
|
||||
})?
|
||||
.collect::<std::result::Result<Vec<_>, _>>()?;
|
||||
|
||||
Ok(notes)
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Embedding helper
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
async fn embed_texts(config: &Config, texts: &[String]) -> Result<Vec<Vec<f32>>> {
|
||||
let ollama = OllamaClient::new(OllamaConfig {
|
||||
base_url: config.embedding.base_url.clone(),
|
||||
model: config.embedding.model.clone(),
|
||||
timeout_secs: 60,
|
||||
});
|
||||
|
||||
let mut all_embeddings: Vec<Vec<f32>> = Vec::with_capacity(texts.len());
|
||||
|
||||
for chunk in texts.chunks(BATCH_SIZE) {
|
||||
let refs: Vec<&str> = chunk.iter().map(|s| s.as_str()).collect();
|
||||
let batch_result = ollama.embed_batch(&refs).await?;
|
||||
all_embeddings.extend(batch_result);
|
||||
}
|
||||
|
||||
Ok(all_embeddings)
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Drift detection
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
fn detect_drift(
|
||||
curve: &[SimilarityPoint],
|
||||
notes: &[NoteRow],
|
||||
threshold: f32,
|
||||
) -> (bool, Option<DriftPoint>) {
|
||||
if curve.len() < WINDOW_SIZE {
|
||||
return (false, None);
|
||||
}
|
||||
|
||||
for i in 0..=curve.len() - WINDOW_SIZE {
|
||||
let window_avg: f32 = curve[i..i + WINDOW_SIZE]
|
||||
.iter()
|
||||
.map(|p| p.similarity)
|
||||
.sum::<f32>()
|
||||
/ WINDOW_SIZE as f32;
|
||||
|
||||
if window_avg < threshold {
|
||||
return (
|
||||
true,
|
||||
Some(DriftPoint {
|
||||
note_index: i,
|
||||
note_id: notes[i].id,
|
||||
author: notes[i].author_username.clone(),
|
||||
created_at: ms_to_iso(notes[i].created_at),
|
||||
similarity: curve[i].similarity,
|
||||
}),
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
(false, None)
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Topic extraction
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
static STOPWORDS: LazyLock<std::collections::HashSet<&'static str>> = LazyLock::new(|| {
|
||||
[
|
||||
"the", "a", "an", "is", "are", "was", "were", "be", "been", "being", "have", "has", "had",
|
||||
"do", "does", "did", "will", "would", "could", "should", "may", "might", "shall", "can",
|
||||
"need", "dare", "ought", "used", "to", "of", "in", "for", "on", "with", "at", "by", "from",
|
||||
"as", "into", "through", "during", "before", "after", "above", "below", "between", "out",
|
||||
"off", "over", "under", "again", "further", "then", "once", "here", "there", "when",
|
||||
"where", "why", "how", "all", "each", "every", "both", "few", "more", "most", "other",
|
||||
"some", "such", "no", "not", "only", "own", "same", "so", "than", "too", "very", "just",
|
||||
"because", "but", "and", "or", "if", "while", "about", "up", "it", "its", "this", "that",
|
||||
"these", "those", "i", "me", "my", "we", "our", "you", "your", "he", "him", "his", "she",
|
||||
"her", "they", "them", "their", "what", "which", "who", "whom", "also", "like", "get",
|
||||
"got", "think", "know", "see", "make", "go", "one", "two", "new", "way",
|
||||
]
|
||||
.into_iter()
|
||||
.collect()
|
||||
});
|
||||
|
||||
fn tokenize(text: &str) -> Vec<String> {
|
||||
let cleaned = strip_markdown(text);
|
||||
cleaned
|
||||
.split(|c: char| !c.is_alphanumeric() && c != '_')
|
||||
.filter(|w| w.len() >= 3)
|
||||
.map(|w| w.to_lowercase())
|
||||
.filter(|w| !STOPWORDS.contains(w.as_str()))
|
||||
.collect()
|
||||
}
|
||||
|
||||
fn extract_drift_topics(description: &str, notes: &[NoteRow], drift_idx: usize) -> Vec<String> {
|
||||
let desc_terms: std::collections::HashSet<String> = tokenize(description).into_iter().collect();
|
||||
|
||||
let mut freq: HashMap<String, usize> = HashMap::new();
|
||||
for note in notes.iter().skip(drift_idx) {
|
||||
for term in tokenize(¬e.body) {
|
||||
if !desc_terms.contains(&term) {
|
||||
*freq.entry(term).or_insert(0) += 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
let mut sorted: Vec<(String, usize)> = freq.into_iter().collect();
|
||||
sorted.sort_by(|a, b| b.1.cmp(&a.1));
|
||||
|
||||
sorted
|
||||
.into_iter()
|
||||
.take(TOP_TOPICS)
|
||||
.map(|(t, _)| t)
|
||||
.collect()
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Markdown stripping
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
static RE_FENCED_CODE: LazyLock<Regex> =
|
||||
LazyLock::new(|| Regex::new(r"(?s)```[^\n]*\n.*?```").unwrap());
|
||||
static RE_INLINE_CODE: LazyLock<Regex> = LazyLock::new(|| Regex::new(r"`[^`]+`").unwrap());
|
||||
static RE_LINK: LazyLock<Regex> = LazyLock::new(|| Regex::new(r"\[([^\]]+)\]\([^)]+\)").unwrap());
|
||||
static RE_BLOCKQUOTE: LazyLock<Regex> = LazyLock::new(|| Regex::new(r"(?m)^>\s?").unwrap());
|
||||
static RE_HTML_TAG: LazyLock<Regex> = LazyLock::new(|| Regex::new(r"<[^>]+>").unwrap());
|
||||
|
||||
fn strip_markdown(text: &str) -> String {
|
||||
let text = RE_FENCED_CODE.replace_all(text, "");
|
||||
let text = RE_INLINE_CODE.replace_all(&text, "");
|
||||
let text = RE_LINK.replace_all(&text, "$1");
|
||||
let text = RE_BLOCKQUOTE.replace_all(&text, "");
|
||||
let text = RE_HTML_TAG.replace_all(&text, "");
|
||||
text.into_owned()
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Printers
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
pub fn print_drift_human(response: &DriftResponse) {
|
||||
let header = format!(
|
||||
"Drift Analysis: {} #{}",
|
||||
response.entity.entity_type, response.entity.iid
|
||||
);
|
||||
println!("{}", style(&header).bold());
|
||||
println!("{}", "-".repeat(header.len().min(60)));
|
||||
println!("Title: {}", response.entity.title);
|
||||
println!("Threshold: {:.2}", response.threshold);
|
||||
println!("Notes: {}", response.similarity_curve.len());
|
||||
println!();
|
||||
|
||||
if response.drift_detected {
|
||||
println!("{}", style("DRIFT DETECTED").red().bold());
|
||||
if let Some(dp) = &response.drift_point {
|
||||
println!(
|
||||
" At note #{} by @{} ({}) - similarity {:.2}",
|
||||
dp.note_index, dp.author, dp.created_at, dp.similarity
|
||||
);
|
||||
}
|
||||
if !response.drift_topics.is_empty() {
|
||||
println!(" Topics: {}", response.drift_topics.join(", "));
|
||||
}
|
||||
} else {
|
||||
println!("{}", style("No drift detected").green());
|
||||
}
|
||||
|
||||
println!();
|
||||
println!("{}", response.recommendation);
|
||||
|
||||
if !response.similarity_curve.is_empty() {
|
||||
println!();
|
||||
println!("{}", style("Similarity Curve:").bold());
|
||||
for pt in &response.similarity_curve {
|
||||
let bar_len = ((pt.similarity.max(0.0)) * 30.0) as usize;
|
||||
let bar: String = "#".repeat(bar_len);
|
||||
println!(
|
||||
" {:>3} {:.2} {} @{}",
|
||||
pt.note_index, pt.similarity, bar, pt.author
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
pub fn print_drift_json(response: &DriftResponse, elapsed_ms: u64) {
|
||||
let meta = RobotMeta { elapsed_ms };
|
||||
let output = serde_json::json!({
|
||||
"ok": true,
|
||||
"data": response,
|
||||
"meta": meta,
|
||||
});
|
||||
match serde_json::to_string(&output) {
|
||||
Ok(json) => println!("{json}"),
|
||||
Err(e) => eprintln!("Error serializing to JSON: {e}"),
|
||||
}
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Tests
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
#[test]
|
||||
fn test_detect_drift_when_divergent() {
|
||||
let notes: Vec<NoteRow> = (0..6)
|
||||
.map(|i| NoteRow {
|
||||
id: i as i64,
|
||||
body: format!("note {i}"),
|
||||
author_username: "user".to_string(),
|
||||
created_at: 1000 + i as i64,
|
||||
})
|
||||
.collect();
|
||||
|
||||
let curve: Vec<SimilarityPoint> = [0.9, 0.85, 0.8, 0.25, 0.2, 0.15]
|
||||
.iter()
|
||||
.enumerate()
|
||||
.map(|(i, &sim)| SimilarityPoint {
|
||||
note_index: i,
|
||||
similarity: sim,
|
||||
author: "user".to_string(),
|
||||
created_at: ms_to_iso(1000 + i as i64),
|
||||
})
|
||||
.collect();
|
||||
|
||||
let (detected, point) = detect_drift(&curve, ¬es, 0.4);
|
||||
assert!(detected);
|
||||
assert!(point.is_some());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_no_drift_consistent() {
|
||||
let notes: Vec<NoteRow> = (0..5)
|
||||
.map(|i| NoteRow {
|
||||
id: i as i64,
|
||||
body: format!("note {i}"),
|
||||
author_username: "user".to_string(),
|
||||
created_at: 1000 + i as i64,
|
||||
})
|
||||
.collect();
|
||||
|
||||
let curve: Vec<SimilarityPoint> = [0.85, 0.8, 0.75, 0.7, 0.65]
|
||||
.iter()
|
||||
.enumerate()
|
||||
.map(|(i, &sim)| SimilarityPoint {
|
||||
note_index: i,
|
||||
similarity: sim,
|
||||
author: "user".to_string(),
|
||||
created_at: ms_to_iso(1000 + i as i64),
|
||||
})
|
||||
.collect();
|
||||
|
||||
let (detected, _) = detect_drift(&curve, ¬es, 0.4);
|
||||
assert!(!detected);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_drift_point_is_first_divergent() {
|
||||
let notes: Vec<NoteRow> = (0..5)
|
||||
.map(|i| NoteRow {
|
||||
id: (i * 10) as i64,
|
||||
body: format!("note {i}"),
|
||||
author_username: format!("user{i}"),
|
||||
created_at: 1000 + i as i64,
|
||||
})
|
||||
.collect();
|
||||
|
||||
// Window of 3: indices [0,1,2] avg=0.83, [1,2,3] avg=0.55, [2,3,4] avg=0.23
|
||||
let curve: Vec<SimilarityPoint> = [0.9, 0.8, 0.8, 0.05, 0.05]
|
||||
.iter()
|
||||
.enumerate()
|
||||
.map(|(i, &sim)| SimilarityPoint {
|
||||
note_index: i,
|
||||
similarity: sim,
|
||||
author: format!("user{i}"),
|
||||
created_at: ms_to_iso(1000 + i as i64),
|
||||
})
|
||||
.collect();
|
||||
|
||||
let (detected, point) = detect_drift(&curve, ¬es, 0.4);
|
||||
assert!(detected);
|
||||
let dp = point.unwrap();
|
||||
// Window [2,3,4] avg = (0.8+0.05+0.05)/3 = 0.3 < 0.4
|
||||
// But [1,2,3] avg = (0.8+0.8+0.05)/3 = 0.55 >= 0.4, so first failing is index 2
|
||||
assert_eq!(dp.note_index, 2);
|
||||
assert_eq!(dp.note_id, 20);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_extract_drift_topics_excludes_description_terms() {
|
||||
let description = "We need to fix the authentication flow for login users";
|
||||
let notes = vec![
|
||||
NoteRow {
|
||||
id: 1,
|
||||
body: "The database migration script is broken and needs postgres update"
|
||||
.to_string(),
|
||||
author_username: "dev".to_string(),
|
||||
created_at: 1000,
|
||||
},
|
||||
NoteRow {
|
||||
id: 2,
|
||||
body: "The database connection pool also has migration issues with postgres"
|
||||
.to_string(),
|
||||
author_username: "dev".to_string(),
|
||||
created_at: 2000,
|
||||
},
|
||||
];
|
||||
|
||||
let topics = extract_drift_topics(description, ¬es, 0);
|
||||
// "database", "migration", "postgres" should appear; "fix" should not (it's in description)
|
||||
assert!(!topics.is_empty());
|
||||
for t in &topics {
|
||||
assert_ne!(t, "fix");
|
||||
assert_ne!(t, "authentication");
|
||||
assert_ne!(t, "login");
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_strip_markdown_code_blocks() {
|
||||
let input = "Before\n```rust\nfn main() {}\n```\nAfter";
|
||||
let result = strip_markdown(input);
|
||||
assert!(!result.contains("fn main"));
|
||||
assert!(result.contains("Before"));
|
||||
assert!(result.contains("After"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_strip_markdown_preserves_text() {
|
||||
let input = "Check [this link](https://example.com) and `inline code` for details";
|
||||
let result = strip_markdown(input);
|
||||
assert!(result.contains("this link"));
|
||||
assert!(!result.contains("https://example.com"));
|
||||
assert!(!result.contains("inline code"));
|
||||
assert!(result.contains("details"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_too_few_notes() {
|
||||
let notes: Vec<NoteRow> = (0..2)
|
||||
.map(|i| NoteRow {
|
||||
id: i as i64,
|
||||
body: format!("note {i}"),
|
||||
author_username: "user".to_string(),
|
||||
created_at: 1000 + i as i64,
|
||||
})
|
||||
.collect();
|
||||
|
||||
let curve: Vec<SimilarityPoint> = [0.1, 0.1]
|
||||
.iter()
|
||||
.enumerate()
|
||||
.map(|(i, &sim)| SimilarityPoint {
|
||||
note_index: i,
|
||||
similarity: sim,
|
||||
author: "user".to_string(),
|
||||
created_at: ms_to_iso(1000 + i as i64),
|
||||
})
|
||||
.collect();
|
||||
|
||||
let (detected, _) = detect_drift(&curve, ¬es, 0.4);
|
||||
assert!(!detected);
|
||||
}
|
||||
}
|
||||
@@ -1,6 +1,7 @@
|
||||
pub mod auth_test;
|
||||
pub mod count;
|
||||
pub mod doctor;
|
||||
pub mod drift;
|
||||
pub mod embed;
|
||||
pub mod generate_docs;
|
||||
pub mod ingest;
|
||||
@@ -20,6 +21,7 @@ pub use count::{
|
||||
run_count_events,
|
||||
};
|
||||
pub use doctor::{DoctorChecks, print_doctor_results, run_doctor};
|
||||
pub use drift::{DriftResponse, print_drift_human, print_drift_json, run_drift};
|
||||
pub use embed::{print_embed, print_embed_json, run_embed};
|
||||
pub use generate_docs::{print_generate_docs, print_generate_docs_json, run_generate_docs};
|
||||
pub use ingest::{
|
||||
|
||||
@@ -1,3 +1,5 @@
|
||||
use std::collections::HashMap;
|
||||
|
||||
use console::style;
|
||||
use serde::Serialize;
|
||||
|
||||
@@ -8,9 +10,10 @@ use crate::core::paths::get_db_path;
|
||||
use crate::core::project::resolve_project;
|
||||
use crate::core::time::{ms_to_iso, parse_since};
|
||||
use crate::documents::SourceType;
|
||||
use crate::embedding::ollama::{OllamaClient, OllamaConfig};
|
||||
use crate::search::{
|
||||
FtsQueryMode, PathFilter, SearchFilters, apply_filters, get_result_snippet, rank_rrf,
|
||||
search_fts,
|
||||
FtsQueryMode, HybridResult, PathFilter, SearchFilters, SearchMode, get_result_snippet,
|
||||
search_fts, search_hybrid,
|
||||
};
|
||||
|
||||
#[derive(Debug, Serialize)]
|
||||
@@ -58,7 +61,7 @@ pub struct SearchCliFilters {
|
||||
pub limit: usize,
|
||||
}
|
||||
|
||||
pub fn run_search(
|
||||
pub async fn run_search(
|
||||
config: &Config,
|
||||
query: &str,
|
||||
cli_filters: SearchCliFilters,
|
||||
@@ -71,15 +74,18 @@ pub fn run_search(
|
||||
|
||||
let mut warnings: Vec<String> = Vec::new();
|
||||
|
||||
// Determine actual mode: vector search requires embeddings, which need async + Ollama.
|
||||
// Until hybrid/semantic are wired up, we run lexical and warn if the user asked for more.
|
||||
let actual_mode = "lexical";
|
||||
if requested_mode != "lexical" {
|
||||
warnings.push(format!(
|
||||
"Requested mode '{}' is not yet available; falling back to lexical search.",
|
||||
requested_mode
|
||||
));
|
||||
}
|
||||
let actual_mode = SearchMode::parse(requested_mode).unwrap_or(SearchMode::Hybrid);
|
||||
|
||||
let client = if actual_mode != SearchMode::Lexical {
|
||||
let ollama_cfg = &config.embedding;
|
||||
Some(OllamaClient::new(OllamaConfig {
|
||||
base_url: ollama_cfg.base_url.clone(),
|
||||
model: ollama_cfg.model.clone(),
|
||||
..OllamaConfig::default()
|
||||
}))
|
||||
} else {
|
||||
None
|
||||
};
|
||||
|
||||
let doc_count: i64 = conn
|
||||
.query_row("SELECT COUNT(*) FROM documents", [], |row| row.get(0))
|
||||
@@ -89,7 +95,7 @@ pub fn run_search(
|
||||
warnings.push("No documents indexed. Run 'lore generate-docs' first.".to_string());
|
||||
return Ok(SearchResponse {
|
||||
query: query.to_string(),
|
||||
mode: actual_mode.to_string(),
|
||||
mode: actual_mode.as_str().to_string(),
|
||||
total_results: 0,
|
||||
results: vec![],
|
||||
warnings,
|
||||
@@ -151,52 +157,54 @@ pub fn run_search(
|
||||
limit: cli_filters.limit,
|
||||
};
|
||||
|
||||
let requested = filters.clamp_limit();
|
||||
let top_k = if filters.has_any_filter() {
|
||||
(requested * 50).clamp(200, 1500)
|
||||
} else {
|
||||
(requested * 10).clamp(50, 1500)
|
||||
};
|
||||
|
||||
let fts_results = search_fts(&conn, query, top_k, fts_mode)?;
|
||||
let fts_tuples: Vec<(i64, f64)> = fts_results
|
||||
.iter()
|
||||
.map(|r| (r.document_id, r.bm25_score))
|
||||
.collect();
|
||||
|
||||
let snippet_map: std::collections::HashMap<i64, String> = fts_results
|
||||
// Run FTS separately for snippet extraction (search_hybrid doesn't return snippets).
|
||||
let snippet_top_k = filters
|
||||
.clamp_limit()
|
||||
.checked_mul(10)
|
||||
.unwrap_or(500)
|
||||
.clamp(50, 1500);
|
||||
let fts_results = search_fts(&conn, query, snippet_top_k, fts_mode)?;
|
||||
let snippet_map: HashMap<i64, String> = fts_results
|
||||
.iter()
|
||||
.map(|r| (r.document_id, r.snippet.clone()))
|
||||
.collect();
|
||||
|
||||
let ranked = rank_rrf(&[], &fts_tuples);
|
||||
let ranked_ids: Vec<i64> = ranked.iter().map(|r| r.document_id).collect();
|
||||
// search_hybrid handles recall sizing, RRF ranking, and filter application internally.
|
||||
let (hybrid_results, mut hybrid_warnings) = search_hybrid(
|
||||
&conn,
|
||||
client.as_ref(),
|
||||
query,
|
||||
actual_mode,
|
||||
&filters,
|
||||
fts_mode,
|
||||
)
|
||||
.await?;
|
||||
warnings.append(&mut hybrid_warnings);
|
||||
|
||||
let filtered_ids = apply_filters(&conn, &ranked_ids, &filters)?;
|
||||
|
||||
if filtered_ids.is_empty() {
|
||||
if hybrid_results.is_empty() {
|
||||
return Ok(SearchResponse {
|
||||
query: query.to_string(),
|
||||
mode: actual_mode.to_string(),
|
||||
mode: actual_mode.as_str().to_string(),
|
||||
total_results: 0,
|
||||
results: vec![],
|
||||
warnings,
|
||||
});
|
||||
}
|
||||
|
||||
let hydrated = hydrate_results(&conn, &filtered_ids)?;
|
||||
let ranked_ids: Vec<i64> = hybrid_results.iter().map(|r| r.document_id).collect();
|
||||
let hydrated = hydrate_results(&conn, &ranked_ids)?;
|
||||
|
||||
let rrf_map: std::collections::HashMap<i64, &crate::search::RrfResult> =
|
||||
ranked.iter().map(|r| (r.document_id, r)).collect();
|
||||
let hybrid_map: HashMap<i64, &HybridResult> =
|
||||
hybrid_results.iter().map(|r| (r.document_id, r)).collect();
|
||||
|
||||
let mut results: Vec<SearchResultDisplay> = Vec::with_capacity(hydrated.len());
|
||||
for row in &hydrated {
|
||||
let rrf = rrf_map.get(&row.document_id);
|
||||
let hr = hybrid_map.get(&row.document_id);
|
||||
let fts_snippet = snippet_map.get(&row.document_id).map(|s| s.as_str());
|
||||
let snippet = get_result_snippet(fts_snippet, &row.content_text);
|
||||
|
||||
let explain_data = if explain {
|
||||
rrf.map(|r| ExplainData {
|
||||
hr.map(|r| ExplainData {
|
||||
vector_rank: r.vector_rank,
|
||||
fts_rank: r.fts_rank,
|
||||
rrf_score: r.rrf_score,
|
||||
@@ -217,14 +225,14 @@ pub fn run_search(
|
||||
labels: row.labels.clone(),
|
||||
paths: row.paths.clone(),
|
||||
snippet,
|
||||
score: rrf.map(|r| r.normalized_score).unwrap_or(0.0),
|
||||
score: hr.map(|r| r.score).unwrap_or(0.0),
|
||||
explain: explain_data,
|
||||
});
|
||||
}
|
||||
|
||||
Ok(SearchResponse {
|
||||
query: query.to_string(),
|
||||
mode: actual_mode.to_string(),
|
||||
mode: actual_mode.as_str().to_string(),
|
||||
total_results: results.len(),
|
||||
results,
|
||||
warnings,
|
||||
@@ -360,8 +368,12 @@ pub fn print_search_results(response: &SearchResponse) {
|
||||
|
||||
if let Some(ref explain) = result.explain {
|
||||
println!(
|
||||
" {} fts_rank={} rrf_score={:.6}",
|
||||
" {} vector_rank={} fts_rank={} rrf_score={:.6}",
|
||||
style("[explain]").magenta(),
|
||||
explain
|
||||
.vector_rank
|
||||
.map(|r| r.to_string())
|
||||
.unwrap_or_else(|| "-".into()),
|
||||
explain
|
||||
.fts_rank
|
||||
.map(|r| r.to_string())
|
||||
|
||||
@@ -75,12 +75,17 @@ pub struct IssueDetail {
|
||||
pub author_username: String,
|
||||
pub created_at: i64,
|
||||
pub updated_at: i64,
|
||||
pub closed_at: Option<String>,
|
||||
pub confidential: bool,
|
||||
pub web_url: Option<String>,
|
||||
pub project_path: String,
|
||||
pub references_full: String,
|
||||
pub labels: Vec<String>,
|
||||
pub assignees: Vec<String>,
|
||||
pub due_date: Option<String>,
|
||||
pub milestone: Option<String>,
|
||||
pub user_notes_count: i64,
|
||||
pub merge_requests_count: usize,
|
||||
pub closing_merge_requests: Vec<ClosingMrRef>,
|
||||
pub discussions: Vec<DiscussionDetail>,
|
||||
pub status_name: Option<String>,
|
||||
@@ -122,6 +127,9 @@ pub fn run_show_issue(
|
||||
|
||||
let discussions = get_issue_discussions(&conn, issue.id)?;
|
||||
|
||||
let references_full = format!("{}#{}", issue.project_path, issue.iid);
|
||||
let merge_requests_count = closing_mrs.len();
|
||||
|
||||
Ok(IssueDetail {
|
||||
id: issue.id,
|
||||
iid: issue.iid,
|
||||
@@ -131,12 +139,17 @@ pub fn run_show_issue(
|
||||
author_username: issue.author_username,
|
||||
created_at: issue.created_at,
|
||||
updated_at: issue.updated_at,
|
||||
closed_at: issue.closed_at,
|
||||
confidential: issue.confidential,
|
||||
web_url: issue.web_url,
|
||||
project_path: issue.project_path,
|
||||
references_full,
|
||||
labels,
|
||||
assignees,
|
||||
due_date: issue.due_date,
|
||||
milestone: issue.milestone_title,
|
||||
user_notes_count: issue.user_notes_count,
|
||||
merge_requests_count,
|
||||
closing_merge_requests: closing_mrs,
|
||||
discussions,
|
||||
status_name: issue.status_name,
|
||||
@@ -156,10 +169,13 @@ struct IssueRow {
|
||||
author_username: String,
|
||||
created_at: i64,
|
||||
updated_at: i64,
|
||||
closed_at: Option<String>,
|
||||
confidential: bool,
|
||||
web_url: Option<String>,
|
||||
project_path: String,
|
||||
due_date: Option<String>,
|
||||
milestone_title: Option<String>,
|
||||
user_notes_count: i64,
|
||||
status_name: Option<String>,
|
||||
status_category: Option<String>,
|
||||
status_color: Option<String>,
|
||||
@@ -173,8 +189,12 @@ fn find_issue(conn: &Connection, iid: i64, project_filter: Option<&str>) -> Resu
|
||||
let project_id = resolve_project(conn, project)?;
|
||||
(
|
||||
"SELECT i.id, i.iid, i.title, i.description, i.state, i.author_username,
|
||||
i.created_at, i.updated_at, i.web_url, p.path_with_namespace,
|
||||
i.created_at, i.updated_at, i.closed_at, i.confidential,
|
||||
i.web_url, p.path_with_namespace,
|
||||
i.due_date, i.milestone_title,
|
||||
(SELECT COUNT(*) FROM notes n
|
||||
JOIN discussions d ON n.discussion_id = d.id
|
||||
WHERE d.noteable_type = 'Issue' AND d.noteable_id = i.id AND n.is_system = 0) AS user_notes_count,
|
||||
i.status_name, i.status_category, i.status_color,
|
||||
i.status_icon_name, i.status_synced_at
|
||||
FROM issues i
|
||||
@@ -185,8 +205,12 @@ fn find_issue(conn: &Connection, iid: i64, project_filter: Option<&str>) -> Resu
|
||||
}
|
||||
None => (
|
||||
"SELECT i.id, i.iid, i.title, i.description, i.state, i.author_username,
|
||||
i.created_at, i.updated_at, i.web_url, p.path_with_namespace,
|
||||
i.created_at, i.updated_at, i.closed_at, i.confidential,
|
||||
i.web_url, p.path_with_namespace,
|
||||
i.due_date, i.milestone_title,
|
||||
(SELECT COUNT(*) FROM notes n
|
||||
JOIN discussions d ON n.discussion_id = d.id
|
||||
WHERE d.noteable_type = 'Issue' AND d.noteable_id = i.id AND n.is_system = 0) AS user_notes_count,
|
||||
i.status_name, i.status_category, i.status_color,
|
||||
i.status_icon_name, i.status_synced_at
|
||||
FROM issues i
|
||||
@@ -201,6 +225,7 @@ fn find_issue(conn: &Connection, iid: i64, project_filter: Option<&str>) -> Resu
|
||||
let mut stmt = conn.prepare(sql)?;
|
||||
let issues: Vec<IssueRow> = stmt
|
||||
.query_map(param_refs.as_slice(), |row| {
|
||||
let confidential_val: i64 = row.get(9)?;
|
||||
Ok(IssueRow {
|
||||
id: row.get(0)?,
|
||||
iid: row.get(1)?,
|
||||
@@ -210,15 +235,18 @@ fn find_issue(conn: &Connection, iid: i64, project_filter: Option<&str>) -> Resu
|
||||
author_username: row.get(5)?,
|
||||
created_at: row.get(6)?,
|
||||
updated_at: row.get(7)?,
|
||||
web_url: row.get(8)?,
|
||||
project_path: row.get(9)?,
|
||||
due_date: row.get(10)?,
|
||||
milestone_title: row.get(11)?,
|
||||
status_name: row.get(12)?,
|
||||
status_category: row.get(13)?,
|
||||
status_color: row.get(14)?,
|
||||
status_icon_name: row.get(15)?,
|
||||
status_synced_at: row.get(16)?,
|
||||
closed_at: row.get(8)?,
|
||||
confidential: confidential_val != 0,
|
||||
web_url: row.get(10)?,
|
||||
project_path: row.get(11)?,
|
||||
due_date: row.get(12)?,
|
||||
milestone_title: row.get(13)?,
|
||||
user_notes_count: row.get(14)?,
|
||||
status_name: row.get(15)?,
|
||||
status_category: row.get(16)?,
|
||||
status_color: row.get(17)?,
|
||||
status_icon_name: row.get(18)?,
|
||||
status_synced_at: row.get(19)?,
|
||||
})
|
||||
})?
|
||||
.collect::<std::result::Result<Vec<_>, _>>()?;
|
||||
@@ -618,6 +646,7 @@ pub fn print_show_issue(issue: &IssueDetail) {
|
||||
println!("{}", "━".repeat(header.len().min(80)));
|
||||
println!();
|
||||
|
||||
println!("Ref: {}", style(&issue.references_full).dim());
|
||||
println!("Project: {}", style(&issue.project_path).cyan());
|
||||
|
||||
let state_styled = if issue.state == "opened" {
|
||||
@@ -627,6 +656,10 @@ pub fn print_show_issue(issue: &IssueDetail) {
|
||||
};
|
||||
println!("State: {}", state_styled);
|
||||
|
||||
if issue.confidential {
|
||||
println!(" {}", style("CONFIDENTIAL").red().bold());
|
||||
}
|
||||
|
||||
if let Some(status) = &issue.status_name {
|
||||
println!(
|
||||
"Status: {}",
|
||||
@@ -658,6 +691,10 @@ pub fn print_show_issue(issue: &IssueDetail) {
|
||||
println!("Created: {}", format_date(issue.created_at));
|
||||
println!("Updated: {}", format_date(issue.updated_at));
|
||||
|
||||
if let Some(closed_at) = &issue.closed_at {
|
||||
println!("Closed: {}", closed_at);
|
||||
}
|
||||
|
||||
if let Some(due) = &issue.due_date {
|
||||
println!("Due: {}", due);
|
||||
}
|
||||
@@ -931,12 +968,17 @@ pub struct IssueDetailJson {
|
||||
pub author_username: String,
|
||||
pub created_at: String,
|
||||
pub updated_at: String,
|
||||
pub closed_at: Option<String>,
|
||||
pub confidential: bool,
|
||||
pub web_url: Option<String>,
|
||||
pub project_path: String,
|
||||
pub references_full: String,
|
||||
pub labels: Vec<String>,
|
||||
pub assignees: Vec<String>,
|
||||
pub due_date: Option<String>,
|
||||
pub milestone: Option<String>,
|
||||
pub user_notes_count: i64,
|
||||
pub merge_requests_count: usize,
|
||||
pub closing_merge_requests: Vec<ClosingMrRefJson>,
|
||||
pub discussions: Vec<DiscussionDetailJson>,
|
||||
pub status_name: Option<String>,
|
||||
@@ -980,12 +1022,17 @@ impl From<&IssueDetail> for IssueDetailJson {
|
||||
author_username: issue.author_username.clone(),
|
||||
created_at: ms_to_iso(issue.created_at),
|
||||
updated_at: ms_to_iso(issue.updated_at),
|
||||
closed_at: issue.closed_at.clone(),
|
||||
confidential: issue.confidential,
|
||||
web_url: issue.web_url.clone(),
|
||||
project_path: issue.project_path.clone(),
|
||||
references_full: issue.references_full.clone(),
|
||||
labels: issue.labels.clone(),
|
||||
assignees: issue.assignees.clone(),
|
||||
due_date: issue.due_date.clone(),
|
||||
milestone: issue.milestone.clone(),
|
||||
user_notes_count: issue.user_notes_count,
|
||||
merge_requests_count: issue.merge_requests_count,
|
||||
closing_merge_requests: issue
|
||||
.closing_merge_requests
|
||||
.iter()
|
||||
|
||||
@@ -215,6 +215,24 @@ pub enum Commands {
|
||||
/// People intelligence: experts, workload, active discussions, overlap
|
||||
Who(WhoArgs),
|
||||
|
||||
/// Detect discussion divergence from original intent
|
||||
Drift {
|
||||
/// Entity type (currently only "issues" supported)
|
||||
#[arg(value_parser = ["issues"])]
|
||||
entity_type: String,
|
||||
|
||||
/// Entity IID
|
||||
iid: i64,
|
||||
|
||||
/// Similarity threshold for drift detection (0.0-1.0)
|
||||
#[arg(long, default_value = "0.4")]
|
||||
threshold: f32,
|
||||
|
||||
/// Scope to project (fuzzy match)
|
||||
#[arg(short, long)]
|
||||
project: Option<String>,
|
||||
},
|
||||
|
||||
#[command(hide = true)]
|
||||
List {
|
||||
#[arg(value_parser = ["issues", "mrs"])]
|
||||
|
||||
@@ -77,6 +77,7 @@ pub fn strip_schemas(commands: &mut serde_json::Value) {
|
||||
for (_cmd_name, cmd) in map.iter_mut() {
|
||||
if let Some(obj) = cmd.as_object_mut() {
|
||||
obj.remove("response_schema");
|
||||
obj.remove("example_output");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -69,6 +69,10 @@ const MIGRATIONS: &[(&str, &str)] = &[
|
||||
"021",
|
||||
include_str!("../../migrations/021_work_item_status.sql"),
|
||||
),
|
||||
(
|
||||
"023",
|
||||
include_str!("../../migrations/023_issue_detail_fields.sql"),
|
||||
),
|
||||
];
|
||||
|
||||
pub fn create_connection(db_path: &Path) -> Result<Connection> {
|
||||
|
||||
@@ -3,7 +3,9 @@ pub mod chunk_ids;
|
||||
pub mod chunking;
|
||||
pub mod ollama;
|
||||
pub mod pipeline;
|
||||
pub mod similarity;
|
||||
|
||||
pub use change_detector::{PendingDocument, count_pending_documents, find_pending_documents};
|
||||
pub use chunking::{CHUNK_MAX_BYTES, CHUNK_OVERLAP_CHARS, split_into_chunks};
|
||||
pub use pipeline::{EmbedResult, embed_documents};
|
||||
pub use similarity::cosine_similarity;
|
||||
|
||||
48
src/embedding/similarity.rs
Normal file
48
src/embedding/similarity.rs
Normal file
@@ -0,0 +1,48 @@
|
||||
/// Cosine similarity between two embedding vectors.
|
||||
/// Returns value in [-1, 1] range; higher = more similar.
|
||||
pub fn cosine_similarity(a: &[f32], b: &[f32]) -> f32 {
|
||||
debug_assert_eq!(a.len(), b.len(), "embedding dimensions must match");
|
||||
let dot: f32 = a.iter().zip(b).map(|(x, y)| x * y).sum();
|
||||
let norm_a: f32 = a.iter().map(|x| x * x).sum::<f32>().sqrt();
|
||||
let norm_b: f32 = b.iter().map(|x| x * x).sum::<f32>().sqrt();
|
||||
if norm_a == 0.0 || norm_b == 0.0 {
|
||||
return 0.0;
|
||||
}
|
||||
dot / (norm_a * norm_b)
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
#[test]
|
||||
fn test_cosine_similarity_identical() {
|
||||
let v = [1.0, 2.0, 3.0];
|
||||
let sim = cosine_similarity(&v, &v);
|
||||
assert!((sim - 1.0).abs() < 1e-6);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_cosine_similarity_orthogonal() {
|
||||
let a = [1.0, 0.0, 0.0];
|
||||
let b = [0.0, 1.0, 0.0];
|
||||
let sim = cosine_similarity(&a, &b);
|
||||
assert!(sim.abs() < 1e-6);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_cosine_similarity_zero_vector() {
|
||||
let a = [1.0, 2.0, 3.0];
|
||||
let b = [0.0, 0.0, 0.0];
|
||||
let sim = cosine_similarity(&a, &b);
|
||||
assert!((sim - 0.0).abs() < 1e-6);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_cosine_similarity_opposite() {
|
||||
let a = [1.0, 2.0, 3.0];
|
||||
let b = [-1.0, -2.0, -3.0];
|
||||
let sim = cosine_similarity(&a, &b);
|
||||
assert!((sim - (-1.0)).abs() < 1e-6);
|
||||
}
|
||||
}
|
||||
92
src/main.rs
92
src/main.rs
@@ -12,17 +12,17 @@ use lore::cli::autocorrect::{self, CorrectionResult};
|
||||
use lore::cli::commands::{
|
||||
IngestDisplay, InitInputs, InitOptions, InitResult, ListFilters, MrListFilters,
|
||||
SearchCliFilters, SyncOptions, TimelineParams, open_issue_in_browser, open_mr_in_browser,
|
||||
print_count, print_count_json, print_doctor_results, print_dry_run_preview,
|
||||
print_dry_run_preview_json, print_embed, print_embed_json, print_event_count,
|
||||
print_event_count_json, print_generate_docs, print_generate_docs_json, print_ingest_summary,
|
||||
print_ingest_summary_json, print_list_issues, print_list_issues_json, print_list_mrs,
|
||||
print_list_mrs_json, print_search_results, print_search_results_json, print_show_issue,
|
||||
print_show_issue_json, print_show_mr, print_show_mr_json, print_stats, print_stats_json,
|
||||
print_sync, print_sync_json, print_sync_status, print_sync_status_json, print_timeline,
|
||||
print_timeline_json_with_meta, print_who_human, print_who_json, run_auth_test, run_count,
|
||||
run_count_events, run_doctor, run_embed, run_generate_docs, run_ingest, run_ingest_dry_run,
|
||||
run_init, run_list_issues, run_list_mrs, run_search, run_show_issue, run_show_mr, run_stats,
|
||||
run_sync, run_sync_status, run_timeline, run_who,
|
||||
print_count, print_count_json, print_doctor_results, print_drift_human, print_drift_json,
|
||||
print_dry_run_preview, print_dry_run_preview_json, print_embed, print_embed_json,
|
||||
print_event_count, print_event_count_json, print_generate_docs, print_generate_docs_json,
|
||||
print_ingest_summary, print_ingest_summary_json, print_list_issues, print_list_issues_json,
|
||||
print_list_mrs, print_list_mrs_json, print_search_results, print_search_results_json,
|
||||
print_show_issue, print_show_issue_json, print_show_mr, print_show_mr_json, print_stats,
|
||||
print_stats_json, print_sync, print_sync_json, print_sync_status, print_sync_status_json,
|
||||
print_timeline, print_timeline_json_with_meta, print_who_human, print_who_json, run_auth_test,
|
||||
run_count, run_count_events, run_doctor, run_drift, run_embed, run_generate_docs, run_ingest,
|
||||
run_ingest_dry_run, run_init, run_list_issues, run_list_mrs, run_search, run_show_issue,
|
||||
run_show_mr, run_stats, run_sync, run_sync_status, run_timeline, run_who,
|
||||
};
|
||||
use lore::cli::robot::{RobotMeta, strip_schemas};
|
||||
use lore::cli::{
|
||||
@@ -178,6 +178,22 @@ async fn main() {
|
||||
}
|
||||
Some(Commands::Timeline(args)) => handle_timeline(cli.config.as_deref(), args, robot_mode),
|
||||
Some(Commands::Who(args)) => handle_who(cli.config.as_deref(), args, robot_mode),
|
||||
Some(Commands::Drift {
|
||||
entity_type,
|
||||
iid,
|
||||
threshold,
|
||||
project,
|
||||
}) => {
|
||||
handle_drift(
|
||||
cli.config.as_deref(),
|
||||
&entity_type,
|
||||
iid,
|
||||
threshold,
|
||||
project.as_deref(),
|
||||
robot_mode,
|
||||
)
|
||||
.await
|
||||
}
|
||||
Some(Commands::Stats(args)) => handle_stats(cli.config.as_deref(), args, robot_mode).await,
|
||||
Some(Commands::Embed(args)) => handle_embed(cli.config.as_deref(), args, robot_mode).await,
|
||||
Some(Commands::Sync(args)) => {
|
||||
@@ -1762,7 +1778,8 @@ async fn handle_search(
|
||||
fts_mode,
|
||||
&args.mode,
|
||||
explain,
|
||||
)?;
|
||||
)
|
||||
.await?;
|
||||
let elapsed_ms = start.elapsed().as_millis() as u64;
|
||||
|
||||
if robot_mode {
|
||||
@@ -2048,6 +2065,7 @@ struct RobotDocsData {
|
||||
version: String,
|
||||
description: String,
|
||||
activation: RobotDocsActivation,
|
||||
quick_start: serde_json::Value,
|
||||
commands: serde_json::Value,
|
||||
/// Deprecated command aliases (old -> new)
|
||||
aliases: serde_json::Value,
|
||||
@@ -2151,6 +2169,7 @@ fn handle_robot_docs(robot_mode: bool, brief: bool) -> Result<(), Box<dyn std::e
|
||||
"meta": {"elapsed_ms": "int"}
|
||||
}
|
||||
},
|
||||
"example_output": {"list": {"ok":true,"data":{"issues":[{"iid":3864,"title":"Switch Health Card","state":"opened","status_name":"In progress","labels":["customer:BNSF"],"assignees":["teernisse"],"discussion_count":12,"updated_at_iso":"2026-02-12T..."}],"total_count":1,"showing":1},"meta":{"elapsed_ms":42}}},
|
||||
"fields_presets": {"minimal": ["iid", "title", "state", "updated_at_iso"]}
|
||||
},
|
||||
"mrs": {
|
||||
@@ -2169,6 +2188,7 @@ fn handle_robot_docs(robot_mode: bool, brief: bool) -> Result<(), Box<dyn std::e
|
||||
"meta": {"elapsed_ms": "int"}
|
||||
}
|
||||
},
|
||||
"example_output": {"list": {"ok":true,"data":{"mrs":[{"iid":200,"title":"Add throw time chart","state":"opened","draft":false,"author_username":"teernisse","target_branch":"main","source_branch":"feat/throw-time","reviewers":["cseiber"],"discussion_count":5,"updated_at_iso":"2026-02-11T..."}],"total_count":1,"showing":1},"meta":{"elapsed_ms":38}}},
|
||||
"fields_presets": {"minimal": ["iid", "title", "state", "updated_at_iso"]}
|
||||
},
|
||||
"search": {
|
||||
@@ -2180,6 +2200,7 @@ fn handle_robot_docs(robot_mode: bool, brief: bool) -> Result<(), Box<dyn std::e
|
||||
"data": {"results": "[{document_id:int, source_type:string, title:string, snippet:string, score:float, url:string?, author:string?, created_at:string?, updated_at:string?, project_path:string, labels:[string], paths:[string]}]", "total_results": "int", "query": "string", "mode": "string", "warnings": "[string]"},
|
||||
"meta": {"elapsed_ms": "int"}
|
||||
},
|
||||
"example_output": {"ok":true,"data":{"query":"throw time","mode":"hybrid","total_results":3,"results":[{"document_id":42,"source_type":"issue","title":"Switch Health Card","score":0.92,"snippet":"...throw time data from BNSF...","project_path":"vs/typescript-code"}],"warnings":[]},"meta":{"elapsed_ms":85}},
|
||||
"fields_presets": {"minimal": ["document_id", "title", "source_type", "score"]}
|
||||
},
|
||||
"count": {
|
||||
@@ -2289,6 +2310,7 @@ fn handle_robot_docs(robot_mode: bool, brief: bool) -> Result<(), Box<dyn std::e
|
||||
},
|
||||
"meta": {"elapsed_ms": "int"}
|
||||
},
|
||||
"example_output": {"expert": {"ok":true,"data":{"mode":"expert","result":{"experts":[{"username":"teernisse","score":42,"note_count":15,"diff_note_count":8}]}},"meta":{"elapsed_ms":65}}},
|
||||
"fields_presets": {
|
||||
"expert_minimal": ["username", "score"],
|
||||
"workload_minimal": ["entity_type", "iid", "title", "state"],
|
||||
@@ -2302,7 +2324,28 @@ fn handle_robot_docs(robot_mode: bool, brief: bool) -> Result<(), Box<dyn std::e
|
||||
}
|
||||
});
|
||||
|
||||
// --brief: strip response_schema from every command (~60% smaller)
|
||||
let quick_start = serde_json::json!({
|
||||
"glab_equivalents": [
|
||||
{ "glab": "glab issue list", "lore": "lore -J issues -n 50", "note": "Richer: includes labels, status, closing MRs, discussion counts" },
|
||||
{ "glab": "glab issue view 123", "lore": "lore -J issues 123", "note": "Includes full discussions, work-item status, cross-references" },
|
||||
{ "glab": "glab issue list -l bug", "lore": "lore -J issues --label bug", "note": "AND logic for multiple --label flags" },
|
||||
{ "glab": "glab mr list", "lore": "lore -J mrs", "note": "Includes draft status, reviewers, discussion counts" },
|
||||
{ "glab": "glab mr view 456", "lore": "lore -J mrs 456", "note": "Includes discussions, review threads, source/target branches" },
|
||||
{ "glab": "glab mr list -s opened", "lore": "lore -J mrs -s opened", "note": "States: opened, merged, closed, locked, all" },
|
||||
{ "glab": "glab api '/projects/:id/issues'", "lore": "lore -J issues -p project", "note": "Fuzzy project matching (suffix or substring)" }
|
||||
],
|
||||
"lore_exclusive": [
|
||||
"search: FTS5 + vector hybrid search across all entities",
|
||||
"who: Expert/workload/reviews analysis per file path or person",
|
||||
"timeline: Chronological event reconstruction across entities",
|
||||
"stats: Database statistics with document/note/discussion counts",
|
||||
"count: Entity counts with state breakdowns",
|
||||
"embed: Generate vector embeddings for semantic search via Ollama"
|
||||
],
|
||||
"read_write_split": "lore = ALL reads (issues, MRs, search, who, timeline, intelligence). glab = ALL writes (create, update, approve, merge, CI/CD)."
|
||||
});
|
||||
|
||||
// --brief: strip response_schema and example_output from every command (~60% smaller)
|
||||
let mut commands = commands;
|
||||
if brief {
|
||||
strip_schemas(&mut commands);
|
||||
@@ -2405,6 +2448,7 @@ fn handle_robot_docs(robot_mode: bool, brief: bool) -> Result<(), Box<dyn std::e
|
||||
env: "LORE_ROBOT=1".to_string(),
|
||||
auto: "Non-TTY stdout".to_string(),
|
||||
},
|
||||
quick_start,
|
||||
commands,
|
||||
aliases,
|
||||
exit_codes,
|
||||
@@ -2445,6 +2489,28 @@ fn handle_who(
|
||||
Ok(())
|
||||
}
|
||||
|
||||
async fn handle_drift(
|
||||
config_override: Option<&str>,
|
||||
entity_type: &str,
|
||||
iid: i64,
|
||||
threshold: f32,
|
||||
project: Option<&str>,
|
||||
robot_mode: bool,
|
||||
) -> Result<(), Box<dyn std::error::Error>> {
|
||||
let start = std::time::Instant::now();
|
||||
let config = Config::load(config_override)?;
|
||||
let effective_project = config.effective_project(project);
|
||||
let response = run_drift(&config, entity_type, iid, threshold, effective_project).await?;
|
||||
let elapsed_ms = start.elapsed().as_millis() as u64;
|
||||
|
||||
if robot_mode {
|
||||
print_drift_json(&response, elapsed_ms);
|
||||
} else {
|
||||
print_drift_human(&response);
|
||||
}
|
||||
Ok(())
|
||||
}
|
||||
|
||||
#[allow(clippy::too_many_arguments)]
|
||||
async fn handle_list_compat(
|
||||
config_override: Option<&str>,
|
||||
|
||||
Reference in New Issue
Block a user