feat(cli): Add search, stats, embed, sync, health, and robot-docs commands

Extends the CLI with six new commands that complete the search pipeline:

- lore search <QUERY>: Hybrid search with mode selection (lexical,
  hybrid, semantic), rich filtering (--type, --author, --project,
  --label, --path, --after, --updated-after), result limits, and
  optional explain mode showing RRF score breakdowns. Safe FTS mode
  sanitizes user input; raw mode passes through for power users.

- lore stats: Document and index statistics with optional --check
  for integrity verification and --repair to fix inconsistencies
  (orphaned documents, missing FTS entries, stale dirty queue items).

- lore embed: Generate vector embeddings via Ollama. Supports
  --retry-failed to re-attempt previously failed embeddings.

- lore generate-docs: Drain the dirty queue to regenerate documents.
  --full seeds all entities for complete rebuild. --project scopes
  to a single project.

- lore sync: Full pipeline orchestration (ingest issues + MRs,
  generate-docs, embed) with --no-embed and --no-docs flags for
  partial runs. Reports per-stage results and total elapsed time.

- lore health: Quick pre-flight check (config exists, DB exists,
  schema current). Returns exit code 1 if unhealthy. Designed for
  agent pre-flight scripts.

- lore robot-docs: Machine-readable command manifest for agent
  self-discovery. Returns all commands, flags, examples, exit codes,
  and recommended workflows as structured JSON.

Also enhances lore init with --gitlab-url, --token-env-var, and
--projects flags for fully non-interactive robot-mode initialization.
Fixes init's force/non-interactive precedence logic and adds JSON
output for robot mode.

Updates all command files for the GiError -> LoreError rename.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Taylor Eernisse
2026-01-30 15:47:10 -05:00
parent 559f0702ad
commit daf5a73019
13 changed files with 1930 additions and 95 deletions

88
src/cli/commands/embed.rs Normal file
View File

@@ -0,0 +1,88 @@
//! Embed command: generate vector embeddings for documents via Ollama.
use console::style;
use serde::Serialize;
use crate::core::db::create_connection;
use crate::core::error::Result;
use crate::core::paths::get_db_path;
use crate::embedding::ollama::{OllamaClient, OllamaConfig};
use crate::embedding::pipeline::embed_documents;
use crate::Config;
/// Result of the embed command.
#[derive(Debug, Default, Serialize)]
pub struct EmbedCommandResult {
pub embedded: usize,
pub failed: usize,
pub skipped: usize,
}
/// Run the embed command.
pub async fn run_embed(
config: &Config,
retry_failed: bool,
) -> Result<EmbedCommandResult> {
let db_path = get_db_path(config.storage.db_path.as_deref());
let conn = create_connection(&db_path)?;
// Build Ollama config from user settings
let ollama_config = OllamaConfig {
base_url: config.embedding.base_url.clone(),
model: config.embedding.model.clone(),
..OllamaConfig::default()
};
let client = OllamaClient::new(ollama_config);
// Health check — fail fast if Ollama is down or model missing
client.health_check().await?;
// If retry_failed, clear errors so they become pending again
if retry_failed {
conn.execute(
"UPDATE embedding_metadata SET last_error = NULL, attempt_count = 0
WHERE last_error IS NOT NULL",
[],
)?;
}
let model_name = &config.embedding.model;
let result = embed_documents(&conn, &client, model_name, None).await?;
Ok(EmbedCommandResult {
embedded: result.embedded,
failed: result.failed,
skipped: result.skipped,
})
}
/// Print human-readable output.
pub fn print_embed(result: &EmbedCommandResult) {
println!(
"{} Embedding complete",
style("done").green().bold(),
);
println!(" Embedded: {}", result.embedded);
if result.failed > 0 {
println!(" Failed: {}", style(result.failed).red());
}
if result.skipped > 0 {
println!(" Skipped: {}", result.skipped);
}
}
/// JSON output.
#[derive(Serialize)]
struct EmbedJsonOutput<'a> {
ok: bool,
data: &'a EmbedCommandResult,
}
/// Print JSON robot-mode output.
pub fn print_embed_json(result: &EmbedCommandResult) {
let output = EmbedJsonOutput {
ok: true,
data: result,
};
println!("{}", serde_json::to_string(&output).unwrap());
}