- Fix MENTIONED_RE/CLOSED_BY_RE to match real GitLab format
('mentioned in issue #N' / 'mentioned in merge request !N')
- Add GITLAB_URL_RE + parse_url_refs() for full URL extraction
- Add extract_refs_from_descriptions() -> source_method='description_parse'
- Add extract_refs_from_user_notes() -> source_method='note_parse'
- Wire both into orchestrator after system note extraction
- 36 tests: regex fix, URL parsing, integration, idempotency
Move inline #[cfg(test)] mod tests { ... } blocks from 22 source files
into dedicated _tests.rs companion files, wired via:
#[cfg(test)]
#[path = "module_tests.rs"]
mod tests;
This keeps implementation-focused source files leaner and more scannable
while preserving full access to private items through `use super::*;`.
Modules extracted:
core: db, note_parser, payloads, project, references, sync_run,
timeline_collect, timeline_expand, timeline_seed
cli: list (55 tests), who (75 tests)
documents: extractor (43 tests), regenerator
embedding: change_detector, chunking
gitlab: graphql (wiremock async tests), transformers/issue
ingestion: dirty_tracker, discussions, issues, mr_diffs
Also adds conflicts_with("explain_score") to the --detail flag in the
who command to prevent mutually exclusive flags from being combined.
All 629 unit tests pass. No behavior changes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace .unwrap_or(), .ok(), and .filter_map(|r| r.ok()) patterns with
proper error propagation using ? and rusqlite::OptionalExtension where
the query may legitimately return no rows.
Affected areas:
- events_db::count_events: three count queries now propagate errors
instead of defaulting to (0, 0) on failure
- note_parser::extract_refs_from_system_notes: row iteration errors
are now propagated instead of silently dropped via filter_map
- note_parser::noteable_type_to_entity_type: unknown types now log a
debug warning before defaulting to "issue"
- payloads::store_payload/read_payload: use .optional()? instead of
.ok() to distinguish "no row" from "query failed"
- backoff::compute_next_attempt_at: use .clamp(0, 30) to guard against
negative attempt_count, not just .min(30)
- search::vector::max_chunks_per_document: returns Result<i64> with
proper error propagation through .optional()?.flatten()
- embedding::chunk_ids::decode_rowid: promote debug_assert to assert
since negative rowids indicate data corruption worth failing fast on
- ingestion::dirty_tracker::record_dirty_error: use .optional()? to
handle missing dirty_sources row gracefully instead of hard error
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduces two new modules for extracting and storing entity cross-references
from GitLab data:
note_parser.rs:
- Parses system notes for "mentioned in" and "closed by" patterns
- Extracts cross-project references (group/project#42, group/project!123)
- Uses lazy-compiled regexes for performance
- Handles both issue (#) and MR (!) sigils
- Provides extract_refs_from_system_notes() for batch processing
references.rs:
- Extracts refs from resource_state_events table (API-sourced closes links)
- Provides insert_entity_reference() for storing discovered references
- Includes resolution helpers: resolve_issue_local_id, resolve_mr_local_id,
resolve_project_path for converting iids to internal IDs
- Enables cross-project reference resolution
These modules power the entity_references table, enabling features like
"find all MRs that close this issue" and "find all issues mentioned in this MR".
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>