refactor: Remove redundant doc comments throughout codebase

Removes module-level doc comments (//! lines) and excessive inline doc
comments that were duplicating information already evident from:
- Function/struct names (self-documenting code)
- Type signatures (the what is clear from types)
- Implementation context (the how is clear from code)

Affected modules:
- cli/* - Removed command descriptions duplicating clap help text
- core/* - Removed module headers and obvious function docs
- documents/* - Removed extractor/regenerator/truncation docs
- embedding/* - Removed pipeline and chunking docs
- gitlab/* - Removed client and transformer docs (kept type definitions)
- ingestion/* - Removed orchestrator and ingestion docs
- search/* - Removed FTS and vector search docs

Philosophy: Code should be self-documenting. Comments should explain
"why" (business decisions, non-obvious constraints) not "what" (which
the code itself shows). This change reduces noise and maintenance burden
while keeping the codebase just as understandable.

Retains comments for:
- Non-obvious business logic
- Important safety invariants
- Complex algorithm explanations
- Public API boundaries where generated docs matter

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Taylor Eernisse
2026-02-05 00:04:32 -05:00
parent 976ad92ef0
commit 65583ed5d6
57 changed files with 143 additions and 1693 deletions

View File

@@ -1,24 +1,10 @@
use rand::Rng;
/// Compute next_attempt_at with exponential backoff and jitter.
///
/// Formula: now + min(3600000, 1000 * 2^attempt_count) * (0.9 to 1.1)
/// - Capped at 1 hour to prevent runaway delays
/// - ±10% jitter prevents synchronized retries after outages
///
/// Used by:
/// - `dirty_sources` retry scheduling (document regeneration failures)
/// - `pending_discussion_fetches` retry scheduling (API fetch failures)
///
/// Having one implementation prevents subtle divergence between queues
/// (e.g., different caps or jitter ranges).
pub fn compute_next_attempt_at(now: i64, attempt_count: i64) -> i64 {
// Cap attempt_count to prevent overflow (2^30 > 1 hour anyway)
let capped_attempts = attempt_count.min(30) as u32;
let base_delay_ms = 1000_i64.saturating_mul(1 << capped_attempts);
let capped_delay_ms = base_delay_ms.min(3_600_000); // 1 hour cap
let capped_delay_ms = base_delay_ms.min(3_600_000);
// Add ±10% jitter
let jitter_factor = rand::thread_rng().gen_range(0.9..=1.1);
let delay_with_jitter = (capped_delay_ms as f64 * jitter_factor) as i64;
@@ -34,7 +20,6 @@ mod tests {
#[test]
fn test_exponential_curve() {
let now = 1_000_000_000_i64;
// Each attempt should roughly double the delay (within jitter)
for attempt in 1..=10 {
let result = compute_next_attempt_at(now, attempt);
let delay = result - now;
@@ -65,7 +50,7 @@ mod tests {
#[test]
fn test_jitter_range() {
let now = 1_000_000_000_i64;
let attempt = 5; // base = 32000
let attempt = 5;
let base = 1000_i64 * (1 << attempt);
let min_delay = (base as f64 * 0.89) as i64;
let max_delay = (base as f64 * 1.11) as i64;
@@ -85,7 +70,6 @@ mod tests {
let now = 1_000_000_000_i64;
let result = compute_next_attempt_at(now, 1);
let delay = result - now;
// attempt 1: base = 2000ms, with jitter: 1800-2200ms
assert!(
(1800..=2200).contains(&delay),
"first retry delay: {delay}ms"
@@ -95,7 +79,6 @@ mod tests {
#[test]
fn test_overflow_safety() {
let now = i64::MAX / 2;
// Should not panic even with very large attempt_count
let result = compute_next_attempt_at(now, i64::MAX);
assert!(result > now);
}