3 Commits

Author SHA1 Message Date
eb2328b768 Enhance filter pipeline with synonym-aware categories and deal sorting
- extend filter.Options with sort mode support and keep Apply as a single-pass pipeline with limit behavior preserved for unsorted flows
- add sort normalization and two ordering strategies:
  * savings: rank by computed DealScore with deterministic title tie-break
  * ending: rank by earliest parsed end date, then DealScore fallback
- introduce DealScore heuristics that combine BOGO weighting, dollar-off extraction, and percentage extraction from savings/deal-info text
- add category synonym matcher that supports:
  * direct case-insensitive matches
  * canonical group synonym expansion (e.g. veggies -> produce)
  * normalized fallback for hyphen/underscore/plural variants without breaking exact unknown-category matching
- include explicit tests for synonym matching, hyphenated category handling, unknown plural exact matching, and sort ordering behavior
- keep allocation-sensitive behavior intact while adding matcher precomputation and fast-path checks
2026-02-23 00:26:55 -05:00
df0af4a5f8 Rewrite filter.Apply as single-pass with early-exit and pre-allocation
Replace the multi-pass where() chain in Apply() with a single loop that
evaluates all filter predicates per item and skips immediately on first
mismatch. This eliminates N intermediate slice allocations (one per
active filter) and avoids re-scanning the full dataset for each filter
dimension.

Key changes in filter.go:
- Single loop with continue-on-mismatch for BOGO, category, department,
  and query filters — combined categories check scans item.Categories
  once for both BOGO and category instead of twice
- Pre-allocate result slice capped at min(len(items), opts.Limit) to
  avoid grow-and-copy churn
- Fast-path bypass when no filters are active (just apply limit)
- Break early once limit is reached instead of filtering everything
  and truncating after
- Remove the now-unused where() helper function
- Add early-return fast paths to CleanText() for the common case where
  input contains no HTML entities or newlines, avoiding unnecessary
  html.UnescapeString and ReplaceAll calls

Test coverage:
- filter_equivalence_test.go (new): Reference implementation of the
  original multi-pass algorithm with 500 randomized test cases verifying
  behavioral equivalence. Includes allocation budget guardrail (<=80
  allocs/op for 1k items) to catch accidental regression to multi-pass.
  Benchmarks for new vs legacy reference on identical workload.
- filter_test.go: Benchmark comparisons for CleanText on plain text
  (fast path) vs escaped HTML (full path), new vs legacy.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 00:11:38 -05:00
12eb55f4b8 Add deal filtering engine with BOGO, category, department, and keyword support
Composable filter pipeline that processes SavingItem slices through
chained predicates: BOGO detection (category match), exact category
match, substring department match, and keyword search across title
and description fields. All text matching is case-insensitive.

Includes utility functions for HTML entity unescaping (CleanText),
nil-safe string pointer dereferencing (Deref), and case-insensitive
slice membership (ContainsIgnoreCase). An optional limit truncates
results after all filters are applied. Tests cover each filter in
isolation, combined filters, nil field safety, and the Categories
aggregation helper.
2026-02-22 21:41:46 -05:00