-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Labels
area: prefilterSIMD prefilters (memchr, memmem, Teddy)SIMD prefilters (memchr, memmem, Teddy)bench: vs-rustComparison with Rust regex crateComparison with Rust regex cratepriority: highImportant for next releaseImportant for next releasetype: performanceSpeed/memory improvement or regressionSpeed/memory improvement or regression
Description
Tracking issue for Aho-Corasick performance gap.
coregx/ahocorasick is 2x slower than Rust's aho-corasick crate (38ms vs 18ms on 6MB, 35 patterns). This affects:
- Prefilter for UseDFA with >32 case-fold literals
- UseAhoCorasick strategy (>64 patterns)
Upstream issue: coregx/ahocorasick#1
Impact on coregex
LangArena patterns using AC path:
bots(?i)— 39 literals, AC prefilter for DFAauth_attempts(?i)— 128 → 18 literals (trimmed), now Teddy prefilter- Any >64 complete literal alternation
Closing the AC gap would improve all AC-dependent patterns by ~2x.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
area: prefilterSIMD prefilters (memchr, memmem, Teddy)SIMD prefilters (memchr, memmem, Teddy)bench: vs-rustComparison with Rust regex crateComparison with Rust regex cratepriority: highImportant for next releaseImportant for next releasetype: performanceSpeed/memory improvement or regressionSpeed/memory improvement or regression