Skip to content

perf: tracking — Aho-Corasick 2x performance gap (coregx/ahocorasick#1) #144

@kolkov

Description

@kolkov

Tracking issue for Aho-Corasick performance gap.

coregx/ahocorasick is 2x slower than Rust's aho-corasick crate (38ms vs 18ms on 6MB, 35 patterns). This affects:

  • Prefilter for UseDFA with >32 case-fold literals
  • UseAhoCorasick strategy (>64 patterns)

Upstream issue: coregx/ahocorasick#1

Impact on coregex

LangArena patterns using AC path:

  • bots (?i) — 39 literals, AC prefilter for DFA
  • auth_attempts (?i) — 128 → 18 literals (trimmed), now Teddy prefilter
  • Any >64 complete literal alternation

Closing the AC gap would improve all AC-dependent patterns by ~2x.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area: prefilterSIMD prefilters (memchr, memmem, Teddy)bench: vs-rustComparison with Rust regex cratepriority: highImportant for next releasetype: performanceSpeed/memory improvement or regression

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions