Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 56 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,62 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- ARM NEON SIMD support (Go 1.26 `simd/archsimd` intrinsics — [#120](https://github.com/coregx/coregex/issues/120))
- SIMD prefilter for CompositeSequenceDFA (#83)

## [0.12.21] - 2026-03-27

### Performance
- **Tagged start states** (Rust `LazyStateID` approach) — start states get tag bit,
always route to slow path. Enables prefilter skip-ahead only at start state,
eliminating O(n²) from start state self-loop. Unlocks UseDFA for tiny NFA patterns.

- **DFA multiline $ fix** — EndLine look-ahead re-computation in determinize
(Rust mod.rs:131-212). `(?m)hello$` now works correctly in DFA.

- **Dead-state prefilter restart** in searchEarliestMatch — IsMatch path uses
prefilter to skip past dead states, matching Rust find_fwd_imp approach.

- **1100x fewer mallocs** — FindAllIndex/FindAllSubmatchIndex use flat buffer
(`compactToSliceOfSlice`): N matches → 2 allocations instead of N+1.

- **Local SearchState cache** on Engine — atomic.Pointer single-slot cache
survives GC, avoids sync.Pool re-allocation overhead.

- **Tiny NFA → UseDFA routing** — patterns with < 20 NFA states now use
bidirectional DFA (was PikeVM). 7x faster DFA vs PikeVM on large inputs.

### Added
- **`AllIndex(b []byte) iter.Seq[[2]int]`** — zero-alloc match index iterator (Go 1.23+)
- **`AllStringIndex(s string) iter.Seq[[2]int]`** — string version
- **`All(b []byte) iter.Seq[[]byte]`** — zero-alloc match content iterator
- **`AllString(s string) iter.Seq[string]`** — string version
- **`AppendAllIndex(dst [][2]int, b []byte, n int) [][2]int`** — buffer-reuse API
- **`AppendAllStringIndex(dst [][2]int, s string, n int) [][2]int`** — string version

Naming follows Go proposal #61902 (regexp iterator methods) and `strconv.Append*` convention.

### Fixed
- DFA `isMatchWithPrefilter` pfSkip off-by-one — `zx+` on "zzx" now correct
- DFA multiline `$` EndLine look-ahead — `(?m)hello$` now matches before `\n`

### Benchmarks (LangArena LogParser, 7.2 MB, 13 patterns)

| Metric | v0.12.20 | v0.12.21 | Improvement |
|--------|----------|----------|-------------|
| Total time (FindAll) | 163ms | **107ms** | **-34%** |
| errors pattern | 23ms | **8ms** (FindAll) / **5.5ms** (AllIndex) | **-65% / -76%** |
| vs Rust gap | 3.9x | **2.9x** (FindAll) / **1.7x** (AllIndex) | **-56%** |
| Mallocs/iter | 203K | **182** | **-99.9%** |

### Zero-Alloc API Benchmarks (new methods vs stdlib-compat)

| Method | errors (33K matches) | Alloc | vs Rust |
|--------|---------------------|-------|---------|
| FindAllStringIndex (stdlib) | 8.2ms / 3890 KB | 19 mallocs | 2.6x slower |
| **AllIndex (iter.Seq)** | **5.9ms / 0 KB** | **0 mallocs** | **1.7x** |
| **AppendAllIndex (reuse)** | **5.5ms / 0 KB** | **0 mallocs** | **1.7x** |
| Rust find_iter | 3.2ms / 0 | 0 | — |

emails pattern: `AppendAllIndex` **2.0ms vs Rust 2.6ms** — **faster than Rust!**

## [0.12.20] - 2026-03-25

### Performance
Expand Down
22 changes: 20 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ Cross-language benchmarks on 6MB input, AMD EPYC ([source](https://github.com/ko
- Multi-pattern (`foo|bar|baz|...`) — Slim Teddy (≤32), Fat Teddy (33-64), or Aho-Corasick (>64)
- Anchored alternations (`^(\d+|UUID|hex32)`) — O(1) branch dispatch (5-20x)
- Concatenated char classes (`[a-zA-Z]+[0-9]+`) — DFA with byte classes (5-7x)
- **Zero-alloc iterators** (`AllIndex`, `AppendAllIndex`) — 0 heap allocs, up to **30% faster** than FindAll. Email pattern **faster than Rust** with `AppendAllIndex`.

## Features

Expand Down Expand Up @@ -130,11 +131,28 @@ Supported methods:
### Zero-Allocation APIs

```go
// Zero allocations — returns bool
// Zero allocations — boolean match
matched := re.IsMatch(text)

// Zero allocations — returns (start, end, found)
// Zero allocations — single match indices
start, end, found := re.FindIndices(text)

// Zero allocations — iterator over all matches (Go 1.23+)
for m := range re.AllIndex(data) {
fmt.Printf("match at [%d, %d]\n", m[0], m[1])
}

// Zero allocations — match content iterator
for s := range re.AllString(text) {
fmt.Println(s)
}

// Buffer-reuse — append to caller's slice (strconv.Append* pattern)
var buf [][2]int
for _, chunk := range chunks {
buf = re.AppendAllIndex(buf[:0], chunk, -1)
process(buf)
}
```

### Configuration
Expand Down
7 changes: 5 additions & 2 deletions ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,8 +97,11 @@ v0.12.18 ✅ → Flat DFA transition table, integrated prefilter, PikeVM skip-ah
v0.12.19 ✅ → Zero-alloc FindSubmatch, byte-based DFA cache, Rust-aligned visited limits
v0.12.20 (Current) → Premultiplied/tagged StateIDs, break-at-match DFA determinize,
Phase 3 elimination (2-pass bidirectional DFA)
v0.12.20 ✅ → Premultiplied/tagged StateIDs, break-at-match DFA determinize,
Phase 3 elimination (2-pass bidirectional DFA)
v0.12.21 (Current) → Tagged start states, zero-alloc API (AllIndex iter.Seq),
1100x fewer mallocs, UseDFA for tiny NFA, -32% LangArena
v1.0.0-rc → Feature freeze, API locked
Expand Down
24 changes: 24 additions & 0 deletions dfa/lazy/builder.go
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,9 @@ func (b *Builder) Build() (*DFA, error) {
// Check if the NFA contains word boundary assertions
hasWordBoundary := b.checkHasWordBoundary()

// Check if the NFA contains EndLine ($) assertions
hasEndLine := b.checkHasEndLine()

// Check if the pattern is always anchored (has ^ prefix)
isAlwaysAnchored := b.nfa.IsAlwaysAnchored()

Expand All @@ -80,6 +83,7 @@ func (b *Builder) Build() (*DFA, error) {
byteClasses: b.nfa.ByteClasses(),
unanchoredStart: b.nfa.StartUnanchored(),
hasWordBoundary: hasWordBoundary,
hasEndLine: hasEndLine,
isAlwaysAnchored: isAlwaysAnchored,
startByteMap: startByteMap,
}
Expand Down Expand Up @@ -706,3 +710,23 @@ func (b *Builder) checkHasWordBoundary() bool {
}
return false
}

// checkHasEndLine checks if the NFA contains EndLine ($) look assertions.
// When true, determinize performs look-ahead re-computation on '\n' bytes.
// Computed once at DFA build time for O(1) check in hot loop.
func (b *Builder) checkHasEndLine() bool {
numStates := b.nfa.States()
for i := nfa.StateID(0); int(i) < numStates; i++ {
state := b.nfa.State(i)
if state == nil {
continue
}
if state.Kind() == nfa.StateLook {
look, _ := state.Look()
if look == nfa.LookEndLine {
return true
}
}
}
return false
}
Loading
Loading