Skip to content

perf: add benchmarks and speed up variant generation#11

Open
andrew wants to merge 3 commits into
rustfoundation:mainfrom
andrew:feat/benchmarks
Open

perf: add benchmarks and speed up variant generation#11
andrew wants to merge 3 commits into
rustfoundation:mainfrom
andrew:feat/benchmarks

Conversation

@andrew
Copy link
Copy Markdown

@andrew andrew commented May 11, 2026

Adds a criterion benchmark suite and uses it to find and fix the main bottleneck.

The benchmarks run each check (and the full harness) against the 200 most-downloaded crates.io packages, with input names of 3, 10, 16 and 22 characters. Run with cargo bench --bench checks.

The slow part turned out to be util::rebuild_name, which builds every candidate name with format!(). That allocates a new String for every candidate. Omitted alone generates ~870 candidates for a 22-character name, so that's ~870 allocations per call.

The fix adds rebuild_name_into, which writes into a buffer the caller owns. Omitted, Typos and SwappedWords now reuse one buffer for the whole loop and only clone it when a candidate actually matches.

Results on aarch64 macOS, full harness with all checks:

input before after
3 chars 10.5 µs 3.9 µs
10 chars 36.2 µs 13.9 µs
16 chars 67.6 µs 22.3 µs
22 chars 104.5 µs 35.6 µs

Roughly 3x faster. No API changes.

This will conflict with #9 in omitted.rs, swapped.rs, typos.rs and with #10 in util.rs. Happy to rebase once those land.

Related to the discussion in #2, though it takes a different approach.

andrew added 3 commits May 11, 2026 16:18
Per-check and full-harness timings against the top 200 crates.io
packages by downloads, across four input lengths (3-22 chars).

Run with:

    cargo bench --bench checks
util::rebuild_name went through format!() for every generated
variant, and each check allocated a fresh String per candidate.
Add rebuild_name_into which writes into a caller-provided buffer,
and convert the hot checks (Omitted, Typos, SwappedWords) to reuse
one buffer across the loop, only cloning when a match is found.

SwappedWords additionally allocated a delimiter String via
format!() and a joined String per permutation; both are now
written into the shared buffer.

Against the saved baseline on aarch64-apple-darwin:

    omitted             -58% to -66%
    swapped_words       -37% to -57%
    typos               -43% to -64%
    swapped_characters  -16% to -42% (from rebuild_name change only)
    harness             -58% to -67%

For the longest input (22 chars) the full harness drops from
~104us to ~36us.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant