Skip to content

perf: use runtime maphash for sharded-cache key hashing#362

Merged
jfberry merged 1 commit into
mainfrom
perf/maphash-shardkey
May 24, 2026
Merged

perf: use runtime maphash for sharded-cache key hashing#362
jfberry merged 1 commit into
mainfrom
perf/maphash-shardkey

Conversation

@jfberry
Copy link
Copy Markdown
Collaborator

@jfberry jfberry commented May 24, 2026

Summary

StringKeyToShard is on every Get / Set / Delete / GetOrSetFunc call against pokestopCache, gymCache, and stationCache — easily tens of thousands of calls per second under typical GMO ingest load.

Switch from hash/fnv to hash/maphash, which is ~5× faster on the hot path.

Benchmark

Single-key and many-distinct-keys variants, 5 runs each, Go 1.26 on Apple M3 Pro, 35-char fort IDs (typical):

Implementation Time Allocs
fnv.New64a() + Write([]byte(key)) ~17.5 ns/op 0 alloc/op
maphash.String(seed, key) ~3.4 ns/op 0 alloc/op

Both are alloc-free under Go 1.26 — escape analysis keeps the fnv hash struct and the []byte(key) cast on the stack. The win is purely CPU.

Safety notes

  • maphash.MakeSeed() is captured once at package init. Shard assignments are in-memory only and never persisted, so a random per-process seed is fine.
  • After this change, items hash to different shards across restarts. Not visible externally — sharding is purely about lock-contention distribution inside the cache layer.
  • Distribution properties are at least as good as FNV-1a (maphash is the same hash the Go runtime uses for map).

Test plan

  • go build ./...
  • go vet ./decoder/
  • go test ./decoder/ -race -count=1
  • Benchmark above

🤖 Generated with Claude Code

StringKeyToShard is on every Get/Set/Delete against pokestopCache,
gymCache, and stationCache — easily tens of thousands of calls per
second under typical GMO load.

Switch from hash/fnv to hash/maphash, which is ~5× faster on the hot
path (~17 ns/op → ~3.4 ns/op, measured on Go 1.26 / Apple M3 Pro with
35-char fort IDs). Both implementations are alloc-free thanks to Go
1.26's escape analysis on the fnv struct and []byte cast, so the win
is pure CPU.

maphash.MakeSeed() is captured once at package init; a per-process
random seed is fine because shard assignments are in-memory only and
never persisted across restarts.
@jfberry jfberry merged commit 81215ff into main May 24, 2026
3 checks passed
@jfberry jfberry deleted the perf/maphash-shardkey branch May 24, 2026 08:51
@lenisko
Copy link
Copy Markdown
Contributor

lenisko commented May 24, 2026

Nice to have, but gain close to nothing :)

@jfberry
Copy link
Copy Markdown
Collaborator Author

jfberry commented May 24, 2026

Called 100s of times per gmo so not nothing but it’s hardly going to be noticeable indeed

@lenisko
Copy link
Copy Markdown
Contributor

lenisko commented May 24, 2026

image

Yeah, overall it was taking about 0.05% of Golbat CPU share. Right now it went down to 0.02% (low on samples).

At this point we might need to plan reuse of memory if we want to optimize Golbat further. Everything else looks tight enough.

@jfberry
Copy link
Copy Markdown
Collaborator Author

jfberry commented May 24, 2026

If you can find a way to optimise the proto memory that would be great - but pooling won’t work given the shame of gmos. This broad section of allocations is outside of our control

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants