B4/rhash by mykyta5 · Pull Request #11300 · kernel-patches/bpf

mykyta5 · 2026-03-05T12:06:36Z

No description provided.

This patch series introduces BPF_MAP_TYPE_RHASH, a new hash map type that leverages the kernel's rhashtable to provide resizable hash map for BPF. The existing BPF_MAP_TYPE_HASH uses a fixed number of buckets determined at map creation time. While this works well for many use cases, it presents challenges when: 1. The number of elements is unknown at creation time 2. The element count varies significantly during runtime 3. Memory efficiency is important (over-provisioning wastes memory, under-provisioning hurts performance) BPF_MAP_TYPE_RHASH addresses these issues by using rhashtable, which automatically grows and shrinks based on load factor. The implementation wraps the kernel's rhashtable with BPF map operations: - Uses bpf_mem_alloc for RCU-safe memory management - Supports all standard map operations (lookup, update, delete, get_next_key) - Supports batch operations (lookup_batch, lookup_and_delete_batch) - Supports BPF iterators for traversal - Supports BPF_F_LOCK for spin locks in values - Requires BPF_F_NO_PREALLOC flag (elements allocated on demand) - max_entries serves as a hard limit, not bucket count The series includes comprehensive tests: - Basic operations in test_maps (lookup, update, delete, get_next_key) - BPF program tests for lookup/update/delete semantics - BPF_F_LOCK tests with concurrent access - Stress tests for get_next_key during concurrent resize operations - Seq file tests Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> --- Current implementation of the BPF_MAP_TYPE_RHASH does not provide the same strong guarantees on the values consistency under concurrent reads/writes as BPF_MAP_TYPE_HASH. BPF_MAP_TYPE_HASH allocates a new element and atomically swaps the pointer, so RCU readers always see a complete value. BPF_MAP_TYPE_RHASH does memcpy in place with no lock held. rhash trades consistency for speed (5x improvement in update benchmark): concurrent readers can observe partially updated data. Two concurrent writers to the same key can also interleave, producing mixed values. As a solution, user may use BPF_F_LOCK to guarantee consistent reads and write serialization. Summary of the read consistency guarantees: map type | write mechanism | read consistency -------------+------------------+-------------------------- htab | alloc, swap ptr | always consistent (RCU) htab F_LOCK | in-place + lock | consistent if reader locks -------------+------------------+-------------------------- rhtab | in-place memcpy | torn reads rhtab F_LOCK | in-place + lock | consistent if reader locks Changes in v2: - Added benchmarks - Link to v1: https://lore.kernel.org/r/20260205-rhash-v1-0-30dd6d63c462@meta.com --- b4-submit-tracking --- { "series": { "revision": 2, "change-id": "20251103-rhash-7b70069923d8", "prefixes": [ "RFC bpf-next" ], "history": { "v1": [ "20260205-rhash-v1-0-30dd6d63c462@meta.com" ] } } }

Add resizable hash map into enums where it is needed. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>

Introduce basic operations for BPF_MAP_TYPE_RHASH, a new hash map type built on top of the kernel's rhashtable. Key implementation details: - Uses rhashtable for automatic resizing with RCU-safe operations - Elements allocated via bpf_mem_alloc for lock-free allocation - Supports BPF_F_LOCK for spin_lock protected values - Requires BPF_F_NO_PREALLOC Implemented map operations: * map_alloc/map_free: Initialize and destroy the rhashtable * map_lookup_elem: RCU-protected lookup via rhashtable_lookup * map_update_elem: Insert or update with BPF_NOEXIST/EXIST/ANY * map_delete_elem: Remove element with RCU-deferred freeing * map_get_next_key: Returns the next key in the table * map_release_uref: Free internal structs (timers, workqueues) Other operations (batch, seq file) are implemented in the next patch Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>

Add batch operations and BPF iterator support for BPF_MAP_TYPE_RHASH. Batch operations: * rhtab_map_lookup_batch: Bulk lookup of elements by bucket * rhtab_map_lookup_and_delete_batch: Atomic bulk lookup and delete The batch implementation iterates through buckets under RCU protection, copying keys and values to userspace buffers. When the buffer fills mid-bucket, it rolls back to the bucket boundary so the next call can retry that bucket completely. BPF iterator: * Uses rhashtable_walk_* API for safe iteration * Handles -EAGAIN during table resize transparently * Tracks skip_elems to resume iteration across read() calls Also implements rhtab_map_mem_usage() to report memory consumption. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>

Test basic map operations (lookup, update, delete) for BPF_MAP_TYPE_RHASH including boundary conditions like duplicate key insertion and deletion of nonexistent keys. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>

Add tests validating resizable hash map handles BPF_F_LOCK flag as expected. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>

Test get_next_key behavior under concurrent modification: * Resize test: verify all elements visited after resize trigger * Stress test: concurrent iterators and modifiers to detect races Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>

Test BPF iterator functionality for BPF_MAP_TYPE_RHASH: * Basic iteration verifying all elements are visited * Overflow test triggering seq_file restart, validating correct resume behavior via skip_elems tracking Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>

Make bpftool documentation aware of the resizable hash map. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>

Support resizable hashmap in BPF map benchmarks. Results: $ sudo ./bench -w3 -d10 -a bpf-rhashmap-full-update 0:hash_map_full_perf 21641414 events per sec $ sudo ./bench -w3 -d10 -a bpf-hashmap-full-update 0:hash_map_full_perf 4392758 events per sec $ sudo ./bench -w3 -d10 -a -p8 htab-mem --use-case overwrite --value-size 8 Iter 0 (302.834us): per-prod-op 62.85k/s, memory usage 2.70MiB Iter 1 (-44.810us): per-prod-op 62.81k/s, memory usage 2.70MiB Iter 2 (-45.821us): per-prod-op 62.81k/s, memory usage 2.70MiB Iter 3 (-63.658us): per-prod-op 62.92k/s, memory usage 2.70MiB Iter 4 ( 32.887us): per-prod-op 62.85k/s, memory usage 2.70MiB Iter 5 (-76.948us): per-prod-op 62.75k/s, memory usage 2.70MiB Iter 6 (157.235us): per-prod-op 63.01k/s, memory usage 2.70MiB Iter 7 (-118.761us): per-prod-op 62.85k/s, memory usage 2.70MiB Iter 8 (127.139us): per-prod-op 62.92k/s, memory usage 2.70MiB Iter 9 (-169.908us): per-prod-op 62.99k/s, memory usage 2.70MiB Iter 10 (101.962us): per-prod-op 62.97k/s, memory usage 2.70MiB Iter 11 (-64.330us): per-prod-op 63.05k/s, memory usage 2.70MiB Iter 12 (-20.543us): per-prod-op 62.86k/s, memory usage 2.70MiB Iter 13 ( 55.382us): per-prod-op 62.95k/s, memory usage 2.70MiB Summary: per-prod-op 62.92 ± 0.09k/s, memory usage 2.70 ± 0.00MiB, peak memory usage 2.96MiB $ sudo ./bench -w3 -d10 -a -p8 rhtab-mem --use-case overwrite --value-size 8 Iter 0 (316.805us): per-prod-op 96.40k/s, memory usage 2.71MiB Iter 1 (-35.225us): per-prod-op 96.54k/s, memory usage 2.71MiB Iter 2 (-12.431us): per-prod-op 96.54k/s, memory usage 2.71MiB Iter 3 (-56.537us): per-prod-op 96.58k/s, memory usage 2.71MiB Iter 4 ( 27.108us): per-prod-op 96.62k/s, memory usage 2.71MiB Iter 5 (-52.491us): per-prod-op 96.57k/s, memory usage 2.71MiB Iter 6 ( -2.777us): per-prod-op 96.52k/s, memory usage 2.71MiB Iter 7 (108.963us): per-prod-op 96.45k/s, memory usage 2.71MiB Iter 8 (-61.575us): per-prod-op 96.48k/s, memory usage 2.71MiB Iter 9 (-21.595us): per-prod-op 96.14k/s, memory usage 2.71MiB Iter 10 ( 3.243us): per-prod-op 96.36k/s, memory usage 2.71MiB Iter 11 ( 3.102us): per-prod-op 94.70k/s, memory usage 2.71MiB Iter 12 (109.102us): per-prod-op 95.77k/s, memory usage 2.71MiB Iter 13 ( 16.153us): per-prod-op 95.91k/s, memory usage 2.71MiB Summary: per-prod-op 96.19 ± 0.57k/s, memory usage 2.71 ± 0.00MiB, peak memory usage 2.71MiB sudo ./bench -w3 -d10 -a bpf-hashmap-lookup --key_size 4\ --max_entries 1000 --nr_entries 500 --nr_loops 1000000 cpu00: lookup 28.603M ± 0.536M events/sec (approximated from 32 samples of ~34ms) sudo ./bench -w3 -d10 -a bpf-rhashmap-lookup --key_size 4\ --max_entries 1000 --nr_entries 500 --nr_loops 1000000 cpu00: lookup 27.340M ± 0.864M events/sec (approximated from 32 samples of ~36ms) Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>

The else-if and else branches in rht_key_get_hash() both compute a hash using either params.hashfn or jhash, differing only in the source of key_len (params.key_len vs ht->p.key_len). Merge the two branches into one by using the ternary `params.key_len ?: ht->p.key_len` to select the key length, removing the duplicated logic. This also improves the performance of the else branch which previously always used jhash and never fell through to jhash2. This branch is going to be used by BPF resizable hashmap, which wraps rhashtable: https://lore.kernel.org/bpf/20260205-rhash-v1-0-30dd6d63c462@meta.com/ Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>

kernel-patches-daemon-bpf · 2026-03-11T01:17:30Z

Automatically cleaning up stale PR; feel free to reopen if needed

mykyta5 added 13 commits March 3, 2026 15:33

bpf: Register rhash map

09b36da

Add resizable hash map into enums where it is needed. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>

libbpf: Support resizable hashtable

038d0ca

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>

selftests/bpf: Add basic tests for resizable hash map

924ba9a

Test basic map operations (lookup, update, delete) for BPF_MAP_TYPE_RHASH including boundary conditions like duplicate key insertion and deletion of nonexistent keys. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>

selftests/bpf: Support resizable hashtab in test_maps

04ff92b

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>

selftests/bpf: Resizable hashtab BPF_F_LOCK tests

78f14f9

Add tests validating resizable hash map handles BPF_F_LOCK flag as expected. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>

bpftool: Add rhash map documentation

163db11

Make bpftool documentation aware of the resizable hash map. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>

kernel-patches-daemon-bpf Bot force-pushed the bpf-next_base branch 7 times, most recently from b72a510 to ebefa82 Compare March 11, 2026 01:10

kernel-patches-daemon-bpf Bot closed this Mar 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

B4/rhash#11300

B4/rhash#11300
mykyta5 wants to merge 13 commits intokernel-patches:bpf-next_basefrom
mykyta5:b4/rhash

mykyta5 commented Mar 5, 2026

Uh oh!

kernel-patches-daemon-bpf Bot commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mykyta5 commented Mar 5, 2026

Uh oh!

kernel-patches-daemon-bpf Bot commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant