lib: cgen: group same-key maps using bitmasks by pzmarzly · Pull Request #520 · facebook/bpfilter

pzmarzly · 2026-04-21T15:00:47Z

Closes #381.

We're grouping all sets in a chain by key format. Then, sets of the same format will turn into a single BPF map, with bitmasks in its values that indicate which sets the element appears in. With the 64 maps per program limit, we should now be able to have 64 unique key formats per program, which is much less likely to hit than just 64 sets.

There is an exception for LPM tries though. BPF_FUNC_map_lookup_elem on an LPM trie performs longest-prefix-match. So for a ruleset like:

chain demo BF_HOOK_XDP ACCEPT
    set a (ip4.snet) in { 192.168.1.0/24 }
    set b (ip4.snet) in { 192.168.0.0/16 }
    rule (ip4.snet) in b counter DROP

If we grouped a and b together, and got packet from 192.168.1.1, BPF_FUNC_map_lookup_elem would return set a (key: 192.168.1.0/24, value: 0x01) and packet would be accepted.

Therefore, LPM-based sets are excluded from merging.

pzmarzly · 2026-04-21T15:55:14Z

@claude review

github-actions · 2026-04-21T15:55:28Z

qdeslandes

Thanks for working on this, I think's it's a good first implementation, but there are a few things that could be improved to simplify it.

When grouping sets, using a dedicated structure would be helpful:

bf_set_group {
    key;
    sets;
};

When looping over sets, you can ignore the empty ones, then for non-empty ones find if a set group exists already and add it there. You don't necessary need a "leader".

Fixups can hold a pointer to a set directly, it will be easier to resolve them, instead of relying on a set index. This is possible as the fixup (and bf_program) aren't serialized, they're only temporary objects (which was not the case before).

pzmarzly · 2026-04-22T15:06:14Z

Fixups can hold a pointer to a set directly, it will be easier to resolve them, instead of relying on a set index. This is possible as the fixup (and bf_program) aren't serialized, they're only temporary objects (which was not the case before).

That's amazing. Trying this.

@claude review

github-actions · 2026-04-22T15:38:53Z

-        if (r)
-            return bf_err_r(r, "failed to add set elements to the map");
+                memset(value, 0, value_size);
+                (void)bf_bpf_map_lookup_elem(new_map->fd, elem, value);


Claude: must-fix: For LPM trie maps, bf_bpf_map_lookup_elem performs longest-prefix matching, not exact-key matching. When inserting an element from set B (e.g. 192.168.1.0/24), the lookup may return the value for a shorter prefix from set A (e.g. 192.168.0.0/16) that was already inserted. The subsequent OR would incorrectly propagate set A's bit into set B's entry.

This means a packet matching 192.168.1.0/24 would appear to belong to both set A and set B, producing false positives. Consider either:

Skipping the lookup for LPM trie groups and always writing only the current set's bit, or

Preventing LPM trie sets from being grouped together when their prefixes can overlap.

This means a packet matching 192.168.1.0/24 would appear to belong to both set A and set B

I mean, logically it does belong to both sets, so your example is not the best. However, thanks for bringing up LPM maps. They have footguns. Let's exclude them from grouping up.

github-actions · 2026-05-05T17:22:38Z

        (void)snprintf(name, BPF_OBJ_NAME_LEN, _BF_SET_MAP_PREFIX "%04x",
-                       (uint8_t)set_idx++);
-        r = bf_map_new_from_set(&map, name, set);
+                       (uint16_t)map_idx++);


Claude: suggestion: The (void)bf_bpf_map_lookup_elem(...) discards the return value unconditionally. If the syscall fails for a reason other than ENOENT (e.g., EFAULT from a bad fd), the error is silently swallowed and the bitmask is built from zeroed data, which could cause silent incorrect behavior. Consider at least logging at debug level on unexpected failure:

int lr = bf_bpf_map_lookup_elem(new_map->fd, elem->data, value); if (lr && lr != -ENOENT) bf_dbg("unexpected map lookup error: %d", lr);

github-actions · 2026-05-05T17:22:39Z

+        bf_list_foreach (&group->sets, set_node) {
+            const struct bf_set *set = bf_list_node_get_data(set_node);
+
+            bf_hashset_foreach (&set->elems, elem) {


Claude: suggestion: The non-owning alias map_ref is stored before ownership transfer and assigned to group->map after the push. This is correct today, but fragile: if future code adds an early return between bf_list_push and group->map = map_ref, the group's map pointer will be left as NULL with no indication of what went wrong. Consider assigning group->map immediately after the successful push, or adding a brief comment noting the dependency.

github-actions · 2026-05-05T17:22:40Z

-                return r;
-            continue;
+    bf_list_foreach (&new_prog->set_groups, group_node) {
+        struct bf_set_group *group = bf_list_node_get_data(group_node);


Claude: suggestion: total_elems sums all elements across all sets in the group, but keys shared across sets will occupy a single map entry. For groups with heavily overlapping sets this over-estimates the BPF map's max_entries, wasting kernel memory. Safe but worth a brief comment documenting the trade-off (or a cap to prevent requesting an excessively large map from the kernel).

github-actions · 2026-05-05T17:22:40Z

@@ -71,10 +71,13 @@ int bf_map_new(struct bf_map **map, const char *name, enum bf_map_type type,
 *        also as filename when pinning the map to the system. Can't be NULL or
 *        empty.
 * @param set Set to create the map from. Can't be NULL.


Claude: nit: The @param set documentation reads "Set to create the map from" but the set is now used only for its key format (elem_size, use_trie), not its elements. Consider updating to: "Set whose key format defines the map's key size and type."

github-actions · 2026-05-05T17:22:41Z

+         * the actual entry count may be smaller, but we can err on the
+         * safe side. */
+        bf_list_foreach (&group->sets, set_node) {
+            const struct bf_set *set = bf_list_node_get_data(set_node);


Claude: suggestion: The read-modify-write loop performs 2 syscalls (lookup + update) per element per set. For single-set groups (the common case and all LPM trie groups), the bitmask is always 0x01, so the lookup is pure overhead. Consider a fast path: when n_sets == 1, skip the lookup and unconditionally write a static 0x01 value, or use bf_bpf_map_update_batch as the old code did. This avoids a per-element syscall regression for chains that do not benefit from grouping.

I think bf_bpf_map_update_batch is the right approach here, we can use bf_hashset to prepare final bf_map shape on userspace side. I'll try it later. However, it seems it's not a dealbreaker if we don't get it now - even with 2 syscalls per item, it only slows down load_maps 2-3x in microbenchmarks.

pzmarzly requested a review from qdeslandes as a code owner April 21, 2026 15:00

meta-cla Bot added the cla signed label Apr 21, 2026

pzmarzly commented Apr 21, 2026

View reviewed changes

Comment thread src/libbpfilter/cgen/matcher/set.c Outdated

github-actions Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread src/libbpfilter/cgen/program.c

github-actions Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread src/libbpfilter/cgen/program.c Outdated

github-actions Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread src/libbpfilter/cgen/matcher/set.c Outdated

github-actions Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread src/libbpfilter/cgen/matcher/set.c Outdated

github-actions Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread src/libbpfilter/cgen/program.c Outdated

github-actions Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread src/libbpfilter/cgen/program.c Outdated

github-actions Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread src/libbpfilter/cgen/program.c Outdated

github-actions Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread src/libbpfilter/cgen/prog/map.h Outdated

qdeslandes requested changes Apr 22, 2026

View reviewed changes

Comment thread src/libbpfilter/cgen/matcher/set.c Outdated

pzmarzly force-pushed the push-ytzrrosklspt branch 4 times, most recently from a49eca0 to 7f30d9d Compare April 22, 2026 15:06