Skip to content

Commit

Permalink
bpf: Optimize call_rcu in non-preallocated hash map.
Browse files Browse the repository at this point in the history
Doing call_rcu() million times a second becomes a bottle neck.
Convert non-preallocated hash map from call_rcu to SLAB_TYPESAFE_BY_RCU.
The rcu critical section is no longer observed for one htab element
which makes non-preallocated hash map behave just like preallocated hash map.
The map elements are released back to kernel memory after observing
rcu critical section.
This improves 'map_perf_test 4' performance from 100k events per second
to 250k events per second.

bpf_mem_alloc + percpu_counter + typesafe_by_rcu provide 10x performance
boost to non-preallocated hash map and make it within few % of preallocated map
while consuming fraction of memory.

Signed-off-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Kumar Kartikeya Dwivedi <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
  • Loading branch information
Alexei Starovoitov authored and borkmann committed Sep 5, 2022
1 parent 86fe28f commit 0fd7c5d
Show file tree
Hide file tree
Showing 3 changed files with 7 additions and 14 deletions.
8 changes: 6 additions & 2 deletions kernel/bpf/hashtab.c
Original file line number Diff line number Diff line change
Expand Up @@ -953,8 +953,12 @@ static void free_htab_elem(struct bpf_htab *htab, struct htab_elem *l)
__pcpu_freelist_push(&htab->freelist, &l->fnode);
} else {
dec_elem_count(htab);
l->htab = htab;
call_rcu(&l->rcu, htab_elem_free_rcu);
if (htab->map.map_type == BPF_MAP_TYPE_PERCPU_HASH) {
l->htab = htab;
call_rcu(&l->rcu, htab_elem_free_rcu);
} else {
htab_elem_free(htab, l);
}
}
}

Expand Down
2 changes: 1 addition & 1 deletion kernel/bpf/memalloc.c
Original file line number Diff line number Diff line change
Expand Up @@ -281,7 +281,7 @@ int bpf_mem_alloc_init(struct bpf_mem_alloc *ma, int size)
return -ENOMEM;
size += LLIST_NODE_SZ; /* room for llist_node */
snprintf(buf, sizeof(buf), "bpf-%u", size);
kmem_cache = kmem_cache_create(buf, size, 8, 0, NULL);
kmem_cache = kmem_cache_create(buf, size, 8, SLAB_TYPESAFE_BY_RCU, NULL);
if (!kmem_cache) {
free_percpu(pc);
return -ENOMEM;
Expand Down
11 changes: 0 additions & 11 deletions tools/testing/selftests/bpf/progs/timer.c
Original file line number Diff line number Diff line change
Expand Up @@ -208,17 +208,6 @@ static int timer_cb2(void *map, int *key, struct hmap_elem *val)
*/
bpf_map_delete_elem(map, key);

/* in non-preallocated hashmap both 'key' and 'val' are RCU
* protected and still valid though this element was deleted
* from the map. Arm this timer for ~35 seconds. When callback
* finishes the call_rcu will invoke:
* htab_elem_free_rcu
* check_and_free_timer
* bpf_timer_cancel_and_free
* to cancel this 35 second sleep and delete the timer for real.
*/
if (bpf_timer_start(&val->timer, 1ull << 35, 0) != 0)
err |= 256;
ok |= 4;
}
return 0;
Expand Down

0 comments on commit 0fd7c5d

Please sign in to comment.