Skip to content

Commit

Permalink
Merge tag 'mlx5-updates-2022-10-24' of git://git.kernel.org/pub/scm/l…
Browse files Browse the repository at this point in the history
…inux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2022-10-24

SW steering updates from Yevgeny Kliteynik:

1) 1st Four patches: small fixes / optimizations for SW steering:

 - Patch 1: Don't abort destroy flow if failed to destroy table - continue
   and free everything else.
 - Patches 2 and 3 deal with fast teardown:
    + Skip sync during fast teardown, as PCI device is not there any more.
    + Check device state when polling CQ - otherwise SW steering keeps polling
      the CQ forever, because nobody is there to flush it.
 - Patch 4: Removing unneeded function argument.

2) Deal with the hiccups that we get during rules insertion/deletion,
which sometimes reach 1/4 of a second. While insertion/deletion rate
improvement was not the focus here, it still is a by-product of removing these
hiccups.

Another by-product is the reduced standard deviation in measuring the duration
of rules insertion/deletion bursts.

In the testing we add K rules (warm-up phase), and then continuously do
insertion/deletion bursts of N rules.
During the test execution, the driver measures hiccups (amount and duration)
and total time for insertion/deletion of a batch of rules.

Here are some numbers, before and after these patches:

+--------------------------------------------+-----------------+----------------+
|                                            |   Create rules  |  Delete rules  |
|                                            +--------+--------+--------+-------+
|                                            | Before |  After | Before | After |
+--------------------------------------------+--------+--------+--------+-------+
| Max hiccup [msec]                          |    253 |     42 |    254 |    68 |
+--------------------------------------------+--------+--------+--------+-------+
| Avg duration of 10K rules add/remove [msec]| 140.07 | 124.32 | 106.99 | 99.51 |
+--------------------------------------------+--------+--------+--------+-------+
| Num of hiccups per 100K rules add/remove   |   7.77 |   7.97 |  12.60 | 11.57 |
+--------------------------------------------+--------+--------+--------+-------+
| Avg hiccup duration [msec]                 |  36.92 |  33.25 |  36.15 | 33.74 |
+--------------------------------------------+--------+--------+--------+-------+

 - Patch 5: Allocate a short array on stack instead of dynamically- it is
   destroyed at the end of the function.
 - Patch 6: Rather than cleaning the corresponding chunk's section of
   ste_arrays on chunk deletion, initialize these areas upon chunk creation.
   Chunk destruction tend to come in large batches (during pool syncing),
   so instead of doing huge memory initialization during pool sync,
   we amortize this by doing small initsializations on chunk creation.
 - Patch 7: In order to simplifies error flow and allows cleaner addition
   of new pools, handle creation/destruction of all the domain's memory pools
   and other memory-related fields in a separate init/uninit functions.
 - Patch 8: During rehash, write each table row immediately instead of waiting
   for the whole table to be ready and writing it all - saves allocations
   of ste_send_info structures and improves performance.
 - Patch 9: Instead of allocating/freeing send info objects dynamically,
   manage them in pool. The number of send info objects doesn't depend on
   number of rules, so after pre-populating the pool with an initial batch of
   send info objects, the pool is not expected to grow.
   This way we save alloc/free during writing STEs to ICM, which by itself can
   sometimes take up to 40msec.
 - Patch 10: Allocate icm_chunks from their own slab allocator, which lowered
   the alloc/free "hiccups" frequency.
 - Patch 11: Similar to patch 9, allocate htbl from its own slab allocator.
 - Patch 12: Lower sync threshold for ICM hot memory - set the threshold for
   sync to 1/4 of the pool instead of 1/2 of the pool. Although we will have
   more syncs, each     sync will be shorter and will help with insertion rate
   stability. Also, notice that the overall number of hiccups wasn't increased
   due to all the other patches.
 - Patch 13: Keep track of hot ICM chunks in an array instead of list.
   After steering sync, we traverse the hot list and finally free all the
   chunks. It appears that traversing a long list takes unusually long time
   due to cache misses on many entries, which causes a big "hiccup" during
   rule insertion. This patch replaces the list with pre-allocated array that
   stores only the bookkeeping information that is needed to later free the
   chunks in its buddy allocator.
 - Patch 14: Remove the unneeded buddy used_list - we don't need to have the
   list of used chunks, we only need the total amount of used memory.

* tag 'mlx5-updates-2022-10-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: DR, Remove the buddy used_list
  net/mlx5: DR, Keep track of hot ICM chunks in an array instead of list
  net/mlx5: DR, Lower sync threshold for ICM hot memory
  net/mlx5: DR, Allocate htbl from its own slab allocator
  net/mlx5: DR, Allocate icm_chunks from their own slab allocator
  net/mlx5: DR, Manage STE send info objects in pool
  net/mlx5: DR, In rehash write the line in the entry immediately
  net/mlx5: DR, Handle domain memory resources init/uninit separately
  net/mlx5: DR, Initialize chunk's ste_arrays at chunk creation
  net/mlx5: DR, For short chains of STEs, avoid allocating ste_arr dynamically
  net/mlx5: DR, Remove unneeded argument from dr_icm_chunk_destroy
  net/mlx5: DR, Check device state when polling CQ
  net/mlx5: DR, Fix the SMFS sync_steering for fast teardown
  net/mlx5: DR, In destroy flow, free resources even if FW command failed
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
  • Loading branch information
kuba-moo committed Oct 29, 2022
2 parents eb288cb + edaea00 commit 02a97e0
Show file tree
Hide file tree
Showing 10 changed files with 406 additions and 138 deletions.
2 changes: 0 additions & 2 deletions drivers/net/ethernet/mellanox/mlx5/core/steering/dr_buddy.c
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,6 @@ int mlx5dr_buddy_init(struct mlx5dr_icm_buddy_mem *buddy,
buddy->max_order = max_order;

INIT_LIST_HEAD(&buddy->list_node);
INIT_LIST_HEAD(&buddy->used_list);
INIT_LIST_HEAD(&buddy->hot_list);

buddy->bitmap = kcalloc(buddy->max_order + 1,
sizeof(*buddy->bitmap),
Expand Down
7 changes: 7 additions & 0 deletions drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.c
Original file line number Diff line number Diff line change
Expand Up @@ -271,6 +271,13 @@ int mlx5dr_cmd_sync_steering(struct mlx5_core_dev *mdev)
{
u32 in[MLX5_ST_SZ_DW(sync_steering_in)] = {};

/* Skip SYNC in case the device is internal error state.
* Besides a device error, this also happens when we're
* in fast teardown
*/
if (mdev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)
return 0;

MLX5_SET(sync_steering_in, in, opcode, MLX5_CMD_OP_SYNC_STEERING);

return mlx5_cmd_exec_in(mdev, sync_steering, in);
Expand Down
89 changes: 71 additions & 18 deletions drivers/net/ethernet/mellanox/mlx5/core/steering/dr_domain.c
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,70 @@ int mlx5dr_domain_get_recalc_cs_ft_addr(struct mlx5dr_domain *dmn,
return 0;
}

static int dr_domain_init_mem_resources(struct mlx5dr_domain *dmn)
{
int ret;

dmn->chunks_kmem_cache = kmem_cache_create("mlx5_dr_chunks",
sizeof(struct mlx5dr_icm_chunk), 0,
SLAB_HWCACHE_ALIGN, NULL);
if (!dmn->chunks_kmem_cache) {
mlx5dr_err(dmn, "Couldn't create chunks kmem_cache\n");
return -ENOMEM;
}

dmn->htbls_kmem_cache = kmem_cache_create("mlx5_dr_htbls",
sizeof(struct mlx5dr_ste_htbl), 0,
SLAB_HWCACHE_ALIGN, NULL);
if (!dmn->htbls_kmem_cache) {
mlx5dr_err(dmn, "Couldn't create hash tables kmem_cache\n");
ret = -ENOMEM;
goto free_chunks_kmem_cache;
}

dmn->ste_icm_pool = mlx5dr_icm_pool_create(dmn, DR_ICM_TYPE_STE);
if (!dmn->ste_icm_pool) {
mlx5dr_err(dmn, "Couldn't get icm memory\n");
ret = -ENOMEM;
goto free_htbls_kmem_cache;
}

dmn->action_icm_pool = mlx5dr_icm_pool_create(dmn, DR_ICM_TYPE_MODIFY_ACTION);
if (!dmn->action_icm_pool) {
mlx5dr_err(dmn, "Couldn't get action icm memory\n");
ret = -ENOMEM;
goto free_ste_icm_pool;
}

ret = mlx5dr_send_info_pool_create(dmn);
if (ret) {
mlx5dr_err(dmn, "Couldn't create send info pool\n");
goto free_action_icm_pool;
}

return 0;

free_action_icm_pool:
mlx5dr_icm_pool_destroy(dmn->action_icm_pool);
free_ste_icm_pool:
mlx5dr_icm_pool_destroy(dmn->ste_icm_pool);
free_htbls_kmem_cache:
kmem_cache_destroy(dmn->htbls_kmem_cache);
free_chunks_kmem_cache:
kmem_cache_destroy(dmn->chunks_kmem_cache);

return ret;
}

static void dr_domain_uninit_mem_resources(struct mlx5dr_domain *dmn)
{
mlx5dr_send_info_pool_destroy(dmn);
mlx5dr_icm_pool_destroy(dmn->action_icm_pool);
mlx5dr_icm_pool_destroy(dmn->ste_icm_pool);
kmem_cache_destroy(dmn->htbls_kmem_cache);
kmem_cache_destroy(dmn->chunks_kmem_cache);
}

static int dr_domain_init_resources(struct mlx5dr_domain *dmn)
{
int ret;
Expand All @@ -79,32 +143,22 @@ static int dr_domain_init_resources(struct mlx5dr_domain *dmn)
goto clean_pd;
}

dmn->ste_icm_pool = mlx5dr_icm_pool_create(dmn, DR_ICM_TYPE_STE);
if (!dmn->ste_icm_pool) {
mlx5dr_err(dmn, "Couldn't get icm memory\n");
ret = -ENOMEM;
ret = dr_domain_init_mem_resources(dmn);
if (ret) {
mlx5dr_err(dmn, "Couldn't create domain memory resources\n");
goto clean_uar;
}

dmn->action_icm_pool = mlx5dr_icm_pool_create(dmn, DR_ICM_TYPE_MODIFY_ACTION);
if (!dmn->action_icm_pool) {
mlx5dr_err(dmn, "Couldn't get action icm memory\n");
ret = -ENOMEM;
goto free_ste_icm_pool;
}

ret = mlx5dr_send_ring_alloc(dmn);
if (ret) {
mlx5dr_err(dmn, "Couldn't create send-ring\n");
goto free_action_icm_pool;
goto clean_mem_resources;
}

return 0;

free_action_icm_pool:
mlx5dr_icm_pool_destroy(dmn->action_icm_pool);
free_ste_icm_pool:
mlx5dr_icm_pool_destroy(dmn->ste_icm_pool);
clean_mem_resources:
dr_domain_uninit_mem_resources(dmn);
clean_uar:
mlx5_put_uars_page(dmn->mdev, dmn->uar);
clean_pd:
Expand All @@ -116,8 +170,7 @@ static int dr_domain_init_resources(struct mlx5dr_domain *dmn)
static void dr_domain_uninit_resources(struct mlx5dr_domain *dmn)
{
mlx5dr_send_ring_free(dmn, dmn->send_ring);
mlx5dr_icm_pool_destroy(dmn->action_icm_pool);
mlx5dr_icm_pool_destroy(dmn->ste_icm_pool);
dr_domain_uninit_mem_resources(dmn);
mlx5_put_uars_page(dmn->mdev, dmn->uar);
mlx5_core_dealloc_pd(dmn->mdev, dmn->pdn);
}
Expand Down
Loading

0 comments on commit 02a97e0

Please sign in to comment.