Skip to content

Commit

Permalink
slab: remove slub sysfs interface files early for empty memcg caches
Browse files Browse the repository at this point in the history
With kmem cgroup support enabled, kmem_caches can be created and
destroyed frequently and a great number of near empty kmem_caches can
accumulate if there are a lot of transient cgroups and the system is not
under memory pressure.  When memory reclaim starts under such
conditions, it can lead to consecutive deactivation and destruction of
many kmem_caches, easily hundreds of thousands on moderately large
systems, exposing scalability issues in the current slab management
code.  This is one of the patches to address the issue.

Each cache has a number of sysfs interface files under /sys/kernel/slab.
On a system with a lot of memory and transient memcgs, the number of
interface files which have to be removed once memory reclaim kicks in
can reach millions.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Jay Vana <[email protected]>
Acked-by: Vladimir Davydov <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
  • Loading branch information
htejun authored and torvalds committed Feb 23, 2017
1 parent 01fb58b commit 50862ce
Showing 1 changed file with 23 additions and 2 deletions.
25 changes: 23 additions & 2 deletions mm/slub.c
Original file line number Diff line number Diff line change
Expand Up @@ -3959,8 +3959,20 @@ int __kmem_cache_shrink(struct kmem_cache *s)
#ifdef CONFIG_MEMCG
static void kmemcg_cache_deact_after_rcu(struct kmem_cache *s)
{
/* called with all the locks held after a sched RCU grace period */
__kmem_cache_shrink(s);
/*
* Called with all the locks held after a sched RCU grace period.
* Even if @s becomes empty after shrinking, we can't know that @s
* doesn't have allocations already in-flight and thus can't
* destroy @s until the associated memcg is released.
*
* However, let's remove the sysfs files for empty caches here.
* Each cache has a lot of interface files which aren't
* particularly useful for empty draining caches; otherwise, we can
* easily end up with millions of unnecessary sysfs files on
* systems which have a lot of memory and transient cgroups.
*/
if (!__kmem_cache_shrink(s))
sysfs_slab_remove(s);
}

void __kmemcg_cache_deactivate(struct kmem_cache *s)
Expand Down Expand Up @@ -5659,6 +5671,15 @@ static void sysfs_slab_remove(struct kmem_cache *s)
*/
return;

if (!s->kobj.state_in_sysfs)
/*
* For a memcg cache, this may be called during
* deactivation and again on shutdown. Remove only once.
* A cache is never shut down before deactivation is
* complete, so no need to worry about synchronization.
*/
return;

#ifdef CONFIG_MEMCG
kset_unregister(s->memcg_kset);
#endif
Expand Down

0 comments on commit 50862ce

Please sign in to comment.