Skip to content

Commit

Permalink
mm, slab: extend slab/shrink to shrink all memcg caches
Browse files Browse the repository at this point in the history
Currently, a value of '1" is written to /sys/kernel/slab/<slab>/shrink
file to shrink the slab by flushing out all the per-cpu slabs and free
slabs in partial lists.  This can be useful to squeeze out a bit more
memory under extreme condition as well as making the active object counts
in /proc/slabinfo more accurate.

This usually applies only to the root caches, as the SLUB_MEMCG_SYSFS_ON
option is usually not enabled and "slub_memcg_sysfs=1" not set.  Even if
memcg sysfs is turned on, it is too cumbersome and impractical to manage
all those per-memcg sysfs files in a real production system.

So there is no practical way to shrink memcg caches.  Fix this by enabling
a proper write to the shrink sysfs file of the root cache to scan all the
available memcg caches and shrink them as well.  For a non-root memcg
cache (when SLUB_MEMCG_SYSFS_ON or slub_memcg_sysfs is on), only that
cache will be shrunk when written.

On a 2-socket 64-core 256-thread arm64 system with 64k page after
a parallel kernel build, the the amount of memory occupied by slabs
before shrinking slabs were:

 # grep task_struct /proc/slabinfo
 task_struct        53137  53192   4288   61    4 : tunables    0    0
 0 : slabdata    872    872      0
 # grep "^S[lRU]" /proc/meminfo
 Slab:            3936832 kB
 SReclaimable:     399104 kB
 SUnreclaim:      3537728 kB

After shrinking slabs (by echoing "1" to all shrink files):

 # grep "^S[lRU]" /proc/meminfo
 Slab:            1356288 kB
 SReclaimable:     263296 kB
 SUnreclaim:      1092992 kB
 # grep task_struct /proc/slabinfo
 task_struct         2764   6832   4288   61    4 : tunables    0    0
 0 : slabdata    112    112      0

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Waiman Long <[email protected]>
Acked-by: Roman Gushchin <[email protected]>
Acked-by: Christoph Lameter <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Shakeel Butt <[email protected]>
Cc: Vladimir Davydov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
  • Loading branch information
Waiman-Long authored and torvalds committed Sep 24, 2019
1 parent 1c3ce54 commit 04f768a
Show file tree
Hide file tree
Showing 4 changed files with 48 additions and 5 deletions.
13 changes: 9 additions & 4 deletions Documentation/ABI/testing/sysfs-kernel-slab
Original file line number Diff line number Diff line change
Expand Up @@ -429,10 +429,15 @@ KernelVersion: 2.6.22
Contact: Pekka Enberg <[email protected]>,
Christoph Lameter <[email protected]>
Description:
The shrink file is written when memory should be reclaimed from
a cache. Empty partial slabs are freed and the partial list is
sorted so the slabs with the fewest available objects are used
first.
The shrink file is used to reclaim unused slab cache
memory from a cache. Empty per-cpu or partial slabs
are freed and the partial list is sorted so the slabs
with the fewest available objects are used first.
It only accepts a value of "1" on write for shrinking
the cache. Other input values are considered invalid.
Shrinking slab caches might be expensive and can
adversely impact other running applications. So it
should be used with care.

What: /sys/kernel/slab/cache/slab_size
Date: May 2007
Expand Down
1 change: 1 addition & 0 deletions mm/slab.h
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,7 @@ int __kmem_cache_shrink(struct kmem_cache *);
void __kmemcg_cache_deactivate(struct kmem_cache *s);
void __kmemcg_cache_deactivate_after_rcu(struct kmem_cache *s);
void slab_kmem_cache_release(struct kmem_cache *);
void kmem_cache_shrink_all(struct kmem_cache *s);

struct seq_file;
struct file;
Expand Down
37 changes: 37 additions & 0 deletions mm/slab_common.c
Original file line number Diff line number Diff line change
Expand Up @@ -981,6 +981,43 @@ int kmem_cache_shrink(struct kmem_cache *cachep)
}
EXPORT_SYMBOL(kmem_cache_shrink);

/**
* kmem_cache_shrink_all - shrink a cache and all memcg caches for root cache
* @s: The cache pointer
*/
void kmem_cache_shrink_all(struct kmem_cache *s)
{
struct kmem_cache *c;

if (!IS_ENABLED(CONFIG_MEMCG_KMEM) || !is_root_cache(s)) {
kmem_cache_shrink(s);
return;
}

get_online_cpus();
get_online_mems();
kasan_cache_shrink(s);
__kmem_cache_shrink(s);

/*
* We have to take the slab_mutex to protect from the memcg list
* modification.
*/
mutex_lock(&slab_mutex);
for_each_memcg_cache(c, s) {
/*
* Don't need to shrink deactivated memcg caches.
*/
if (s->flags & SLAB_DEACTIVATED)
continue;
kasan_cache_shrink(c);
__kmem_cache_shrink(c);
}
mutex_unlock(&slab_mutex);
put_online_mems();
put_online_cpus();
}

bool slab_is_available(void)
{
return slab_state >= UP;
Expand Down
2 changes: 1 addition & 1 deletion mm/slub.c
Original file line number Diff line number Diff line change
Expand Up @@ -5298,7 +5298,7 @@ static ssize_t shrink_store(struct kmem_cache *s,
const char *buf, size_t length)
{
if (buf[0] == '1')
kmem_cache_shrink(s);
kmem_cache_shrink_all(s);
else
return -EINVAL;
return length;
Expand Down

0 comments on commit 04f768a

Please sign in to comment.