Skip to content

Commit

Permalink
Revert "mm/vmscan: never demote for memcg reclaim"
Browse files Browse the repository at this point in the history
This reverts commit 3a23569.

Its premise was that cgroup reclaim cares about freeing memory inside the
cgroup, and demotion just moves them around within the cgroup limit. 
Hence, pages from toptier nodes should be reclaimed directly.

However, with NUMA balancing now doing tier promotions, demotion is part
of the page aging process.  Global reclaim demotes the coldest toptier
pages to secondary memory, where their life continues and from which they
have a chance to get promoted back.  Essentially, tiered memory systems
have an LRU order that spans multiple nodes.

When cgroup reclaims pages coming off the toptier directly, there can be
colder pages on lower tier nodes that were demoted by global reclaim. 
This is an aging inversion, not unlike if cgroups were to reclaim directly
from the active lists while there are inactive pages.

Proactive reclaim is another factor.  The goal of that it is to offload
colder pages from expensive RAM to cheaper storage.  When lower tier
memory is available as an intermediate layer, we want offloading to take
advantage of it instead of bypassing to storage.

Revert the patch so that cgroups respect the LRU order spanning the memory
hierarchy.

Of note is a specific undercommit scenario, where all cgroup limits in the
system add up to <= available toptier memory.  In that case, shuffling
pages out to lower tiers first to reclaim them from there is inefficient. 
This is something could be optimized/short-circuited later on (although
care must be taken not to accidentally recreate the aging inversion). 
Let's ensure correctness first.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Johannes Weiner <[email protected]>
Acked-by: Dave Hansen <[email protected]>
Reviewed-by: Yang Shi <[email protected]>
Acked-by: Roman Gushchin <[email protected]>
Reviewed-by: "Huang, Ying" <[email protected]>
Reviewed-by: Muchun Song <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Acked-by: Shakeel Butt <[email protected]>
Acked-by: Tim Chen <[email protected]>
Cc: Zi Yan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
  • Loading branch information
hnaz authored and akpm00 committed May 25, 2022
1 parent 83d7d04 commit 3f1509c
Showing 1 changed file with 2 additions and 7 deletions.
9 changes: 2 additions & 7 deletions mm/vmscan.c
Original file line number Diff line number Diff line change
Expand Up @@ -528,13 +528,8 @@ static bool can_demote(int nid, struct scan_control *sc)
{
if (!numa_demotion_enabled)
return false;
if (sc) {
if (sc->no_demotion)
return false;
/* It is pointless to do demotion in memcg reclaim */
if (cgroup_reclaim(sc))
return false;
}
if (sc && sc->no_demotion)
return false;
if (next_demotion_node(nid) == NUMA_NO_NODE)
return false;

Expand Down

0 comments on commit 3f1509c

Please sign in to comment.