Skip to content

Commit

Permalink
hugetlb, memory_hotplug: prefer to use reserved pages for migration
Browse files Browse the repository at this point in the history
new_node_page will try to use the origin's next NUMA node as the
migration destination for hugetlb pages.  If such a node doesn't have
any preallocated pool it falls back to __alloc_buddy_huge_page_no_mpol
to allocate a surplus page instead.  This is quite subotpimal for any
configuration when hugetlb pages are no distributed to all NUMA nodes
evenly.  Say we have a hotplugable node 4 and spare hugetlb pages are
node 0

  /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages:10000
  /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages:0
  /sys/devices/system/node/node2/hugepages/hugepages-2048kB/nr_hugepages:0
  /sys/devices/system/node/node3/hugepages/hugepages-2048kB/nr_hugepages:0
  /sys/devices/system/node/node4/hugepages/hugepages-2048kB/nr_hugepages:10000
  /sys/devices/system/node/node5/hugepages/hugepages-2048kB/nr_hugepages:0
  /sys/devices/system/node/node6/hugepages/hugepages-2048kB/nr_hugepages:0
  /sys/devices/system/node/node7/hugepages/hugepages-2048kB/nr_hugepages:0

Now we consume the whole pool on node 4 and try to offline this node.
All the allocated pages should be moved to node0 which has enough
preallocated pages to hold them.  With the current implementation
offlining very likely fails because hugetlb allocations during runtime
are much less reliable.

Fix this by reusing the nodemask which excludes migration source and try
to find a first node which has a page in the preallocated pool first and
fall back to __alloc_buddy_huge_page_no_mpol only when the whole pool is
consumed.

[[email protected]: remove bogus arg from alloc_huge_page_nodemask() stub]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michal Hocko <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Xishi Qiu <[email protected]>
Cc: zhong jiang <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
  • Loading branch information
Michal Hocko authored and torvalds committed Jul 10, 2017
1 parent 7f252f2 commit 4db9b2e
Show file tree
Hide file tree
Showing 3 changed files with 31 additions and 7 deletions.
2 changes: 2 additions & 0 deletions include/linux/hugetlb.h
Original file line number Diff line number Diff line change
Expand Up @@ -349,6 +349,7 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
struct page *alloc_huge_page_node(struct hstate *h, int nid);
struct page *alloc_huge_page_noerr(struct vm_area_struct *vma,
unsigned long addr, int avoid_reserve);
struct page *alloc_huge_page_nodemask(struct hstate *h, const nodemask_t *nmask);
int huge_add_to_page_cache(struct page *page, struct address_space *mapping,
pgoff_t idx);

Expand Down Expand Up @@ -524,6 +525,7 @@ static inline void set_huge_swap_pte_at(struct mm_struct *mm, unsigned long addr
struct hstate {};
#define alloc_huge_page(v, a, r) NULL
#define alloc_huge_page_node(h, nid) NULL
#define alloc_huge_page_nodemask(h, nmask) NULL
#define alloc_huge_page_noerr(v, a, r) NULL
#define alloc_bootmem_huge_page(h) NULL
#define hstate_file(f) NULL
Expand Down
27 changes: 27 additions & 0 deletions mm/hugetlb.c
Original file line number Diff line number Diff line change
Expand Up @@ -1723,6 +1723,33 @@ struct page *alloc_huge_page_node(struct hstate *h, int nid)
return page;
}

struct page *alloc_huge_page_nodemask(struct hstate *h, const nodemask_t *nmask)
{
struct page *page = NULL;
int node;

spin_lock(&hugetlb_lock);
if (h->free_huge_pages - h->resv_huge_pages > 0) {
for_each_node_mask(node, *nmask) {
page = dequeue_huge_page_node_exact(h, node);
if (page)
break;
}
}
spin_unlock(&hugetlb_lock);
if (page)
return page;

/* No reservations, try to overcommit */
for_each_node_mask(node, *nmask) {
page = __alloc_buddy_huge_page_no_mpol(h, node);
if (page)
return page;
}

return NULL;
}

/*
* Increase the hugetlb pool such that it can accommodate a reservation
* of size 'delta'.
Expand Down
9 changes: 2 additions & 7 deletions mm/memory_hotplug.c
Original file line number Diff line number Diff line change
Expand Up @@ -1446,14 +1446,9 @@ static struct page *new_node_page(struct page *page, unsigned long private,
if (nodes_empty(nmask))
node_set(nid, nmask);

/*
* TODO: allocate a destination hugepage from a nearest neighbor node,
* accordance with memory policy of the user process if possible. For
* now as a simple work-around, we use the next node for destination.
*/
if (PageHuge(page))
return alloc_huge_page_node(page_hstate(compound_head(page)),
next_node_in(nid, nmask));
return alloc_huge_page_nodemask(
page_hstate(compound_head(page)), &nmask);

if (PageHighMem(page)
|| (zone_idx(page_zone(page)) == ZONE_MOVABLE))
Expand Down

0 comments on commit 4db9b2e

Please sign in to comment.