Skip to content

Commit

Permalink
thp: fix splitting of hwpoisoned hugepages
Browse files Browse the repository at this point in the history
The poisoned THP is now split with split_huge_page() in
collect_procs_anon().  If kmalloc() is failed in collect_procs(),
split_huge_page() could not be called.  And the work after
split_huge_page() for collecting the processes using poisoned page will
not be done, too.  So the processes using the poisoned page could not be
killed.

The condition becomes worse when CONFIG_DEBUG_VM == "Y".  Because the
poisoned THP could not be split, system panic will be caused by
VM_BUG_ON(PageTransHuge(page)) in try_to_unmap().

This patch does:
  1. move split_huge_page() to the place before collect_procs().
     This can be sure the failure of splitting THP is caused by itself.
  2. when splitting THP is failed, stop the operations after it.
     This can avoid unexpected system panic or non sense works.

[[email protected]: coding-style fixes]
Signed-off-by: Jin Dongming <[email protected]>
Reviewed-by: Hidetoshi Seto <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Andi Kleen <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
  • Loading branch information
Jin Dongming authored and torvalds committed Feb 3, 2011
1 parent b16957c commit efeda7a
Showing 1 changed file with 28 additions and 2 deletions.
30 changes: 28 additions & 2 deletions mm/memory-failure.c
Original file line number Diff line number Diff line change
Expand Up @@ -386,8 +386,6 @@ static void collect_procs_anon(struct page *page, struct list_head *to_kill,
struct task_struct *tsk;
struct anon_vma *av;

if (!PageHuge(page) && unlikely(split_huge_page(page)))
return;
read_lock(&tasklist_lock);
av = page_lock_anon_vma(page);
if (av == NULL) /* Not actually mapped anymore */
Expand Down Expand Up @@ -896,6 +894,34 @@ static int hwpoison_user_mappings(struct page *p, unsigned long pfn,
}
}

if (PageTransHuge(hpage)) {
/*
* Verify that this isn't a hugetlbfs head page, the check for
* PageAnon is just for avoid tripping a split_huge_page
* internal debug check, as split_huge_page refuses to deal with
* anything that isn't an anon page. PageAnon can't go away fro
* under us because we hold a refcount on the hpage, without a
* refcount on the hpage. split_huge_page can't be safely called
* in the first place, having a refcount on the tail isn't
* enough * to be safe.
*/
if (!PageHuge(hpage) && PageAnon(hpage)) {
if (unlikely(split_huge_page(hpage))) {
/*
* FIXME: if splitting THP is failed, it is
* better to stop the following operation rather
* than causing panic by unmapping. System might
* survive if the page is freed later.
*/
printk(KERN_INFO
"MCE %#lx: failed to split THP\n", pfn);

BUG_ON(!PageHWPoison(p));
return SWAP_FAIL;
}
}
}

/*
* First collect all the processes that have the page
* mapped in dirty form. This has to be done before try_to_unmap,
Expand Down

0 comments on commit efeda7a

Please sign in to comment.