Skip to content

Commit

Permalink
mm: make PR_SET_THP_DISABLE immediately active
Browse files Browse the repository at this point in the history
PR_SET_THP_DISABLE has a rather subtle semantic.  It doesn't affect any
existing mapping because it only updated mm->def_flags which is a
template for new mappings.

The mappings created after prctl(PR_SET_THP_DISABLE) have VM_NOHUGEPAGE
flag set.  This can be quite surprising for all those applications which
do not do prctl(); fork() & exec() and want to control their own THP
behavior.

Another usecase when the immediate semantic of the prctl might be useful
is a combination of pre- and post-copy migration of containers with
CRIU.  In this case CRIU populates a part of a memory region with data
that was saved during the pre-copy stage.  Afterwards, the region is
registered with userfaultfd and CRIU expects to get page faults for the
parts of the region that were not yet populated.  However, khugepaged
collapses the pages and the expected page faults do not occur.

In more general case, the prctl(PR_SET_THP_DISABLE) could be used as a
temporary mechanism for enabling/disabling THP process wide.

Implementation wise, a new MMF_DISABLE_THP flag is added.  This flag is
tested when decision whether to use huge pages is taken either during
page fault of at the time of THP collapse.

It should be noted, that the new implementation makes PR_SET_THP_DISABLE
master override to any per-VMA setting, which was not the case
previously.

Fixes: a0715cc ("mm, thp: add VM_INIT_DEF_MASK and PRCTL_THP_DISABLE")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michal Hocko <[email protected]>
Signed-off-by: Mike Rapoport <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
  • Loading branch information
Michal Hocko authored and torvalds committed Jul 10, 2017
1 parent b6bb981 commit 1860033
Show file tree
Hide file tree
Showing 6 changed files with 17 additions and 9 deletions.
1 change: 1 addition & 0 deletions include/linux/huge_mm.h
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@ extern bool is_vma_temporary_stack(struct vm_area_struct *vma);
(1<<TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG) && \
((__vma)->vm_flags & VM_HUGEPAGE))) && \
!((__vma)->vm_flags & VM_NOHUGEPAGE) && \
!test_bit(MMF_DISABLE_THP, &(__vma)->vm_mm->flags) && \
!is_vma_temporary_stack(__vma))
#define transparent_hugepage_use_zero_page() \
(transparent_hugepage_flags & \
Expand Down
3 changes: 2 additions & 1 deletion include/linux/khugepaged.h
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,8 @@ static inline int khugepaged_enter(struct vm_area_struct *vma,
if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags))
if ((khugepaged_always() ||
(khugepaged_req_madv() && (vm_flags & VM_HUGEPAGE))) &&
!(vm_flags & VM_NOHUGEPAGE))
!(vm_flags & VM_NOHUGEPAGE) &&
!test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
if (__khugepaged_enter(vma->vm_mm))
return -ENOMEM;
return 0;
Expand Down
5 changes: 4 additions & 1 deletion include/linux/sched/coredump.h
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,10 @@ static inline int get_dumpable(struct mm_struct *mm)
#define MMF_OOM_SKIP 21 /* mm is of no interest for the OOM killer */
#define MMF_UNSTABLE 22 /* mm is unstable for copy_from_user */
#define MMF_HUGE_ZERO_PAGE 23 /* mm has ever used the global huge zero page */
#define MMF_DISABLE_THP 24 /* disable THP for all VMAs */
#define MMF_DISABLE_THP_MASK (1 << MMF_DISABLE_THP)

#define MMF_INIT_MASK (MMF_DUMPABLE_MASK | MMF_DUMP_FILTER_MASK)
#define MMF_INIT_MASK (MMF_DUMPABLE_MASK | MMF_DUMP_FILTER_MASK |\
MMF_DISABLE_THP_MASK)

#endif /* _LINUX_SCHED_COREDUMP_H */
6 changes: 3 additions & 3 deletions kernel/sys.c
Original file line number Diff line number Diff line change
Expand Up @@ -2360,17 +2360,17 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
case PR_GET_THP_DISABLE:
if (arg2 || arg3 || arg4 || arg5)
return -EINVAL;
error = !!(me->mm->def_flags & VM_NOHUGEPAGE);
error = !!test_bit(MMF_DISABLE_THP, &me->mm->flags);
break;
case PR_SET_THP_DISABLE:
if (arg3 || arg4 || arg5)
return -EINVAL;
if (down_write_killable(&me->mm->mmap_sem))
return -EINTR;
if (arg2)
me->mm->def_flags |= VM_NOHUGEPAGE;
set_bit(MMF_DISABLE_THP, &me->mm->flags);
else
me->mm->def_flags &= ~VM_NOHUGEPAGE;
clear_bit(MMF_DISABLE_THP, &me->mm->flags);
up_write(&me->mm->mmap_sem);
break;
case PR_MPX_ENABLE_MANAGEMENT:
Expand Down
3 changes: 2 additions & 1 deletion mm/khugepaged.c
Original file line number Diff line number Diff line change
Expand Up @@ -816,7 +816,8 @@ khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node)
static bool hugepage_vma_check(struct vm_area_struct *vma)
{
if ((!(vma->vm_flags & VM_HUGEPAGE) && !khugepaged_always()) ||
(vma->vm_flags & VM_NOHUGEPAGE))
(vma->vm_flags & VM_NOHUGEPAGE) ||
test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
return false;
if (shmem_file(vma->vm_file)) {
if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGE_PAGECACHE))
Expand Down
8 changes: 5 additions & 3 deletions mm/shmem.c
Original file line number Diff line number Diff line change
Expand Up @@ -1977,10 +1977,12 @@ static int shmem_fault(struct vm_fault *vmf)
}

sgp = SGP_CACHE;
if (vma->vm_flags & VM_HUGEPAGE)
sgp = SGP_HUGE;
else if (vma->vm_flags & VM_NOHUGEPAGE)

if ((vma->vm_flags & VM_NOHUGEPAGE) ||
test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
sgp = SGP_NOHUGE;
else if (vma->vm_flags & VM_HUGEPAGE)
sgp = SGP_HUGE;

error = shmem_getpage_gfp(inode, vmf->pgoff, &vmf->page, sgp,
gfp, vma, vmf, &ret);
Expand Down

0 comments on commit 1860033

Please sign in to comment.