Skip to content

Commit

Permalink
mm: brk: downgrade mmap_sem to read when shrinking
Browse files Browse the repository at this point in the history
brk might be used to shrink memory mapping too other than munmap().  So,
it may hold write mmap_sem for long time when shrinking large mapping, as
what commit ("mm: mmap: zap pages with read mmap_sem in munmap")
described.

The brk() will not manipulate vmas anymore after __do_munmap() call for
the mapping shrink use case.  But, it may set mm->brk after __do_munmap(),
which needs hold write mmap_sem.

However, a simple trick can workaround this by setting mm->brk before
__do_munmap().  Then restore the original value if __do_munmap() fails.
With this trick, it is safe to downgrade to read mmap_sem.

So, the same optimization, which downgrades mmap_sem to read for zapping
pages, is also feasible and reasonable to this case.

The period of holding exclusive mmap_sem for shrinking large mapping would
be reduced significantly with this optimization.

[[email protected]: tweak comment]
[[email protected]: fix unsigned compare against 0 issue]
  Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Yang Shi <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Laurent Dufour <[email protected]>
Cc: Colin Ian King <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
  • Loading branch information
Yang Shi authored and torvalds committed Oct 26, 2018
1 parent 85a0683 commit 9bc8039
Showing 1 changed file with 35 additions and 11 deletions.
46 changes: 35 additions & 11 deletions mm/mmap.c
Original file line number Diff line number Diff line change
Expand Up @@ -191,16 +191,19 @@ static int do_brk_flags(unsigned long addr, unsigned long request, unsigned long
SYSCALL_DEFINE1(brk, unsigned long, brk)
{
unsigned long retval;
unsigned long newbrk, oldbrk;
unsigned long newbrk, oldbrk, origbrk;
struct mm_struct *mm = current->mm;
struct vm_area_struct *next;
unsigned long min_brk;
bool populate;
bool downgraded = false;
LIST_HEAD(uf);

if (down_write_killable(&mm->mmap_sem))
return -EINTR;

origbrk = mm->brk;

#ifdef CONFIG_COMPAT_BRK
/*
* CONFIG_COMPAT_BRK can still be overridden by setting
Expand Down Expand Up @@ -229,14 +232,32 @@ SYSCALL_DEFINE1(brk, unsigned long, brk)

newbrk = PAGE_ALIGN(brk);
oldbrk = PAGE_ALIGN(mm->brk);
if (oldbrk == newbrk)
goto set_brk;
if (oldbrk == newbrk) {
mm->brk = brk;
goto success;
}

/* Always allow shrinking brk. */
/*
* Always allow shrinking brk.
* __do_munmap() may downgrade mmap_sem to read.
*/
if (brk <= mm->brk) {
if (!do_munmap(mm, newbrk, oldbrk-newbrk, &uf))
goto set_brk;
goto out;
int ret;

/*
* mm->brk must to be protected by write mmap_sem so update it
* before downgrading mmap_sem. When __do_munmap() fails,
* mm->brk will be restored from origbrk.
*/
mm->brk = brk;
ret = __do_munmap(mm, newbrk, oldbrk-newbrk, &uf, true);
if (ret < 0) {
mm->brk = origbrk;
goto out;
} else if (ret == 1) {
downgraded = true;
}
goto success;
}

/* Check against existing mmap mappings. */
Expand All @@ -247,18 +268,21 @@ SYSCALL_DEFINE1(brk, unsigned long, brk)
/* Ok, looks good - let it rip. */
if (do_brk_flags(oldbrk, newbrk-oldbrk, 0, &uf) < 0)
goto out;

set_brk:
mm->brk = brk;

success:
populate = newbrk > oldbrk && (mm->def_flags & VM_LOCKED) != 0;
up_write(&mm->mmap_sem);
if (downgraded)
up_read(&mm->mmap_sem);
else
up_write(&mm->mmap_sem);
userfaultfd_unmap_complete(mm, &uf);
if (populate)
mm_populate(oldbrk, newbrk - oldbrk);
return brk;

out:
retval = mm->brk;
retval = origbrk;
up_write(&mm->mmap_sem);
return retval;
}
Expand Down

0 comments on commit 9bc8039

Please sign in to comment.