Skip to content

Commit

Permalink
mm: mremap: downgrade mmap_sem to read when shrinking
Browse files Browse the repository at this point in the history
Other than munmap, mremap might be used to shrink memory mapping too.
So, it may hold write mmap_sem for long time when shrinking large
mapping, as what commit ("mm: mmap: zap pages with read mmap_sem in
munmap") described.

The mremap() will not manipulate vmas anymore after __do_munmap() call for
the mapping shrink use case, so it is safe to downgrade to read mmap_sem.

So, the same optimization, which downgrades mmap_sem to read for zapping
pages, is also feasible and reasonable to this case.

The period of holding exclusive mmap_sem for shrinking large mapping
would be reduced significantly with this optimization.

MREMAP_FIXED and MREMAP_MAYMOVE are more complicated to adopt this
optimization since they need manipulate vmas after do_munmap(),
downgrading mmap_sem may create race window.

Simple mapping shrink is the low hanging fruit, and it may cover the
most cases of unmap with munmap together.

[[email protected]: tweak comment]
[[email protected]: fix unsigned compare against 0 issue]
  Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Yang Shi <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Laurent Dufour <[email protected]>
Cc: Colin Ian King <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
  • Loading branch information
Yang Shi authored and torvalds committed Oct 26, 2018
1 parent 3c05132 commit 85a0683
Show file tree
Hide file tree
Showing 3 changed files with 20 additions and 6 deletions.
2 changes: 2 additions & 0 deletions include/linux/mm.h
Original file line number Diff line number Diff line change
Expand Up @@ -2306,6 +2306,8 @@ extern unsigned long do_mmap(struct file *file, unsigned long addr,
unsigned long len, unsigned long prot, unsigned long flags,
vm_flags_t vm_flags, unsigned long pgoff, unsigned long *populate,
struct list_head *uf);
extern int __do_munmap(struct mm_struct *, unsigned long, size_t,
struct list_head *uf, bool downgrade);
extern int do_munmap(struct mm_struct *, unsigned long, size_t,
struct list_head *uf);

Expand Down
4 changes: 2 additions & 2 deletions mm/mmap.c
Original file line number Diff line number Diff line change
Expand Up @@ -2687,8 +2687,8 @@ int split_vma(struct mm_struct *mm, struct vm_area_struct *vma,
* work. This now handles partial unmappings.
* Jeremy Fitzhardinge <[email protected]>
*/
static int __do_munmap(struct mm_struct *mm, unsigned long start, size_t len,
struct list_head *uf, bool downgrade)
int __do_munmap(struct mm_struct *mm, unsigned long start, size_t len,
struct list_head *uf, bool downgrade)
{
unsigned long end;
struct vm_area_struct *vma, *prev, *last;
Expand Down
20 changes: 16 additions & 4 deletions mm/mremap.c
Original file line number Diff line number Diff line change
Expand Up @@ -521,6 +521,7 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len,
unsigned long ret = -EINVAL;
unsigned long charged = 0;
bool locked = false;
bool downgraded = false;
struct vm_userfaultfd_ctx uf = NULL_VM_UFFD_CTX;
LIST_HEAD(uf_unmap_early);
LIST_HEAD(uf_unmap);
Expand Down Expand Up @@ -557,12 +558,20 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len,
/*
* Always allow a shrinking remap: that just unmaps
* the unnecessary pages..
* do_munmap does all the needed commit accounting
* __do_munmap does all the needed commit accounting, and
* downgrades mmap_sem to read if so directed.
*/
if (old_len >= new_len) {
ret = do_munmap(mm, addr+new_len, old_len - new_len, &uf_unmap);
if (ret && old_len != new_len)
int retval;

retval = __do_munmap(mm, addr+new_len, old_len - new_len,
&uf_unmap, true);
if (retval < 0 && old_len != new_len) {
ret = retval;
goto out;
/* Returning 1 indicates mmap_sem is downgraded to read. */
} else if (retval == 1)
downgraded = true;
ret = addr;
goto out;
}
Expand Down Expand Up @@ -627,7 +636,10 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len,
vm_unacct_memory(charged);
locked = 0;
}
up_write(&current->mm->mmap_sem);
if (downgraded)
up_read(&current->mm->mmap_sem);
else
up_write(&current->mm->mmap_sem);
if (locked && new_len > old_len)
mm_populate(new_addr + old_len, new_len - old_len);
userfaultfd_unmap_complete(mm, &uf_unmap_early);
Expand Down

0 comments on commit 85a0683

Please sign in to comment.