Skip to content

Commit

Permalink
mm,mremap: bail out earlier in mremap_to under map pressure
Browse files Browse the repository at this point in the history
When using mremap() syscall in addition to MREMAP_FIXED flag, mremap()
calls mremap_to() which does the following:

1) unmaps the destination region where we are going to move the map
2) If the new region is going to be smaller, we unmap the last part
   of the old region

Then, we will eventually call move_vma() to do the actual move.

move_vma() checks whether we are at least 4 maps below max_map_count
before going further, otherwise it bails out with -ENOMEM.  The problem
is that we might have already unmapped the vma's in steps 1) and 2), so
it is not possible for userspace to figure out the state of the vmas
after it gets -ENOMEM, and it gets tricky for userspace to clean up
properly on error path.

While it is true that we can return -ENOMEM for more reasons (e.g: see
may_expand_vm() or move_page_tables()), I think that we can avoid this
scenario if we check early in mremap_to() if the operation has high
chances to succeed map-wise.

Should that not be the case, we can bail out before we even try to unmap
anything, so we make sure the vma's are left untouched in case we are
likely to be short of maps.

The thumb-rule now is to rely on the worst-scenario case we can have.
That is when both vma's (old region and new region) are going to be
split in 3, so we get two more maps to the ones we already hold (one per
each).  If current map count + 2 maps still leads us to 4 maps below the
threshold, we are going to pass the check in move_vma().

Of course, this is not free, as it might generate false positives when
it is true that we are tight map-wise, but the unmap operation can
release several vma's leading us to a good state.

Another approach was also investigated [1], but it may be too much
hassle for what it brings.

[1] https://lore.kernel.org/lkml/[email protected]/

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Oscar Salvador <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Joel Fernandes (Google) <[email protected]>
Cc: Yang Shi <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Cyril Hrubis <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
  • Loading branch information
osalvadorvilardaga authored and torvalds committed Mar 6, 2019
1 parent d778015 commit ea2c3f6
Showing 1 changed file with 17 additions and 0 deletions.
17 changes: 17 additions & 0 deletions mm/mremap.c
Original file line number Diff line number Diff line change
Expand Up @@ -516,6 +516,23 @@ static unsigned long mremap_to(unsigned long addr, unsigned long old_len,
if (addr + old_len > new_addr && new_addr + new_len > addr)
goto out;

/*
* move_vma() need us to stay 4 maps below the threshold, otherwise
* it will bail out at the very beginning.
* That is a problem if we have already unmaped the regions here
* (new_addr, and old_addr), because userspace will not know the
* state of the vma's after it gets -ENOMEM.
* So, to avoid such scenario we can pre-compute if the whole
* operation has high chances to success map-wise.
* Worst-scenario case is when both vma's (new_addr and old_addr) get
* split in 3 before unmaping it.
* That means 2 more maps (1 for each) to the ones we already hold.
* Check whether current map count plus 2 still leads us to 4 maps below
* the threshold, otherwise return -ENOMEM here to be more safe.
*/
if ((mm->map_count + 2) >= sysctl_max_map_count - 3)
return -ENOMEM;

ret = do_munmap(mm, new_addr, new_len, uf_unmap_early);
if (ret)
goto out;
Expand Down

0 comments on commit ea2c3f6

Please sign in to comment.