Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
mempolicy: fix show_numa_map() vs exec() + do_set_mempolicy() race
9e78144 "hold task->mempolicy while numa_maps scans." fixed the race with the exiting task but this is not enough. The current code assumes that get_vma_policy(task) should either see task->mempolicy == NULL or it should be equal to ->task_mempolicy saved by hold_task_mempolicy(), so we can never race with __mpol_put(). But this can only work if we can't race with do_set_mempolicy(), and thus we can't race with another do_set_mempolicy() or do_exit() after that. However, do_set_mempolicy()->down_write(mmap_sem) can not prevent this race. This task can exec, change it's ->mm, and call do_set_mempolicy() after that; in this case they take 2 different locks. Change hold_task_mempolicy() to use get_task_policy(), it never returns NULL, and change show_numa_map() to use __get_vma_policy() or fall back to proc_priv->task_mempolicy. Note: this is the minimal fix, we will cleanup this code later. I think hold_task_mempolicy() and release_task_mempolicy() should die, we can move this logic into show_numa_map(). Or we can move get_task_policy() outside of ->mmap_sem and !CONFIG_NUMA code at least. Signed-off-by: Oleg Nesterov <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: David Rientjes <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Cyrill Gorcunov <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Naoya Horiguchi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
- Loading branch information