Skip to content

Commit

Permalink
aio: fix io_destroy() regression by using call_rcu()
Browse files Browse the repository at this point in the history
There was a regression introduced by 36f5588 ("aio: refcounting
cleanup"), reported by Jens Axboe - the refcounting cleanup switched to
using RCU in the shutdown path, but the synchronize_rcu() was done in
the context of the io_destroy() syscall greatly increasing the time it
could block.

This patch switches it to call_rcu() and makes shutdown asynchronous
(more asynchronous than it was originally; before the refcount changes
io_destroy() would still wait on pending kiocbs).

Note that there's a global quota on the max outstanding kiocbs, and that
quota must be manipulated synchronously; otherwise io_setup() could
return -EAGAIN when there isn't quota available, and userspace won't
have any way of waiting until shutdown of the old kioctxs has finished
(besides busy looping).

So we release our quota before kioctx shutdown has finished, which
should be fine since the quota never corresponded to anything real
anyways.

Signed-off-by: Kent Overstreet <[email protected]>
Cc: Zach Brown <[email protected]>
Cc: Felipe Balbi <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Rusty Russell <[email protected]>
Reported-by: Jens Axboe <[email protected]>
Tested-by: Jens Axboe <[email protected]>
Cc: Asai Thambi S P <[email protected]>
Cc: Selvan Mani <[email protected]>
Cc: Sam Bradshaw <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: Al Viro <[email protected]>
Signed-off-by: Benjamin LaHaise <[email protected]>
Tested-by: Benjamin LaHaise <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
  • Loading branch information
Kent Overstreet authored and torvalds committed Jun 12, 2013
1 parent bba00e5 commit 4fcc712
Showing 1 changed file with 16 additions and 20 deletions.
36 changes: 16 additions & 20 deletions fs/aio.c
Original file line number Diff line number Diff line change
Expand Up @@ -141,9 +141,6 @@ static void aio_free_ring(struct kioctx *ctx)
for (i = 0; i < ctx->nr_pages; i++)
put_page(ctx->ring_pages[i]);

if (ctx->mmap_size)
vm_munmap(ctx->mmap_base, ctx->mmap_size);

if (ctx->ring_pages && ctx->ring_pages != ctx->internal_pages)
kfree(ctx->ring_pages);
}
Expand Down Expand Up @@ -322,11 +319,6 @@ static void free_ioctx(struct kioctx *ctx)

aio_free_ring(ctx);

spin_lock(&aio_nr_lock);
BUG_ON(aio_nr - ctx->max_reqs > aio_nr);
aio_nr -= ctx->max_reqs;
spin_unlock(&aio_nr_lock);

pr_debug("freeing %p\n", ctx);

/*
Expand Down Expand Up @@ -435,17 +427,24 @@ static void kill_ioctx(struct kioctx *ctx)
{
if (!atomic_xchg(&ctx->dead, 1)) {
hlist_del_rcu(&ctx->list);
/* Between hlist_del_rcu() and dropping the initial ref */
synchronize_rcu();

/*
* We can't punt to workqueue here because put_ioctx() ->
* free_ioctx() will unmap the ringbuffer, and that has to be
* done in the original process's context. kill_ioctx_rcu/work()
* exist for exit_aio(), as in that path free_ioctx() won't do
* the unmap.
* It'd be more correct to do this in free_ioctx(), after all
* the outstanding kiocbs have finished - but by then io_destroy
* has already returned, so io_setup() could potentially return
* -EAGAIN with no ioctxs actually in use (as far as userspace
* could tell).
*/
kill_ioctx_work(&ctx->rcu_work);
spin_lock(&aio_nr_lock);
BUG_ON(aio_nr - ctx->max_reqs > aio_nr);
aio_nr -= ctx->max_reqs;
spin_unlock(&aio_nr_lock);

if (ctx->mmap_size)
vm_munmap(ctx->mmap_base, ctx->mmap_size);

/* Between hlist_del_rcu() and dropping the initial ref */
call_rcu(&ctx->rcu_head, kill_ioctx_rcu);
}
}

Expand Down Expand Up @@ -495,10 +494,7 @@ void exit_aio(struct mm_struct *mm)
*/
ctx->mmap_size = 0;

if (!atomic_xchg(&ctx->dead, 1)) {
hlist_del_rcu(&ctx->list);
call_rcu(&ctx->rcu_head, kill_ioctx_rcu);
}
kill_ioctx(ctx);
}
}

Expand Down

0 comments on commit 4fcc712

Please sign in to comment.