Skip to content

Commit

Permalink
rbd: avoid a deadlock on header_rwsem when flushing notifies
Browse files Browse the repository at this point in the history
rbd_unregister_watch() flushes notifies and therefore cannot be called
under header_rwsem because a header update notify takes header_rwsem to
synchronize with "rbd map".  If mapping an image fails after the watch
is established and a header update notify sneaks in, we deadlock when
erroring out from rbd_dev_image_probe().

Move watch registration and unregistration out of the critical section.
The only reason they were put there was to make header_rwsem management
slightly more obvious.

Fixes: 811c668 ("rbd: fix rbd map vs notify races")
Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Jason Dillaman <[email protected]>
  • Loading branch information
idryomov committed Apr 13, 2020
1 parent 8f3d9f3 commit 0e4e1de
Showing 1 changed file with 13 additions and 4 deletions.
17 changes: 13 additions & 4 deletions drivers/block/rbd.c
Original file line number Diff line number Diff line change
Expand Up @@ -4527,6 +4527,10 @@ static void cancel_tasks_sync(struct rbd_device *rbd_dev)
cancel_work_sync(&rbd_dev->unlock_work);
}

/*
* header_rwsem must not be held to avoid a deadlock with
* rbd_dev_refresh() when flushing notifies.
*/
static void rbd_unregister_watch(struct rbd_device *rbd_dev)
{
cancel_tasks_sync(rbd_dev);
Expand Down Expand Up @@ -6907,6 +6911,9 @@ static void rbd_dev_image_release(struct rbd_device *rbd_dev)
* device. If this image is the one being mapped (i.e., not a
* parent), initiate a watch on its header object before using that
* object to get detailed information about the rbd image.
*
* On success, returns with header_rwsem held for write if called
* with @depth == 0.
*/
static int rbd_dev_image_probe(struct rbd_device *rbd_dev, int depth)
{
Expand Down Expand Up @@ -6936,6 +6943,9 @@ static int rbd_dev_image_probe(struct rbd_device *rbd_dev, int depth)
}
}

if (!depth)
down_write(&rbd_dev->header_rwsem);

ret = rbd_dev_header_info(rbd_dev);
if (ret) {
if (ret == -ENOENT && !need_watch)
Expand Down Expand Up @@ -6987,6 +6997,8 @@ static int rbd_dev_image_probe(struct rbd_device *rbd_dev, int depth)
err_out_probe:
rbd_dev_unprobe(rbd_dev);
err_out_watch:
if (!depth)
up_write(&rbd_dev->header_rwsem);
if (need_watch)
rbd_unregister_watch(rbd_dev);
err_out_format:
Expand Down Expand Up @@ -7050,12 +7062,9 @@ static ssize_t do_rbd_add(struct bus_type *bus,
goto err_out_rbd_dev;
}

down_write(&rbd_dev->header_rwsem);
rc = rbd_dev_image_probe(rbd_dev, 0);
if (rc < 0) {
up_write(&rbd_dev->header_rwsem);
if (rc < 0)
goto err_out_rbd_dev;
}

if (rbd_dev->opts->alloc_size > rbd_dev->layout.object_size) {
rbd_warn(rbd_dev, "alloc_size adjusted to %u",
Expand Down

0 comments on commit 0e4e1de

Please sign in to comment.