Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
blk-mq: release scheduler resource when request completes
Chuck reported [1] an IO hang problem on NFS exports that reside on SATA devices and bisected to commit 615939a ("blk-mq: defer to the normal submission path for post-flush requests"). We analysed the IO hang problem, found there are two postflush requests waiting for each other. The first postflush request completed the REQ_FSEQ_DATA sequence, so go to the REQ_FSEQ_POSTFLUSH sequence and added in the flush pending list, but failed to blk_kick_flush() because of the second postflush request which is inflight waiting in scheduler queue. The second postflush waiting in scheduler queue can't be dispatched because the first postflush hasn't released scheduler resource even though it has completed by itself. Fix it by releasing scheduler resource when the first postflush request completed, so the second postflush can be dispatched and completed, then make blk_kick_flush() succeed. While at it, remove the check for e->ops.finish_request, as all schedulers set that. Reaffirm this requirement by adding a WARN_ON_ONCE() at scheduler registration time, just like we do for insert_requests and dispatch_request. [1] https://lore.kernel.org/all/[email protected]/ Link: https://lore.kernel.org/linux-block/[email protected]/ Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-lkp/[email protected] Fixes: 615939a ("blk-mq: defer to the normal submission path for post-flush requests") Reported-by: Chuck Lever <[email protected]> Signed-off-by: Chengming Zhou <[email protected]> Tested-by: Chuck Lever <[email protected]> Link: https://lore.kernel.org/r/[email protected] [axboe: folded in incremental fix and added tags] Signed-off-by: Jens Axboe <[email protected]>
- Loading branch information