Skip to content

Commit

Permalink
block: Do not pull requests from the scheduler when we cannot dispatc…
Browse files Browse the repository at this point in the history
…h them

Provided the device driver does not implement dispatch budget accounting
(which only SCSI does) the loop in __blk_mq_do_dispatch_sched() pulls
requests from the IO scheduler as long as it is willing to give out any.
That defeats scheduling heuristics inside the scheduler by creating
false impression that the device can take more IO when it in fact
cannot.

For example with BFQ IO scheduler on top of virtio-blk device setting
blkio cgroup weight has barely any impact on observed throughput of
async IO because __blk_mq_do_dispatch_sched() always sucks out all the
IO queued in BFQ. BFQ first submits IO from higher weight cgroups but
when that is all dispatched, it will give out IO of lower weight cgroups
as well. And then we have to wait for all this IO to be dispatched to
the disk (which means lot of it actually has to complete) before the
IO scheduler is queried again for dispatching more requests. This
completely destroys any service differentiation.

So grab request tag for a request pulled out of the IO scheduler already
in __blk_mq_do_dispatch_sched() and do not pull any more requests if we
cannot get it because we are unlikely to be able to dispatch it. That
way only single request is going to wait in the dispatch list for some
tag to free.

Reviewed-by: Ming Lei <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
  • Loading branch information
jankara authored and axboe committed Jun 3, 2021
1 parent 90bf3e2 commit 6134715
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 2 deletions.
12 changes: 11 additions & 1 deletion block/blk-mq-sched.c
Original file line number Diff line number Diff line change
Expand Up @@ -168,9 +168,19 @@ static int __blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx)
* in blk_mq_dispatch_rq_list().
*/
list_add_tail(&rq->queuelist, &rq_list);
count++;
if (rq->mq_hctx != hctx)
multi_hctxs = true;
} while (++count < max_dispatch);

/*
* If we cannot get tag for the request, stop dequeueing
* requests from the IO scheduler. We are unlikely to be able
* to submit them anyway and it creates false impression for
* scheduling heuristics that the device can take more IO.
*/
if (!blk_mq_get_driver_tag(rq))
break;
} while (count < max_dispatch);

if (!count) {
if (run_queue)
Expand Down
2 changes: 1 addition & 1 deletion block/blk-mq.c
Original file line number Diff line number Diff line change
Expand Up @@ -1104,7 +1104,7 @@ static bool __blk_mq_get_driver_tag(struct request *rq)
return true;
}

static bool blk_mq_get_driver_tag(struct request *rq)
bool blk_mq_get_driver_tag(struct request *rq)
{
struct blk_mq_hw_ctx *hctx = rq->mq_hctx;

Expand Down
2 changes: 2 additions & 0 deletions block/blk-mq.h
Original file line number Diff line number Diff line change
Expand Up @@ -260,6 +260,8 @@ static inline void blk_mq_put_driver_tag(struct request *rq)
__blk_mq_put_driver_tag(rq->mq_hctx, rq);
}

bool blk_mq_get_driver_tag(struct request *rq);

static inline void blk_mq_clear_mq_map(struct blk_mq_queue_map *qmap)
{
int cpu;
Expand Down

0 comments on commit 6134715

Please sign in to comment.