Skip to content

Commit

Permalink
blk-mq, percpu_ref: implement a kludge for SCSI blk-mq stall during p…
Browse files Browse the repository at this point in the history
…robe

blk-mq uses percpu_ref for its usage counter which tracks the number
of in-flight commands and used to synchronously drain the queue on
freeze.  percpu_ref shutdown takes measureable wallclock time as it
involves a sched RCU grace period.  This means that draining a blk-mq
takes measureable wallclock time.  One would think that this shouldn't
matter as queue shutdown should be a rare event which takes place
asynchronously w.r.t. userland.

Unfortunately, SCSI probing involves synchronously setting up and then
tearing down a lot of request_queues back-to-back for non-existent
LUNs.  This means that SCSI probing may take more than ten seconds
when scsi-mq is used.

This will be properly fixed by implementing a mechanism to keep
q->mq_usage_counter in atomic mode till genhd registration; however,
that involves rather big updates to percpu_ref which is difficult to
apply late in the devel cycle (v3.17-rc6 at the moment).  As a
stop-gap measure till the proper fix can be implemented in the next
cycle, this patch introduces __percpu_ref_kill_expedited() and makes
blk_mq_freeze_queue() use it.  This is heavy-handed but should work
for testing the experimental SCSI blk-mq implementation.

Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Christoph Hellwig <[email protected]>
Link: http://lkml.kernel.org/g/[email protected]
Fixes: add703f ("blk-mq: use percpu_ref for mq usage count")
Cc: Kent Overstreet <[email protected]>
Cc: Jens Axboe <[email protected]>
Tested-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
  • Loading branch information
htejun authored and axboe committed Sep 24, 2014
1 parent 452b636 commit 0a30288
Show file tree
Hide file tree
Showing 3 changed files with 27 additions and 1 deletion.
11 changes: 10 additions & 1 deletion block/blk-mq.c
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,16 @@ void blk_mq_freeze_queue(struct request_queue *q)
spin_unlock_irq(q->queue_lock);

if (freeze) {
percpu_ref_kill(&q->mq_usage_counter);
/*
* XXX: Temporary kludge to work around SCSI blk-mq stall.
* SCSI synchronously creates and destroys many queues
* back-to-back during probe leading to lengthy stalls.
* This will be fixed by keeping ->mq_usage_counter in
* atomic mode until genhd registration, but, for now,
* let's work around using expedited synchronization.
*/
__percpu_ref_kill_expedited(&q->mq_usage_counter);

blk_mq_run_queues(q, false);
}
wait_event(q->mq_freeze_wq, percpu_ref_is_zero(&q->mq_usage_counter));
Expand Down
1 change: 1 addition & 0 deletions include/linux/percpu-refcount.h
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ void percpu_ref_reinit(struct percpu_ref *ref);
void percpu_ref_exit(struct percpu_ref *ref);
void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
percpu_ref_func_t *confirm_kill);
void __percpu_ref_kill_expedited(struct percpu_ref *ref);

/**
* percpu_ref_kill - drop the initial ref
Expand Down
16 changes: 16 additions & 0 deletions lib/percpu-refcount.c
Original file line number Diff line number Diff line change
Expand Up @@ -184,3 +184,19 @@ void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
call_rcu_sched(&ref->rcu, percpu_ref_kill_rcu);
}
EXPORT_SYMBOL_GPL(percpu_ref_kill_and_confirm);

/*
* XXX: Temporary kludge to work around SCSI blk-mq stall. Used only by
* block/blk-mq.c::blk_mq_freeze_queue(). Will be removed during v3.18
* devel cycle. Do not use anywhere else.
*/
void __percpu_ref_kill_expedited(struct percpu_ref *ref)
{
WARN_ONCE(ref->pcpu_count_ptr & PCPU_REF_DEAD,
"percpu_ref_kill() called more than once on %pf!",
ref->release);

ref->pcpu_count_ptr |= PCPU_REF_DEAD;
synchronize_sched_expedited();
percpu_ref_kill_rcu(&ref->rcu);
}

0 comments on commit 0a30288

Please sign in to comment.