Skip to content

Commit 06ede5f

Browse files
committed
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull more block layer updates from Jens Axboe: "A followup pull request, with some parts that either needed a bit more testing before going in, merge sync, or just later arriving fixes. This contains: - Timer related updates from Kees. These were purposefully delayed since I didn't want to pull in a later v4.14-rc tag to my block tree. - ide-cd prep sense buffer fix from Bart. Also delayed, as not to clash with the late fix we put into 4.14-rc. - Small BFQ updates series from Luca and Paolo. - Single nvmet fix from James, fixing a non-functional case there. - Bio fast clone fix from Michael, which made bcache return the wrong data for some cases. - Legacy IO path regression hang fix from Ming" * 'for-linus' of git://git.kernel.dk/linux-block: bio: ensure __bio_clone_fast copies bi_partno nvmet_fc: fix better length checking block: wake up all tasks blocked in get_request() block, bfq: move debug blkio stats behind CONFIG_DEBUG_BLK_CGROUP block, bfq: update blkio stats outside the scheduler lock block, bfq: add missing invocations of bfqg_stats_update_io_add/remove doc, block, bfq: update max IOPS sustainable with BFQ ide: Make ide_cdrom_prep_fs() initialize the sense buffer pointer md: Convert timers to use timer_setup() block: swim3: Convert timers to use timer_setup() block/aoe: Convert timers to use timer_setup() amifloppy: Convert timers to use timer_setup() block/floppy: Convert callback to pass timer_list
2 parents a3841f9 + 62530ed commit 06ede5f

19 files changed

+311
-166
lines changed

Documentation/block/bfq-iosched.txt

+37-6
Original file line numberDiff line numberDiff line change
@@ -20,12 +20,27 @@ for that device, by setting low_latency to 0. See Section 3 for
2020
details on how to configure BFQ for the desired tradeoff between
2121
latency and throughput, or on how to maximize throughput.
2222

23-
On average CPUs, the current version of BFQ can handle devices
24-
performing at most ~30K IOPS; at most ~50 KIOPS on faster CPUs. As a
25-
reference, 30-50 KIOPS correspond to very high bandwidths with
26-
sequential I/O (e.g., 8-12 GB/s if I/O requests are 256 KB large), and
27-
to 120-200 MB/s with 4KB random I/O. BFQ is currently being tested on
28-
multi-queue devices too.
23+
BFQ has a non-null overhead, which limits the maximum IOPS that a CPU
24+
can process for a device scheduled with BFQ. To give an idea of the
25+
limits on slow or average CPUs, here are, first, the limits of BFQ for
26+
three different CPUs, on, respectively, an average laptop, an old
27+
desktop, and a cheap embedded system, in case full hierarchical
28+
support is enabled (i.e., CONFIG_BFQ_GROUP_IOSCHED is set), but
29+
CONFIG_DEBUG_BLK_CGROUP is not set (Section 4-2):
30+
- Intel i7-4850HQ: 400 KIOPS
31+
- AMD A8-3850: 250 KIOPS
32+
- ARM CortexTM-A53 Octa-core: 80 KIOPS
33+
34+
If CONFIG_DEBUG_BLK_CGROUP is set (and of course full hierarchical
35+
support is enabled), then the sustainable throughput with BFQ
36+
decreases, because all blkio.bfq* statistics are created and updated
37+
(Section 4-2). For BFQ, this leads to the following maximum
38+
sustainable throughputs, on the same systems as above:
39+
- Intel i7-4850HQ: 310 KIOPS
40+
- AMD A8-3850: 200 KIOPS
41+
- ARM CortexTM-A53 Octa-core: 56 KIOPS
42+
43+
BFQ works for multi-queue devices too.
2944

3045
The table of contents follow. Impatients can just jump to Section 3.
3146

@@ -500,6 +515,22 @@ BFQ-specific files is "blkio.bfq." or "io.bfq." For example, the group
500515
parameter to set the weight of a group with BFQ is blkio.bfq.weight
501516
or io.bfq.weight.
502517

518+
As for cgroups-v1 (blkio controller), the exact set of stat files
519+
created, and kept up-to-date by bfq, depends on whether
520+
CONFIG_DEBUG_BLK_CGROUP is set. If it is set, then bfq creates all
521+
the stat files documented in
522+
Documentation/cgroup-v1/blkio-controller.txt. If, instead,
523+
CONFIG_DEBUG_BLK_CGROUP is not set, then bfq creates only the files
524+
blkio.bfq.io_service_bytes
525+
blkio.bfq.io_service_bytes_recursive
526+
blkio.bfq.io_serviced
527+
blkio.bfq.io_serviced_recursive
528+
529+
The value of CONFIG_DEBUG_BLK_CGROUP greatly influences the maximum
530+
throughput sustainable with bfq, because updating the blkio.bfq.*
531+
stats is rather costly, especially for some of the stats enabled by
532+
CONFIG_DEBUG_BLK_CGROUP.
533+
503534
Parameters to set
504535
-----------------
505536

block/bfq-cgroup.c

+84-64
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@
2424

2525
#include "bfq-iosched.h"
2626

27-
#ifdef CONFIG_BFQ_GROUP_IOSCHED
27+
#if defined(CONFIG_BFQ_GROUP_IOSCHED) && defined(CONFIG_DEBUG_BLK_CGROUP)
2828

2929
/* bfqg stats flags */
3030
enum bfqg_stats_flags {
@@ -152,6 +152,57 @@ void bfqg_stats_update_avg_queue_size(struct bfq_group *bfqg)
152152
bfqg_stats_update_group_wait_time(stats);
153153
}
154154

155+
void bfqg_stats_update_io_add(struct bfq_group *bfqg, struct bfq_queue *bfqq,
156+
unsigned int op)
157+
{
158+
blkg_rwstat_add(&bfqg->stats.queued, op, 1);
159+
bfqg_stats_end_empty_time(&bfqg->stats);
160+
if (!(bfqq == ((struct bfq_data *)bfqg->bfqd)->in_service_queue))
161+
bfqg_stats_set_start_group_wait_time(bfqg, bfqq_group(bfqq));
162+
}
163+
164+
void bfqg_stats_update_io_remove(struct bfq_group *bfqg, unsigned int op)
165+
{
166+
blkg_rwstat_add(&bfqg->stats.queued, op, -1);
167+
}
168+
169+
void bfqg_stats_update_io_merged(struct bfq_group *bfqg, unsigned int op)
170+
{
171+
blkg_rwstat_add(&bfqg->stats.merged, op, 1);
172+
}
173+
174+
void bfqg_stats_update_completion(struct bfq_group *bfqg, uint64_t start_time,
175+
uint64_t io_start_time, unsigned int op)
176+
{
177+
struct bfqg_stats *stats = &bfqg->stats;
178+
unsigned long long now = sched_clock();
179+
180+
if (time_after64(now, io_start_time))
181+
blkg_rwstat_add(&stats->service_time, op,
182+
now - io_start_time);
183+
if (time_after64(io_start_time, start_time))
184+
blkg_rwstat_add(&stats->wait_time, op,
185+
io_start_time - start_time);
186+
}
187+
188+
#else /* CONFIG_BFQ_GROUP_IOSCHED && CONFIG_DEBUG_BLK_CGROUP */
189+
190+
void bfqg_stats_update_io_add(struct bfq_group *bfqg, struct bfq_queue *bfqq,
191+
unsigned int op) { }
192+
void bfqg_stats_update_io_remove(struct bfq_group *bfqg, unsigned int op) { }
193+
void bfqg_stats_update_io_merged(struct bfq_group *bfqg, unsigned int op) { }
194+
void bfqg_stats_update_completion(struct bfq_group *bfqg, uint64_t start_time,
195+
uint64_t io_start_time, unsigned int op) { }
196+
void bfqg_stats_update_dequeue(struct bfq_group *bfqg) { }
197+
void bfqg_stats_set_start_empty_time(struct bfq_group *bfqg) { }
198+
void bfqg_stats_update_idle_time(struct bfq_group *bfqg) { }
199+
void bfqg_stats_set_start_idle_time(struct bfq_group *bfqg) { }
200+
void bfqg_stats_update_avg_queue_size(struct bfq_group *bfqg) { }
201+
202+
#endif /* CONFIG_BFQ_GROUP_IOSCHED && CONFIG_DEBUG_BLK_CGROUP */
203+
204+
#ifdef CONFIG_BFQ_GROUP_IOSCHED
205+
155206
/*
156207
* blk-cgroup policy-related handlers
157208
* The following functions help in converting between blk-cgroup
@@ -229,42 +280,10 @@ void bfqg_and_blkg_put(struct bfq_group *bfqg)
229280
blkg_put(bfqg_to_blkg(bfqg));
230281
}
231282

232-
void bfqg_stats_update_io_add(struct bfq_group *bfqg, struct bfq_queue *bfqq,
233-
unsigned int op)
234-
{
235-
blkg_rwstat_add(&bfqg->stats.queued, op, 1);
236-
bfqg_stats_end_empty_time(&bfqg->stats);
237-
if (!(bfqq == ((struct bfq_data *)bfqg->bfqd)->in_service_queue))
238-
bfqg_stats_set_start_group_wait_time(bfqg, bfqq_group(bfqq));
239-
}
240-
241-
void bfqg_stats_update_io_remove(struct bfq_group *bfqg, unsigned int op)
242-
{
243-
blkg_rwstat_add(&bfqg->stats.queued, op, -1);
244-
}
245-
246-
void bfqg_stats_update_io_merged(struct bfq_group *bfqg, unsigned int op)
247-
{
248-
blkg_rwstat_add(&bfqg->stats.merged, op, 1);
249-
}
250-
251-
void bfqg_stats_update_completion(struct bfq_group *bfqg, uint64_t start_time,
252-
uint64_t io_start_time, unsigned int op)
253-
{
254-
struct bfqg_stats *stats = &bfqg->stats;
255-
unsigned long long now = sched_clock();
256-
257-
if (time_after64(now, io_start_time))
258-
blkg_rwstat_add(&stats->service_time, op,
259-
now - io_start_time);
260-
if (time_after64(io_start_time, start_time))
261-
blkg_rwstat_add(&stats->wait_time, op,
262-
io_start_time - start_time);
263-
}
264-
265283
/* @stats = 0 */
266284
static void bfqg_stats_reset(struct bfqg_stats *stats)
267285
{
286+
#ifdef CONFIG_DEBUG_BLK_CGROUP
268287
/* queued stats shouldn't be cleared */
269288
blkg_rwstat_reset(&stats->merged);
270289
blkg_rwstat_reset(&stats->service_time);
@@ -276,6 +295,7 @@ static void bfqg_stats_reset(struct bfqg_stats *stats)
276295
blkg_stat_reset(&stats->group_wait_time);
277296
blkg_stat_reset(&stats->idle_time);
278297
blkg_stat_reset(&stats->empty_time);
298+
#endif
279299
}
280300

281301
/* @to += @from */
@@ -284,6 +304,7 @@ static void bfqg_stats_add_aux(struct bfqg_stats *to, struct bfqg_stats *from)
284304
if (!to || !from)
285305
return;
286306

307+
#ifdef CONFIG_DEBUG_BLK_CGROUP
287308
/* queued stats shouldn't be cleared */
288309
blkg_rwstat_add_aux(&to->merged, &from->merged);
289310
blkg_rwstat_add_aux(&to->service_time, &from->service_time);
@@ -296,6 +317,7 @@ static void bfqg_stats_add_aux(struct bfqg_stats *to, struct bfqg_stats *from)
296317
blkg_stat_add_aux(&to->group_wait_time, &from->group_wait_time);
297318
blkg_stat_add_aux(&to->idle_time, &from->idle_time);
298319
blkg_stat_add_aux(&to->empty_time, &from->empty_time);
320+
#endif
299321
}
300322

301323
/*
@@ -342,6 +364,7 @@ void bfq_init_entity(struct bfq_entity *entity, struct bfq_group *bfqg)
342364

343365
static void bfqg_stats_exit(struct bfqg_stats *stats)
344366
{
367+
#ifdef CONFIG_DEBUG_BLK_CGROUP
345368
blkg_rwstat_exit(&stats->merged);
346369
blkg_rwstat_exit(&stats->service_time);
347370
blkg_rwstat_exit(&stats->wait_time);
@@ -353,10 +376,12 @@ static void bfqg_stats_exit(struct bfqg_stats *stats)
353376
blkg_stat_exit(&stats->group_wait_time);
354377
blkg_stat_exit(&stats->idle_time);
355378
blkg_stat_exit(&stats->empty_time);
379+
#endif
356380
}
357381

358382
static int bfqg_stats_init(struct bfqg_stats *stats, gfp_t gfp)
359383
{
384+
#ifdef CONFIG_DEBUG_BLK_CGROUP
360385
if (blkg_rwstat_init(&stats->merged, gfp) ||
361386
blkg_rwstat_init(&stats->service_time, gfp) ||
362387
blkg_rwstat_init(&stats->wait_time, gfp) ||
@@ -371,6 +396,7 @@ static int bfqg_stats_init(struct bfqg_stats *stats, gfp_t gfp)
371396
bfqg_stats_exit(stats);
372397
return -ENOMEM;
373398
}
399+
#endif
374400

375401
return 0;
376402
}
@@ -887,6 +913,7 @@ static ssize_t bfq_io_set_weight(struct kernfs_open_file *of,
887913
return bfq_io_set_weight_legacy(of_css(of), NULL, weight);
888914
}
889915

916+
#ifdef CONFIG_DEBUG_BLK_CGROUP
890917
static int bfqg_print_stat(struct seq_file *sf, void *v)
891918
{
892919
blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), blkg_prfill_stat,
@@ -991,6 +1018,7 @@ static int bfqg_print_avg_queue_size(struct seq_file *sf, void *v)
9911018
0, false);
9921019
return 0;
9931020
}
1021+
#endif /* CONFIG_DEBUG_BLK_CGROUP */
9941022

9951023
struct bfq_group *bfq_create_group_hierarchy(struct bfq_data *bfqd, int node)
9961024
{
@@ -1028,15 +1056,6 @@ struct cftype bfq_blkcg_legacy_files[] = {
10281056
},
10291057

10301058
/* statistics, covers only the tasks in the bfqg */
1031-
{
1032-
.name = "bfq.time",
1033-
.private = offsetof(struct bfq_group, stats.time),
1034-
.seq_show = bfqg_print_stat,
1035-
},
1036-
{
1037-
.name = "bfq.sectors",
1038-
.seq_show = bfqg_print_stat_sectors,
1039-
},
10401059
{
10411060
.name = "bfq.io_service_bytes",
10421061
.private = (unsigned long)&blkcg_policy_bfq,
@@ -1047,6 +1066,16 @@ struct cftype bfq_blkcg_legacy_files[] = {
10471066
.private = (unsigned long)&blkcg_policy_bfq,
10481067
.seq_show = blkg_print_stat_ios,
10491068
},
1069+
#ifdef CONFIG_DEBUG_BLK_CGROUP
1070+
{
1071+
.name = "bfq.time",
1072+
.private = offsetof(struct bfq_group, stats.time),
1073+
.seq_show = bfqg_print_stat,
1074+
},
1075+
{
1076+
.name = "bfq.sectors",
1077+
.seq_show = bfqg_print_stat_sectors,
1078+
},
10501079
{
10511080
.name = "bfq.io_service_time",
10521081
.private = offsetof(struct bfq_group, stats.service_time),
@@ -1067,17 +1096,9 @@ struct cftype bfq_blkcg_legacy_files[] = {
10671096
.private = offsetof(struct bfq_group, stats.queued),
10681097
.seq_show = bfqg_print_rwstat,
10691098
},
1099+
#endif /* CONFIG_DEBUG_BLK_CGROUP */
10701100

10711101
/* the same statictics which cover the bfqg and its descendants */
1072-
{
1073-
.name = "bfq.time_recursive",
1074-
.private = offsetof(struct bfq_group, stats.time),
1075-
.seq_show = bfqg_print_stat_recursive,
1076-
},
1077-
{
1078-
.name = "bfq.sectors_recursive",
1079-
.seq_show = bfqg_print_stat_sectors_recursive,
1080-
},
10811102
{
10821103
.name = "bfq.io_service_bytes_recursive",
10831104
.private = (unsigned long)&blkcg_policy_bfq,
@@ -1088,6 +1109,16 @@ struct cftype bfq_blkcg_legacy_files[] = {
10881109
.private = (unsigned long)&blkcg_policy_bfq,
10891110
.seq_show = blkg_print_stat_ios_recursive,
10901111
},
1112+
#ifdef CONFIG_DEBUG_BLK_CGROUP
1113+
{
1114+
.name = "bfq.time_recursive",
1115+
.private = offsetof(struct bfq_group, stats.time),
1116+
.seq_show = bfqg_print_stat_recursive,
1117+
},
1118+
{
1119+
.name = "bfq.sectors_recursive",
1120+
.seq_show = bfqg_print_stat_sectors_recursive,
1121+
},
10911122
{
10921123
.name = "bfq.io_service_time_recursive",
10931124
.private = offsetof(struct bfq_group, stats.service_time),
@@ -1132,6 +1163,7 @@ struct cftype bfq_blkcg_legacy_files[] = {
11321163
.private = offsetof(struct bfq_group, stats.dequeue),
11331164
.seq_show = bfqg_print_stat,
11341165
},
1166+
#endif /* CONFIG_DEBUG_BLK_CGROUP */
11351167
{ } /* terminate */
11361168
};
11371169

@@ -1147,18 +1179,6 @@ struct cftype bfq_blkg_files[] = {
11471179

11481180
#else /* CONFIG_BFQ_GROUP_IOSCHED */
11491181

1150-
void bfqg_stats_update_io_add(struct bfq_group *bfqg, struct bfq_queue *bfqq,
1151-
unsigned int op) { }
1152-
void bfqg_stats_update_io_remove(struct bfq_group *bfqg, unsigned int op) { }
1153-
void bfqg_stats_update_io_merged(struct bfq_group *bfqg, unsigned int op) { }
1154-
void bfqg_stats_update_completion(struct bfq_group *bfqg, uint64_t start_time,
1155-
uint64_t io_start_time, unsigned int op) { }
1156-
void bfqg_stats_update_dequeue(struct bfq_group *bfqg) { }
1157-
void bfqg_stats_set_start_empty_time(struct bfq_group *bfqg) { }
1158-
void bfqg_stats_update_idle_time(struct bfq_group *bfqg) { }
1159-
void bfqg_stats_set_start_idle_time(struct bfq_group *bfqg) { }
1160-
void bfqg_stats_update_avg_queue_size(struct bfq_group *bfqg) { }
1161-
11621182
void bfq_bfqq_move(struct bfq_data *bfqd, struct bfq_queue *bfqq,
11631183
struct bfq_group *bfqg) {}
11641184

0 commit comments

Comments
 (0)