Skip to content

Commit

Permalink
Merge tag 'dm-4.6-changes' of git://git.kernel.org/pub/scm/linux/kern…
Browse files Browse the repository at this point in the history
…el/git/device-mapper/linux-dm

Pull device mapper updates from Mike Snitzer:

 - Most attention this cycle went to optimizing blk-mq request-based DM
   (dm-mq) that is used exclussively by DM multipath:

     - A stable fix for dm-mq that eliminates excessive context
       switching offers the biggest performance improvement (for both
       IOPs and throughput).

     - But more work is needed, during the next cycle, to reduce
       spinlock contention in DM multipath on large NUMA systems.

 - A stable fix for a NULL pointer seen when DM stats is enabled on a DM
   multipath device that must requeue an IO due to path failure.

 - A stable fix for DM snapshot to disallow the COW and origin devices
   from being identical.  This amounts to graceful failure in the face
   of userspace error because these devices shouldn't ever be identical.

 - Stable fixes for DM cache and DM thin provisioning to address crashes
   seen if/when their respective metadata device experiences failures
   that cause the transition to 'fail_io' mode.

 - The DM cache 'mq' policy is now an alias for the 'smq' policy.  The
   'smq' policy proved to be consistently better than 'mq'.  As such
   'mq', with all its complex user-facing tunables, has been eliminated.

 - Improve DM thin provisioning to consistently return -ENOSPC once the
   thin-pool's data volume is out of space.

 - Improve DM core to properly handle error propagation if
   bio_integrity_clone() fails in clone_bio().

 - Other small cleanups and improvements to DM core.

* tag 'dm-4.6-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (41 commits)
  dm: fix rq_end_stats() NULL pointer in dm_requeue_original_request()
  dm thin: consistently return -ENOSPC if pool has run out of data space
  dm cache: bump the target version
  dm cache: make sure every metadata function checks fail_io
  dm: add missing newline between DM_DEBUG_BLOCK_STACK_TRACING and DM_BUFIO
  dm cache policy smq: clarify that mq registration failure was for 'mq'
  dm: return error if bio_integrity_clone() fails in clone_bio()
  dm thin metadata: don't issue prefetches if a transaction abort has failed
  dm snapshot: disallow the COW and origin devices from being identical
  dm cache: make the 'mq' policy an alias for 'smq'
  dm: drop unnecessary assignment of md->queue
  dm: reorder 'struct mapped_device' members to fix alignment and holes
  dm: remove dummy definition of 'struct dm_table'
  dm: add 'dm_numa_node' module parameter
  dm thin metadata: remove needless newline from subtree_dec() DMERR message
  dm mpath: cleanup reinstate_path() et al based on code review
  dm mpath: remove __pgpath_busy forward declaration, rename to pgpath_busy
  dm mpath: switch from 'unsigned' to 'bool' for flags where appropriate
  dm round robin: use percpu 'repeat_count' and 'current_path'
  dm path selector: remove 'repeat_count' return from .select_path hook
  ...
  • Loading branch information
torvalds committed Mar 17, 2016
2 parents cae8da0 + 98dbc9c commit 6968e6f
Show file tree
Hide file tree
Showing 30 changed files with 849 additions and 2,024 deletions.
39 changes: 2 additions & 37 deletions Documentation/device-mapper/cache-policies.txt
Original file line number Diff line number Diff line change
Expand Up @@ -28,51 +28,16 @@ Overview of supplied cache replacement policies
multiqueue (mq)
---------------

This policy has been deprecated in favor of the smq policy (see below).
This policy is now an alias for smq (see below).

The multiqueue policy has three sets of 16 queues: one set for entries
waiting for the cache and another two for those in the cache (a set for
clean entries and a set for dirty entries).
The following tunables are accepted, but have no effect:

Cache entries in the queues are aged based on logical time. Entry into
the cache is based on variable thresholds and queue selection is based
on hit count on entry. The policy aims to take different cache miss
costs into account and to adjust to varying load patterns automatically.

Message and constructor argument pairs are:
'sequential_threshold <#nr_sequential_ios>'
'random_threshold <#nr_random_ios>'
'read_promote_adjustment <value>'
'write_promote_adjustment <value>'
'discard_promote_adjustment <value>'

The sequential threshold indicates the number of contiguous I/Os
required before a stream is treated as sequential. Once a stream is
considered sequential it will bypass the cache. The random threshold
is the number of intervening non-contiguous I/Os that must be seen
before the stream is treated as random again.

The sequential and random thresholds default to 512 and 4 respectively.

Large, sequential I/Os are probably better left on the origin device
since spindles tend to have good sequential I/O bandwidth. The
io_tracker counts contiguous I/Os to try to spot when the I/O is in one
of these sequential modes. But there are use-cases for wanting to
promote sequential blocks to the cache (e.g. fast application startup).
If sequential threshold is set to 0 the sequential I/O detection is
disabled and sequential I/O will no longer implicitly bypass the cache.
Setting the random threshold to 0 does _not_ disable the random I/O
stream detection.

Internally the mq policy determines a promotion threshold. If the hit
count of a block not in the cache goes above this threshold it gets
promoted to the cache. The read, write and discard promote adjustment
tunables allow you to tweak the promotion threshold by adding a small
value based on the io type. They default to 4, 8 and 1 respectively.
If you're trying to quickly warm a new cache device you may wish to
reduce these to encourage promotion. Remember to switch them back to
their defaults after the cache fills though.

Stochastic multiqueue (smq)
---------------------------

Expand Down
2 changes: 1 addition & 1 deletion block/blk-core.c
Original file line number Diff line number Diff line change
Expand Up @@ -2198,7 +2198,7 @@ int blk_insert_cloned_request(struct request_queue *q, struct request *rq)
if (q->mq_ops) {
if (blk_queue_io_stat(q))
blk_account_io_start(rq, true);
blk_mq_insert_request(rq, false, true, true);
blk_mq_insert_request(rq, false, true, false);
return 0;
}

Expand Down
11 changes: 1 addition & 10 deletions drivers/md/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -249,6 +249,7 @@ config DM_DEBUG_BLOCK_STACK_TRACING
block manager locking used by thin provisioning and caching.

If unsure, say N.

config DM_BIO_PRISON
tristate
depends on BLK_DEV_DM
Expand Down Expand Up @@ -304,16 +305,6 @@ config DM_CACHE
algorithms used to select which blocks are promoted, demoted,
cleaned etc. It supports writeback and writethrough modes.

config DM_CACHE_MQ
tristate "MQ Cache Policy (EXPERIMENTAL)"
depends on DM_CACHE
default y
---help---
A cache policy that uses a multiqueue ordered by recent hit
count to select which blocks should be promoted and demoted.
This is meant to be a general purpose policy. It prioritises
reads over writes.

config DM_CACHE_SMQ
tristate "Stochastic MQ Cache Policy (EXPERIMENTAL)"
depends on DM_CACHE
Expand Down
2 changes: 0 additions & 2 deletions drivers/md/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@ dm-log-userspace-y \
+= dm-log-userspace-base.o dm-log-userspace-transfer.o
dm-thin-pool-y += dm-thin.o dm-thin-metadata.o
dm-cache-y += dm-cache-target.o dm-cache-metadata.o dm-cache-policy.o
dm-cache-mq-y += dm-cache-policy-mq.o
dm-cache-smq-y += dm-cache-policy-smq.o
dm-cache-cleaner-y += dm-cache-policy-cleaner.o
dm-era-y += dm-era-target.o
Expand Down Expand Up @@ -55,7 +54,6 @@ obj-$(CONFIG_DM_RAID) += dm-raid.o
obj-$(CONFIG_DM_THIN_PROVISIONING) += dm-thin-pool.o
obj-$(CONFIG_DM_VERITY) += dm-verity.o
obj-$(CONFIG_DM_CACHE) += dm-cache.o
obj-$(CONFIG_DM_CACHE_MQ) += dm-cache-mq.o
obj-$(CONFIG_DM_CACHE_SMQ) += dm-cache-smq.o
obj-$(CONFIG_DM_CACHE_CLEANER) += dm-cache-cleaner.o
obj-$(CONFIG_DM_ERA) += dm-era.o
Expand Down
98 changes: 59 additions & 39 deletions drivers/md/dm-cache-metadata.c
Original file line number Diff line number Diff line change
Expand Up @@ -867,19 +867,40 @@ static int blocks_are_unmapped_or_clean(struct dm_cache_metadata *cmd,
return 0;
}

#define WRITE_LOCK(cmd) \
if (cmd->fail_io || dm_bm_is_read_only(cmd->bm)) \
#define WRITE_LOCK(cmd) \
down_write(&cmd->root_lock); \
if (cmd->fail_io || dm_bm_is_read_only(cmd->bm)) { \
up_write(&cmd->root_lock); \
return -EINVAL; \
down_write(&cmd->root_lock)
}

#define WRITE_LOCK_VOID(cmd) \
if (cmd->fail_io || dm_bm_is_read_only(cmd->bm)) \
down_write(&cmd->root_lock); \
if (cmd->fail_io || dm_bm_is_read_only(cmd->bm)) { \
up_write(&cmd->root_lock); \
return; \
down_write(&cmd->root_lock)
}

#define WRITE_UNLOCK(cmd) \
up_write(&cmd->root_lock)

#define READ_LOCK(cmd) \
down_read(&cmd->root_lock); \
if (cmd->fail_io || dm_bm_is_read_only(cmd->bm)) { \
up_read(&cmd->root_lock); \
return -EINVAL; \
}

#define READ_LOCK_VOID(cmd) \
down_read(&cmd->root_lock); \
if (cmd->fail_io || dm_bm_is_read_only(cmd->bm)) { \
up_read(&cmd->root_lock); \
return; \
}

#define READ_UNLOCK(cmd) \
up_read(&cmd->root_lock)

int dm_cache_resize(struct dm_cache_metadata *cmd, dm_cblock_t new_cache_size)
{
int r;
Expand Down Expand Up @@ -1015,22 +1036,20 @@ int dm_cache_load_discards(struct dm_cache_metadata *cmd,
{
int r;

down_read(&cmd->root_lock);
READ_LOCK(cmd);
r = __load_discards(cmd, fn, context);
up_read(&cmd->root_lock);
READ_UNLOCK(cmd);

return r;
}

dm_cblock_t dm_cache_size(struct dm_cache_metadata *cmd)
int dm_cache_size(struct dm_cache_metadata *cmd, dm_cblock_t *result)
{
dm_cblock_t r;
READ_LOCK(cmd);
*result = cmd->cache_blocks;
READ_UNLOCK(cmd);

down_read(&cmd->root_lock);
r = cmd->cache_blocks;
up_read(&cmd->root_lock);

return r;
return 0;
}

static int __remove(struct dm_cache_metadata *cmd, dm_cblock_t cblock)
Expand Down Expand Up @@ -1188,9 +1207,9 @@ int dm_cache_load_mappings(struct dm_cache_metadata *cmd,
{
int r;

down_read(&cmd->root_lock);
READ_LOCK(cmd);
r = __load_mappings(cmd, policy, fn, context);
up_read(&cmd->root_lock);
READ_UNLOCK(cmd);

return r;
}
Expand All @@ -1215,18 +1234,18 @@ static int __dump_mappings(struct dm_cache_metadata *cmd)

void dm_cache_dump(struct dm_cache_metadata *cmd)
{
down_read(&cmd->root_lock);
READ_LOCK_VOID(cmd);
__dump_mappings(cmd);
up_read(&cmd->root_lock);
READ_UNLOCK(cmd);
}

int dm_cache_changed_this_transaction(struct dm_cache_metadata *cmd)
{
int r;

down_read(&cmd->root_lock);
READ_LOCK(cmd);
r = cmd->changed;
up_read(&cmd->root_lock);
READ_UNLOCK(cmd);

return r;
}
Expand Down Expand Up @@ -1276,9 +1295,9 @@ int dm_cache_set_dirty(struct dm_cache_metadata *cmd,
void dm_cache_metadata_get_stats(struct dm_cache_metadata *cmd,
struct dm_cache_statistics *stats)
{
down_read(&cmd->root_lock);
READ_LOCK_VOID(cmd);
*stats = cmd->stats;
up_read(&cmd->root_lock);
READ_UNLOCK(cmd);
}

void dm_cache_metadata_set_stats(struct dm_cache_metadata *cmd,
Expand Down Expand Up @@ -1312,9 +1331,9 @@ int dm_cache_get_free_metadata_block_count(struct dm_cache_metadata *cmd,
{
int r = -EINVAL;

down_read(&cmd->root_lock);
READ_LOCK(cmd);
r = dm_sm_get_nr_free(cmd->metadata_sm, result);
up_read(&cmd->root_lock);
READ_UNLOCK(cmd);

return r;
}
Expand All @@ -1324,9 +1343,9 @@ int dm_cache_get_metadata_dev_size(struct dm_cache_metadata *cmd,
{
int r = -EINVAL;

down_read(&cmd->root_lock);
READ_LOCK(cmd);
r = dm_sm_get_nr_blocks(cmd->metadata_sm, result);
up_read(&cmd->root_lock);
READ_UNLOCK(cmd);

return r;
}
Expand Down Expand Up @@ -1417,7 +1436,13 @@ int dm_cache_write_hints(struct dm_cache_metadata *cmd, struct dm_cache_policy *

int dm_cache_metadata_all_clean(struct dm_cache_metadata *cmd, bool *result)
{
return blocks_are_unmapped_or_clean(cmd, 0, cmd->cache_blocks, result);
int r;

READ_LOCK(cmd);
r = blocks_are_unmapped_or_clean(cmd, 0, cmd->cache_blocks, result);
READ_UNLOCK(cmd);

return r;
}

void dm_cache_metadata_set_read_only(struct dm_cache_metadata *cmd)
Expand All @@ -1440,10 +1465,7 @@ int dm_cache_metadata_set_needs_check(struct dm_cache_metadata *cmd)
struct dm_block *sblock;
struct cache_disk_superblock *disk_super;

/*
* We ignore fail_io for this function.
*/
down_write(&cmd->root_lock);
WRITE_LOCK(cmd);
set_bit(NEEDS_CHECK, &cmd->flags);

r = superblock_lock(cmd, &sblock);
Expand All @@ -1458,19 +1480,17 @@ int dm_cache_metadata_set_needs_check(struct dm_cache_metadata *cmd)
dm_bm_unlock(sblock);

out:
up_write(&cmd->root_lock);
WRITE_UNLOCK(cmd);
return r;
}

bool dm_cache_metadata_needs_check(struct dm_cache_metadata *cmd)
int dm_cache_metadata_needs_check(struct dm_cache_metadata *cmd, bool *result)
{
bool needs_check;
READ_LOCK(cmd);
*result = !!test_bit(NEEDS_CHECK, &cmd->flags);
READ_UNLOCK(cmd);

down_read(&cmd->root_lock);
needs_check = !!test_bit(NEEDS_CHECK, &cmd->flags);
up_read(&cmd->root_lock);

return needs_check;
return 0;
}

int dm_cache_metadata_abort(struct dm_cache_metadata *cmd)
Expand Down
4 changes: 2 additions & 2 deletions drivers/md/dm-cache-metadata.h
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ void dm_cache_metadata_close(struct dm_cache_metadata *cmd);
* origin blocks to map to.
*/
int dm_cache_resize(struct dm_cache_metadata *cmd, dm_cblock_t new_cache_size);
dm_cblock_t dm_cache_size(struct dm_cache_metadata *cmd);
int dm_cache_size(struct dm_cache_metadata *cmd, dm_cblock_t *result);

int dm_cache_discard_bitset_resize(struct dm_cache_metadata *cmd,
sector_t discard_block_size,
Expand Down Expand Up @@ -137,7 +137,7 @@ int dm_cache_write_hints(struct dm_cache_metadata *cmd, struct dm_cache_policy *
*/
int dm_cache_metadata_all_clean(struct dm_cache_metadata *cmd, bool *result);

bool dm_cache_metadata_needs_check(struct dm_cache_metadata *cmd);
int dm_cache_metadata_needs_check(struct dm_cache_metadata *cmd, bool *result);
int dm_cache_metadata_set_needs_check(struct dm_cache_metadata *cmd);
void dm_cache_metadata_set_read_only(struct dm_cache_metadata *cmd);
void dm_cache_metadata_set_read_write(struct dm_cache_metadata *cmd);
Expand Down
Loading

0 comments on commit 6968e6f

Please sign in to comment.