Skip to content

Commit

Permalink
Merge tag 'dm-4.8-changes' of git://git.kernel.org/pub/scm/linux/kern…
Browse files Browse the repository at this point in the history
…el/git/device-mapper/linux-dm

Pull device mapper updates from Mike Snitzer:

 - initially based on Jens' 'for-4.8/core' (given all the flag churn)
   and later merged with 'for-4.8/core' to pickup the QUEUE_FLAG_DAX
   commits that DM depends on to provide its DAX support

 - clean up the bio-based vs request-based DM core code by moving the
   request-based DM core code out to dm-rq.[hc]

 - reinstate bio-based support in the DM multipath target (done with the
   idea that fast storage like NVMe over Fabrics could benefit) -- while
   preserving support for request_fn and blk-mq request-based DM mpath

 - SCSI and DM multipath persistent reservation fixes that were
   coordinated with Martin Petersen.

 - the DM raid target saw the most extensive change this cycle; it now
   provides reshape and takeover support (by layering ontop of the
   corresponding MD capabilities)

 - DAX support for DM core and the linear, stripe and error targets

 - a DM thin-provisioning block discard vs allocation race fix that
   addresses potential for corruption

 - a stable fix for DM verity-fec's block calculation during decode

 - a few cleanups and fixes to DM core and various targets

* tag 'dm-4.8-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (73 commits)
  dm: allow bio-based table to be upgraded to bio-based with DAX support
  dm snap: add fake origin_direct_access
  dm stripe: add DAX support
  dm error: add DAX support
  dm linear: add DAX support
  dm: add infrastructure for DAX support
  dm thin: fix a race condition between discarding and provisioning a block
  dm btree: fix a bug in dm_btree_find_next_single()
  dm raid: fix random optimal_io_size for raid0
  dm raid: address checkpatch.pl complaints
  dm: call PR reserve/unreserve on each underlying device
  sd: don't use the ALL_TG_PT bit for reservations
  dm: fix second blk_delay_queue() parameter to be in msec units not jiffies
  dm raid: change logical functions to actually return bool
  dm raid: use rdev_for_each in status
  dm raid: use rs->raid_disks to avoid memory leaks on free
  dm raid: support delta_disks for raid1, fix table output
  dm raid: enhance reshape check and factor out reshape setup
  dm raid: allow resize during recovery
  dm raid: fix rs_is_recovering() to allow for lvextend
  ...
  • Loading branch information
torvalds committed Jul 27, 2016
2 parents 3fc9d69 + b5ab4a9 commit f7e6816
Show file tree
Hide file tree
Showing 29 changed files with 4,638 additions and 1,975 deletions.
58 changes: 55 additions & 3 deletions Documentation/device-mapper/dm-raid.txt
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,12 @@ The target is named "raid" and it accepts the following parameters:
<#raid_devs> <metadata_dev0> <dev0> [.. <metadata_devN> <devN>]

<raid_type>:
raid0 RAID0 striping (no resilience)
raid1 RAID1 mirroring
raid4 RAID4 dedicated parity disk
raid4 RAID4 with dedicated last parity disk
raid5_n RAID5 with dedicated last parity disk suporting takeover
Same as raid4
-Transitory layout
raid5_la RAID5 left asymmetric
- rotating parity 0 with data continuation
raid5_ra RAID5 right asymmetric
Expand All @@ -30,7 +34,19 @@ The target is named "raid" and it accepts the following parameters:
- rotating parity N (right-to-left) with data restart
raid6_nc RAID6 N continue
- rotating parity N (right-to-left) with data continuation
raid6_n_6 RAID6 with dedicate parity disks
- parity and Q-syndrome on the last 2 disks;
laylout for takeover from/to raid4/raid5_n
raid6_la_6 Same as "raid_la" plus dedicated last Q-syndrome disk
- layout for takeover from raid5_la from/to raid6
raid6_ra_6 Same as "raid5_ra" dedicated last Q-syndrome disk
- layout for takeover from raid5_ra from/to raid6
raid6_ls_6 Same as "raid5_ls" dedicated last Q-syndrome disk
- layout for takeover from raid5_ls from/to raid6
raid6_rs_6 Same as "raid5_rs" dedicated last Q-syndrome disk
- layout for takeover from raid5_rs from/to raid6
raid10 Various RAID10 inspired algorithms chosen by additional params
(see raid10_format and raid10_copies below)
- RAID10: Striped Mirrors (aka 'Striping on top of mirrors')
- RAID1E: Integrated Adjacent Stripe Mirroring
- RAID1E: Integrated Offset Stripe Mirroring
Expand Down Expand Up @@ -116,10 +132,41 @@ The target is named "raid" and it accepts the following parameters:
Here we see layouts closely akin to 'RAID1E - Integrated
Offset Stripe Mirroring'.

[delta_disks <N>]
The delta_disks option value (-251 < N < +251) triggers
device removal (negative value) or device addition (positive
value) to any reshape supporting raid levels 4/5/6 and 10.
RAID levels 4/5/6 allow for addition of devices (metadata
and data device tupel), raid10_near and raid10_offset only
allow for device addtion. raid10_far does not support any
reshaping at all.
A minimum of devices have to be kept to enforce resilience,
which is 3 devices for raid4/5 and 4 devices for raid6.

[data_offset <sectors>]
This option value defines the offset into each data device
where the data starts. This is used to provide out-of-place
reshaping space to avoid writing over data whilst
changing the layout of stripes, hence an interruption/crash
may happen at any time without the risk of losing data.
E.g. when adding devices to an existing raid set during
forward reshaping, the out-of-place space will be allocated
at the beginning of each raid device. The kernel raid4/5/6/10
MD personalities supporting such device addition will read the data from
the existing first stripes (those with smaller number of stripes)
starting at data_offset to fill up a new stripe with the larger
number of stripes, calculate the redundancy blocks (CRC/Q-syndrome)
and write that new stripe to offset 0. Same will be applied to all
N-1 other new stripes. This out-of-place scheme is used to change
the RAID type (i.e. the allocation algorithm) as well, e.g.
changing from raid5_ls to raid5_n.

<#raid_devs>: The number of devices composing the array.
Each device consists of two entries. The first is the device
containing the metadata (if any); the second is the one containing the
data.
data. A Maximum of 64 metadata/data device entries are supported
up to target version 1.8.0.
1.9.0 supports up to 253 which is enforced by the used MD kernel runtime.

If a drive has failed or is missing at creation time, a '-' can be
given for both the metadata and data drives for a given position.
Expand Down Expand Up @@ -207,7 +254,6 @@ include:
"recover"- Initiate/continue a recover process.
"check" - Initiate a check (i.e. a "scrub") of the array.
"repair" - Initiate a repair of the array.
"reshape"- Currently unsupported (-EINVAL).


Discard Support
Expand Down Expand Up @@ -257,3 +303,9 @@ Version History
1.5.2 'mismatch_cnt' is zero unless [last_]sync_action is "check".
1.6.0 Add discard support (and devices_handle_discard_safely module param).
1.7.0 Add support for MD RAID0 mappings.
1.8.0 Explictely check for compatible flags in the superblock metadata
and reject to start the raid set if any are set by a newer
target version, thus avoiding data corruption on a raid set
with a reshape in progress.
1.9.0 Add support for RAID level takeover/reshape/region size
and set size reduction.
3 changes: 2 additions & 1 deletion drivers/md/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@
#

dm-mod-y += dm.o dm-table.o dm-target.o dm-linear.o dm-stripe.o \
dm-ioctl.o dm-io.o dm-kcopyd.o dm-sysfs.o dm-stats.o
dm-ioctl.o dm-io.o dm-kcopyd.o dm-sysfs.o dm-stats.o \
dm-rq.o
dm-multipath-y += dm-path-selector.o dm-mpath.o
dm-snapshot-y += dm-snap.o dm-exception-store.o dm-snap-transient.o \
dm-snap-persistent.o
Expand Down
2 changes: 1 addition & 1 deletion drivers/md/dm-builtin.c
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#include "dm.h"
#include "dm-core.h"

/*
* The kobject release method must not be placed in the module itself,
Expand Down
149 changes: 149 additions & 0 deletions drivers/md/dm-core.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
/*
* Internal header file _only_ for device mapper core
*
* Copyright (C) 2016 Red Hat, Inc. All rights reserved.
*
* This file is released under the LGPL.
*/

#ifndef DM_CORE_INTERNAL_H
#define DM_CORE_INTERNAL_H

#include <linux/kthread.h>
#include <linux/ktime.h>
#include <linux/blk-mq.h>

#include <trace/events/block.h>

#include "dm.h"

#define DM_RESERVED_MAX_IOS 1024

struct dm_kobject_holder {
struct kobject kobj;
struct completion completion;
};

/*
* DM core internal structure that used directly by dm.c and dm-rq.c
* DM targets must _not_ deference a mapped_device to directly access its members!
*/
struct mapped_device {
struct srcu_struct io_barrier;
struct mutex suspend_lock;

/*
* The current mapping (struct dm_table *).
* Use dm_get_live_table{_fast} or take suspend_lock for
* dereference.
*/
void __rcu *map;

struct list_head table_devices;
struct mutex table_devices_lock;

unsigned long flags;

struct request_queue *queue;
int numa_node_id;

unsigned type;
/* Protect queue and type against concurrent access. */
struct mutex type_lock;

atomic_t holders;
atomic_t open_count;

struct dm_target *immutable_target;
struct target_type *immutable_target_type;

struct gendisk *disk;
char name[16];

void *interface_ptr;

/*
* A list of ios that arrived while we were suspended.
*/
atomic_t pending[2];
wait_queue_head_t wait;
struct work_struct work;
spinlock_t deferred_lock;
struct bio_list deferred;

/*
* Event handling.
*/
wait_queue_head_t eventq;
atomic_t event_nr;
atomic_t uevent_seq;
struct list_head uevent_list;
spinlock_t uevent_lock; /* Protect access to uevent_list */

/* the number of internal suspends */
unsigned internal_suspend_count;

/*
* Processing queue (flush)
*/
struct workqueue_struct *wq;

/*
* io objects are allocated from here.
*/
mempool_t *io_pool;
mempool_t *rq_pool;

struct bio_set *bs;

/*
* freeze/thaw support require holding onto a super block
*/
struct super_block *frozen_sb;

/* forced geometry settings */
struct hd_geometry geometry;

struct block_device *bdev;

/* kobject and completion */
struct dm_kobject_holder kobj_holder;

/* zero-length flush that will be cloned and submitted to targets */
struct bio flush_bio;

struct dm_stats stats;

struct kthread_worker kworker;
struct task_struct *kworker_task;

/* for request-based merge heuristic in dm_request_fn() */
unsigned seq_rq_merge_deadline_usecs;
int last_rq_rw;
sector_t last_rq_pos;
ktime_t last_rq_start_time;

/* for blk-mq request-based DM support */
struct blk_mq_tag_set *tag_set;
bool use_blk_mq:1;
bool init_tio_pdu:1;
};

void dm_init_md_queue(struct mapped_device *md);
void dm_init_normal_md_queue(struct mapped_device *md);
int md_in_flight(struct mapped_device *md);
void disable_write_same(struct mapped_device *md);

static inline struct completion *dm_get_completion_from_kobject(struct kobject *kobj)
{
return &container_of(kobj, struct dm_kobject_holder, kobj)->completion;
}

unsigned __dm_get_module_param(unsigned *module_param, unsigned def, unsigned max);

static inline bool dm_message_test_buffer_overflow(char *result, unsigned maxlen)
{
return !maxlen || strlen(result) + 1 >= maxlen;
}

#endif
4 changes: 2 additions & 2 deletions drivers/md/dm-crypt.c
Original file line number Diff line number Diff line change
Expand Up @@ -683,7 +683,7 @@ static int crypt_iv_tcw_whitening(struct crypt_config *cc,
u8 *data)
{
struct iv_tcw_private *tcw = &cc->iv_gen_private.tcw;
u64 sector = cpu_to_le64((u64)dmreq->iv_sector);
__le64 sector = cpu_to_le64(dmreq->iv_sector);
u8 buf[TCW_WHITENING_SIZE];
SHASH_DESC_ON_STACK(desc, tcw->crc32_tfm);
int i, r;
Expand Down Expand Up @@ -722,7 +722,7 @@ static int crypt_iv_tcw_gen(struct crypt_config *cc, u8 *iv,
struct dm_crypt_request *dmreq)
{
struct iv_tcw_private *tcw = &cc->iv_gen_private.tcw;
u64 sector = cpu_to_le64((u64)dmreq->iv_sector);
__le64 sector = cpu_to_le64(dmreq->iv_sector);
u8 *src;
int r = 0;

Expand Down
2 changes: 1 addition & 1 deletion drivers/md/dm-io.c
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
* This file is released under the GPL.
*/

#include "dm.h"
#include "dm-core.h"

#include <linux/device-mapper.h>

Expand Down
31 changes: 17 additions & 14 deletions drivers/md/dm-ioctl.c
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
* This file is released under the GPL.
*/

#include "dm.h"
#include "dm-core.h"

#include <linux/module.h>
#include <linux/vmalloc.h>
Expand Down Expand Up @@ -1267,6 +1267,15 @@ static int populate_table(struct dm_table *table,
return dm_table_complete(table);
}

static bool is_valid_type(unsigned cur, unsigned new)
{
if (cur == new ||
(cur == DM_TYPE_BIO_BASED && new == DM_TYPE_DAX_BIO_BASED))
return true;

return false;
}

static int table_load(struct dm_ioctl *param, size_t param_size)
{
int r;
Expand Down Expand Up @@ -1309,7 +1318,7 @@ static int table_load(struct dm_ioctl *param, size_t param_size)
DMWARN("unable to set up device queue for new table.");
goto err_unlock_md_type;
}
} else if (dm_get_md_type(md) != dm_table_get_type(t)) {
} else if (!is_valid_type(dm_get_md_type(md), dm_table_get_type(t))) {
DMWARN("can't change device type after initial table load.");
r = -EINVAL;
goto err_unlock_md_type;
Expand Down Expand Up @@ -1670,19 +1679,16 @@ static int check_version(unsigned int cmd, struct dm_ioctl __user *user)
return r;
}

#define DM_PARAMS_KMALLOC 0x0001 /* Params alloced with kmalloc */
#define DM_PARAMS_VMALLOC 0x0002 /* Params alloced with vmalloc */
#define DM_PARAMS_MALLOC 0x0001 /* Params allocated with kvmalloc() */
#define DM_WIPE_BUFFER 0x0010 /* Wipe input buffer before returning from ioctl */

static void free_params(struct dm_ioctl *param, size_t param_size, int param_flags)
{
if (param_flags & DM_WIPE_BUFFER)
memset(param, 0, param_size);

if (param_flags & DM_PARAMS_KMALLOC)
kfree(param);
if (param_flags & DM_PARAMS_VMALLOC)
vfree(param);
if (param_flags & DM_PARAMS_MALLOC)
kvfree(param);
}

static int copy_params(struct dm_ioctl __user *user, struct dm_ioctl *param_kernel,
Expand Down Expand Up @@ -1714,19 +1720,14 @@ static int copy_params(struct dm_ioctl __user *user, struct dm_ioctl *param_kern
* Use kmalloc() rather than vmalloc() when we can.
*/
dmi = NULL;
if (param_kernel->data_size <= KMALLOC_MAX_SIZE) {
if (param_kernel->data_size <= KMALLOC_MAX_SIZE)
dmi = kmalloc(param_kernel->data_size, GFP_NOIO | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN);
if (dmi)
*param_flags |= DM_PARAMS_KMALLOC;
}

if (!dmi) {
unsigned noio_flag;
noio_flag = memalloc_noio_save();
dmi = __vmalloc(param_kernel->data_size, GFP_NOIO | __GFP_HIGH | __GFP_HIGHMEM, PAGE_KERNEL);
memalloc_noio_restore(noio_flag);
if (dmi)
*param_flags |= DM_PARAMS_VMALLOC;
}

if (!dmi) {
Expand All @@ -1735,6 +1736,8 @@ static int copy_params(struct dm_ioctl __user *user, struct dm_ioctl *param_kern
return -ENOMEM;
}

*param_flags |= DM_PARAMS_MALLOC;

if (copy_from_user(dmi, user, param_kernel->data_size))
goto bad;

Expand Down
2 changes: 1 addition & 1 deletion drivers/md/dm-kcopyd.c
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
#include <linux/device-mapper.h>
#include <linux/dm-kcopyd.h>

#include "dm.h"
#include "dm-core.h"

#define SUB_JOB_SIZE 128
#define SPLIT_COUNT 8
Expand Down
Loading

0 comments on commit f7e6816

Please sign in to comment.