From a7b2f87b0637a3711e90c0f1d0a99dd5d32bd60f Mon Sep 17 00:00:00 2001 From: Chun Chen Date: Tue, 5 Apr 2016 15:35:24 +0800 Subject: [PATCH] Add docs about how to extend devicemapper thin pool Signed-off-by: Chun Chen Update to device mapper Entering comments Signed-off-by: Mary Anthony --- .../storagedriver/device-mapper-driver.md | 327 ++++++++---------- 1 file changed, 147 insertions(+), 180 deletions(-) diff --git a/docs/userguide/storagedriver/device-mapper-driver.md b/docs/userguide/storagedriver/device-mapper-driver.md index ceef5c22335..757a6471acb 100644 --- a/docs/userguide/storagedriver/device-mapper-driver.md +++ b/docs/userguide/storagedriver/device-mapper-driver.md @@ -16,12 +16,10 @@ leverages the thin provisioning and snapshotting capabilities of this framework for image and container management. This article refers to the Device Mapper storage driver as `devicemapper`, and the kernel framework as `Device Mapper`. - >**Note**: The [Commercially Supported Docker Engine (CS-Engine) running on RHEL and CentOS Linux](https://www.docker.com/compatibility-maintenance) requires that you use the `devicemapper` storage driver. - ## An alternative to AUFS Docker originally ran on Ubuntu and Debian Linux and used AUFS for its storage @@ -61,20 +59,20 @@ With `devicemapper` the high level process for creating images is as follows: 1. The `devicemapper` storage driver creates a thin pool. - The pool is created from block devices or loop mounted sparse files (more -on this later). + The pool is created from block devices or loop mounted sparse files (more + on this later). 2. Next it creates a *base device*. - A base device is a thin device with a filesystem. You can see which -filesystem is in use by running the `docker info` command and checking the -`Backing filesystem` value. + A base device is a thin device with a filesystem. You can see which + filesystem is in use by running the `docker info` command and checking the + `Backing filesystem` value. 3. Each new image (and image layer) is a snapshot of this base device. - These are thin provisioned copy-on-write snapshots. This means that they -are initially empty and only consume space from the pool when data is written -to them. + These are thin provisioned copy-on-write snapshots. This means that they + are initially empty and only consume space from the pool when data is written + to them. With `devicemapper`, container layers are snapshots of the image they are created from. Just as with images, container snapshots are thin provisioned @@ -109,9 +107,9 @@ block (`0x44f`) in an example container. 1. An application makes a read request for block `0x44f` in the container. - Because the container is a thin snapshot of an image it does not have the -data. Instead, it has a pointer (PTR) to where the data is stored in the image -snapshot lower down in the image stack. + Because the container is a thin snapshot of an image it does not have the + data. Instead, it has a pointer (PTR) to where the data is stored in the image + snapshot lower down in the image stack. 2. The storage driver follows the pointer to block `0xf33` in the snapshot relating to image layer `a005...`. @@ -121,7 +119,7 @@ snapshot to memory in the container. 4. The storage driver returns the data to the requesting application. -### Write examples +## Write examples With the `devicemapper` driver, writing new data to a container is accomplished by an *allocate-on-demand* operation. Updating existing data uses a @@ -132,7 +130,7 @@ For example, when making a small change to a large file in a container, the `devicemapper` storage driver does not copy the entire file. It only copies the blocks to be modified. Each block is 64KB. -#### Writing new data +### Writing new data To write 56KB of new data to a container: @@ -141,12 +139,12 @@ To write 56KB of new data to a container: 2. The allocate-on-demand operation allocates a single new 64KB block to the container's snapshot. - If the write operation is larger than 64KB, multiple new blocks are -allocated to the container's snapshot. + If the write operation is larger than 64KB, multiple new blocks are + allocated to the container's snapshot. 3. The data is written to the newly allocated block. -#### Overwriting existing data +### Overwriting existing data To modify existing data for the first time: @@ -163,7 +161,7 @@ The application in the container is unaware of any of these allocate-on-demand and copy-on-write operations. However, they may add latency to the application's read and write operations. -## Configuring Docker with Device Mapper +## Configure Docker with devicemapper The `devicemapper` is the default Docker storage driver on some Linux distributions. This includes RHEL and most of its forks. Currently, the @@ -182,18 +180,20 @@ deployments should not run under `loop-lvm` mode. You can detect the mode by viewing the `docker info` command: - $ sudo docker info - Containers: 0 - Images: 0 - Storage Driver: devicemapper - Pool Name: docker-202:2-25220302-pool - Pool Blocksize: 65.54 kB - Backing Filesystem: xfs - ... - Data loop file: /var/lib/docker/devicemapper/devicemapper/data - Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata - Library Version: 1.02.93-RHEL7 (2015-01-28) - ... +```bash +$ sudo docker info +Containers: 0 +Images: 0 +Storage Driver: devicemapper + Pool Name: docker-202:2-25220302-pool + Pool Blocksize: 65.54 kB + Backing Filesystem: xfs + [...] + Data loop file: /var/lib/docker/devicemapper/devicemapper/data + Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata + Library Version: 1.02.93-RHEL7 (2015-01-28) + [...] + ``` The output above shows a Docker host running with the `devicemapper` storage driver operating in `loop-lvm` mode. This is indicated by the fact that the @@ -203,175 +203,141 @@ files. ### Configure direct-lvm mode for production -The preferred configuration for production deployments is `direct lvm`. This +The preferred configuration for production deployments is `direct-lvm`. This mode uses block devices to create the thin pool. The following procedure shows you how to configure a Docker host to use the `devicemapper` storage driver in a `direct-lvm` configuration. -> **Caution:** If you have already run the Engine daemon on your Docker host +> **Caution:** If you have already run the Docker daemon on your Docker host > and have images you want to keep, `push` them Docker Hub or your private > Docker Trusted Registry before attempting this procedure. The procedure below will create a 90GB data volume and 4GB metadata volume to use as backing for the storage pool. It assumes that you have a spare block -device at `/dev/sdd` with enough free space to complete the task. The device +device at `/dev/xvdf` with enough free space to complete the task. The device identifier and volume sizes may be be different in your environment and you -should substitute your own values throughout the procedure. - -The procedure also assumes that the Engine daemon is in the `stopped` state. -Any existing images or data are lost by this process. - -1. Log in to the Docker host you want to configure. -2. If it is running, stop the Engine daemon. -3. Install the logical volume management version 2. - - ```bash - $ yum install lvm2 - ``` -4. Create a physical volume replacing `/dev/sdd` with your block device. - - ```bash - $ pvcreate /dev/sdd - ``` - -5. Create a 'docker' volume group. - - ```bash - $ vgcreate docker /dev/sdd - ``` - -6. Create a thin pool named `thinpool`. - - In this example, the data logical is 95% of the 'docker' volume group size. - Leaving this free space allows for auto expanding of either the data or - metadata if space runs low as a temporary stopgap. - - ```bash - $ lvcreate --wipesignatures y -n thinpool docker -l 95%VG - $ lvcreate --wipesignatures y -n thinpoolmeta docker -l 1%VG - ``` - -7. Convert the pool to a thin pool. - - ```bash - $ lvconvert -y --zero n -c 512K --thinpool docker/thinpool --poolmetadata docker/thinpoolmeta - ``` - -8. Configure autoextension of thin pools via an `lvm` profile. - - ```bash - $ vi /etc/lvm/profile/docker-thinpool.profile - ``` - -9. Specify 'thin_pool_autoextend_threshold' value. - - The value should be the percentage of space used before `lvm` attempts - to autoextend the available space (100 = disabled). +should substitute your own values throughout the procedure. The procedure also +assumes that the Docker daemon is in the `stopped` state. - ``` - thin_pool_autoextend_threshold = 80 - ``` +1. Log in to the Docker host you want to configure and stop the Docker daemon. -10. Modify the `thin_pool_autoextend_percent` for when thin pool autoextension occurs. +2. If it exists, delete your existing image store by removing the +`/var/lib/docker` directory. - The value's setting is the perentage of space to increase the thin pool (100 = - disabled) - - ``` - thin_pool_autoextend_percent = 20 - ``` - -11. Check your work, your `docker-thinpool.profile` file should appear similar to the following: - - An example `/etc/lvm/profile/docker-thinpool.profile` file: + ```bash + $ sudo rm -rf /var/lib/docker + ``` - ``` - activation { - thin_pool_autoextend_threshold=80 - thin_pool_autoextend_percent=20 - } - ``` +3. Create an LVM physical volume (PV) on your spare block device using the +`pvcreate` command. -12. Apply your new lvm profile + ```bash + $ sudo pvcreate /dev/xvdf + Physical volume `/dev/xvdf` successfully created + ``` - ```bash - $ lvchange --metadataprofile docker-thinpool docker/thinpool - ``` + The device identifier may be different on your system. Remember to substitute + your value in the command above. If your host is running on AWS EC2, you may + need to install `lvm2` and attach an EBS device to use this procedure. -13. Verify the `lv` is monitored. +4. Create a new volume group (VG) called `vg-docker` using the PV created in +the previous step. - ```bash - $ lvs -o+seg_monitor - ``` + ```bash + $ sudo vgcreate vg-docker /dev/xvdf + Volume group `vg-docker` successfully created + ``` -14. If Engine was previously started, clear your graph driver directory. +5. Create a new 90GB logical volume (LV) called `data` from space in the +`vg-docker` volume group. - Clearing your graph driver removes any images and containers in your Docker - installation. + ```bash + $ sudo lvcreate -L 90G -n data vg-docker + Logical volume `data` created. + ``` - ```bash - $ rm -rf /var/lib/docker/* - ``` + The command creates an LVM logical volume called `data` and an associated + block device file at `/dev/vg-docker/data`. In a later step, you instruct the + `devicemapper` storage driver to use this block device to store image and + container data. -14. Configure the Engine daemon with specific devicemapper options. + If you receive a signature detection warning, make sure you are working on + the correct devices before continuing. Signature warnings indicate that the + device you're working on is currently in use by LVM or has been used by LVM in + the past. - There are two ways to do this. You can set options on the commmand line if you start the daemon there: +6. Create a new logical volume (LV) called `metadata` from space in the +`vg-docker` volume group. - ```bash - --storage-driver=devicemapper --storage-opt=dm.thinpooldev=/dev/mapper/docker-thinpool --storage-opt dm.use_deferred_removal=true - ``` + ```bash + $ sudo lvcreate -L 4G -n metadata vg-docker + Logical volume `metadata` created. + ``` - You can also set them for startup in the `daemon.json` configuration, for example: + This creates an LVM logical volume called `metadata` and an associated + block device file at `/dev/vg-docker/metadata`. In the next step you instruct + the `devicemapper` storage driver to use this block device to store image and + container metadata. - ```json - { - "storage-driver": "devicemapper", - "storage-opts": [ - "dm.thinpooldev=/dev/mapper/docker-thinpool", - "dm.use_deferred_removal=true" - ] - } - ``` -15. Start the Engine daemon. +7. Start the Docker daemon with the `devicemapper` storage driver and the +`--storage-opt` flags. - ```bash - $ systemctl start docker - ``` + The `data` and `metadata` devices that you pass to the `--storage-opt` + options were created in the previous steps. -After you start the Engine daemon, ensure you monitor your thin pool and volume -group free space. While the volume group will auto-extend, it can still fill -up. To monitor logical volumes, use `lvs` without options or `lvs -a` to see tha -data and metadata sizes. To monitor volume group free space, use the `vgs` command. + ```bash + $ sudo docker daemon --storage-driver=devicemapper --storage-opt dm.datadev=/dev/vg-docker/data --storage-opt dm.metadatadev=/dev/vg-docker/metadata & + [1] 2163 + [root@ip-10-0-0-75 centos]# INFO[0000] Listening for HTTP on unix (/var/run/docker.sock) + INFO[0027] Option DefaultDriver: bridge + INFO[0027] Option DefaultNetwork: bridge + <-- output truncated --> + INFO[0027] Daemon has completed initialization + INFO[0027] Docker daemon commit=1b09a95 graphdriver=aufs version=1.11.0-dev + ``` -Logs can show the auto-extension of the thin pool when it hits the threshold, to -view the logs use: + It is also possible to set the `--storage-driver` and `--storage-opt` flags + in the Docker config file and start the daemon normally using the `service` or + `systemd` commands. -```bash -journalctl -fu dm-event.service -``` +8. Use the `docker info` command to verify that the daemon is using `data` and +`metadata` devices you created. -If you run into repeated problems with thin pool, you can use the -`dm.min_free_space` option to tune the Engine behavior. This value ensures that -operations fail with a warning when the free space is at or near the minimum. -For information, see the storage driver options in the Engine daemon reference. + ```bash + $ sudo docker info + INFO[0180] GET /v1.20/info + Containers: 0 + Images: 0 + Storage Driver: devicemapper + Pool Name: docker-202:1-1032-pool + Pool Blocksize: 65.54 kB + Backing Filesystem: xfs + Data file: /dev/vg-docker/data + Metadata file: /dev/vg-docker/metadata + [...] + ``` + The output of the command above shows the storage driver as `devicemapper`. + The last two lines also confirm that the correct devices are being used for + the `Data file` and the `Metadata file`. ### Examine devicemapper structures on the host You can use the `lsblk` command to see the device files created above and the `pool` that the `devicemapper` storage driver creates on top of them. - $ sudo lsblk - NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT - xvda 202:0 0 8G 0 disk - └─xvda1 202:1 0 8G 0 part / - xvdf 202:80 0 10G 0 disk - ├─vg--docker-data 253:0 0 90G 0 lvm - │ └─docker-202:1-1032-pool 253:2 0 10G 0 dm - └─vg--docker-metadata 253:1 0 4G 0 lvm - └─docker-202:1-1032-pool 253:2 0 10G 0 dm +```bash +$ sudo lsblk +NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT +xvda 202:0 0 8G 0 disk +└─xvda1 202:1 0 8G 0 part / +xvdf 202:80 0 10G 0 disk +├─vg--docker-data 253:0 0 90G 0 lvm +│ └─docker-202:1-1032-pool 253:2 0 10G 0 dm +└─vg--docker-metadata 253:1 0 4G 0 lvm + └─docker-202:1-1032-pool 253:2 0 10G 0 dm +``` The diagram below shows the image from prior examples updated with the detail from the `lsblk` command above. @@ -379,8 +345,8 @@ from the `lsblk` command above. ![](http://farm1.staticflickr.com/703/22116692899_0471e5e160_b.jpg) In the diagram, the pool is named `Docker-202:1-1032-pool` and spans the `data` - and `metadata` devices created earlier. The `devicemapper` constructs the pool - name as follows: +and `metadata` devices created earlier. The `devicemapper` constructs the pool +name as follows: ``` Docker-MAJ:MIN-INO-pool @@ -440,18 +406,18 @@ Logging Driver: json-file [...] ``` -The `Data Space` values show that the pool is 100GiB total. This example extends the pool to 200GiB. +The `Data Space` values show that the pool is 100GB total. This example extends the pool to 200GB. 1. List the sizes of the devices. ```bash $ sudo ls -lh /var/lib/docker/devicemapper/devicemapper/ - total 1.2G - -rw------- 1 root root 100G Apr 14 08:47 data - -rw------- 1 root root 2.0G Apr 19 13:27 metadata + total 1175492 + -rw------- 1 root root 100G Mar 30 05:22 data + -rw------- 1 root root 2.0G Mar 31 11:17 metadata ``` -2. Truncate `data` file to 200GiB. +2. Truncate `data` file to the size of the `metadata` file (approximage 200GB). ```bash $ sudo truncate -s 214748364800 /var/lib/docker/devicemapper/devicemapper/data @@ -460,10 +426,12 @@ The `Data Space` values show that the pool is 100GiB total. This example extends 3. Verify the file size changed. ```bash - $ sudo ls -lh /var/lib/docker/devicemapper/devicemapper/ - total 1.2G - -rw------- 1 root root 200G Apr 14 08:47 data - -rw------- 1 root root 2.0G Apr 19 13:27 metadata + $ sudo ls -al /var/lib/docker/devicemapper/devicemapper/ + total 1175492 + drwx------ 2 root root 4096 Mar 29 02:45 . + drwx------ 5 root root 4096 Mar 29 02:48 .. + -rw------- 1 root root 214748364800 Mar 31 11:20 data + -rw------- 1 root root 2147483648 Mar 31 11:17 metadata ``` 4. Reload data loop device @@ -480,19 +448,19 @@ The `Data Space` values show that the pool is 100GiB total. This example extends a. Get the pool name first. - $ sudo dmsetup status | grep pool - docker-8:1-123141-pool: 0 209715200 thin-pool 91 422/524288 18338/1638400 - rw discard_passdown queue_if_no_space - + $ sudo dmsetup status docker-8:1-123141-pool: 0 209715200 thin-pool 91 + 422/524288 18338/1638400 - rw discard_passdown queue_if_no_space - The name is the string before the colon. - b. Dump the device mapper table first. + b. Dump the device mapper table first. $ sudo dmsetup table docker-8:1-123141-pool 0 209715200 thin-pool 7:1 7:0 128 32768 1 skip_block_zeroing c. Calculate the real total sectors of the thin pool now. - Change the second number of the table info (i.e. the number of sectors) to reflect the new number of 512 byte sectors in the disk. For example, as the new loop size is 200GiB, change the second number to 419430400. + Change the second number of the table info (i.e. the disk end sector) to reflect the new number of 512 byte sectors in the disk. For example, as the new loop size is 200GB, change the second number to 419430400. d. Reload the thin pool with the new sector number @@ -514,7 +482,7 @@ $ ./device_tool resize 200GB ### For a direct-lvm mode configuration In this example, you extend the capacity of a running device that uses the -`direct-lvm` configuration. This example assumes you are using the `/dev/sdh1` +`direct-lvm` configuration. This example assumes you are using the `/dev/sdh1` disk partition. 1. Extend the volume group (VG) `vg-docker`. @@ -550,7 +518,7 @@ disk partition. c. Calculate the real total sectors of the thin pool now. we can use `blockdev` to get the real size of data lv. - Change the second number of the table info (i.e. the number of sectors) to + Change the second number of the table info (i.e. the disk end sector) to reflect the new number of 512 byte sectors in the disk. For example, as the new data `lv` size is `264132100096` bytes, change the second number to `515883008`. @@ -562,7 +530,6 @@ disk partition. $ sudo dmsetup suspend docker-253:17-1835016-pool && sudo dmsetup reload docker-253:17-1835016-pool --table '0 515883008 thin-pool 252:0 252:1 128 32768 1 skip_block_zeroing' && sudo dmsetup resume docker-253:17-1835016-pool - ## Device Mapper and Docker performance It is important to understand the impact that allocate-on-demand and