Skip to content

Latest commit

 

History

History
309 lines (234 loc) · 14.5 KB

configure-replication-zones.md

File metadata and controls

309 lines (234 loc) · 14.5 KB
title summary keywords toc
Configure Replication Zones
In CockroachDB, you use replication zones to control the number and location of replicas for specific sets of data.
ttl, time to live, availability zone
false

In CockroachDB, you use replication zones to control the number and location of replicas for specific sets of data. Initially, there is a single, default replication zone for the entire cluster. You can adjust this default zone as well as add zones for individual databases and tables as needed. For example, you might use the default zone to replicate most data in a cluster normally within a single datacenter, while creating a specific zone to more highly replicate a certain database or table across multiple datacenters and geographies.

This page explains how replication zones work and how to use the cockroach zone command to configure them.

{{site.data.alerts.callout_info}}Currently, only the root user can configure replication zones.{{site.data.alerts.end}}

Overview

Replication Zone Levels

There are three replication zone levels:

When replicating a piece of data, CockroachDB uses the most granular zone available: If there's a replication zone for the table containing the data, CockroachDB uses it; otherwise, it uses the replication zone for the database containing the data. If there's no applicable table or database replication zone, CockroachDB uses the cluster-wide replication zone.

Replicaton Zone Format

A replication zone is specified in YAML format and looks like this:

replicas: 
- attrs: [comma-separated attribute list]
- attrs: [comma-separated attribute list]
- attrs: [comma-separated attribute list]
range_min_bytes: <size-in-bytes>
range_max_bytes: <size-in-bytes>
gc:
  ttlseconds: <time-in-seconds>

Alternately, the replicas field can be simplified into a single line:

replicas: [attrs: [attribute list], attrs: [attribute list], attrs: [attribute list]]
range_min_bytes: <size-in-bytes>
range_max_bytes: <size-in-bytes>
gc:
  ttlseconds: <time-in-seconds>
Field Description
replicas The number and location of replicas for the zone. Each attrs item equals one replica. See Node/Replica Recommendations below.

It's normal and sufficient to define the number of replicas by listing attrs without any specific attributes (i.e., - attrs: []). But if you do set specific attributes for a replica (i.e., - attrs: [us-east-1a, ssd]), the replica will be placed on the nodes/stores with the matching attributes.

Node-level and store-level attributes are arbitrary strings specified when starting a node. You must match these strings exactly here in order for replication to work as you intend, so be sure to check carefully. See Start a Node for more details about node and store attributes.

Default: 3 replicas with no specific attributes
range_max_bytes The maximum size, in bytes, for a range of data in the zone. When a range reaches this size, CockroachDB will spit it into two ranges.

Default: 67108864 (64MB)
range_min_bytes Not yet implemented.
ttlseconds The number of seconds overwritten values will be retained before garbage collection. Smaller values can save disk space if values are frequently overwritten; larger values increase the range allowed for AS OF SYSTEM TIME queries. It is not recommended to set this below 600 (10 minutes).

Default: 86400 (24 hours)

Node/Replica Recommendations

When running a cluster with more than one node, each replica will be on a different node and a majority of replicas must remain available for the cluster to make progress. Therefore:

  • When running a cluster with more than one node, you should run at least three to ensure that a majority of replicas (2/3) remains available when a node goes down.

  • Configurations with odd numbers of replicas are more robust than those with even numbers. Clusters of three and four nodes can each tolerate one node failure and still reach a quorum (2/3 and 3/4 respectively), so the fourth replica doesn't add any extra fault-tolerance. To survive two simultaneous failures, you must have five replicas.

  • When replicating across datacenters, you should use datacenters on a single continent to ensure peformance (cross-continent scenarios will be better supported in the future). If the average network round-trip latency between your datacenters is greater than 200ms, you should adjust the raft-tick-interval flag on each node.

Subcommands

Subcommand Usage
ls List all replication zones.
get View the YAML contents of a replication zone.
set Create or edit a replication zone.
rm Remove a replication zone.

Synopsis

# List all replication zones:
$ cockroach zone ls <flags>

# View the default replication zone for the cluster:
$ cockroach zone get .default <flags>

# View the replication zone for a database:
$ cockroach zone get <database> <flags>

# View the replication zone for a table:
$ cockroach zone get <database.table> <flags>

# Edit the default replication zone for the cluster:
$ cockroach zone set .default --file=<zone-content.yaml> <flags>

# Create/edit the replication zone for a database:
$ cockroach zone set <database> --file=<zone-conent.yaml> <flags>

# Create/edit the replication zone for a table:
$ cockroach zone set <database.table> --file=<zone-content.yaml> <flags>

# Remove the replication zone for a database:
$ cockroach zone rm <database> <flags>

# Remove the replication zone for a table:
$ cockroach zone rm <database.table> <flags>

# View help:
$ cockroach zone --help
$ cockroach zone ls --help
$ cockroach zone get --help
$ cockroach zone set --help
$ cockroach zone rm --help

Flags

The zone command and subcommands support the following flags, as well as logging flags.

Flag Description
--ca-cert The path to the CA certificate. This flag is required if the cluster is secure.

Env Variable: COCKROACH_CA_CERT
--cert The path to the client certificate. This flag is required if the cluster is secure.

Env Variable: COCKROACH_CERT
--database
-d
Not currently implemented.
--file
-f
The path to the YAML file defining the zone configuration. To pass the zone configuration via the standard input, set this flag to -.

This flag is relevant only for the set subcommand.
--host The server host to connect to. This can be the address of any node in the cluster.

Env Variable: COCKROACH_HOST
Default: localhost
--insecure Set this only if the cluster is insecure and running on multiple machines.

If the cluster is insecure and local, leave this out. If the cluster is secure, leave this out and set the --ca-cert, --cert, and -key flags.

Env Variable: COCKROACH_INSECURE
--key The path to the client key protecting the client certificate. This flag is required if the cluster is secure.

Env Variable: COCKROACH_KEY
--port
-p
The server port to connect to.

Env Variable: COCKROACH_PORT
Default: 26257
--url The connection URL. If you use this flag, do not set any other connection flags.

For insecure connections, the URL format is:
--url=postgresql://<user>@<host>:<port>/<database>?sslmode=disable

For secure connections, the URL format is:
--url=postgresql://<user>@<host>:<port>/<database>
with the following parameters in the query string:
sslcert=<path-to-client-crt>
sslkey=<path-to-client-key>
sslmode=verify-full
sslrootcert=<path-to-ca-crt>

Env Variable: COCKROACH_URL
--user
-u
The user connecting to the database. Currently, only the root user can configure replication zones.

Env Variable: COCKROACH_USER
Default: root

Examples

View the Default Replication Zone

The cluster-wide replication zone (.default) is initially set to replicate data to any three nodes in your cluster, with ranges in each replica splitting once they get larger than 67108864 bytes.

To view the default replication zone, use the cockroach zone get .default command with appropriate flags:

$ cockroach zone get .default
.default
replicas:
- attrs: []
- attrs: []
- attrs: []
range_min_bytes: 1048576
range_max_bytes: 67108864
gc:
  ttlseconds: 86400

Edit the Default Replication Zone

Let's say you want to run a three-node cluster across three datacenters, two on the US east coast and one on the US west coast. You want data replicated three times by default, with each replica stored on a specific node in a specific datacenter.

  1. Start each node with the relevant datacenter location specified in the --attrs field:

    # Start node in first US east coast datacenter:
    $ cockroach start --host=node1-hostname --attrs=us-east-1a
    
    # Start node in second US east coast datacenter:
    $ cockroach start --host=node2-hostname --attrs=us-east-1b --join=node1-hostname:27257
    
    # Start node in US west coast datacenter:
    $ cockroach start --host=node3-hostname --attrs=us-west-1a --join=node1-hostname:27257
  2. Create a YAML file with one datacenter attribute set for each replica, and use the file to update the default zone configuration :

    $ cat default_update.yaml
    replicas:
    - attrs: [us-east-1a] 
    - attrs: [us-east-1b]
    - attrs: [us-west-1a]
    
    $ cockroach zone set .default -f default_update.yaml
    UPDATE 1
    replicas:
    - attrs: [us-east-1a]
    - attrs: [us-east-1b]
    - attrs: [us-west-1a]
    range_min_bytes: 1048576
    range_max_bytes: 67108864
    gc:
      ttlseconds: 86400

    Alternately, you can pass the YAML content via the standard input:

    $ cockroach zone set .default -f - <<EOF
    replicas:
    - attrs: [us-east-1a] 
    - attrs: [us-east-1b]
    - attrs: [us-west-1a]
    EOF

Create a Replication Zone for a Database

Let's say you want to run a cluster across five nodes, three of which have ssd storage devices. You want data in the bank database replicated to these ssd devices.

  1. When starting the three nodes that have ssd storage, specify ssd as an attribute of the stores, and when starting the other two nodes, leave the attribute out:

    # Start nodes with ssd storage:
    $ cockroach start --insecure --host=node1-hostname --store=path=node1-data,attrs=ssd
    $ cockroach start --insecure --host=node2-hostname --store=path=node2-data,attrs=ssd --join=node1-hostname:27257
    $ cockroach start --insecure --host=node3-hostname --store=path=node3-data,attrs=ssd --join=node1-hostname:27257
    
    # Start nodes without ssd storage:
    $ cockroach start --insecure --host=node4-hostname --store=path=node4-data --join=node1-hostname:27257
    $ cockroach start --insecure --host=node5-hostname --store=path=node5-data --join=node1-hostname:27257
  2. Create a YAML file with ssd set as the attribute for each replica, and use the file to update the zone configuration for the bank database:

    $ cat bank_zone.yaml
    replicas:
    - attrs: [ssd]
    - attrs: [ssd]
    - attrs: [ssd]
    
    $ cockroach zone set bank -f bank_zone.yaml
    INSERT 1
    replicas:
    - attrs: [ssd]
    - attrs: [ssd]
    - attrs: [ssd]
    range_min_bytes: 1048576
    range_max_bytes: 67108864
    gc:
      ttlseconds: 86400

    Alternately, you can pass the YAML content via the standard input:

    $ cockroach zone set bank -f - <<EOF
    replicas:
    - attrs: [ssd]
    - attrs: [ssd]
    - attrs: [ssd]
    EOF

Create a Replication Zone for a Table

Let's say you want to run a cluster across five nodes, three of which have ssd storage devices. You want data in the bank.accounts table replicated to these ssd devices.

  1. When starting the three nodes that have ssd storage, specify ssd as an attribute of the stores, and when starting the other two nodes, leave the attribute out:

    # Start nodes with ssd storage:
    $ cockroach start --insecure --host=node1-hostname --store=path=node1-data,attrs=ssd
    $ cockroach start --insecure --host=node2-hostname --store=path=node2-data,attrs=ssd --join=node1-hostname:27257
    $ cockroach start --insecure --host=node3-hostname --store=path=node3-data,attrs=ssd --join=node1-hostname:27257
    
    # Start nodes without ssd storage:
    $ cockroach start --insecure --host=node4-hostname --store=path=node4-data --join=node1-hostname:27257
    $ cockroach start --insecure --host=node5-hostname --store=path=node5-data --join=node1-hostname:27257
  2. Create a YAML file with ssd set as the attribute for each replica, and use the file to update the zone configuration for the bank.accounts table:

    $ cat accounts_zone.yaml
    replicas:
    - attrs: [ssd]
    - attrs: [ssd]
    - attrs: [ssd]
    
    $ cockroach zone set bank.accounts -f accounts_zone.yaml
    INSERT 1
    replicas:
    - attrs: [ssd]
    - attrs: [ssd]
    - attrs: [ssd]
    range_min_bytes: 1048576
    range_max_bytes: 67108864
    gc:
      ttlseconds: 86400   

    Alternately, you can pass the YAML content via the standard input:

    $ cockroach zone set bank -f - <<EOF
    replicas:
    - attrs: [ssd]
    - attrs: [ssd]
    - attrs: [ssd]
    EOF

See Also

Other Cockroach Commands