title | summary | toc |
---|---|---|
BACKUP |
Back up your CockroachDB cluster to a cloud storage services such as AWS S3, Google Cloud Storage, or other NFS. |
true |
CockroachDB's BACKUP
statement allows you to create full or incremental backups of your cluster's schema and data that are consistent as of a given timestamp.
{{site.data.alerts.callout_info}}
The syntax BACKUP ... INTO
adds a backup to a collection within the backup destination. The path to the backup is created using a date-based naming scheme. Versions of CockroachDB prior to v21.1 used the syntax BACKUP ... TO
to back up directly to a specific operator-chosen destination, rather than picking a date-based path. The BACKUP ... TO
syntax will be deprecated in future releases. For more information on this soon-to-be deprecated syntax, see the docs for v20.2 or earlier.
{{site.data.alerts.end}}
{{site.data.alerts.callout_info}} Core users can only take full backups. To use the other backup features, you need an {{ site.data.products.enterprise }} license. You can also use {{ site.data.products.dedicated }}, which runs full backups daily and incremental backups hourly. {{site.data.alerts.end}}
You can backup a full cluster, which includes:
- Relevant system tables
- All databases
- All tables (which automatically includes their indexes)
- All views
- All scheduled jobs
You can also back up:
-
An individual database, which includes all of its tables and views
-
An individual table, which includes its indexes and views
BACKUP
only backs up entire tables; it does not support backing up subsets of a table.
Because CockroachDB is designed with high fault tolerance, these backups are designed primarily for disaster recovery (i.e., if your cluster loses a majority of its nodes) through RESTORE
. Isolated issues (such as small-scale node outages) do not require any intervention.
{{site.data.alerts.callout_success}}
To view the contents of an {{ site.data.products.enterprise }} backup created with the BACKUP
statement, use SHOW BACKUP
.
{{site.data.alerts.end}}
{{site.data.alerts.callout_info}}
BACKUP
is a blocking statement. To run a backup job asynchronously, use the DETACHED
option. See the options below.
{{site.data.alerts.end}}
{{site.data.alerts.callout_info}} Interleaving data is disabled by default, and will be permanently removed from CockroachDB in a future release. CockroachDB versions v21.2 and later will not be able to read or restore backups that include interleaved data.
To backup interleaved data, a BACKUP
statement must include the INCLUDE_DEPRECATED_INTERLEAVES
option.
{{site.data.alerts.end}}
- Full cluster backups can only be run by members of the
admin
role. By default, theroot
user belongs to theadmin
role. - For all other backups, the user must have read access on all objects being backed up. Database and table backups require
SELECT
privileges. Backups of user-defined schemas, or backups containing user-defined types, requireUSAGE
privileges. BACKUP
requires full read and write (including delete and overwrite) permissions to its target destination.
{% include {{ page.version.version }}/backups/destination-file-privileges.md %}
Parameter | Description
-----------+-------------
targets
| Back up the listed targets.
subdirectory
| The name of the specific subdirectory (e.g., 2021/03/23-213101.37
) where you want to add an incremental backup. To view available subdirectories, use SHOW BACKUPS IN destination
. If the subdirectory
is not provided, a full backup will be created in the collection using a date-based naming scheme (i.e., <year>/<month>/<day>-<timestamp>
).
Warning: If you use an arbitrary STRING
as the subdirectory, a new full backup will be created, but it will never be shown in SHOW BACKUPS IN
. We do not recommend using arbitrary strings as subdirectory names.
LATEST
| Append an incremental backup to the latest completed full backup's subdirectory.
destination
| The URL where you want to store the backup.
For information about this URL structure, see Backup File URLs.
timestamp
| Back up data as it existed as of timestamp
. The timestamp
must be more recent than your cluster's last garbage collection (which defaults to occur every 25 hours, but is configurable per table).
backup_options
| Control the backup behavior with a comma-separated list of these options.
Target | Description
-----------------------------------+-------------------------------------------------------------------------
N/A | Backup the cluster. For an example of a full cluster backup, see Backup a cluster.
DATABASE {database_name} [, ...]
| The name of the database(s) you want to back up (i.e., create backups of all tables and views in the database). For an example of backing up a database, see Backup a database.
TABLE {table_name} [, ...]
| The name of the table(s) or view(s) you want to back up. For an example of backing up a table or view, see Backup a table or view.
{% include {{ page.version.version }}/backups/backup-options.md %}
CockroachDB uses the URL provided to construct a secure API call to the service you specify. The URL structure depends on the type of file storage you are using. For more information, see the following:
-
Use Cloud Storage for Bulk Operations
{{site.data.alerts.callout_info}} HTTP storage is not supported for
BACKUP
andRESTORE
. {{site.data.alerts.end}}
Dependent objects must be backed up at the same time as the objects they depend on.
Object | Depends On |
---|---|
Table with foreign key constraints | The table it REFERENCES ; however, this dependency can be removed during the restore. |
Table with a sequence | The sequence it uses; however, this dependency can be removed during the restore. |
Views | The tables used in the view's SELECT statement. |
Interleaved tables | The parent table in the interleaved hierarchy. |
The system.users
table stores your users and their passwords. To restore your users and privilege grants, do a cluster backup and restore the cluster to a fresh cluster with no user data. You can also backup the system.users
table, and then use this procedure.
The BACKUP
process minimizes its impact to the cluster's performance by distributing work to all nodes. Each node backs up only a specific subset of the data it stores (those for which it serves writes; more details about this architectural concept forthcoming), with no two nodes backing up the same data.
BACKUP
, like any read, cannot export a range if the range contains an unresolved intent. While you typically will want bulk, background jobs like BACKUP
to have as little impact on your foreground traffic as possible, it's more important for backups to actually complete (which maintains your recovery point objective (RPO)). Unlike a normal read transaction that will block until any uncommitted writes it encounters are resolved, BACKUP
will block only for a configurable duration before invoking priority to ensure it can complete on-time.
We recommend always starting backups with a specific timestamp at least 10 seconds in the past. For example:
> BACKUP...AS OF SYSTEM TIME '-10s';
This improves performance by decreasing the likelihood that the BACKUP
will be retried because it contends with other statements/transactions. However, because AS OF SYSTEM TIME
returns historical data, your reads might be stale. Taking backups with AS OF SYSTEM TIME '-10s'
is a good best practice to reduce the number of still-running transactions you may encounter, since the backup will take priority and will force still-running transactions to restart after the backup is finished.
BACKUP
will initially ask individual ranges to backup but to skip if they encounter an intent. Any range that is skipped is placed at the end of the queue. When BACKUP
has completed its initial pass and is revisiting ranges, it will ask any range that did not resolve within the given time limit (default 1 minute) to attempt to resolve any intents that it encounters and to not skip. Additionally, the backup's transaction priority is then set to high
, which causes other transactions to abort until the intents are resolved and the backup is finished.
After CockroachDB successfully initiates a backup, it registers the backup as a job, and you can do the following:
Action | SQL Statement
-----------------------+-----------------
View the backup status | SHOW JOBS
Pause the backup | PAUSE JOB
Resume the backup | RESUME JOB
Cancel the backup | CANCEL JOB
You can also visit the Jobs page of the DB Console to view job details. The BACKUP
statement will return when the backup is finished or if it encounters an error.
{{site.data.alerts.callout_info}}
The presence of a BACKUP-CHECKPOINT
file in the backup destination usually means the backup is not complete. This file is created when a backup is initiated, and is replaced with a BACKUP
file once the backup is finished.
{{site.data.alerts.end}}
Per our guidance in the Performance section, we recommend starting backups from a time at least 10 seconds in the past using AS OF SYSTEM TIME
. Each example below follows this guidance.
{% include {{ page.version.version }}/backups/bulk-auth-options.md %}
{{site.data.alerts.callout_info}}
The examples in this section use the default AUTH=specified
parameter. For more detail on how to use implicit
authentication with Amazon S3 buckets, read Use Cloud Storage for Bulk Operations — Authentication.
{{site.data.alerts.end}}
To take a full backup of a cluster:
{% include copy-clipboard.html %}
> BACKUP INTO \
's3://{BUCKET NAME}/{PATH}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}' \
AS OF SYSTEM TIME '-10s';
To take a full backup of a single database:
{% include copy-clipboard.html %}
> BACKUP DATABASE bank \
INTO 's3://{BUCKET NAME}/{PATH}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}' \
AS OF SYSTEM TIME '-10s';
To take a full backup of multiple databases:
{% include copy-clipboard.html %}
> BACKUP DATABASE bank, employees \
INTO 's3://{BUCKET NAME}/{PATH}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}' \
AS OF SYSTEM TIME '-10s';
To take a full backup of a single table or view:
{% include copy-clipboard.html %}
> BACKUP bank.customers \
INTO 's3://{BUCKET NAME}/{PATH}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}' \
AS OF SYSTEM TIME '-10s';
To take a full backup of multiple tables:
{% include copy-clipboard.html %}
> BACKUP bank.customers, bank.accounts \
INTO 's3://{BUCKET NAME}/{PATH}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}' \
AS OF SYSTEM TIME '-10s';
{{site.data.alerts.callout_danger}} {% include {{page.version.version}}/backups/no-multiregion-table-backups.md %} {{site.data.alerts.end}}
If you backup to a destination already containing a full backup, an incremental backup will be appended to the full backup's path with a date-based name (e.g., 20210324
):
{% include copy-clipboard.html %}
> BACKUP INTO LATEST IN \
's3://{BUCKET NAME}/{PATH}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}' \
AS OF SYSTEM TIME '-10s';
Use the DETACHED
option to execute the backup job asynchronously:
{% include copy-clipboard.html %}
> BACKUP INTO \
's3://{BUCKET NAME}/{PATH}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}' \
AS OF SYSTEM TIME '-10s'
WITH DETACHED;
The job ID is returned immediately without waiting for the job to finish:
job_id
----------------------
592786066399264769
(1 row)
Without the DETACHED
option, BACKUP
will block the SQL connection until the job completes. Once finished, the job status and more detailed job data is returned:
job_id | status | fraction_completed | rows | index_entries | bytes
---------------------+-----------+--------------------+------+---------------+--------
652471804772712449 | succeeded | 1 | 50 | 0 | 4911
(1 row)
To take a full backup of a cluster:
{% include copy-clipboard.html %}
> BACKUP INTO \
'azure://{CONTAINER NAME}/{PATH}?AZURE_ACCOUNT_NAME={ACCOUNT NAME}&AZURE_ACCOUNT_KEY={URL-ENCODED KEY}' \
AS OF SYSTEM TIME '-10s';
To take a full backup of a single database:
{% include copy-clipboard.html %}
> BACKUP DATABASE bank \
INTO 'azure://{CONTAINER NAME}/{PATH}?AZURE_ACCOUNT_NAME={ACCOUNT NAME}&AZURE_ACCOUNT_KEY={URL-ENCODED KEY}' \
AS OF SYSTEM TIME '-10s';
To take a full backup of multiple databases:
{% include copy-clipboard.html %}
> BACKUP DATABASE bank, employees \
INTO 'azure://{CONTAINER NAME}/{PATH}?AZURE_ACCOUNT_NAME={ACCOUNT NAME}&AZURE_ACCOUNT_KEY={URL-ENCODED KEY}' \
AS OF SYSTEM TIME '-10s';
To take a full backup of a single table or view:
{% include copy-clipboard.html %}
> BACKUP bank.customers \
INTO 'azure://{CONTAINER NAME}/{PATH}?AZURE_ACCOUNT_NAME={ACCOUNT NAME}&AZURE_ACCOUNT_KEY={URL-ENCODED KEY}' \
AS OF SYSTEM TIME '-10s';
To take a full backup of multiple tables:
{% include copy-clipboard.html %}
> BACKUP bank.customers, bank.accounts \
INTO 'azure://{CONTAINER NAME}/{PATH}?AZURE_ACCOUNT_NAME={ACCOUNT NAME}&AZURE_ACCOUNT_KEY={URL-ENCODED KEY}' \
AS OF SYSTEM TIME '-10s';
{{site.data.alerts.callout_danger}} {% include {{page.version.version}}/backups/no-multiregion-table-backups.md %} {{site.data.alerts.end}}
If you backup to a destination already containing a full backup, an incremental backup will be appended to the full backup's path with a date-based name (e.g., 20210324
):
{% include copy-clipboard.html %}
> BACKUP INTO LATEST IN \
'azure://{CONTAINER NAME}/{PATH}?AZURE_ACCOUNT_NAME={ACCOUNT NAME}&AZURE_ACCOUNT_KEY={URL-ENCODED KEY}' \
AS OF SYSTEM TIME '-10s';
Use the DETACHED
option to execute the backup job asynchronously:
{% include copy-clipboard.html %}
> BACKUP INTO \
'azure://{CONTAINER NAME}/{PATH}?AZURE_ACCOUNT_NAME={ACCOUNT NAME}&AZURE_ACCOUNT_KEY={URL-ENCODED KEY}' \
AS OF SYSTEM TIME '-10s'
WITH DETACHED;
The job ID is returned immediately without waiting for the job to finish:
job_id
----------------------
592786066399264769
(1 row)
Without the DETACHED
option, BACKUP
will block the SQL connection until the job completes. Once finished, the job status and more detailed job data is returned:
job_id | status | fraction_completed | rows | index_entries | bytes
---------------------+-----------+--------------------+------+---------------+--------
652471804772712449 | succeeded | 1 | 50 | 0 | 4911
(1 row)
{{site.data.alerts.callout_info}}
The examples in this section use the AUTH=specified
parameter, which will be the default behavior in v21.2 and beyond for connecting to Google Cloud Storage. For more detail on how to pass your Google Cloud Storage credentials with this parameter, or, how to use implicit
authentication, read Use Cloud Storage for Bulk Operations — Authentication.
{{site.data.alerts.end}}
To take a full backup of a cluster:
{% include copy-clipboard.html %}
> BACKUP INTO \
'gs://{BUCKET NAME}/{PATH}?AUTH=specified&CREDENTIALS={ENCODED KEY}' \
AS OF SYSTEM TIME '-10s';
To take a full backup of a single database:
{% include copy-clipboard.html %}
> BACKUP DATABASE bank \
INTO 'gs://{BUCKET NAME}/{PATH}?AUTH=specified&CREDENTIALS={ENCODED KEY}' \
AS OF SYSTEM TIME '-10s';
To take a full backup of multiple databases:
{% include copy-clipboard.html %}
> BACKUP DATABASE bank, employees \
INTO 'gs://{BUCKET NAME}/{PATH}?AUTH=specified&CREDENTIALS={ENCODED KEY}' \
AS OF SYSTEM TIME '-10s';
To take a full backup of a single table or view:
{% include copy-clipboard.html %}
> BACKUP bank.customers \
INTO 'gs://{BUCKET NAME}/{PATH}?AUTH=specified&CREDENTIALS={ENCODED KEY}' \
AS OF SYSTEM TIME '-10s';
To take a full backup of multiple tables:
{% include copy-clipboard.html %}
> BACKUP bank.customers, bank.accounts \
INTO 'gs://{BUCKET NAME}/{PATH}?AUTH=specified&CREDENTIALS={ENCODED KEY}' \
AS OF SYSTEM TIME '-10s';
{{site.data.alerts.callout_danger}} {% include {{page.version.version}}/backups/no-multiregion-table-backups.md %} {{site.data.alerts.end}}
If you backup to a destination already containing a full backup, an incremental backup will be appended to the full backup's path with a date-based name (e.g., 20210324
):
{% include copy-clipboard.html %}
> BACKUP INTO LATEST IN \
'gs://{BUCKET NAME}/{PATH}?AUTH=specified&CREDENTIALS={ENCODED KEY}' \
AS OF SYSTEM TIME '-10s';
Use the DETACHED
option to execute the backup job asynchronously:
{% include copy-clipboard.html %}
> BACKUP INTO \
'gs://{BUCKET NAME}/{PATH}?AUTH=specified&CREDENTIALS={ENCODED KEY}' \
AS OF SYSTEM TIME '-10s'
WITH DETACHED;
The job ID is returned immediately without waiting for the job to finish:
job_id
----------------------
592786066399264769
(1 row)
Without the DETACHED
option, BACKUP
will block the SQL connection until the job completes. Once finished, the job status and more detailed job data is returned:
job_id | status | fraction_completed | rows | index_entries | bytes
---------------------+-----------+--------------------+------+---------------+--------
652471804772712449 | succeeded | 1 | 50 | 0 | 4911
(1 row)
{% include {{ page.version.version }}/backups/advanced-examples-list.md %}
{% include {{ page.version.version }}/known-limitations/backup-interleaved.md %}
{% include {{ page.version.version }}/known-limitations/write-intent-buildup.md %}
{% include {{page.version.version}}/backups/no-multiregion-table-backups.md %}