Skip to content

Commit

Permalink
IGNITE-18927 Add incremental snapshot docs (apache#10609)
Browse files Browse the repository at this point in the history
  • Loading branch information
timoninmaxim authored Mar 28, 2023
1 parent e68e3c5 commit 857fe89
Show file tree
Hide file tree
Showing 4 changed files with 82 additions and 14 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,11 @@ void configuration() {
try (IgniteCache<Integer, String> cache = ignite.getOrCreateCache(ccfg)) {
cache.put(1, "Maxim");

// Start snapshot operation.
// Create snapshot operation.
ignite.snapshot().createSnapshot("snapshot_02092020").get();

// Create incremental snapshot operation.
ignite.snapshot().createIncrementalSnapshot("snapshot_02092020").get();
}
finally {
ignite.destroyCache(ccfg.getName());
Expand All @@ -52,6 +55,10 @@ void configuration() {
//tag::restore[]
// Restore cache named "snapshot-cache" from the snapshot "snapshot_02092020".
ignite.snapshot().restoreSnapshot("snapshot_02092020", Collections.singleton("snapshot-cache")).get();

// Restore cache named "snapshot-cache" from the snapshot "snapshot_02092020" and its increment with index 1.
ignite.snapshot().restoreSnapshot("snapshot_02092020", Collections.singleton("snapshot-cache"), 1).get();

//end::restore[]

ignite.close();
Expand Down
3 changes: 3 additions & 0 deletions docs/_docs/monitoring-metrics/system-views.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -988,4 +988,7 @@ The SNAPSHOT view exposes information about local snapshots.
| CONSISTENT_ID | VARCHAR | Consistent ID of a node to which snapshot data relates.
| BASELINE_NODES | VARCHAR | Baseline nodes affected by the snapshot.
| CACHE_GROUPS | VARCHAR | Cache group names that were included in the snapshot.
| SNAPSHOT_RECORD_SEGMENT | BIGINT | Index of WAL segment containing snapshot's WAL record.
| INCREMENT_INDEX | INTEGER | Incremental snapshot index.
| TYPE | VARCHAR | Type of snapshot - full or incremental.
|===
82 changes: 70 additions & 12 deletions docs/_docs/snapshots/snapshots.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@

== Overview

Ignite provides an ability to create full cluster snapshots for deployments using
Ignite provides an ability to create full and incremental cluster snapshots for deployments using
link:persistence/native-persistence[Ignite Persistence]. An Ignite snapshot includes a consistent cluster-wide copy of
all data records persisted on disk and some other files needed for a restore procedure.

Expand All @@ -28,15 +28,15 @@ with several exceptions. Let's take this snapshot as an example to review the st
work
└── snapshots
└── backup23012020
├── increments
│ └── 0000000000000001
└── db
├── binary_meta
│ ├── node1
│ ├── node2
│ └── node3
├── marshaller
│ ├── node1
│ ├── node2
│ └── node3
│ └── classname0
├── node1
│ └── my-sample-cache
│ ├── cache_data.dat
Expand All @@ -62,8 +62,10 @@ the nodes are named as `node1`, `node2`, and `node3`, while in practice, the nam
link:https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Persistent+Store+-+under+the+hood#IgnitePersistentStoreunderthehood-SubfoldersGeneration[consistent IDs].
* The snapshot keeps a copy of the `my-sample-cache` cache.
* The `db` folder keeps a copy of data records in `part-N.bin` and `cache_data.dat` files. Write-ahead and checkpointing
are not added into the snapshot as long as those are not required for the current restore procedure.
are not added into the snapshot as long as those are not required for the full snapshot restore procedure.
* The `binary_meta` and `marshaller` directories store metadata and marshaller-specific information.
* The `increments` directory stores incremental snapshots based on the full snapshot `backup23012020`, in this example
there is a single increment `0000000000000001`. It contains `wal` directory that stores the compressed WAL segments, `binary_meta` and `marshaller` directories.

[NOTE]
====
Expand All @@ -76,6 +78,31 @@ snapshot data spread across the cluster. Each node keeps a segment of the snapsh
The link:snapshots/snapshots#restoring-from-snapshot[restore procedure] explains how to tether together all the segments during recovery.
====

== Incremental snapshots

The low RPO (Recovery Point Object), e.g. a few minutes, can hardly be achieved using full snapshots. They require additional resources
to create and store all partitions data. Instead, you can use incremental snapshots:

1. to store the data changes happened since previous full or incremental snapshot was created
2. to provide a lightweight creation process and can be run concurrently with runtime load.

[NOTE]
====
Incremental snapshots consist of compressed WAL segments, which are collected in the background without pressure on cluster resources.
====

There are some prerequisites for using incremental snapshots:

* Incremental snapshots are based on existing full snapshot.
* link:persistence/native-persistence#wal-archive-compaction[WAL archive compaction] has to be enabled.
* Incremental snapshots has to be created on the same media drive where WAL archives are stored.

During incremental snapshot restore procedure the full snapshot is restored first and after that all increments are restored sequentially.

Please refer to the sections link:snapshots/snapshots#consistency-guarantees[Consistency Guarantees] and
link:snapshots/snapshots#current-limitations[Current Limitations] below for more details about incremental snapshots.

== Configuration

=== Snapshot Directory
Expand Down Expand Up @@ -133,7 +160,10 @@ control.(sh|bat) --snapshot create snapshot_09062021
control.(sh|bat) --snapshot create snapshot_09062021 --sync
# Create a cluster snapshot named "snapshot_09062021" in the "/tmp/ignite/snapshots" folder (the full path to the snapshot files will be /tmp/ignite/snapshots/snapshot_09062021):
control.(sh|bat) --snapshot create snapshot_09062021 -dest /tmp/ignite/snapshots
control.(sh|bat) --snapshot create snapshot_09062021 --dest /tmp/ignite/snapshots
# Create an incremental snapshot based on full snapshot "snapshot_09062021":
control.(sh|bat) --snapshot create snapshot_09062021 --incremental
----

=== Using JMX
Expand All @@ -143,7 +173,8 @@ Use the `SnapshotMXBean` interface to perform the snapshot-specific procedures v
[cols="1,1",opts="header"]
|===
|Method | Description
|createSnapshot(String snpName) | Create a snapshot.
|createSnapshot(String snpName, String snpPath) | Create a snapshot.
|createIncrementalSnapshot(String snpName, String snpPath) | Create an incremental snapshot.
|===

=== Using Java API
Expand Down Expand Up @@ -181,12 +212,12 @@ Both procedures are described below, however, it is preferable to use the restor

=== Manual Snapshot Restore Procedure

The snapshot structure is similar to the layout of the Ignite Native Persistence, so for the manual snapshot restore you must
do a snapshot restore only on the same cluster with the same node `consistentId` and on the same topology on which a snapshot
was taken. If you need to restore a snapshot on a different cluster or on a different cluster topology use the
link:snapshots/snapshots#automatic-snapshot-restore-procedure[Automatic Snapshot Restore Procedure].
The snapshot structure is similar to the layout of the Ignite Native Persistence. Therefore, to restore the manual snapshot, you must
restore a snapshot only on the same cluster with the same node `consistentId` and on the same topology on which a snapshot
was taken. Only the full snapshot can be restored. If you need to restore a snapshot on a different cluster, or on a different
cluster topology, or restore incremental snapshots use the link:snapshots/snapshots#automatic-snapshot-restore-procedure[Automatic Snapshot Restore Procedure].

In general, stop the cluster, then replace persistence data and other files with the data from the snapshot, and restart the nodes.
In general, stop the cluster, then replace the persistence data and other files using the data from the snapshot, and restart the nodes.

The detailed procedure looks as follows:

Expand Down Expand Up @@ -228,6 +259,10 @@ tab:CLI[]
----
# Restore cache group "snapshot-cache" from the snapshot "snapshot_02092020".
control.(sh|bat) --snapshot restore snapshot_02092020 --groups snapshot-cache
# Restore cache group "snapshot-cache" from the snapshot "snapshot_02092020" and its increment with index 1.
control.(sh|bat) --snapshot restore snapshot_02092020 --groups snapshot-cache --increment 1
----
--

Expand All @@ -247,6 +282,9 @@ control.(sh|bat) --snapshot restore snapshot_09062021 --src /tmp/ignite/snapshot
# Start restoring only "cache-group1" and "cache-group2" from the snapshot "snapshot_09062021" in the background.
control.(sh|bat) --snapshot restore snapshot_09062021 --groups cache-group1,cache-group2
# Start restoring all user-created cache groups from the snapshot "snapshot_09062021" and its increment with index 1.
control.(sh|bat) --snapshot restore snapshot_09062021 --increment 1
----

== Getting Snapshot Operation Status
Expand Down Expand Up @@ -332,6 +370,14 @@ The consistency between the Ignite Persistence files and their snapshot copies i
files to the destination snapshot directory with tracking all concurrent ongoing changes. The tracking of the changes
might require extra space on the Ignite Persistence storage media (up to the 1x size of the storage media).

=== Incremental snapshot consistency guarantees

Incremental snapshots uses different non-blocking approach for achieving transactional consistency based on the Consistent Cut algorithm.
This allows you to start incremental snapshots concurrently with the runtime load without affecting performance. But it doesn't guarantee consistency
for atomic caches. It's highly recommended to verify these caches after restoring with the `idle_verify`
command. If necessary, it's possible to repair inconsistent partitions with the `consistency` command. Please, check the
link:tools/control-script[Control Script] section for more information about these commands.

== Current Limitations

The snapshot procedure has some limitations that you should be aware of before using the feature in your production environment:
Expand All @@ -347,3 +393,15 @@ The snapshot procedure has some limitations that you should be aware of before u

If any of these limitations prevent you from using Apache Ignite, then select alternate snapshotting implementations for
Ignite provided by enterprise vendors.

=== Incremental snapshot limitations

Incremental snapshots can't be created in the following cases:

* Encrypted caches are presented in a cluster.
* Caches are created, changed or destroyed after full snapshot was created.
* After link:data-rebalancing[data has been rebalanced] in the cluster.

Ignite automatically monitors these events and prevents the incremental snapshot creation. It's required to create a new
full snapshot and after that creation of incremental snapshots becomes available again.

Original file line number Diff line number Diff line change
Expand Up @@ -253,7 +253,7 @@
* These major actions available:
* <ul>
* <li>Create snapshot of the whole cluster cache groups by triggering PME to achieve consistency.</li>
* <li>Create incremental snapshot of transactional cache groups by using Consistent Cut algorithm.</li>
* <li>Create incremental snapshot using lightweight, non-blocking Consistent Cut algorithm.</li>
* </ul>
*/
public class IgniteSnapshotManager extends GridCacheSharedManagerAdapter
Expand Down

0 comments on commit 857fe89

Please sign in to comment.