Skip to content

Commit

Permalink
archiving backup draft material into draft folder
Browse files Browse the repository at this point in the history
  • Loading branch information
Sam Kleinman committed Jan 8, 2014
1 parent 8dbff98 commit 09d4a65
Show file tree
Hide file tree
Showing 3 changed files with 304 additions and 0 deletions.
76 changes: 76 additions & 0 deletions draft/administration/backup-considerations.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
Deployment Considerations
-------------------------

All Deployments
~~~~~~~~~~~~~~~

To facilitate a robust backup strategy, all MongoDB deployments
should:

- use MMS Backups.

- run all production deployments with journaling enabled. By default,
MongoDB enables journaling. The journal can facilitate snapshots and
provides robust durability for :program:`mongod` instances.

Replica Sets
~~~~~~~~~~~~

For replica sets, create backups from secondary members to minimize
the impact on the :doc:`primary </core/replica-set-primary>` with
backup operations. Consider using a :doc:`hidden member
</core/replica-set-hidden-member>` as a dedicated backup instance.

Sharded Clusters
~~~~~~~~~~~~~~~~

Any approach to sharded cluster backups must ensure data consistency
between shards. To create a backup of a sharded cluster you must
:doc:`turn off the balancer
</tutorial/schedule-backup-window-for-sharded-clusters>`.

Additionally, your backup method must synchronize the snapshots of all
shards. MMS uses a synchronization token, but you can accomplish the
same effect by stopping all write operations while capturing backups.

When backing up any sharded cluster, you must also :doc:`back up the
config server metadata </tutorial/backup-sharded-cluster-metadata>`.

Testing and Restoring Backups
-----------------------------

A backup system is only useful if it is possible to restore and
recover data using the backup. *Always* test backups to ensure that
restorations are viable, and include recovery testing as part of your
larger backup strategy.

Consider the following restoration tutorials:

- :doc:`/tutorial/restore-replica-set-from-backup`
- :doc:`/tutorial/restore-single-shard`
- :doc:`/tutorial/restore-sharded-cluster`.

Backup Considerations
---------------------

As you develop a backup strategy for your MongoDB deployment consider
the following factors:

- Geography. Ensure that you move some backups away from your
primary database infrastructure.

- System errors. Ensure that your backups can survive situations where
hardware failures or disk errors impact the integrity or
availability of your backups.

- Production constraints. Backup operations themselves sometimes require
substantial system resources. It is important to consider the time of
the backup schedule relative to peak usage and maintenance windows.

- System capabilities. Some of the block-level snapshot tools require
special support on the operating-system or infrastructure level.

- Database configuration. :term:`Replication` and :term:`sharding
<shard>` can affect the process and impact of the backup
implementation. See :ref:`sharded-cluster-backups` and
:ref:`replica-set-backups`.
160 changes: 160 additions & 0 deletions draft/administration/backup-methods.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
======================
MongoDB Backup Methods
======================

When deploying MongoDB in production, you should have a strategy for
capturing and restoring backups in the case of data loss events. There
are a variety of different methods that you can use to back up
MongoDB, depending on your requirements and configuration.

Overview of Methods

* Employ MongoDB Management Service (MMS)
* Use mongodump and mongorestore tools
* Copy the Underlying Data Files

Backing up with MongoDB Management Service (MMS)
------------------------------------------------

MMS Backup (MMS) provides a fully managed backup solution for
MongoDB. MMS continually backs up MongoDB replica sets and sharded
systems by reading the oplog data from your MongoDB cluster.

MMS Backup offers point in time recovery of MongoDB replica sets and a
consistent snapshot of sharded systems.

MMS achieves point in time recovery by storing oplog data so that it
can recreate any moment in time within the last 24 hours for a
particular replica set.

For sharded systems, MMS can’t restore to any arbitrary moment in
time, but it does provide periodic consistent snapshots of the entire
sharded system, something that is difficult to achieve using any other
method of backing up MongoDB.

To restore your MongoDB cluster from an MMS Backup snapshot, you
download a compressed archive of your MongoDB data files and move
those files into position before restarting the mongod processes on
the various MongoDB nodes.

To get started with MMS Backup, see the MMS documentation.

Backing up with mongodump
-------------------------


:program:`mongodump` is used in conjunction with
:program:`mongorestore`. :program:`mongodump` creates backups by
querying the data in each targeted collection and writing that data to
a file. :program:`mongorestore` can take the output of
:program:`mongodump` and write that data to a MongoDB deployment.

:program:`mongodump` and :program:`mongorestore` can operate both
online, against a running mongod process, and offline, by manipulating
the underlying database files.

:program:`mongodump` does not capture indexes and stores only the documents that
it backs up. The resulting backup is space efficient, but indexes must
be recreated after a restore.

When connected to a running mongod, the :program:`mongodump` --oplog option
creates a consistent point in time backup. Use the corresponding
:program:`mongorestore` --oplogReplay options to restore these
backups.
while :program:`mongodump` with the --oplog option runs simultaneously.

To use the -oplog option to create a point-in-time backup, you must be
running a replica set versus a standalone mongod instance because only
a replica set produces the required oplog data. Even a single host can
run as a replica set.

:program:`mongodump` can adversely affect your database performance. When
connected to a MongoDB instance, :program:`mongodump` uses standard query and
write operations. If your data is larger than memory, :program:`mongodump` can
cause a large amount of IO and force your previous working set out
memory.

Because :program:`mongodump` can have an adverse effect on performance, it is
advisable to run it on a secondary node of a replica set so that the
primary node remains responsive. Alternatively, you can bring a
secondary node offline and use :program:`mongodump` in offline mode. However, be
sure that your dump can complete before your oplog turns over or the
secondary node will need to perform a complete resynch from the
primary.

:program:`mongodump` does not back up the local database unless explicitly
specified.

To backup a sharded system using :program:`mongodump`, you must backup a replica
set node on every shard. Obtaining a snapshot corresponding to an
exact moment in time across a sharded system is difficult with
:program:`mongodump`. You can approximate a moment in time snapshot by bringing a
secondary node offline for each shard at approximately the same moment
and then running :program:`mongodump` on each offline host.

Despite the performance limitations of :program:`mongodump` and :program:`mongorestore`,
they can be quite useful for backing up and restoring a collection to
a running system.

See Back Up and Restore with MongoDB Tools, Backup a Small Sharded
Cluster with :program:`mongodump`, and Backup a Sharded Cluster with Database
Dumps for more information.


Copying the Underlying Data Files
---------------------------------

You can create a backup of MongoDB by copying the underlying data
files that MongoDB uses to operate and restoring them as needed.

If the volume where MongoDB stores data files supports point in time
snapshots, you can use these snapshots to create backups of a MongoDB
system at an exact moment in time. File systems snapshots are an
operating system volume manager feature. not a feature specific to
MongoDB.

The mechanics of snapshots depend on the underlying volume management
system. For example if you are using Amazon’s EBS storage system on
EC2, it directly support snaphots. On Linux the LVM manager can create
a snapshot.

To get a correct snapshot of a running mongod process, you must have
journaling enabled within MongoDB. Without journaling enabled, there
is no guarantee that the snapshot will be consistent; there could be
half-written data.

The MongoDB journal file must reside on the same logical volume as the
other MongoDB data files for the snapshot to be consistent.

To get a consistent snapshot of a sharded system, you must snapshot a
replica set node of every shard and a config server at approximately
the same moment in time, while taking care to turn off the balancer
during the backup process.

File systems snapshots are efficient and do not impact performance
significantly if the underlying snapshot facility is well
implemented. They can also be space efficient if the snapshotting
feature of the volume manager supports incremental snapshots and
compression.

The primary drawback of using file systems snapshots is that they
don’t provide arbitrary point in time recovery for replica sets and
can be somewhat complicated to configure and manage for large sharded
systems.

If your storage system does not support snapshots, you can copy the
files directly. Since, copying multiple files is not an atomic
operation, you must stop the mongoDB process before starting to copy
the files. Otherwise, you will copy the files in an inconsistent
state.

For more information, see the Backup and Restore with Filesystem
Snapshots and Backup a Sharded Cluster with Filesystem Snapshots for
complete instructions on using LVM to create snapshots. Also see Back
up and Restore Processes for MongoDB on Amazon EC2.

A drawback of simply copying and restoring files is that any
underlying fragmentation within the MongoDB data files, due to
documents being updated and deleted, will remain after you restore the
files. :program:`mongodump` and :program:`mongorestore`, in contrast,
have the effect of removing fragmentation.
68 changes: 68 additions & 0 deletions draft/administration/mms-backup.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
Replica Sets
------------

MMS backups for replica sets uses a lightweight backup agent that runs
within a deployment to copy data to the Backup service. Internally,
MMS Backup operates like a secondary in a replica set.

The agent first performs a process that resembles the
:ref:`initial sync for new replica set members
<replica-set-initial-sync>`. The agent *always* collects :term:`oplog`
entries and, just like :doc:`replication
</core/replication-introduction>`, applies those operations to the
backup snapshots. MMS Backup uses this oplog to build point-in-time
snapshots of the replica set.

If MMS cannot reach the backup agent, MMS automatically sends an
alert. MMS also alerts if the MMS Backup service falls too far behind
the deployment.

See the :mms:`tutorial for enabling backup with MMS
</backup/tutorial/enable-backup-for-replica-set/>` to begin using MMS
Backup.

Sharded Clusters
~~~~~~~~~~~~~~~~

Each shard in a production sharded cluster is a replica set. MMS
Backup treats each shard like all other replica sets.

To create a backup of a sharded cluster, MMS Backup temporarily
disables all :ref:`balancing operations <sharding-balancing>`
and inserts a synchronization token into the oplog of all shards. MMS
Backup uses these tokens to produce snapshots of the cluster.

MMS Backup produces these cluster wide backups on a regular schedule,
and can restore the sharded cluster to one of these points in time.

See the :mms:`tutorial for enabling backup for sharded clusters with MMS
</backup/tutorial/enable-backup-for-sharded-cluster/>` to begin
backing up a sharded cluster with MMS Backup.

Restoration
~~~~~~~~~~~

MMS Backup provides restorations in the form of MongoDB data
files. MMS can either provide from the remote MMS servers or push a
set of MongoDB data files onto a local directory on a server.

See :mms:`Restore from a Stored Snapshot
</backup/tutorial/restore-from-snapshot/>` and :mms:`Restore a Replica
set from a Point in the Last 24 Hours
</backup/tutorial/restore-from-point-in-time-snapshot/>`.

Use
~~~

MMS simplifies the set up and configuration of a backup
system. The Backup agent has no dependencies and minimal
configuration. However, the Backup agent must be able to contact the
MMS server. See the :mms:`Backup agent installation
</backup/tutorial/install-and-start-backup-agent/>` and :mms:`Backup
agent configuration </backup/tutorial/configure-backup-agent>`
tutorials for a description of this process.

Because MMS tracks oplog entries, MMS Backup can provide restores for
any point in time within the past 24 hours. Regular snapshots for
replica sets and sharded clusters ensure your ability to restore to
specific earlier states as well.

0 comments on commit 09d4a65

Please sign in to comment.