A file-backup utility
Creates a directory named as the original file, containing a tarred copy of the file, optionally compressed.
Files are added to the tar archive only if they were changed,
i.e. modification time is greater as compared to the last archive and size (or checksum) is different.
The directory containing tar files is placed in a mirrored directory tree.
Each backup is a separate tar file.
- Install Python (at least 3.9), if not yet installed
- Download rumar.py
- Download rumar.toml to the same directory as
rumar.py
- Edit
rumar.toml
and adapt it to your needs – see settings details - Open a console/terminal (e.g. Windows PowerShell) and change to the directory containing
rumar.py
- If your installed Python version is below 3.11, run
python -m pip install tomli
to install the module tomli, if not yet done - Run
python rumar.py list-profiles
→ you should see your profile name(s) printed in the console - Run
python rumar.py create --profile "My Documents"
to create a backup using the profile "My Documents" - Add this command to Task Scheduler or cron, to be run at an interval or each day/night
- Run
python rumar.py sweep --profile "My Documents" --dry-run
and verify the files to be removed - Run
python rumar.py sweep --profile "My Documents"
to remove old backups - Add this command to Task Scheduler or cron, to be run at an interval or each day/night
Note: when --dry-run
is used, rumar.py counts the backup files and selects those to be removed based on settings, but no files are actually deleted.
Unless specified by --toml path/to/your/settings.toml
,
settings are loaded from rumar.toml
in the same directory as rumar.py
or located in rumar/rumar.toml
inside $XDG_CONFIG_HOME
($HOME/.config
if not set) on POSIX,
or inside %APPDATA%
on NT (MS Windows).
rumar.toml
# schema version - always 2
version = 2
# settings common for all profiles
backup_base_dir = 'C:\Users\Mac\Backup'
# setting for individual profiles - override any common ones
["My Documents"]
source_dir = 'C:\Users\Mac\Documents'
excluded_top_dirs = ['My Music', 'My Pictures', 'My Videos']
excluded_files_as_glob = ['desktop.ini', 'Thumbs.db']
[Desktop]
source_dir = 'C:\Users\Mac\Desktop'
excluded_files_as_glob = ['desktop.ini', '*.exe', '*.msi']
['# this profile starts with a hash, therefore will be ignored']
source_dir = "this setting won't be loaded"
Profiles which start with a hash #
are ignored when rumar.toml
is loaded.
version indicates schema version and for now is always 2
.
- backup_base_dir: str used by: create, sweep
path to the base directory used for backup; usually set in the global space, common for all profiles
backup dir for each profile is constructed as backup_base_dir + profile, unless backup_base_dir_for_profile is set, which takes precedence - backup_base_dir_for_profile: str used by: create, sweep
path to the base dir used for the profile; usually left unset; see backup_base_dir - archive_format: Literal['tar', 'tar.gz', 'tar.bz2', 'tar.xz'] = 'tar.gz' used by: create, sweep
format of archive files to be created - compression_level: int = 3 used by: create
for the formats 'tar.gz', 'tar.bz2', 'tar.xz': compression level from 0 to 9 - no_compression_suffixes_default: str = '7z,zip,zipx,jar,rar,tgz,gz,tbz,bz2,xz,zst,zstd,xlsx,docx,pptx,ods,odt,odp,odg,odb,epub,mobi,png,jpg,gif,mp4,mov,avi,mp3,m4a,aac,ogg,ogv,kdbx' used by: create
comma-separated string of lower-case suffixes for which to use uncompressed tar - no_compression_suffixes: str = '' used by: create
extra lower-case suffixes in addition to no_compression_suffixes_default - tar_format: Literal[0, 1, 2] = tarfile.GNU_FORMAT used by: create
Double Commander fails to correctly display mtime when PAX is used, therefore GNU is the default - source_dir: str used by: create
path to the directory which is to be archived - included_top_dirs: list[str] used by: create, sweep
a list of paths
if present, only files from those dirs and their descendant subdirs will be considered, together with included_files_as_glob
the paths can be relative to source_dir or absolute, but always under source_dir
if missing, source_dir and all its descendant subdirs will be considered - excluded_top_dirs: list[str] used by: create, sweep
like included_top_dirs, but for exclusion - included_dirs_as_regex: list[str] used by: create, sweep
a list of regex patterns, applied after ..._top_dirs and dirnames of ..._files_as_glob
if present, only matching directories will be included
/
must be used as the path separator, also on MS Windows
the patterns are matched against a path relative to source_dir
the first segment in the relative path (to match against) also starts with a slash
e.g.['/B$',]
will match any basename equal toB
, at any level
regex-pattern matching is case-sensitive – use(?i)
at each pattern's beginning for case-insensitive matching
see also https://docs.python.org/3/library/re.html - excluded_dirs_as_regex: list[str] used by: create, sweep
like included_dirs_as_regex, but for exclusion - included_files_as_glob: list[str] used by: create, sweep
a list of glob patterns, also known as shell-style wildcards, i.e.* ? [seq] [!seq]
if present, only matching files will be considered, together with files from included_top_dirs
the paths/globs can be partial, relative to source_dir or absolute, but always under source_dir
e.g.["My Music\*.m3u"]
on MS Windows, global-pattern matching is case-insensitive
caution: a leading path separator in a path/glob indicates a root directory, e.g.["\My Music\*"]
meansC:\My Music\*
orD:\My Music\*
but notC:\Users\Mac\Documents\My Music\*
see also https://docs.python.org/3/library/fnmatch.html and https://en.wikipedia.org/wiki/Glob_(programming) - excluded_files_as_glob: list[str] used by: create, sweep
like included_files_as_glob, but for exclusion - included_files_as_regex: list[str] used by: create, sweep
like included_dirs_as_regex, but for files
applied after ..._top_dirs and ..._dirs_as_regex and ..._files_as_glob - excluded_files_as_regex: list[str] used by: create, sweep
like included_files_as_regex, but for exclusion - checksum_comparison_if_same_size: bool = False used by: create
when False, a file is considered changed if its mtime is later than the latest backup's mtime and its size changed
when True, BLAKE2b checksum is calculated to determine if the file changed despite having the same size
mtime := time of last modification
see also https://en.wikipedia.org/wiki/File_verification - file_deduplication: bool = False used by: create
when True, an attempt is made to find and skip duplicates
a duplicate file has the same suffix and size and part of its name, case-insensitive (suffix, name) - min_age_in_days_of_backups_to_sweep: int = 2 used by: sweep
only the backups which are older than the specified number of days are considered for removal - number_of_backups_per_day_to_keep: int = 2 used by: sweep
for each file, the specified number of backups per day is kept, if available
more backups per day might be kept to satisfy number_of_backups_per_week_to_keep and/or number_of_backups_per_month_to_keep
oldest backups are removed first - number_of_backups_per_week_to_keep: int = 14 used by: sweep
for each file, the specified number of backups per week is kept, if available
more backups per week might be kept to satisfy number_of_backups_per_day_to_keep and/or number_of_backups_per_month_to_keep
oldest backups are removed first - number_of_backups_per_month_to_keep: int = 60 used by: sweep
for each file, the specified number of backups per month is kept, if available
more backups per month might be kept to satisfy number_of_backups_per_day_to_keep and/or number_of_backups_per_week_to_keep
oldest backups are removed first - commands_which_use_filters: list[str] = ['create'] used by: create, sweep
determines which commands can use the filters specified in the included_* and excluded_* settings
by default, filters are used only by create, i.e. sweep considers all created backups (no filter is applied)
a filter for sweep could be used to e.g. never remove backups from the first day of a month:
excluded_files_as_regex = ['/\d\d\d\d-\d\d-01_\d\d,\d\d,\d\d\.\d{6}(\+|-)\d\d,\d\d\~\d+(~.+)?.tar(\.(gz|bz2|xz))?$']
it's best when the setting is part of a separate profile, i.e. a copy made for sweep,
otherwise create will also seek such files to be excluded
Version 1 contained sha256_comparison_if_same_size
.
In version 2 it's checksum_comparison_if_same_size
.
Logging is controlled by settings located in rumar/rumar.logging.toml
inside $XDG_CONFIG_HOME
($HOME/.config
if not set) on POSIX,
or inside %APPDATA%
on NT (MS Windows).
By default, rumar.log
is created in the current directory (where rumar.py
is executed).
To disable the creation of rumar.log
,
copy the below to rumar.logging.toml
in the appropriate location
and put a hash #
in front of "to_file",
in [loggers.rumar]
.
version = 1
[formatters.f1]
format = "{levelShort} {asctime}: {funcName:24} {msg}"
style = "{"
validate = true
[handlers.to_console]
class = "logging.StreamHandler"
formatter = "f1"
#level = "DEBUG_14"
[handlers.to_file]
class = "logging.FileHandler"
filename = "rumar.log"
encoding = "UTF-8"
formatter = "f1"
#level = "DEBUG_14"
[loggers.rumar]
handlers = [
"to_console",
"to_file",
]
level = "DEBUG_14"
More information: https://docs.python.org/3/library/logging.config.html#logging-config-dictschema
Copyright © 2023, 2024 macmarrum
SPDX-License-Identifier: GPL-3.0-or-later