IT/Software/Backup Programs/Borg Backup: Difference between revisions

Access restrictions were established for this page. If you see this message, you have no access to this page.

Latest revision as of 14:38, 1 May 2024

About

Borg Backup is a backup program that features compression deduplication, data compression and runs nicely over SSH.

https://www.borgbackup.org/

This page is incomplete.

Setup

BorgBackup must be installed on both a client and server machine to perform remote backups. This is a good thing because it requires much much less bandwidth to perform backups and is much less latency dependent.

To setup the client, simply install borgbackup. It is in the ubuntu repositories. It's a little bit out of date in 22.04 (version 1.2.0, latest stable at time of writing is 1.2.7 and 23.04 on all have development version 2.0), but not an issue.

To setup the server, there are a few more steps.

It is recommended to create a dedicated user for borg for security.
Setup passwordless ssh login for that user using a key file with the client machine. Not necessary but makes things much easier.
For security again, add the following to the beginning of the authorized_keys entry that contains the client public key.

command="borg serve --restrict-to-repository /path/to/repo",restrict

It should look like this:

command="borg serve --restrict-to-repository /path/to/repo",restrict ssh-rsa AbCgnbiuorgurigt743GREG4r43d...B3= username@clienthostname

This forces any login using that private key to run the command borg serve, which disallows any other commands.

See the borg serve docs and the borg Hosting repositories docs for more possible configurations.

Finally, run one of the following to initialize the repository.

# On the server
borg init -e=none /path/to/repo 

# On the client
borg init -e=none ssh://username@serverhostname/path/to/repo

The repo is setup and connection is established between the server and client. Now you just need to setup a cronjob to perform a backup.

TODO: cronjob to backup

Usage

Terms

Repository: A place backup archives can be stored. From a lowlevel perspective, it's a folder on a server. Contains one archive per backup, but not one file per archive. Understanding the structure of the repository folder is done only through borg.
- Actually each file is a chunk of deduplicated data.
Archive: A single full backup. Can be thought of as a snapshot of a point in time. Use borg list to see archives in a repository. An archive's name is often something of the form "msgcnxfiles 2024-05-01 12:00".
- A single repo can contain archives of completely different folders. It's not a bad idea to do so either, because borg's deduplication can save space between them.

Commands

There are several important commands to understand borg. Note that all options (arguments with a - like -s or --progress) MUST come before or after positional arguments such as a repo URL, and not between.

Other borg commands exist, these are only the most commonly used. Use borg help or see the documentation for exhaustive information.

init

This command is used to create a new borg repository. This repository can be created anywhere you have access, such as on a local mounted disk, or on a remote borg instance over ssh. The syntax is simple, but requires the -e flag for encryption settings. Choose none for no encryption, or repokey for standard SHA-256 encryption. The examples will all be no encryption as we don't require it.

# At-a-glance syntax
borg init -e=none <repository-location>

# To create a repository in a directory on the local machine. 
# The directory should already exist, or supply --make-parent-dirs to borg
borg init -e=none /path/to/repo

# To create a repository on a remote machine
borg init -e=none ssh://username@hostname/path/to/repo 

# To create a repository on a remote machine relative to the user's home directory
borg init -e=none ssh://user@hostname/~/Documents/repository

Creating a local repository from machine A is equivalent to creating a remote repository on machine A from machine B over ssh.

create

This command is used to create a new backup inside an existing repository. Specify the intended compression level, the target repository, and the source location. Paths are stored in the repository archive exactly as they are written in the create command. This means that the command

borg create ssh://user@hostname/~/borgrepository::backup-name ./Documents/backupfolder

will take the relative folder Documents/backupfolder and store it in the repository with the name "backup-name" under the path Documents/backupfolder. This means thought should be put into what working directory borg is run from.

# At-a-glance syntax
borg create ssh://user@hostname/~/Documents/repository::backupfolder-{now} backupfolder

# Options:
# --list Print out each file as it is processed
# --exclude Exclude a glob. Ex. Do not backup .vdi files: --exclude '*.vdi'
# -C | --compression <compalg>,<level> Compression. Possible compression algorithms are given by `borg help compression`. Essentially lz4 is high speed low compression, zstd is variable and has a different compression level based on the level you give it, zlib is medium (and also has levels), lzma is low speed high compression, auto chooses for each chunk whether to compress, obfuscate is useful when using encryption. In our experience lzma is the most efficient for our internet speeds and CPU usage.
# --progress Print progress
# --stats Print stats after running on the repository
# --paths-from-stdin Not usually useful but allows specifying specific files to backup. Can be useful with find. See borg docs for info. 
# -n | --dry-run Performs all steps except for actually making the archive. Useful for testing.

# Typical backup. Prints progress information and afterwards statistics regarding how the size of your backup changed
borg create -C auto,lzma --progress --stats ssh://user@hostname/~/Documents/repository ./foldertobackup

You can fill in several placeholders in the archive name (the part after the ::) that borg will fill in.

{hostname} - the hostname of the source server
{user} - username of the source server
{now} - Current time and date
{now:%Y-%m-%dT%H:%M:%S} - Current time and date in format 2024-05-01T12:38:15

There are further options for backing up raw devices, for backing up the other direction using an sshfs, etc. See borg docs for this information.

extract

Extract extracts the contents of an archive to the current working directory. Always cd to the intended location before running borg extract. Use --progress for extra information, but keep in mind this requires an extra pass over the metadata and will make the process a little slower.

# At-a-glance syntax
borg extract ssh://user@hostname/~/Documents/repository::backupfolder

# List all files while processing
borg extract --list ssh://user@hostname/~/Documents/repository::backupfolder

# Dry run; do not actually write files
borg extract -n ssh://user@hostname/~/Documents/repository::backupfolder

# Extract only "src" directory
borg extract ssh://user@hostname/~/Documents/repository::backupfolder path/to/src

# Extract "src" directory, excluding vdi files
borg extract ssh://user@hostname/~/Documents/repository::backupfolder path/to/src --exclude '*.vdi'

list

Print out all archives in a repository, or files inside an archive.

# At-a-glance syntax to print archives in a repository
borg list ssh://user@hostname/~/Documents/repository

# At-a-glance syntax to print files in an archive
borg list ssh://user@hostname/~/Documents/repository::backupfolder

# Custom format (something like "-rw-rw-r-- user   user    1416192 Sun, 2015-02-01 11:00:00 code/myproject/file.ext")
#                      permissions - 6 space pad user - 6 space pad group - 8 digits of size - iso time - path - newline
borg list ssh://user@hostname/~/Documents/repository::backupfolder --format="{mode} {user:6} {group:6} {size:8d} {isomtime} {path}{NEWLINE}"

prune

Prune the repository, deleting unneeded archives. This is usually used by an automated script. No disk space is actually freed until borg compact is run.

Acceptable time values are

H hours
d days
w weeks
m months
y years

# Keep 7 end of day and 4 additional end of week archives.
# Do a dry-run without actually deleting anything.
borg prune -v --list --dry-run --keep-daily=7 --keep-weekly=4 /path/to/repo

# Same as above but only apply to archive names starting with the hostname
# of the machine followed by a "-" character:
borg prune -v --list --keep-daily=7 --keep-weekly=4 --glob-archives='{hostname}-*' /path/to/repo
# actually free disk space:
borg compact /path/to/repo

# Keep 7 end of day, 4 additional end of week archives,
# and an end of month archive for every month:
$ borg prune -v --list --keep-daily=7 --keep-weekly=4 --keep-monthly=-1 /path/to/repo

# Keep all backups in the last 10 days, 4 additional end of week archives,
# and an end of month archive for every month:
# (I'm not actually entirely sure how -1 works as a value, but it keeps one archive for each month)
$ borg prune -v --list --keep-within=10d --keep-weekly=4 --keep-monthly=-1 /path/to/repo

compact

Shrink the repository and remove unused data. Does not need to be run after every borg command but should be run periodically. Makes sense to run after every prune. Only bothers to compact segments that would free up at least {threshold} (default 10%) of their space if compacted.

# Basic compact
borg compact ssh://user@hostname/~/Documents/repository::backupfolder

# Compact all segments where saved space would be more than 5%, and print an estimate of freed space
borg compact --threshold 5 --verbose

info

Prints out information about a repository. Helpful to see information such as size of all archives combined, size of the latest archive, size on disk, etc.

# Repository info
borg info ssh://user@hostname/~/Documents/repository

# Archive info
borg info ssh://user@hostname/~/Documents/repository::backupfolder

# Info about the most recent archive
borg info ssh://user@hostname/~/Documents/repository --last 1

mount

Mount an archive or a repository as a FUSE filesystem. Very useful for seeing exactly what is in an archive. Note that fuse.borgfs is an alias command usable in fstab to do the same thing.

By default archive file permission usernames will be mapped to system usernames by name. Use --numeric-ids to use existing id numbers instead of text name mapping. Use -o uid=1000,gid=1000 to choose a user id and group id for all files.

By default the path is mounted and the command closes. Use --foreground to keep the command alive until the device is unmounted. Use system umount to unmount the directory.

# Basic usage to map an archive
borg mount ssh://user@hostname/~/Documents/repository::backupfolder /mnt

# Basic usage to mount a repo. Note the archives are lazyloaded and not processed until requested. Expect a delay when opening each archive's folder. 
borg mount ssh://user@hostname/~/Documents/repository /mnt

serve

This command should never be used manually. It begins a borg server process for storing repositories so that it can be connected to by a client.

See borg docs for more info, but most often it is just

borg serve

# Only allow one repository
borg serve --restrict-to-path /path/to/repo

See Setup above on how to use this to securely lock a user account to only be able to use borg through ssh.

Creating a backup

When creating a backup we need to specify what kind of compression we want to use, where the repo we want to backup to is, and where the source files we want to backup are located.

borg create -C auto,lzma --progress repo/location/::name-of-backup location/to/be/backed/up

LZMA compression uses more CPU and less storage space.

Name of backup must be unique so using the date command instead of a static name is desirable when automating backups.

... repo/location/::`date '+%Y-%m-%d-%H.%M.%S'` location/to/be/backed/up

Backing up over SSH

In all Borg commands we can use ssh://ip.of.server/repo/location/on/server.

borg create -C auto,lzma --progress ssh://my.backup.server/repo/location/::name-of-backup location/to/be/backed/up

Viewing a repos backups

To list all the backups in a repo we can run the following:

borg list /path/to/repo

Restoring from a backup

There are two good ways to restore a borg backup. The first is through borg export, and the second is through borg mount.

All you need is to have borg installed, and filesystem access to the borg repository directory.

borg list path/to/repo

Run borg list to see what archives you have. Now you can take advantage of that archive.

If you're unsure of what you want, use borg mount.

We can mount a Borg backup as if it was a regular drive anywhere in the filesystem.

borg mount /path/to/repo/::archiveName /mnt

We can pull files from the backup as if it were a regular drive.

To unmount the backup we can run:

umount mountpoint/

This works, but if you know exactly what you want, borg export is faster. Remember to cd into the directory you want the files to go beforehand.

cd msgcnxFiles; borg export /path/to/repo::archiveName msgcnxFiles/ # Exports just the msgcnxFiles directory
cd allData; borg export /path/to/repo::archiveName # Exports everything in the repository

Pruning old backups

By default Borg will keep backups forever.

We can prune backups by running borg prune.

borg prune -v --list --keep-hourly=48 --keep-daily=30 --keep-monthly=12 /path/to/repo/

In this example we will assume a backup job is running hourly.

In this example we will keep 1 backup per hour for the past 48 hours, 1 backup per day for the past 30 days, and 1 backup per month for the past 12 months.

Borg will keep the most recent backup from the time period it is pruning.

In the example we would keep the backup ran at 23:00 for the past 30 days and the last backup of the month for the monthly.

Backup Scripting

Example:

#!/bin/bash
cd /location/to/be/backed/up
borg create -C auto,lzma --progress /path/to/repo/::`date '+%Y-%m-%d-%H.%M.%S'` .
borg prune -v --list --keep-hourly=48 --keep-daily=30 --keep-monthly=12 /path/to/repo/