Backups Everywhere!

A few years ago we lost a number of virtual servers due to an irrecoverable hardware issue at a local VPS provider. Those were managed servers, and as it turned out our provider (who shall remain nameless) had neglected to perform the daily backups that were in our service agreement. Whoops.

Fortunately we hadn’t trusted them completely, and we had copies of everything downloaded to one of our work computers. It took us a while, but we did get everything up and running again with minimal loss of data.

Lesson learned: Don’t rely on anyone else to handle your backups for you. Assume that any tools and services you use have flaws, and plan accordingly. This article describes the various backup tools that we use today, on the premise that what we’ve arrived at through trial and error may work for you also.

Linux Backup Tools

We use Linux virtual servers extensively, usually provided by either Linode or DigitalOcean. Our VPSs run either CloudLinux (if we need cPanel on the server) or Ubuntu.

We generally use two different methods to backup the servers that store user data, and at least one of those backup methods use offsite storage from a different provider. The tools that we rely on are:

  • VPS snapshots
  • R1Soft
  • rsync
  • JetBackup (for cPanel servers)
  • Borg

None of these are perfect on their own but by combining them we feel pretty confident that we always have at least one good backup if we need it.

VPS Snapshots

Most VPS vendors (including Linode and DigitalOcean) provide a simple backup solution that stores snapshots of a VPS at regular intervals. Building a new VPS from a snapshot takes on the order of a few minutes depending on the size of the virtual server, so recovering a failed VPS can be done quickly from this type of backup. VPS providers typically charge around 20-25% of the base VPS price for this service.

We use this type of backup on a few virtual servers, primarily application servers with non-standard setups that don’t store data but might take a while to rebuild should they crash. But otherwise we have found VPS snapshot backups to be of limited value to ourselves and our customers. When we need to restore something from backup it is usually individual files or databases, and that would require us to first spin up a new VPS from a snapshot and then extract the files that we need manually. Restoring an entire VPS is not something that we need to do very often.

Another problem is that databases that are backed up in this way are unlikely to be in a consistent state if restored from the snapshot. Some databases (e.g. PostgreSQL) are crash resistent and would probably heal themselves, but MySQl, SQLite and others may not be usable. You can deal with this by using tools such as mysqldump to dump the databases to a file before the backup runs, but in most cases you have limited control over the backup schedule. For example, Linode will let you specify a two-hour interval when the snapshotting should occur, but they will not always hit that window, so your database dump may be several hours out of sync with the file backup.

While VPS snapshots look tempting at first glance from a disaster recovery perspective, we don’t think that they add much value in our case. Provisioning a new VPS is usually a simple process that can be scripted, and files and databases can then be restored through other means.

Snapshots might have helped us recover in the scenario described at the beginning of this article, where we completely lost several virtual servers. But in reality we think it most likely that the snapshots would have been stored on the same storage component that held the virtual servers themselves, so they would probably have been lost along with everything else.

R1Soft

R1Soft is a proprietary backup system that works on the disk block level, bypassing the file system on the client server. It consists of a backup server that pulls backups from the client servers at regular intervals, and it provides a web based interface for backup management.

R1Soft licenses can be purchased from a number of resellers, and you can choose to run the backup server and manager on your own hardware or let someone else host it for you, sharing resources with their other customers. The license price for a VPS is around $5/month. We get our licences from https://cdplicenses.com/

We have been using R1Soft for several years on our own hardware and on backup servers shared with others, and overall we have found it to be fast and reliable. R1Soft does incremental backups, and you have a great deal of flexibility in specifying backup and retention schedules. Data compression using zlib or QuickLZ and AES 256 encryption are optional. File restores are also very quick and work as you would expect.

R1Soft is not without problems, however, as you will quickly discover if you perform a search. Some of those issues pertain to older versions and no longer exist, and others are not so relevant to our use case.

The main issue for us is that restoring individual MySQL or MariaDB databases directly from R1Soft backups just doesn’t work. In theory database restores are supported if the backup policy is correctly configured, but no. What happens is that it will try to do a restore for half an hour or so, then logs an error message and aborts.

We’ve given up on this one. Instead of relying on R1Softs database backup module we just dump the databases using mysqldump a few minutes before the backup is scheduled to run. If a database restore is required and we must use the R1Soft backup we just restore the dump file to the server and then import the database from that. The requirement to maintain the timing between the dump on the client and the backup schedule makes this somewhat brittle, but it works for now. Hopefully the underlying issue gets fixed eventually.

The centralised architecture makes backup management and monitoring easy, compared to backup systems where you have to login to each individual server to get a status. R1Soft can be configured to send a backup status email at regular intervals – we get the report every morning, and we usually only have to read the subject line to verify that everything is working as expected.

There is no direct access from the client VPS to the backups stored on the backup server, which can be an advantage from a security perspective since an attacker that gains access to a server cannot easily wipe or otherwise compromise the backups. On the flip side, a compromised backup server could potentially give an attacker a way into all your servers. Worth considering, especially if you choose to share a backup server with others.

We use R1Soft on almost all our servers. It is not necessarily the one we go to first if we need to do a restore, but it is the one we rely on the most. On each server we then pick one of the other backup systems as our secondary, according to what we think is most appropriate and cost effective.

The R1Soft product has changed owner a number of times over the past few years, most recently to Continuum in 2014. It seems that Continuum is mostly interested in using R1Soft as part of their various managed offerings, so it may not have a bright future as a stand-alone product. Time will tell.

rsync

rsync is a file transfer utility that has been around since the mid-1990s on Unix, Linux and many other operating systems. rsync is designed to maintain synchronised copies of files systems in two different locations (local or remote) in a very efficient way.

The specific behaviour of rsync depends on the command line arguments, but for backup purposes you would initially transfer copies of all your files over SSH to a remote file system. The next time rsync is run, only files that have been modified would be transferred, which in our typical use case would mean that only af few tens or hundreds of several thousand files would be transferred, drastically reducing the bandwidth and time costs of performing a backup.

rsync only gives you the means to create a mirror copy of a file system, it does not “remember” what the file system looked like at earlier points in time, something that you usually want from a backup solution (i.e. the ability to restore files or databases from, say, yesterday or a week ago). However, various backup utilities have been built around rsync to add versioning support. One such backup utility is rsnapshot. (Apples Time Machine back up system for MacOS uses a similar technique).

rsnapshot and similar tools exploit the fact that file names in Linux and related operating systems are really just hard links to an underlying file system object which is referenced by an inode. Other hard links can be created to the same inode, making it appear as if multiple copies of the same file exist, where in reality there is only a single file system object referenced multiple times.

Such tools generally work by first creating a standard rsync mirror copy of the files that are being backed up. On the next run it begins by creating a hardlinked copy of all the files on the backup destination and placing the copy in a different folder, preserving a version of the file system as it was before the new backup is run. When rsync is run again it deals with each file on the backup destination as follows:

  • Unchanged files are left alone, leaving the hard link to the original inode in place
  • Changed files will be created as a new file system object on the target file system, replacing the hard link to the original inode
  • New files will be created as a new file system object
  • Deleted files will cause the hard link to the original inode to be removed. Removing the last hard link (when old backups are pruned) will cause the file system object to be deleted, freeing up the disk space.

The end result is that the previous backup and the current backup are both available on the backup volume, stored in different folders. Because hard links take up very little disk space, this is a very efficient use of backup space.

Browsing and restoring files from a backup made using tools built around rsync is trivial because the backup is stored in a completely standard Linux file system, so files can be copied without having to use proprietary tools or commands.

Backups made by rsync or tools built around it are encrypted during transmission if SSH is used, but the files are stored unencrypted on the destination unless the target disk volume itself is encrypted.

rsync can be made to work in either a pull or push configuration –
rsnapshot works in a pull configuration, periodically pulling backups from the clients.

rsync and ZFS snapshots

Instead of using rsnapshot or similar to handle rsync backup versioning, it is possible to use rsync to mirror a file system to a backup server and then do periodic snapshots of the backup server file system itself. This is the solution that we have chosen.

In our case we use rsync to mirror files to a ZFS disk volume that we rent from rsync.net. The provider then does daily and weekly snapshots that we can access via SSH or using an SFTP client.

Using this approach means that our backup scripts are basically one-liners that can be easily verified. We don’t have to deal with the complexities of tools like rsnapshot, and it frees us from having to setup our own backup server. The main disadvantage is cost – rsync.net backup space with ZFS snapshot support currently costs $8 per 100 GB/Month.

It is worth mentioning that the ZFS snapshots are immutable, which means that we can safely work with a push backup configuration without risking that an attacker wipes both a virtual server and its backups. Backups are stored unencrypted, however, so not suitable for all use cases.

As for our R1Soft backups, databases need to be dumped to files before performing the backup. However, because backups are initiated by the client, the same script that performs the backup can dump the databases, so timing is not an issue.

One disadvantage is that we do not have a simple way of verifying the consistency of a backup. rsync uses checksums to verify that individual files are transferred correctly, and ZFS has built-in protection against data degradation over time, so we believe the risk to be negligible, but it is worth keeping in mind.

JetBackup

JetBackup is a proprietary rsync-based backup utility designed for easy integration on cPanel servers. It pushes incremental backups over SSH and also supports other backup destinations such as S3 and Dropbox. A JetBackup license for a cPanel server currently costs USD 4 / month.

The main benefit of JetBackup is that it creates backups in a format that cPanel understands, so restoring an account is much easier compared to recreating the account manually and then restoring the individual files and databases. It also provides cPanel users a self-service restore option, although in our case we usually handle this for our clients.

JetBackup has mostly worked for us without issues. Backups are relatively slow, but within acceptable limits for our use. File and database restores work well, but restores are triggered via a cron job that runs every five minutes, so a file or database restore does not happen immediately, something that isn’t immediately obvious from the user interface, which could be a problem for the casual self-service user to understand. Backups are stored unencrypted.

One big reason to use JetBackup over the backup utilities provided by cPanel is that remote incremental backups are not supported, however this is about to change with cPanel version 66 that is expected to be released in the July/August timeframe. Depending on how cPanel’s solution performs this may be a better option – for better or worse, we are familiar with cPanel quality control and we understand how login credentials are handled if we need to give their support staff access to the server. JetBackup is more of an unknown quantity, and we would not want to give them root access to a production server.

Borg

Borg is an open-source backup utility that supports AES 256 encryption and compression using lz4, zlib, or lzma. It does client initiated (push) backups to remote backup destinations over SSH.

Like rsync, Borg will only transfer new and changed files after the initial backup run has been completed, but it uses a different algorithm to determine what needs to be backed up:

  • rsync uses file size and file modification time to identify changed files
  • borg divides files into chunks and calculates a unique fingerprint for each chunk. New chunks are identified using this fingerprint, and each unique chunk only needs to be transferred once, regardless of how many files the chunk is found in.

This chunk-based de-duplication strategy gives Borg some advantages over rsync:

  • Multiple copies of the same file only needs to be backed up once (rsync would transfer all copies individually). This can mean substantial time, bandwidth, and backup space savings on a shared server hosting tens or hundreds of WordPress sites, for example
  • Changes to the directory structure have minimal impact on the backup because no new chunks are created (rsync would consider all the files under the changed directory node as changed)

Borg can backup to nfs or sshfs, but performance is much better if Borg is running on the backup destination, so you will have to setup your own backup server to run Borg efficiently. Alternatively, rsync.net can provide backup space at USD 3 per 100GB / month with Borg enabled.

We have limited experience with Borg so far, but it looks very promising to us. These are our takeaways from what we’ve seen:

  • Installation and configuration on Linux is really easy. It took perhaps an hour to read through the documentation and another hour or so to get it up and running using a pre-built binary and following the steps in the “Quick Start” guide.
  • Initial backups ran to completion without issue in a reasonable time. Memory and CPU use were acceptable even on a small 1G RAM / 1vCPU VPS (backing up ~15GB with lz4 compression and encryption enabled).
  • Automating backups, setting up backup retention and pruning old backups is very simple, using the example script in the “Quick Start” guide as a starting point.
  • Restoring from the command line (using the extract command) is complicated compared to copying files from an rsync mirror (there is a useful –dry-run option for experimenting). However, extract is also considerably more powerful because it supports inclusion and exclusion patterns.
  • Mounting a backup archive as a FUSE file system using mount and umount works well and simplifies single-file restores by allowing use of standard Linux file commands.

Verifying backup integrity is a matter of running borg check on the backup repository. At the moment we do this manually, but it can easily be automated to run at regular intervals and to send an alert email if something isn’t right.

One issue that we don’t currently have a resolution for is how to prevent an attacker that gains access to a server from also wiping the backups. One way to deal with this would be to use immutable ZFS snapshots, but it isn’t clear to us how much disk space those snapshots would consume (cost implications).

Borg has an active development community and a good track record of dealing with issues, and in the long term we think that Borg might be a possible replacement for R1Soft in our setup. However, Borg does not have a centralised management console, so we would have to login to each backup client individually to manage backups. Our main concern here is that backups might somehow silently fail on a server without us noticing.