Securely backup your VPS using Duplicity & GnuPG

It's always a good idea to backup your data; it gives you protection from data loss and hardware failure. If you host sensitive data, or applications for customers, it's a good idea to encrypt the backups, ensuring their secure and can be safely kept just about anywhere.

There are lots of backup scripts, solutions and services around: Rsync, s3sync, Rdiff-backup, Jungle Disk and Duplicity being just a few. After trying a few of them I decided to go with Duplicity for my Linode VPS; it provided a simple, yet powerful, way of doing encrypted backups.

Duplicity uses librsync and GnuPG to incrementally encrypt archives of files that have changed since the last backup. You can transfer the backups using a whole range of protocols: ftp, imap, rsync, s3 and scp for example - I store backups on my local file server, however, due to the encrypted nature they could easily be stored on something like Amazon's S3.

The MAN pages have more information on the actions, options and URL formats for duplicity, it also provides several examples.

Installing duplicity and its dependencies

Duplicity is written in python, which needs to be installed if it isn't already (not something covered here). You can install Duplicity via Debian's package manager, but the version is outdated and lacks newer features; to get the latest version it's best to install it from source.

The following dependencies need to be met:

Debian users can simply run the following to get all of the required dependencies:

sudo aptitude build-dep duplicity

Fetch the latest stable release, which as of writing this was 0.6.08b:

mkdir sources
cd sources
wget http://code.launchpad.net/duplicity/0.6-series/0.6.08b/+download/duplicity-0.6.08b.tar.gz

Extract the tarball:

tar xvzf duplicity-0.6.08b.tar.gz<br>$ cd duplicity-0.6.08b

Finally, install Duplicity:

sudo python setup.py install

If successful then you should be able to verify by running:

duplicity --version

Encryption & Keys

Duplicity takes care of the gpg encryption for us, all we have to do is supply a public encryption and signature key. The encryption key is used to protect the data from nosey people, while the signature key is used to ensure the integrity of the backups.

By default, if you omit the signature key, the encryption key is used for signing as well. It's highly recommended to create separate signature and encryption keys; the passphrase for the signature needs to be available in the script, therefore, using the same key for encryption and signing leaves your encrypted files exposed.

On your local machine, not your production server, you can generate the keys with this command:

gpg --gen-key

You will be given a choice of key types, I normally go with the default - same thing with the key length and expiry. When you're asked to enter a name, you can put what you want, I tend to put the name of the server and which key it is, signature or encryption.

Example of the encryption key generation:

gpg --gen-key
gpg (GnuPG) 1.4.10; Copyright (C) 2008 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Please select what kind of key you want:
   (1) RSA and RSA (default)
   (2) DSA and Elgamal
   (3) DSA (sign only)
   (4) RSA (sign only)
Your selection? 
*press enter*
RSA keys may be between 1024 and 4096 bits long.
What keysize do you want? (2048) 
Requested keysize is 2048 bits
*press enter*
Please specify how long the key should be valid.
         0 = key does not expire
      <n>  = key expires in n days
      <n>w = key expires in n weeks
      <n>m = key expires in n months
      <n>y = key expires in n years
Key is valid for? (0) 
*press enter*
Key does not expire at all
Is this correct? (y/N) y

You need a user ID to identify your key; the software constructs the user ID
from the Real Name, Comment and Email Address in this form:
    "Heinrich Heine (Der Dichter) <[email protected]>"

Real name: Edge Backup Encryption Key
Email address: [email protected]
Comment: Encryption key for Edge backups
You selected this USER-ID:
    "Edge Backup Encryption Key (Encryption key for Edge backups) <[email protected]>"

Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? O
You need a Passphrase to protect your secret key.
*enter passphrase, make sure secure*
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
....................+++++
...+++++
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
.....+++++
+++++
gpg: key B5FC8737 marked as ultimately trusted
public and secret key created and signed.

gpg: checking the trustdb
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0  valid:   2  signed:   0  trust: 0-, 0q, 0n, 0m, 0f, 2u
pub   2048R/B5FC8737 2010-04-03
      Key fingerprint = 2365 2F64 0808 8A90 644C  9AEF FBBD B843 B5FC 8737
uid                  Edge Backup Encryption Key (Encryption key for Edge backups) <[email protected]>
sub   2048R/9ABB5804 2010-04-03

Do exactly the same for the signature key, but make sure you use a different passphrase.

Once both keys have been created you need to export and copy the public encryption and private signature keys to the production box; the safest way to do this is SCP/SSH.

To export the keys run the following commands:

gpg --export -a 'Edge Backup Encryption' > edge.enc.pub.gpg
gpg --export-secret-keys -a 'Edge Backup Signature' > backup.sig.sec.gpg

Transfer them to the production box:

scp edge.enc.pub.gpg backup.sig.sec.gpg [email protected]:/tmp

Import the transferred keys by running the following command (on the production box):

gpg --import /tmp/backup.enc.pub.gpg
gpg --import-secret-keys /tmp/backup.enc.sec.gpg

Verify the keys were imported correctly, we also need to note down the key IDs:

gpg --list-keys
gpg --list-secret-keys
/root/.gnupg/pubring.gpg
------------------------
pub   2048R/5FD0100F 2010-04-04
uid                  Edge Backup Encryption Key (Encryption key for edge backups) <[email protected]>
sub   2048R/48F61F08 2010-04-04

pub   2048R/7F73FA36 2010-04-04
uid                  Edge Backup Signature Key (Signature key for edge backups) <[email protected]>
sub   2048R/A67F8410 2010-04-04



/root/.gnupg/secring.gpg
------------------------
sec   2048R/7F73FA36 2010-04-04
uid                  Edge Backup Signature Key (Signature key for edge backups) <[email protected]>
ssb   2048R/A67F8410 2010-04-04

The two IDs we're interested in are 5FD0100F (encryption key) and 7F73FA36 (signature key).

Backup process

I run my backups as root with the scripts running from the root home directory and only readable by root - chmod'ed with 700 (rwx------). I do this for two reasons: one, you need to be able to read all the directories on the server and two, the passphrase needs to be stored in the script.

Before we begin the backups we need to create an exclusion list to ignore certain directories that we don't want to backup. You may include your own directories, or alter the list to better suit your Linux distro.

vim /root/backups/excludelist

Add the following and save

- /sys
- /dev
- /proc
- /tmp
- /mnt

I suggest running the initial backup manually so you can catch any potential errors before automating the process; the initial backup needs to copy everything so expect it to take a while!

Run the following command, changing the sign and encrypt keys for yours! Time to grab a beer maybe?

duplicity --sign-key '7F73FA36' --encrypt-key '5FD0100F' --exclude-filelist=/root/backups/excludelist / scp://rich@backup_server//mnt/backups/edge/main

After that has finished confirm the backup ran successfully:

duplicity collection-status scp://rich@backup_server//mnt/backups/edge/main

It should list a load of statistics and say something like: "No orphaned or incomplete backup sets found."

When you're happy it's all running you can automate the process via a cronjob. Below is the script I use to backup my VPS, note that I dump all the mySQL databases before doing the backup, relying on backups of /var/lib/mysql isn't advised.

#!/bin/sh

export PASSPHRASE=SomeMagiclySecurePassphrase
export SSH_AUTH_SOCK=/tmp/ssh-agent

#
# Dump mySQL databases, relying on backup of /var/lib/mysql isn't advised.
#
mysqldump --all-databases -uroot -pTEHPAZZ | bzip2 -c > /root/backups/db/all_databases_$(date +%Y_%m_%d).sql.bz2

#
# Main backup.
#
duplicity --sign-key '7F73FA36' --encrypt-key '5FD0100F' --exclude-filelist=/root/scripts/backups/ignorelist / scp://rich@backup_server//mnt/backups/edge/main

#
# Clean up.
#

# Remove the temp database dump.
rm /root/backups/db/all_databases_$(date +%Y_%m_%d).sql.bz2

# Delete duplicity backups older than 30 days.
duplicity remove-older-than 30D scp://rich@backup_server//mnt/backups/edge/main

That's it! You should now have secure, fully automated backups of your server. Just make sure you keep your GPG keys safe, I keep mine on a USB pen drive I carry with me.