Linux commands for backups: Tar

This is a beginners guide on how to compress and uncompress using Linux's tar command. Part of a useful utility belt of commands for backups on Linux.

Tar allows you to archive files or directories into a single file—think of it like a bag you store items in—its useful if you want to maintain the ownership and permissions of the files you archive.

Creating an archive

Say we have a directory called myDir that we want to archive to myDir.tar, we could run:

tar --create --verbose --file myDir.tar myDir

This can be shortened using the shorthand options: -c, -v, -f. In fact we don't even need the hyphens, just run:

tar cvf myDir.tar myDir

Awesome! We now have our archived myDir, but could we do better? What happens if we want to save some space? myDir might contain lots of text files that could be compressed to save space.

Tar has flags for using different compression algorithms: --gzip (-z), --bzip2 (-j) and --xz (-J), for example. I personally like to use zstd, a newer algorithm that I find faster with better compression. You may prefer gzip, which will already be installed on nearly all distros.

To compress a folder with zstd, or gzip you could run:

tar --zstd cvf myDir.tar.zst myDir
tar czvf myDir.tar.gz myDir

Its good practice to name the file extension in some way to indicate its a compressed archive. Common names include: gzip: .tar.gz; bzip: .tar.bz2 or zstd: .tar.zst

In fact, this has an advantage, tar has an option not many know of: --auto-compress or -a. This option will automatically choose the compression algorithm based on the file suffix. Our previous commands can be run with:

tar cavf myDir.tar.zst myDir
tar cavf myDir.tar.gz myDir

What about archiving multiple directories? Just them to the end of the command:

tar cavf myArchive.tar.zst myDir myOtherDir notes.txt

You can list as many directories or files as you like.

You can also exclude files (or directories) from an archive. For example, if we wanted everything in myDir archived apart from myDir/list.txt, then we can use the --exclude flag:

tar cavf myDir.tar.zst --exclude=myDir/list.txt myDir

The exclude flag is actually quite powerful; you can use patterns with it. For example, if we have a directory containing RAWs & JPEGs from a camera, and only want to archive the RAW files:

tar cavf 2020-10-08-photoshoot-raws.tar.zst --exclude=*.jpg ./todays-photoshoot

Extracting an archive

To extra an archive you simply replace the -c flag with --extract, or -x. We don't even need -a, tar knows what to do with the compressed archive:

tar xvf myDir.tar.zst

If you want to extract to a different location, for example /tmp, use the -C flag:

tar xvf myDir.tar.zst -C /tmp
cd /tmp/myDir

Using Pipe and SSH

A handy feature of tar is you can combine it with the Linux pipe and send the data over ssh to a remote server.

To archive a local folder to a remote server you can run the following:

tar cvf - --zstd myDir | ssh user@remote-server "cat > myArchiveDir.tar.zst"

Note the -f -, the - is saying send the tar output to stdout.

You can also extract the archive directly on the remote server:

tar cvf - --zstd myDir | ssh user@remote-server "tar x --zstd"

That concludes a beginners guide to the tar command. On its own its useful, but it becomes really powerful when you start combining it with other commands. There will be future posts expanding on this.