This is a beginners guide on how to compress and uncompress using Linux's tar command. Part of a useful utility belt of commands for backups on Linux.
- Part 1: Linux tar
- Part 2: Linux pv
Tar allows you to archive files or directories into a single file—think of it like a bag you store items in—its useful if you want to maintain the ownership and permissions of the files you archive.
Creating an archive
Say we have a directory called myDir
that we want to archive to myDir.tar
, we could run:
tar --create --verbose --file myDir.tar myDir
This can be shortened using the shorthand options: -c, -v, -f. In fact we don't even need the hyphens, just run:
tar cvf myDir.tar myDir
Awesome! We now have our archived myDir
, but could we do better? What happens if we want to save some space? myDir
might contain lots of text files that could be compressed to save space.
Tar has flags for using different compression algorithms: --gzip (-z), --bzip2 (-j) and --xz (-J), for example. I personally like to use zstd, a newer algorithm that I find faster with better compression. You may prefer gzip
, which will already be installed on nearly all distros.
To compress a folder with zstd, or gzip you could run:
tar --zstd cvf myDir.tar.zst myDir
tar czvf myDir.tar.gz myDir
Its good practice to name the file extension in some way to indicate its a compressed archive. Common names include: gzip: .tar.gz; bzip: .tar.bz2 or zstd: .tar.zst
In fact, this has an advantage, tar has an option not many know of: --auto-compress or -a. This option will automatically choose the compression algorithm based on the file suffix. Our previous commands can be run with:
tar cavf myDir.tar.zst myDir
tar cavf myDir.tar.gz myDir
What about archiving multiple directories? Just them to the end of the command:
tar cavf myArchive.tar.zst myDir myOtherDir notes.txt
You can list as many directories or files as you like.
You can also exclude files (or directories) from an archive. For example, if we wanted everything in myDir
archived apart from myDir/list.txt
, then we can use the --exclude flag:
tar cavf myDir.tar.zst --exclude=myDir/list.txt myDir
The exclude flag is actually quite powerful; you can use patterns with it. For example, if we have a directory containing RAWs & JPEGs from a camera, and only want to archive the RAW files:
tar cavf 2020-10-08-photoshoot-raws.tar.zst --exclude=*.jpg ./todays-photoshoot
Extracting an archive
To extra an archive you simply replace the -c flag with --extract, or -x. We don't even need -a, tar knows what to do with the compressed archive:
tar xvf myDir.tar.zst
If you want to extract to a different location, for example /tmp, use the -C flag:
tar xvf myDir.tar.zst -C /tmp
cd /tmp/myDir
Using Pipe and SSH
A handy feature of tar is you can combine it with the Linux pipe and send the data over ssh to a remote server.
To archive a local folder to a remote server you can run the following:
tar cvf - --zstd myDir | ssh user@remote-server "cat > myArchiveDir.tar.zst"
Note the -f -, the -
is saying send the tar output to stdout.
You can also extract the archive directly on the remote server:
tar cvf - --zstd myDir | ssh user@remote-server "tar x --zstd"
That concludes a beginners guide to the tar command. On its own its useful, but it becomes really powerful when you start combining it with other commands. There will be future posts expanding on this.