Compress and Decompress: tar, gzip, zip

K
Kai··4 min read

You'll constantly need to bundle many files into one and compress them: for backups, to package logs to send off, to download source code. This article clears up two often-confused concepts — archiving and compressing — and the famously hard-to-remember tar flags.

Archiving differs from compressing

This is the core point:

  • Archive: combine many files/directories into a single file. Tool: tar. Archiving itself doesn't make things smaller.
  • Compress: make a file smaller. Tool: gzip (and bzip2, xz, zstd...).
   many files        tar (archive)     gzip (compress)
   ┌───┐┌───┐┌───┐   ──────►  ┌──────┐  ──────►  ┌──────────┐
   │ a ││ b ││ c │            │ .tar │           │ .tar.gz  │
   └───┘└───┘└───┘            │(1 file)│         │(smaller) │
                              └──────┘           └──────────┘

On Linux, the most common pattern is to combine both: use tar to archive then gzip to compress, producing a .tar.gz file (often abbreviated .tgz). Conveniently, tar does both in one command thanks to the -z flag.

tar: remembering the flags

tar has a reputation for being hard because of its flags. A way to remember: stack a few letters, each one a job:

   c  = create   (create archive)        ┐ pick 1 of 3
   x  = extract  (decompress)            ┤ (the action)
   t  = list     (list contents)         ┘
   z  = via gzip  (.gz)               → add when working with .tar.gz
   v  = verbose   (print file names)  → optional, to see what it's doing
   f  = file      (specify the file name) → almost always needed, put it LAST

The three operations you use 99% of the time:

Create a compressed archive (c + z + f):

tar -czf proj.tar.gz proj/

Archives the proj/ directory into proj.tar.gz. (Remember: f goes last, right before the target file name.)

List the contents without extracting (t + z + f):

tar -tzf proj.tar.gz
proj/
proj/src/
proj/src/app.txt
proj/README.md

Always -tzf to preview before extracting — so you know what it'll unpack, and where.

Extract (x + z + f):

tar -xzf proj.tar.gz            # extract into the current directory
tar -xzf proj.tar.gz -C /opt    # -C: extract into another directory (/opt)

A fun mnemonic: "eXtract Ze File" for xzf, "Create Ze File" for czf.

Watch out for the "tar bomb": some archives unpack files straight into the current directory (not bundled inside a subdirectory), littering wherever you're standing. That's why you should -tzf to preview first, or extract into a separate directory with -C.

gzip: compress a single file

gzip compresses exactly one file (it doesn't bundle multiple files — that's tar's job):

gzip big.txt        # creates big.txt.gz and DELETES the original big.txt
gunzip big.txt.gz   # decompress again, restoring big.txt

Note that gzip replaces the original file with the .gz version (and gunzip does the reverse). To keep the original, use gzip -k (keep). You often meet a lone .gz with log files (access.log.1.gz); peek at the contents without extracting using zcat file.gz or zless file.gz.

zip: when you need Windows compatibility

.tar.gz is the standard of the Unix/Linux world. When you need to exchange with Windows/macOS users, .zip is more convenient because it opens everywhere out of the box:

apt-get install -y zip unzip    # Ubuntu usually doesn't ship it preinstalled
zip -r proj.zip proj/           # -r to archive the whole directory
unzip proj.zip                  # decompress
unzip -l proj.zip               # -l to list contents

Unlike tar.gz, zip both archives and compresses in one format, and compresses each file separately (so .tar.gz is usually smaller when you have many similar small files).

Which one to pick

Situation Use
Backup/move on Linux, many files tar -czf (.tar.gz)
Compress a single file (e.g. a log) gzip
Send to Windows/macOS users zip
Need stronger compression, accept it being slower tar -cJf (.tar.xz) or zstd

In practice: on a Linux server you use tar -czf / tar -xzf the most. Knowing those two commands is enough for most of the work.

🧹 Cleanup

cd /tmp && rm -rf proj proj.tar.gz proj.zip out big.txt*

Wrap-up

Archiving (tar) combines many files into one; compressing (gzip) makes a file smaller; combine them for .tar.gz. With tar, remember three operations: create -czf, list -tzf, extract -xzf (with -C to choose where to extract, -v to watch progress). gzip/gunzip for single files (replacing the original). zip/unzip when you need Windows compatibility. Always -tzf to preview before extracting something unfamiliar.

You've backed up files, but "disk full" is the classic server incident. Article 10 covers checking capacity and managing disks: df, du, lsblk, mount.