Pipes, Redirects, and Data Streams

K
Kai··5 min read

Article 5 gave us a lot of small tools. This article is the mechanism that joins them together — and it's a deep dive worth the effort, because once you get it the whole Linux command line clicks into place. It all rests on one idea: a program reads from an input stream and writes to output streams, and you control where those streams go.

The three standard streams

Every program on Linux gets three streams when it runs, each one a file descriptor (fd) — an identifying number (recall Article 2: "everything is a file", including data streams):

                  ┌──────────────┐
   stdin (fd 0) ─►│              │─► stdout (fd 1)  ← normal output
   (input)        │   program    │
                  │              │─► stderr (fd 2)  ← error messages
                  └──────────────┘
  • stdin (fd 0) — input, the keyboard by default.
  • stdout (fd 1) — normal output, the screen by default.
  • stderr (fd 2) — output for errors/messages, also the screen by default but a separate stream.

Why split stdout and stderr? So you can handle output and errors separately — for example, save the output to a file while still seeing errors on screen. The next example makes this clear.

Redirect: send a stream into a file

The command ls real.txt nope.txt (one file exists, one doesn't) prints its result to stdout and its error to stderr — both mixed together on screen. Split them:

ls real.txt nope.txt > out.txt 2> err.txt
# out.txt (stdout, fd 1):
real.txt
# err.txt (stderr, fd 2):
ls: cannot access 'nope.txt': No such file or directory
  • > file sends stdout to a file (implicitly 1>).
  • 2> file sends stderr to a file.

The output "real.txt" goes into out.txt, the error into err.txt — completely separated. This is why the two streams exist independently.

Overwrite and append

echo "dong 1" > f.txt     # > creates / OVERWRITES the whole file
echo "dong 2" >> f.txt    # >> APPENDS to the end of the file
cat f.txt                 # dong 1 \n dong 2

Remember well: > overwrites (wipes the old contents), >> appends. Typing > instead of >> onto an important file means data loss.

Input from a file: <

wc -l < f.txt    # feed the contents of f.txt into wc's stdin

< file makes stdin read from a file instead of the keyboard. Used less often than > because most commands take a filename directly, but it's the symmetric part of the model.

Merge stderr into stdout: 2>&1

Sometimes you want both output and errors in the same place (for example, one complete log file):

ls real.txt nope.txt > all.txt 2>&1

2>&1 means "send fd 2 (stderr) to wherever fd 1 (stdout) currently points". Placed after > all.txt, so both go into all.txt. This is why you so often see > file 2>&1 at the end of commands in scripts and crontabs — gathering all output and errors into one log.

Order matters: > all.txt 2>&1 (merges) is different from 2>&1 > all.txt (does not). Because 2>&1 copies fd 1's current target; it has to come after fd 1 already points to the file.

Discard output: /dev/null

/dev/null (Article 2 — the "black hole") swallows everything written to it. Use it to throw away output you don't need:

ls nope.txt 2>/dev/null      # discard the error, nothing shows on screen
some-command > /dev/null 2>&1  # discard BOTH stdout and stderr (run silently)

The pattern > /dev/null 2>&1 means "run completely silently" — you see it constantly in scripts when you only care whether a command ran, not its output.

Pipe: chain commands together

This is the piece that makes Unix powerful. The | (pipe) connects the stdout of the left command to the stdin of the right command:

   cat /etc/passwd ─► grep root ─► wc -l
        stdout │ stdin    stdout │ stdin
cat /etc/passwd | grep root | wc -l
1

Read the chain: cat streams out the file's contents → grep root filters lines containing "root" → wc -l counts the lines. Each command does one job, and chained together they become "count the lines containing root in /etc/passwd". This is the Unix philosophy from Article 0 made real: small tools + pipes = complex operations.

A few real-world pipelines you'll use often:

# the 5 most recent error lines in the log
grep -i error /var/log/syslog | tail -5

# top 3 most frequent IPs in an access log
cut -d ' ' -f1 access.log | sort | uniq -c | sort -rn | head -3

# is there any nginx process running
ps aux | grep nginx | grep -v grep

Note: a pipe only connects stdout (not stderr). To push errors through the pipe too, use 2>&1 before | (or |& in bash).

See it and save it: tee

A pipe passes data onward, so you don't see it. tee "splits" the stream in two: it writes to a file and passes the data on to the screen/pipe:

echo "ghi 2 noi" | tee out.txt

Prints "ghi 2 noi" to the screen and writes it into out.txt. Handy when you want to save a command's output while still watching it run. (tee -a to append instead of overwrite.)

🧹 Cleanup

cd /tmp && rm -f real.txt out.txt err.txt f.txt all.txt t.txt out.txt

Wrap-up

Every program has three streams: stdin (fd 0), stdout (fd 1), stderr (fd 2). You control them: >/>> overwrite/append stdout into a file, 2> for stderr, 2>&1 merges errors into stdout, < for stdin, /dev/null to discard. And | (pipe) connects one command's stdout into the next command's stdin — turning the small tools from Article 5 into powerful pipelines. tee lets you save and watch at the same time.

Once you understand this model, you can read those intimidating long command lines just by following the data stream. Article 7 moves to another deep dive you hit every day on a server: file permissions.