Optimizing Images: Multi-Stage Builds and Security
Your image runs, but it may be many times larger than it needs to be. A large image brings: slow deploys (more data to pull), wasted disk and bandwidth, and a wider attack surface (the more software in the image, the more potential vulnerabilities). This article makes images smaller and safer.
Why images tend to get large
The most common cause: the image carries the build tools that aren't needed at runtime. To compile an application you need a compiler, dev libraries, packaging tools... But to run it you only need the built result. If you leave the whole toolchain in the final image, it bloats for no reason.
Take a Go example. The "naive" way — build and run in the same image:
FROM golang:1.22-alpine
WORKDIR /src
COPY . .
RUN go build -o /app .
CMD ["/app"]
This image weighs 301MB — because it carries the whole Go toolchain (golang:1.22-alpine) while the app is just a binary of a few MB.
Multi-stage builds
The solution is the multi-stage build: use multiple FROMs in one Dockerfile. The first stage has all the tools to build; the final stage is a lean image that only copies the result from the build stage. Everything in the build stage (compiler, intermediate files) is left behind and doesn't enter the final image.
# Stage 1: build (has the Go toolchain)
FROM golang:1.22-alpine AS build
WORKDIR /src
COPY . .
RUN go build -o /app .
# Stage 2: runtime (lean, takes only the binary)
FROM alpine:latest
COPY --from=build /app /app
CMD ["/app"]
COPY --from=build /app /app takes exactly the binary from the build stage. The final image has only Alpine + the binary.
The real result when building both ways and comparing:
docker images | grep demo
demo:multi 15.4MB
demo:single 301MB
From 301MB down to 15.4MB — nearly 20× smaller, same application. The app still runs identically; we just dropped the part that isn't needed at runtime.
┌─ Stage 1: build ──────────────┐
│ FROM golang (toolchain ~300MB)│
│ RUN go build -o /app │
└───────────┬────────────────────┘
│ COPY --from=build /app
▼
┌─ Stage 2: runtime ────────────┐
│ FROM alpine (~7MB) │ ← final image
│ contains only: alpine + binary │ just 15.4MB
└────────────────────────────────┘
(everything in Stage 1 is left behind)
Multi-stage applies to every language: Node (the build stage runs npm install + bundle, the runtime stage takes only the built code + production dependencies), Java (build with Maven, runtime takes only the .jar), and so on.
Pick a small base image
Most of the size comes from the base image. For the same software, different bases give very different sizes:
- Full (
node:20,python:3.12): bundles many tools, the largest. -slim: a trimmed version, dropping less-used things.-alpine: based on Alpine Linux, very small (a few MB). Note Alpine usesmusllibc instead ofglibc, which can sometimes break software that needs glibc — test it.distroless(from Google): runtime only, not even a shell — safer but harder to debug.scratch: an absolutely empty image, used for static binaries (like Go) that need nothing else.
The rule: pick the smallest base on which the app still runs fine.
Reduce the number of layers and keep them lean
Recall Articles 2/5: every RUN creates a layer, and a file added in one layer then deleted in a later layer does not shrink the image (it still sits in the old layer). So clean up within the same RUN:
# Good: install and clean the cache in the same layer
RUN apk add --no-cache curl \
&& <do something> \
&& rm -rf /var/cache/...
And don't forget .dockerignore (Article 5) so you don't stuff node_modules, .git, and build files into the context.
Securing images
A small image is already safer (less software = fewer vulnerabilities). Add a few habits:
Run as a non-root user. By default the process in a container runs as root (root inside the container — recall user namespaces in Article 2). Create and switch to a regular user:
RUN addgroup -S app && adduser -S app -G app
USER app
CMD ["/app"]
If an attacker breaks out of the process, running non-root limits the damage.
Don't bake secrets into the image. Don't COPY a .env file, private keys, or passwords into the image — the image may be pushed to a registry and anyone can read the layers (recall docker history in Article 4 exposes the commands too). Pass secrets at runtime via environment variables, or use a secrets mechanism (Article 13 with Swarm).
Pin versions. Use a specific tag (node:20.11-alpine) instead of latest (Article 4) so builds are reproducible and you know exactly what you're running.
Scan for vulnerabilities. Docker ships with a scanning tool:
docker scout quickview <image>
docker scout cves <image>
It lists the known CVEs in the image, helping you know which base or package to update.
🧹 Cleanup
docker rmi demo:single demo:multi 2>/dev/null
docker image prune -a # remove unused images
docker builder prune # remove build cache (multi-stage creates a lot of cache)
Wrap-up
Images get large mostly from carrying build tools not needed at runtime. The multi-stage build separates build from runtime, copying only the result into the final image — a clear size reduction (e.g. 301MB → 15.4MB). On top of that: pick a small base, merge/clean layers, run non-root, don't bake in secrets, pin versions, and scan for CVEs. Small, clean images deploy faster and are safer.
By now we're proficient with Docker on a single machine: build, run, store, network, group with Compose, optimize images. The rest of the series steps into Docker Swarm — running containers across multiple machines. Article 10 opens with the cluster architecture and the Raft consensus mechanism.