Volumes and Bind Mounts: Storing Data Persistently

K
Kai··5 min read

Recall Article 2: everything a container writes lives in the writable layer, and the writable layer vanishes when the container is removed. So what about a database, user-uploaded files, or logs that need to persist? The answer is to move the data outside the container — via a volume or a bind mount. That's this article's topic.

The problem: data dies with the container

A quick test to see the problem. Run a container, write a file, then remove the container:

docker run --name tmp alpine sh -c 'echo "quan trong" > /data.txt'
docker rm tmp

The file /data.txt lives in the writable layer of the tmp container. When you rm, it's gone with it. For a database this is a disaster. We need a place to store that's decoupled from the container's lifecycle.

Docker has two main approaches (plus tmpfs for temporary in-RAM data):

   Container
   ┌─────────────────────────────┐
   │ Writable layer (dies with    │   ← do NOT put data you need to keep here
   │ the container)               │
   │                              │
   │  /data  ───┐   /app ───┐     │
   └────────────┼───────────┼─────┘
                │           │
        ┌───────▼──┐   ┌────▼──────────┐
        │  Volume  │   │  Bind mount   │
        │ (managed │   │ (a path on    │
        │by Docker)│   │  the host)    │
        └──────────┘   └───────────────┘
   /var/lib/docker/...   /home/you/app

Named volume: let Docker manage it

Per the docs, a volume is a persistent storage mechanism managed by the Docker daemon, and is "ideal for performance-critical data processing and long-term storage." Docker stores volume data in its own area on the host, and you don't need to care about the specific path.

Create a volume and write data into it through a container:

docker volume create demo-data
docker run --rm -v demo-data:/data alpine sh -c 'echo "du lieu quan trong" > /data/note.txt'

The -v demo-data:/data syntax means: mount the demo-data volume at the path /data inside the container. This container uses --rm, so it's deleted right after running. Now create a completely new container mounting the same volume:

docker run --rm -v demo-data:/data alpine cat /data/note.txt
du lieu quan trong

The data is still there, even though the container that wrote it was deleted long ago. The volume lives independently of any container — this is exactly the persistence we need.

See where the volume lives:

docker volume inspect demo-data --format '{{.Mountpoint}}'
/var/lib/docker/volumes/demo-data/_data

(On macOS/Windows this path is inside the Linux VM — recall Article 1.)

Managing volumes:

docker volume ls            # list
docker volume inspect <name> # details
docker volume rm <name>      # remove
docker volume prune          # remove all volumes not attached to any container

Be careful with volume prune and rm: a volume holds real data (a database...). Deleting it loses the data. This is why volumes are decoupled from containers: you can delete containers freely, but delete volumes deliberately.

Bind mount: attach a host directory directly

Per the docs, a bind mount creates a direct link between a path on the host and the container, and should be used "when you need to access files from both the container and the host." Unlike a volume (where Docker manages the storage area), a bind mount points to exactly one directory you specify on the machine.

mkdir mysite
echo "<h1>Trang cua toi</h1>" > mysite/index.html
docker run --rm -v "$(pwd)/mysite":/usr/share/nginx/html -p 8080:80 nginx:alpine

Open http://localhost:8080 and you'll see the content from mysite/index.html on your machine. Edit that file on the host and reload the page — the content changes instantly, with no image rebuild. The link is two-way: the container can also write back out to the host.

Because of this instant, two-way nature, bind mounts are great for development: mount source code from the host into the container so you can code on the host and see the changes in the container. They're also good for bringing configuration files in from the host (nginx.conf, .env files...).

The -v syntax with a bind mount needs an absolute path on the host side ($(pwd)/mysite, not ./mysite). If the left side isn't an absolute path, Docker misreads it as a named volume.

-v and --mount: two syntaxes

Docker has two ways to write this. -v is concise (used a lot in this article). --mount is longer but explicit, and is recommended for complex configurations:

# equivalent to -v demo-data:/data
docker run --mount type=volume,source=demo-data,target=/data alpine ...

# equivalent to -v "$(pwd)/mysite":/html
docker run --mount type=bind,source="$(pwd)/mysite",target=/html alpine ...

--mount forces you to spell out type=volume or type=bind, so it avoids the volume/bind-mount mix-up mentioned above.

Choosing volume or bind mount

Summarized per the docs' recommendations:

Situation Use
Application data that must persist (database, uploads) Volume (managed by Docker, preferred)
Code/config files from the host, dev environment Bind mount
Temporary data, no need to keep, want it fast tmpfs (--tmpfs)

A pragmatic rule: on a production server, use volumes for long-lived data. Save bind mounts for development and for bringing configuration in.

Real example: a database keeps its data

This is why volumes exist. Run PostgreSQL with a volume for its data directory:

docker run -d --name db \
  -e POSTGRES_PASSWORD=secret \
  -v pgdata:/var/lib/postgresql/data \
  postgres:16-alpine

You can docker rm -f db and then rerun the command above — the database still has all its data, because it lives in the pgdata volume, not in the container. Without a volume, deleting the container each time wipes the whole database.

🧹 Cleanup

docker rm -f db 2>/dev/null
docker volume rm demo-data pgdata 2>/dev/null
docker volume prune        # clean up remaining orphan volumes
rm -rf mysite

Remember: deleting a volume deletes real data, so only prune when you're sure.

Wrap-up

A container's writable layer is temporary; data that must persist has to live outside the container. A volume is managed by Docker and is the preferred choice for persistent data like a database. A bind mount attaches a host directory directly, good for dev and config files. Both live independently of the container's lifecycle.

Our containers now run and keep their data. But containers are still isolated from each other (recall the network namespace in Article 2). Article 7 lets them talk to each other and to the outside world: networking in Docker.