Volumes: ephemeral, hostPath, and projected

K
Kai··6 min read

Part IX is about storage, and the foundational brick is the volume. The docs state two problems volumes exist to solve. One: "On-disk files in a container are ephemeral ... when a container crashes or is stopped ... all of the files that were created or modified during the lifetime of the container are lost." Two: "when multiple containers are running in a Pod and need to share files ... challenging to set up a shared filesystem." Volumes patch both: a volume outlives the container and is shareable between containers in a pod.

The point to grasp before the details is the lifetime: the docs distinguish "Ephemeral volume types have a lifetime linked to a specific Pod, but persistent volumes exist beyond the lifetime of any individual Pod." This article covers the ephemeral volumes (live and die with the pod) and one special type that borrows the host; persistent volumes (outliving the pod) are PV/PVC in later articles. Usage is always two parts: declare the source under spec.volumes, mount it into a container under spec.containers[*].volumeMounts.

emptyDir: a scratch area, shared within the pod

emptyDir is the simplest volume. The docs: it's created "when the Pod is assigned to a node", starts empty, and — here's the key lifetime point — "When a Pod is removed from a node for any reason, the data in the emptyDir is deleted permanently", but "A container crashing does not remove a Pod from a node. The data in an emptyDir volume is safe across container crashes." So it lives with the pod, not the container: the container dies and comes back and still sees its data, but the pod disappearing wipes it.

Its main use is sharing between containers in a pod. A pod with two containers sharing one emptyDirwriter writes, reader reads:

apiVersion: v1
kind: Pod
metadata: {name: shared-vol}
spec:
  containers:
  - name: writer
    image: busybox:1.36
    command: ["sh","-c","while true; do date +%T > /data/now.txt; sleep 2; done"]
    volumeMounts: [{name: d, mountPath: /data}]      # writer sees it at /data
  - name: reader
    image: busybox:1.36
    command: ["sh","-c","sleep 3600"]
    volumeMounts: [{name: d, mountPath: /shared}]    # reader sees it at /shared
  volumes:
  - {name: d, emptyDir: {}}                          # the same volume

Both containers mount the same volume d (at different paths). Read from reader:

kubectl exec shared-vol -c reader -- cat /shared/now.txt
17:35:40

reader reads exactly the timestamp writer just wrote — they look at the same storage even though the mount paths differ. This is the underlying mechanism of the sidecar pattern (Article 19): the init/sidecar prepares data into the emptyDir, the app reads it out. (Add medium: Memory and the emptyDir becomes a tmpfs in RAM — exactly what we used to trigger OOM/eviction in Articles 22 and 38; sizeLimit caps the size.)

hostPath: borrow a node directory

hostPath mounts a file/directory from the node's filesystem into the pod. Create a marker on host worker-0, then mount it:

ssh worker-0 'echo "toi-la-file-tren-host-worker-0" | sudo tee /var/k8s-demo/marker.txt'
apiVersion: v1
kind: Pod
metadata: {name: host-vol}
spec:
  nodeName: worker-0           # pin to the node that has the file
  containers:
  - name: c
    image: busybox:1.36
    command: ["sh","-c","sleep 3600"]
    volumeMounts: [{name: h, mountPath: /host-data, readOnly: true}]
  volumes:
  - name: h
    hostPath: {path: /var/k8s-demo, type: Directory}
kubectl exec host-vol -- cat /host-data/marker.txt
toi-la-file-tren-host-worker-0

The pod reads the file directly on host worker-0. Note two things: you must pin the node (nodeName: worker-0) because the file only exists on that node — placing the pod on worker-1 would not see it (or would see a different directory); and hostPath is a security hazard — the pod touches the node's filesystem, so mounting / or /var/run/docker.sock is a backdoor to taking over the node, which is why Pod Security Standards (Part XI) block it at the baseline/restricted levels. hostPath should only be used for system agents (a DaemonSet reading /var/log, as in Article 26), not for ordinary apps.

projected: combine several sources into one place

projected isn't new storage — it combines several existing sources into one directory. The docs: "A projected volume maps several existing volume sources into the same directory." The sources it can combine: secret, configMap, downwardAPI (Articles 22, 31), and serviceAccountToken. A pod combining all four:

  volumes:
  - name: all
    projected:
      sources:
      - configMap: {name: proj-cm, items: [{key: app.conf, path: cm/app.conf}]}
      - secret:    {name: proj-sec, items: [{key: db.pass, path: sec/db.pass}]}
      - downwardAPI: {items: [{path: meta/labels, fieldRef: {fieldPath: metadata.labels}}]}
      - serviceAccountToken: {audience: api, expirationSeconds: 3600, path: token/sa.jwt}
kubectl exec projected-vol -- sh -c '
  echo "configMap:   $(cat /projected/cm/app.conf)"
  echo "secret:      $(cat /projected/sec/db.pass)"
  echo "downwardAPI: $(cat /projected/meta/labels)"
  echo "saToken:     $(cut -c1-25 /projected/token/sa.jwt)..."'
configMap:   mode=prod
secret:      hunter2
downwardAPI: team="storage"
saToken:     eyJhbGciOiJSUzI1NiIsImtpZ...

Four sources of different kinds — config, secret, pod metadata, and a token — sit neatly within one directory tree /projected, each source at a sub-path. Convenient for an app that only needs to point at one mount point. The most notable is serviceAccountToken: it injects a short-lived, bound JWT (eyJhbGci... is the JWT header). Unlike the old-style token (a permanent Secret), this token has an audience (usable only with the intended recipient) and expirationSeconds (once expired, the kubelet refreshes the file itself) — far safer. This is how a modern pod authenticates itself to the API server, and we'll see it again in Article 53 (ServiceAccount & tokens).

🧹 Cleanup

kubectl delete pod shared-vol host-vol projected-vol --now
kubectl delete configmap proj-cm ; kubectl delete secret proj-sec
ssh worker-0 'sudo rm -rf /var/k8s-demo'      # remove the marker on the host

The pods and in-cluster objects are wiped by deletion; the hostPath marker, however, sits on the host, so it has to be removed by hand on worker-0 (one more consequence of hostPath living outside the pod's lifetime). The cluster returns to CoreDNS + metrics-server. Manifests at github.com/nghiadaulau/kubernetes-from-scratch, directory 41-volumes.

Wrap-up

Volumes patch two problems with files-in-a-container: lost on container restart, and not shareable between containers. Lifetime is the classification criterion: emptyDir lives with the pod (safe across container crashes, gone when the pod disappears) and is shareable within the pod — we saw reader read exactly the timestamp writer wrote into the same volume. hostPath borrows a node directory (lives outside the pod, must pin the node, a security hazard — only for system agents). projected combines configMap/secret/downwardAPI/serviceAccountToken into one directory — notably serviceAccountToken injects a short-lived JWT with audience/expiry for the pod to authenticate to the API server. All of these are ephemeral (live and die with the pod). Article 42 moves to persistent storage — PersistentVolume / PersistentVolumeClaim: separating the "storage request" (PVC, declared by the app) from the "actual storage" (PV, created by the admin/CSI), and clarifying what binds what, what creates what.