VolumeSnapshot and CSI snapshot

K
Kai··5 min read

Article 43 gave us auto-born persistent volumes. The next real-ops question: how do you back them up? VolumeSnapshot takes a point-in-time snapshot of a PVC's contents — and with EBS CSI, that snapshot is a real EBS snapshot on AWS, used to restore or clone. This article closes Part IX, and its model parallels PV/PVC so we already have the intuition.

Three objects, parallel to PV/PVC

The docs give a tidy analogy:

VolumeSnapshot        : VolumeSnapshotContent  ::  PersistentVolumeClaim : PersistentVolume
(user requests a snapshot,  (the REAL snapshot at        (user requests storage)  (real storage)
 namespaced)                 the backend, cluster-scoped)
VolumeSnapshotClass   ~  StorageClass            (driver + parameters)
  • VolumeSnapshot"a request for snapshot of a volume by a user", namespaced, like a PVC.
  • VolumeSnapshotContent"a snapshot taken from a volume ... a cluster resource", like a PV.
  • VolumeSnapshotClass — driver + parameters, like a StorageClass.

Important: these three are CRDs, not part of the core API — they must be installed separately (like VPA in Article 40), along with a snapshot-controller in the control plane. The thing that executes the snapshot is the csi-snapshotter sidecar — which already lives inside ebs-csi-controller (the 6/6 containers in Article 43).

Install the snapshot CRD + controller

# CRDs (volumesnapshots, volumesnapshotcontents, volumesnapshotclasses + group...) from external-snapshotter v8.2.0
kubectl apply -f .../client/config/crd/snapshot.storage.k8s.io_volumesnapshots.yaml
# ... (the 3 main CRDs)
# snapshot-controller (rbac + deployment) into the control plane
kubectl apply -f .../deploy/kubernetes/snapshot-controller/setup-snapshot-controller.yaml

The role split to remember: snapshot-controller (control plane) coordinates the Snapshot↔Content objects; csi-snapshotter (sidecar in ebs-csi-controller) calls the driver to actually create the snapshot at AWS. Create a VolumeSnapshotClass pointing at the EBS driver:

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata: {name: ebs-vsc}
driver: ebs.csi.aws.com
deletionPolicy: Delete

Snapshot a PVC

Stand up a PVC + pod that writes data (using ebs-sc from Article 43), then create a VolumeSnapshot pointing at the PVC:

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata: {name: snap1}
spec:
  volumeSnapshotClassName: ebs-vsc
  source: {persistentVolumeClaimName: src-pvc}      # snapshot this PVC

The chain of cause and effect (parallel to provisioning in Article 43):

user ──creates──▶ VolumeSnapshot snap1 (source: src-pvc)
snapshot-controller ── sees snap1 → creates VolumeSnapshotContent
csi-snapshotter (sidecar) ── calls driver CreateSnapshot
EBS CSI driver ──calls AWS──▶ AWS creates an EBS snapshot (snap-...)
                            ▼ when AWS reports completed
VolumeSnapshot.readyToUse = true  ◀──bound──▶ VolumeSnapshotContent
kubectl get volumesnapshot snap1
kubectl get volumesnapshotcontent <name> -o jsonpath='{.status.snapshotHandle}'
aws ec2 describe-snapshots --snapshot-ids <snap-id>
NAME    READYTOUSE   SOURCEPVC   RESTORESIZE   SNAPSHOTCONTENT
snap1   true         src-pvc     2Gi           snapcontent-6fee...

snapshotHandle = snap-06de0c51fde8e085f
snap-06de0c51fde8e085f   2   completed   vol-01a0fa04f3b2b37a6

readyToUse=true after ~70s (AWS finishes the snapshot), bound to the VolumeSnapshotContent, and snapshotHandle is a real EBS snapshot (snap-06de..., state completed, sourced from the volume of src-pvc). The Kubernetes object ↔ AWS snapshot match 1-1, exactly the PVC↔PV model.

Restore: a new PVC from the snapshot, and the page-cache trap

Restore by creating a new PVC with a dataSource pointing at the VolumeSnapshot:

apiVersion: v1
kind: PersistentVolumeClaim
metadata: {name: restore-pvc}
spec:
  accessModes: ["ReadWriteOnce"]
  storageClassName: ebs-sc
  dataSource: {name: snap1, kind: VolumeSnapshot, apiGroup: snapshot.storage.k8s.io}
  resources: {requests: {storage: 2Gi}}

A pod using restore-pvc goes Running, a new EBS volume is created from the snapshot (aws ec2 describe-volumes shows its SnapshotId pointing at the right snap-06de...). But reading the file gives... empty:

kubectl exec restore-pod -- ls -la /data
-rw-r--r--  1 root root  0 May 23 17:47 important.txt     # 0 bytes!

The file exists (its inode was snapshotted) but the content is 0 bytes, while the source pod had 24 bytes. The reason lies in EBS snapshots being block-level, point-in-time: they capture the exact state of the block device at the snapshot moment. At that moment the data from echo > file was still in the OS page cache, not yet flushed to disk, so the snapshot didn't have it. The fix: sync (or quiesce the app) before snapshotting:

kubectl exec src-pod -- sync          # push page cache down to the block device
# create snap2, restore from snap2:
kubectl exec restore2-pod -- cat /data/important.txt
data-goc-truoc-snapshot

After sync, the snapshot captured the data, and the restored PVC reads back the correct data-goc-truoc-snapshot from a brand-new volume. The lesson applies to every block-level snapshot (including database backups): you must quiesce (flush/freeze) before snapshotting, or the snapshot is only "crash-consistent" — like pulling the plug. Real databases use hooks (e.g. fsfreeze, or the DB's own flush command) to make the snapshot application-consistent.

🧹 Cleanup

kubectl delete pod restore2-pod src-pod --now
kubectl delete pvc restore2-pvc src-pvc            # -> EBS volumes auto-deleted (Delete)
kubectl delete volumesnapshot snap1 snap2          # -> EBS snapshots auto-deleted (Delete)

deletionPolicy: Delete on the VolumeSnapshotClass means deleting a VolumeSnapshot deletes the EBS snapshot too on AWS (verify: describe-snapshots is empty). Keep the snapshot-controller + CRDs + VolumeSnapshotClass + EBS CSI for later articles. The cluster now has CoreDNS + metrics-server + ebs-csi + snapshot-controller. Manifests at github.com/nghiadaulau/kubernetes-from-scratch, directory 44-volumesnapshot.

Wrap-up

VolumeSnapshot backs up a PVC at the storage layer, in a model parallel to PV/PVC: VolumeSnapshot (user requests, namespaced) ↔ VolumeSnapshotContent (the real snapshot at the backend, cluster-scoped), with VolumeSnapshotClass (driver+parameters) like a StorageClass — all CRDs that must be installed separately (the snapshot-controller + the csi-snapshotter sidecar already in EBS CSI). Snapshotting: user creates a VolumeSnapshot → snapshot-controller creates Content → csi-snapshotter calls the driver → AWS creates an EBS snapshotreadyToUse=true. Restoring: a new PVC with a dataSource pointing at the snapshot → a new EBS volume from the snapshot. The big lesson: a snapshot is block-level point-in-time — you must sync/quiesce before snapshotting, or data in the page cache won't make it into the snapshot (our first restore came out a 0-byte file; after sync it was complete). deletionPolicy: Delete means deleting the snapshot deletes it on AWS too.

End of Part IX — we went from pod-attached volumes to dynamic persistent storage and backups. Part X upgrades the network: Article 45 opens with Cilium and eBPF (theory) — why replace kube-proxy + the bridge (Articles 12–14) with an eBPF-based CNI, what eBPF is, and what Cilium does differently at the datapath layer, before Article 46 migrates the cluster to a real kube-proxy-less Cilium.