Disruptions and the PodDisruptionBudget

Article 22 touched on a pod being evicted when the node runs out of resources. That's just one of many ways a pod vanishes. This article closes Part III with a systematic look at disruption (a pod ceasing to run) and the one tool that lets us control part of it: the PodDisruptionBudget. The crux of the whole article is distinguishing the two kinds of disruption, because a PDB can only guard exactly one kind.

Two kinds of disruption

The docs split disruption into two groups by whether someone deliberately causes it or not.

Involuntary — things that strike beyond your control, quoting the list verbatim:

"a hardware failure of the physical machine backing the node; cluster administrator deletes VM (instance) by mistake; cloud provider or hypervisor failure makes VM disappear; a kernel panic; the node disappears from the cluster due to cluster network partition; eviction of a pod due to the node being out-of-resources."

The OOM/eviction due to a RAM shortage in Article 22 falls squarely in this group. No one decides for those things to happen; the defense is to spread replicas across many nodes, many zones, so one node dying doesn't bring down the whole service.

Voluntary — initiated by the operator or cluster administrator themselves:

"deleting the deployment or other controller that manages the pod; updating a deployment's pod template causing a restart; directly deleting a pod" and on the admin side: "Draining a node for repair or upgrade; Draining a node from a cluster to scale the cluster down; Removing a pod from a node to permit something else to fit on that node."

Draining a node for maintenance or upgrade (that is, kubectl drain, which we'll use for real in the cluster-upgrade article) is the classic voluntary disruption. Because it's deliberate, we can set rules for it, and that's the job of the PDB.

A PodDisruptionBudget only guards voluntary disruptions

The docs' concise definition:

"A PDB limits the number of Pods of a replicated application that are down simultaneously from voluntary disruptions."

Read the word "voluntary" closely: a PDB cannot prevent a node dying or an OOM. The docs say it plainly:

"Involuntary disruptions cannot be prevented by PDBs; however they do count against the budget."

That is, a node dying still takes pods down as usual, the PDB is powerless, it merely counts that against the budget. What a PDB actually does is block voluntary pod-removal tools (like kubectl drain) when the removal would drop the number of healthy replicas below the allowed level. The mechanism: those tools don't delete pods directly but call the Eviction API, and the PDB guards right at that API.

A PDB has three main fields: selector (which pods to select), and one of minAvailable / maxUnavailable. The docs stress: "You can specify only one of maxUnavailable and minAvailable". minAvailable: 2 means "there must always be at least 2 healthy replicas"; with a 3-replica Deployment, that allows taking down at most 1 at a time.

Build and observe a PDB

Deploy 3 replicas alongside a PDB with minAvailable: 2:

apiVersion: apps/v1
kind: Deployment
metadata: {name: pdb-demo}
spec:
  replicas: 3
  selector: {matchLabels: {app: pdb-demo}}
  template:
    metadata: {labels: {app: pdb-demo}}
    spec:
      containers:
      - name: app
        image: busybox:1.36
        command: ["sleep","3600"]
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata: {name: pdb-demo}
spec:
  minAvailable: 2
  selector: {matchLabels: {app: pdb-demo}}

kubectl get pods -l app=pdb-demo
kubectl get pdb pdb-demo
kubectl get pdb pdb-demo -o jsonpath='currentHealthy={.status.currentHealthy} desiredHealthy={.status.desiredHealthy} disruptionsAllowed={.status.disruptionsAllowed} expectedPods={.status.expectedPods}{"\n"}'

NAME                        READY   STATUS    RESTARTS   AGE
pdb-demo-84645454b8-km7bc   1/1     Running   0          9s
pdb-demo-84645454b8-t9ks9   1/1     Running   0          9s
pdb-demo-84645454b8-zpr9z   1/1     Running   0          9s

NAME       MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
pdb-demo   2               N/A               1                     9s

currentHealthy=3 desiredHealthy=2 disruptionsAllowed=1 expectedPods=3

Read the status like a word problem: expectedPods=3 (the Deployment wants 3), desiredHealthy=2 (the PDB demands at least 2), currentHealthy=3 (3 are currently healthy), so disruptionsAllowed=1, meaning we may take down 1 and still have ≥2. The ALLOWED DISRUPTIONS column in kubectl get pdb is exactly this number. It's a moving scale: take down one and it drops to 0 until a healthy replica fills in.

The Eviction API and the HTTP 429

Now verify that the PDB blocks for real. Instead of kubectl drain-ing the whole node (which would touch the system pods too), we call the Eviction API directly for each pod — this is exactly the API drain uses underneath. An eviction request is an Eviction object POSTed to the subresource .../pods/<name>/eviction. Try to evict two pods in a row:

P1=pdb-demo-84645454b8-km7bc ; P2=pdb-demo-84645454b8-t9ks9
echo '{"apiVersion":"policy/v1","kind":"Eviction","metadata":{"name":"'$P1'","namespace":"default"}}' > /tmp/ev1.json
echo '{"apiVersion":"policy/v1","kind":"Eviction","metadata":{"name":"'$P2'","namespace":"default"}}' > /tmp/ev2.json

kubectl create --raw /api/v1/namespaces/default/pods/$P1/eviction -f /tmp/ev1.json
kubectl create --raw /api/v1/namespaces/default/pods/$P2/eviction -f /tmp/ev2.json

# evict #1:
{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Success","code":201}

# evict #2 (right after):
Error from server (TooManyRequests): Cannot evict pod as it would violate the pod's disruption budget.

This is the whole lesson packed into two lines. Evict #1 succeeds (code 201) because disruptionsAllowed=1 at the time: take down 1 and 2 remain, still enough. Immediately after, the budget drops to 0, so evict #2 is refused with TooManyRequests, i.e. the HTTP 429 the docs describe: when disruptionsAllowed=0, the Eviction API returns 429 instead of allowing the removal. The message states the reason directly: "Cannot evict pod as it would violate the pod's disruption budget." If this were a node being drained, the drain command would stop and wait here instead of taking down the second pod blindly, exactly what we want for a service that must always keep ≥2 replicas.

Compare with deleting directly: kubectl delete pod does not go through the Eviction API so it is not blocked by the PDB (docs: "deleting deployments or pods bypasses Pod Disruption Budgets"). A PDB is not a delete lock; it's a rule for the well-behaved tools that call through the Eviction API.

Replacing replicas and budget recovery

After evict #1, the Deployment's controller (Article 24 will dig into it) immediately notices the missing pod and creates a replacement:

kubectl get pods -l app=pdb-demo
kubectl get pdb pdb-demo -o jsonpath='disruptionsAllowed={.status.disruptionsAllowed} currentHealthy={.status.currentHealthy}{"\n"}'

NAME                        READY   STATUS        RESTARTS   AGE
pdb-demo-84645454b8-km7bc   1/1     Terminating   0          36s
pdb-demo-84645454b8-qjnvk   1/1     Running       0          13s   <- the replacement
pdb-demo-84645454b8-t9ks9   1/1     Running       0          36s
pdb-demo-84645454b8-zpr9z   1/1     Running       0          36s

disruptionsAllowed=1 currentHealthy=3

Pod km7bc is Terminating, but qjnvk has come up to replace it and currentHealthy is back to 3 → disruptionsAllowed is 1 again. This is the cycle: take one down → budget goes to 0 → controller fills in → the replacement is healthy → budget goes back to 1 → only then can you take another down. A kubectl drain that respects the PDB follows exactly this rhythm, taking down one pod at a time and waiting for the replacement to be healthy between each, so the service never drops below 2 replicas throughout node maintenance. A note from the docs: the Deployment/StatefulSet itself during a rolling update is not limited by the PDB — the PDB only guards the Eviction API path.

🧹 Cleanup

kubectl delete deployment pdb-demo
kubectl delete pdb pdb-demo

Deleting the Deployment takes all its pods with it; deleting the PDB is the last object. The cluster returns to the two CoreDNS pods. Manifests at github.com/nghiadaulau/kubernetes-from-scratch, directory 23-pdb.

Wrap-up

Pods vanish in two ways: involuntary (the node dies, kernel panic, OOM — unpreventable, only mitigated by spreading replicas) and voluntary (deleting the controller, or an admin drain-ing a node for maintenance/upgrade). A PodDisruptionBudget only guards the voluntary kind, and guards it at the Eviction API: with minAvailable: 2 on a 3-replica Deployment, disruptionsAllowed is 1; evicting the first succeeds (201), evicting the second right after is blocked with HTTP 429 "would violate the pod's disruption budget"; the controller replaces the replica, the budget recovers to 1, and only then can the next be taken down. A PDB does not block a direct kubectl delete and does not limit a rolling update; it's a rule that tells node-draining tools to stop at the right moment, keeping the service with enough healthy replicas during maintenance.

That's the end of Part III. The next three articles form Part IV on controllers: starting in Article 24 with the Deployment, digging into the rollout/rollback mechanism and the role of the ReplicaSet in the middle (we glimpsed it creating the replacement pod in this article), then on to StatefulSet, DaemonSet, Job/CronJob.