In-place Pod Resize

K
Kai··4 min read·5 views

Article 22 set resources.requests/limits at pod creation, and so far every way to change them — editing a Deployment, VPA in Auto mode (Article 40) — recreates the pod. That's been the implicit assumption throughout the series: a container's resources are immutable for the pod's lifetime. In-place pod resize, a feature that has matured by v1.36, breaks that assumption: adjust a running pod's CPU/memory without recreating it. Article 40 mentioned this feature exists; this article does it for real and sees how it reaches down to cgroup.

resizePolicy and the resize subresource

The pod declares up front, per resource, whether resizing it needs a container restart — via resizePolicy:

spec:
  containers:
  - name: c
    image: busybox:1.36
    command: ["sleep", "100000"]
    resizePolicy:
    - {resourceName: cpu,    restartPolicy: NotRequired}
    - {resourceName: memory, restartPolicy: NotRequired}
    resources: {requests: {cpu: 100m, memory: 64Mi}, limits: {cpu: 200m, memory: 128Mi}}

NotRequired means changing that resource needs no restart — the kubelet applies it straight to the cgroup. The change goes through a dedicated subresource named resize (not a regular kubectl edit), to separate the resize operation from other spec edits. Look at the pod's cgroup on the node before the resize (exactly the mechanism from Article 64):

ssh worker-0 'D=$(find /sys/fs/cgroup/kubepods.slice -type d -name "*pod<uid>*")
  cat $D/cpu.max; cat $D/memory.max'
cpu.max=20000 100000        # 200m
memory.max=134217728        # 128Mi

Resize, the kernel changes in place

Raise CPU 200m→500m and memory 128Mi→256Mi via the resize subresource:

kubectl -n resize-demo patch pod rz --subresource resize --patch \
  '{"spec":{"containers":[{"name":"c","resources":{"requests":{"cpu":"250m","memory":"128Mi"},"limits":{"cpu":"500m","memory":"256Mi"}}}]}}'

Read the cgroup on the node again — the values have changed, and the container hasn't restarted at all:

ssh worker-0 'cat $D/cpu.max; cat $D/memory.max'
kubectl -n resize-demo get pod rz -o jsonpath='restarts={.status.containerStatuses[0].restartCount} allocated={.status.containerStatuses[0].resources.limits}'
cpu.max=50000 100000        # 500m — changed in place
memory.max=268435456        # 256Mi — changed in place
restarts=0 allocated={"cpu":"500m","memory":"256Mi"}

The cgroup v2 cpu.max and memory.max jump straight to the new values, restartCount stays 0, and status...resources reports what was actually allocated. The process inside is uninterrupted — the kernel just widens the ceiling for the live cgroup. This is what Article 40 was missing: VPA right-sizes how much, while in-place resize applies that amount without evicting the pod. (VPA driving in-place resize itself is still not available as of v1.36; this section is manual/API-driven resize.)

Two constraints to know

Resize isn't unconstrained. First, you can't change the QoS class (Article 22): a Burstable pod can't be resized into Guaranteed. Try raising requests up to the limit (turning Burstable into Guaranteed):

kubectl -n resize-demo patch pod rz --subresource resize --patch \
  '{"spec":{"containers":[{"name":"c","resources":{"requests":{"cpu":"500m","memory":"256Mi"},"limits":{"cpu":"500m","memory":"256Mi"}}}]}}'
The Pod "rz" is invalid: spec: Invalid value: "Burstable":
  Pod QOS Class may not change as a result of resizing

The API server refuses — QoS is derived from the request/limit relationship (Article 22) and is a stable property of the pod, so resize is not allowed to flip it.

Second, memory usually needs restartPolicy: RestartContainer. Widening memory.max in the kernel is easy, but many applications read the memory limit at startup (JVM heap, various runtime caches) and don't notice the ceiling changed; for those, set restartPolicy: RestartContainer for memory so the kubelet restarts the container to apply the new value cleanly. CPU is almost always safe as NotRequired because throttling is dynamic. If the node doesn't have room for the new amount, the resize isn't rejected immediately but hangs — the pod carries the PodResizePending condition until resources are available, unlike a new pod that gets Pending at scheduling.

🧹 Cleanup

kubectl delete namespace resize-demo

The resize touched only a test pod; deleting the namespace leaves it clean, with no change to node configuration. Manifests at github.com/nghiadaulau/kubernetes-from-scratch, directory 69-in-place-resize.

Wrap-up

In-place pod resize drops the "a container's resources are immutable for the pod's lifetime" assumption: declare resizePolicy per resource (NotRequired = apply without restart), then change it via the resize subresource. We raised CPU 200m→500m and memory 128Mi→256Mi on a running pod, and saw cgroup v2 on the node (cpu.max, memory.max — Article 64) change to exactly the new values with restartCount still 0. Two constraints: the QoS class can't change on resize (the API refuses because QoS is a stable property derived from request/limit — Article 22), and memory usually needs restartPolicy: RestartContainer because many apps read the limit at startup; if the node lacks room the resize hangs at PodResizePending. This is the "no disruption" piece for the vertical scaling of Article 40 — VPA computes the amount, in-place resize applies it without evicting the pod.

Article 70 moves to the group of storage features that just graduated in v1.36: consistent snapshots of multiple PVCs at once, mounting OCI image content as a volume, and changing volume parameters dynamically — extending what Part IX built.