LimitRange and ResourceQuota

Articles 22 and 32 handled resources at the pod and node layers. There's one more layer, the organizational one: when a cluster serves many teams, how do you stop team A from asking for 64Gi RAM for one container, or creating 10,000 pods that swallow the whole cluster? The answer is two policy objects attached to a namespace (Article 28): LimitRange sets rules for each pod/container, ResourceQuota sets rules for the namespace total. This is the final brick of Part VI, and the foundation for splitting a cluster across multiple teams.

Both work thanks to the LimitRanger and ResourceQuota admission controllers — they're in the API server's default set, so even though in Article 7 we only added NodeRestriction, these two are still on (the tests below prove it).

LimitRange: rules for each pod in a namespace

The docs: "A LimitRange is a policy to constrain the resource allocations (limits and requests) that you can specify for each applicable object kind (such as Pod or PersistentVolumeClaim) in a namespace." It does four things: enforce min/max compute per pod/container, enforce min/max storage per PVC, enforce a request:limit ratio, and set defaults for request/limit then auto-inject them into containers that don't declare any. Set up in a dedicated team-a namespace:

apiVersion: v1
kind: LimitRange
metadata: {name: cpu-mem-limits, namespace: team-a}
spec:
  limits:
  - type: Container
    default:        {cpu: "200m", memory: "128Mi"}   # default limit
    defaultRequest: {cpu: "100m", memory: "64Mi"}     # default request
    max:            {cpu: "500m", memory: "256Mi"}    # ceiling per container
    min:            {cpu: "50m",  memory: "32Mi"}     # floor per container

Inject defaults for pods that don't declare any

The docs describe the admission: "the LimitRange admission controller applies default request and limit values for all Pods (and their containers) that do not set compute resource requirements." Create a bare pod (no resources) in that namespace:

kubectl apply -f pod-nolimits.yaml      # pod 'nolimits', declares no resources
kubectl get pod nolimits -n team-a -o jsonpath='requests={.spec.containers[0].resources.requests} limits={.spec.containers[0].resources.limits}{"\n"}'

requests={"cpu":"100m","memory":"64Mi"} limits={"cpu":"200m","memory":"128Mi"}

The bare pod was injected with exactly the LimitRange's defaultRequest/default. This is how you force every pod in the namespace to have a request/limit even if the developer forgot to declare one, which matters because a pod with no request is BestEffort (Article 22), evicted first. LimitRange turns "forgot to declare" into "has a sensible default".

Block pods over max

The second thing is enforcing min/max. Create a pod asking for cpu: 800m while max: 500m:

kubectl apply -f pod-toobig.yaml        # requests/limits cpu = 800m

Error from server (Forbidden): error when creating "STDIN": pods "toobig" is forbidden:
maximum cpu usage per Container is 500m, but limit is 800m

403 Forbidden, exactly as the docs say: "your request to the API server will fail with an HTTP status code 403 Forbidden and a message explaining the constraint." The pod is blocked at creation (admission), never running. Note: "LimitRange validations occur only at Pod admission stage, not on running Pods". Editing a LimitRange doesn't affect running pods, only pods created after.

ResourceQuota: rules for the namespace total

LimitRange polices each pod; but 1000 valid pods can still swallow the cluster. ResourceQuota polices the total. The docs: "A ResourceQuota object provides constraints that limit aggregate resource consumption per namespace. A ResourceQuota can also limit the quantity of objects that can be created in a namespace by API kind."

apiVersion: v1
kind: ResourceQuota
metadata: {name: team-a-quota, namespace: team-a}
spec:
  hard:
    requests.cpu: "400m"       # total CPU request across all pods
    requests.memory: 512Mi
    limits.cpu: "1"            # total CPU limit
    limits.memory: 1Gi
    pods: "3"                  # at most 3 pods
    count/configmaps: "2"      # at most 2 ConfigMaps

kubectl describe quota gives a Used vs Hard table — resources/object count used versus the ceiling:

kubectl describe quota team-a-quota -n team-a

Resource          Used   Hard
--------          ----   ----
count/configmaps  1      2
limits.cpu        200m   1
limits.memory     128Mi  1Gi
pods              1      3
requests.cpu      100m   400m
requests.memory   64Mi   512Mi

The nolimits pod already counts against the quota: requests.cpu 100m, pods 1. Notice how it dovetails with LimitRange: because the quota polices requests.cpu/memory, the docs warn "users must specify requests or limits ... otherwise the quota system may reject pod creation", but LimitRange already auto-injected the defaults, so the bare pod is still valid. The two policies complement each other: LimitRange ensures every pod has a number for the quota to count. (count/configmaps 1 is because every namespace ships with a built-in kube-root-ca.crt ConfigMap.)

Over quota → 403

Create pods until hitting the pods: 3 ceiling:

# create p2, p3 (reaching pods = 3/3), then try p4:
kubectl apply -f pod-p4.yaml

pod/p2 created
pod/p3 created
Error from server (Forbidden): error when creating "STDIN": pods "p4" is forbidden:
exceeded quota: team-a-quota, requested: pods=1, used: pods=3, limited: pods=3

The fourth pod is blocked: "exceeded quota ... requested: pods=1, used: pods=3, limited: pods=3", the message stating clearly how much is requested, how much is used, what the ceiling is. The docs: "If creating or updating a resource violates a quota constraint, the control plane rejects that request with HTTP status code 403 Forbidden." The quota blocks both by object count (pods, count/configmaps, count/deployments.apps...) and by total resources (requests.cpu...) — the fourth pod would also be blocked if it pushed requests.cpu over 400m, even if the pod count weren't full.

There's one more axis worth mentioning: PIDs. A process is also a finite node resource, and a fork-bomb pod could exhaust a whole machine's PIDs. The kubelet supports a PID limit at the node level (--pod-max-pids) and ResourceQuota can also count count/*; full PID defense belongs to node configuration, but the spirit is identical: set a ceiling so one workload doesn't swallow shared resources.

🧹 Cleanup

kubectl delete namespace team-a

Deleting the namespace cleans up everything inside it — pods, LimitRange, ResourceQuota, ConfigMaps — in a single command (true to the spirit of a namespace as a grouping scope from Article 28, and the kubernetes finalizer from Article 29 handling the sequential cleanup). The cluster returns to two CoreDNS pods. Manifests are at github.com/nghiadaulau/kubernetes-from-scratch, directory 33-limitrange-quota.

Wrap-up

Two namespace-attached policies for splitting a cluster across multiple teams. LimitRange polices each pod/container: injecting defaultRequest/default for a bare pod (we saw a pod with no declaration get 100m/64Mi request, 200m/128Mi limit), and blocking over-min/max with 403 Forbidden at admission time (a pod asking 800m > max 500m is rejected). ResourceQuota polices the namespace total: a ceiling on total requests.*/limits.* and object count (pods, count/*), showing Used/Hard via describe quota, and blocking with 403 exceeded quota when creation exceeds it (the fourth pod blocked at pods: 3). The two complement each other, LimitRange ensures a pod has a request for ResourceQuota to count, and together with a node-level PID limit form a fence so no workload swallows shared resources.

That ends Part VI. Part VII moves into scheduling, the question of "which pod runs on which node". Article 34 opens with the scheduler and the scheduling framework: how kube-scheduler (which we stood up in Article 8) actually picks a node through a chain of filter then score, plugins like NodeResourcesFit (the very one using the Allocatable of Article 32), before later articles dig into affinity, taints, and topology spread.