The Lifecycle of a Request: From kubectl apply to a Running Pod

K
Kai··7 min read

Article 16 proved the cluster runs: the app comes up, balances load, self-heals. But "running" and "understanding why it runs" are two different things. This article takes a single kubectl apply command and traces it through each component we built (api-server, etcd, controller-manager, scheduler, kubelet, containerd, CNI) in exactly its chronological order. This is the hinge article: it stitches the "build" part of the series into a moving picture, and the model that emerges here will follow us through all the concept parts ahead.

There's something to say first, because it shapes everything: in Kubernetes there is no conductor coordinating things. When you apply, no one calls a chain of functions "create pod → pick node → run container". Instead, your command only records the desired state somewhere, and then many independent components, each with its own loop, observe that place and one by one pull reality toward the desire. Get this point and the rest of the article follows as a consequence.

Two halves of the journey

The journey splits cleanly into two halves. The first half (write path) is synchronous and fast: kubectl sends the object, the api-server vets it then writes it to etcd, returns the result. After this half, the object exists but nothing is running yet. The second half (reconcile path) is asynchronous: the controllers, scheduler, and kubelet react to the new object one by one and turn it into a running pod.

   ── WRITE PATH (synchronous, milliseconds) ───────────────────
   kubectl ─TLS─► api-server ─[authn▸authz▸admission▸validate]─► etcd
                     │                                            
                     └── returns 201 Created to kubectl           
   ── RECONCILE PATH (asynchronous, each side its own loop) ─────
   etcd ◄─watch─ Deployment ctrl ─► create ReplicaSet ─► create Pod (Pending)
        ◄─watch─ scheduler ─► assign nodeName (Binding)
        ◄─watch─ kubelet@node ─► CRI ▸ containerd ▸ CNI ─► Pod Running

Half one: kubectl to the api-server

Turn on -v=8 to see what kubectl actually does; it's an HTTP client calling REST:

kubectl get --raw /api/v1/namespaces/default/pods/trace-845c78d578-vf8zg -v=8
round_trippers.go:527] "Request" verb="GET" url="https://203.0.113.10:6443/api/v1/namespaces/default/pods/trace-845c78d578-vf8zg"
round_trippers.go:632] "Response" status="200 OK"

These two lines capture the entire first half. kubectl opens a TLS connection to https://203.0.113.10:6443, the HAProxy Elastic IP (Article 9), not to a specific api-server. It presents the admin certificate to authenticate, and the load balancer passes the stream verbatim to one of the three api-servers (Article 9: tcp passthrough mode, end-to-end mTLS). For a write command like apply, this is a POST, and the reply is 201 Created.

Inside the api-server: the vetting gauntlet

Before anything touches etcd, the request must pass a chain of gates inside the api-server. This order matters:

  1. Authentication — who are you? The api-server reads the client certificate and extracts the identity (admin, O=system:masters). Configured in Article 7.
  2. Authorization — are you allowed to do this? The Node,RBAC mode (Article 7) checks whether that identity has the create deployments permission in this namespace.
  3. Admission controlshould the request be modified or rejected? Admission plugins (Article 7 enabled NodeRestriction) can change the object (mutating) or block it (validating). This is the hook for most cluster policy, which we'll dig into in the Security and Extending parts.
  4. Validation — is the object valid against the schema?
  5. Write to etcd — over the same TLS client connection to etcd (Article 6), and for Secrets, encrypting at-rest before writing (Article 7, verified with a hexdump).

Only after passing all five gates does the api-server return 201 to kubectl. At this point the Deployment object is in etcd, but no pod exists yet, no node has been chosen, no container is running. The first half ends here, and your kubectl has returned the prompt.

Half two: the loops wake up

This is where the "no conductor" model becomes clear. The api-server doesn't call anyone. Instead, each component is watching the api-server, and the appearance of a new Deployment makes them react one by one.

The Deployment controller (in the controller-manager, Article 8) sees a Deployment with no corresponding ReplicaSet, and creates one. The ReplicaSet controller sees a ReplicaSet that wants 1 replica but has no pods, and creates a Pod. This creation chain leaves a trail in the ownerReferences field:

# which ReplicaSet the Pod belongs to, which Deployment the ReplicaSet belongs to
kubectl get pod trace-845c78d578-vf8zg -o jsonpath='{.metadata.ownerReferences[0].kind}/{.metadata.ownerReferences[0].name}'
kubectl get rs trace-845c78d578 -o jsonpath='{.metadata.ownerReferences[0].kind}/{.metadata.ownerReferences[0].name}'
ReplicaSet/trace-845c78d578
Deployment/trace
   Deployment/trace
        ▲ owns
   ReplicaSet/trace-845c78d578
        ▲ owns
   Pod/trace-845c78d578-vf8zg

This ownership chain isn't just for show: it's how Kubernetes knows what to delete when you delete the Deployment (garbage collection by ownerReferences, a concept we'll dissect separately in the Objects part). The freshly created Pod has status.phase: Pending and an empty spec.nodeName: it exists, but belongs to no node yet.

The scheduler picks a home for the pod

The scheduler (Article 8) watches pods with an empty nodeName. Seeing the unassigned trace pod, it scores the candidate nodes (filtering by resources and constraints, a topic for its own whole part later) then picks one, and writes that choice back to the api-server via a subresource called Binding. The result is that spec.nodeName gets filled in:

kubectl get pod trace-845c78d578-vf8zg -o jsonpath='nodeName={.spec.nodeName}{"\n"}podIP={.status.podIP}'
nodeName=worker-1
podIP=10.200.1.8

nodeName=worker-1 is the scheduler's signature. Note that the scheduler doesn't talk to worker-1; it only updates the object in etcd via the api-server. That the other node finds out it just received a pod is, again, because that node is watching.

kubelet turns the object into a process

The kubelet on worker-1 (Article 11) watches pods whose nodeName points at itself. Seeing the just-assigned trace pod, it gets to the real work: calls containerd via CRI (Article 10) to pull the image and create the container, calls CNI (Article 14) to assign an IP and wire up networking, then starts it. At each step it records an Event to the api-server, and this is the real event sequence of the pod we just created:

kubectl get events --field-selector involvedObject.name=trace-845c78d578-vf8zg
REASON      FROM                MESSAGE
Scheduled   default-scheduler   Successfully assigned default/trace-845c78d578-vf8zg to worker-1
Pulled      kubelet             Container image "...agnhost:2.52" already present on machine
Created     kubelet             Container created
Started     kubelet             Container started

Reading top to bottom is reading the lifecycle correctly: Scheduled (scheduler assigns the node) → Pulled (containerd has the image) → Created (the container is built) → Started (the process runs). After the last step, kubelet updates the pod's status.phase to Running and fills in podIP (10.200.1.8, from worker-1's pod range). If the pod sits behind a Service, the endpoint controller adds this IP to the EndpointSlice and kube-proxy (Article 12) updates the rules, and the pod begins receiving traffic.

The real model: watch and reconcile

Looking back over the whole journey, what's worth taking away isn't the order of the steps but how they link together. No component gives orders to another. Each one — from the Deployment controller, ReplicaSet controller, and scheduler to kubelet — runs the same loop: observe the desired state, compare it to the actual state, do one small thing to narrow the gap, repeat. The api-server (with etcd) is the single source of truth that all of them look at; no one else touches etcd directly.

This model is called level-triggered reconciliation, and it's quite different from a sequential chain of RPCs. An RPC chain that breaks halfway leaves things half-done; a reconcile loop just needs to look at the current state again and continue, not caring where it got to before. The cluster self-healing in Article 16 is precisely because of this mechanism: we delete a pod, the ReplicaSet controller on its next loop sees "want 3, have 2", and creates a replacement. There's no "handle delete event" at all, only comparing desire with reality, forever.

Wrap-up

A kubectl apply goes through two halves: writing the object to etcd behind the authn/authz/admission gauntlet, then letting a chain of independent loops turn that object into a running pod: a controller creates a ReplicaSet then a Pod, the scheduler assigns a node, kubelet calls the runtime and the network. They all link through watching one source of truth, not through direct commands. That's why Kubernetes scales and self-heals: adding a controller is just adding one more loop observing that same source of truth.

The "build" part of the series closes here, and it's also time to change tempo. From the next article on, we stop adding new components to the cluster and dig into the concepts running on it, starting with the one closest to the user and also the foundational unit of every workload: the Pod. Article 18 opens the in-depth Pods part with a pod's lifecycle (the phases, the conditions, restartPolicy), digging much deeper than the glimpse in the Event chain above.