Why Build Kubernetes by Hand, and What We're Going to Build

If you've been through the "Kubernetes With Minikube" series, you know how to use a cluster: write YAML, kubectl apply, and then Pods, Deployments, Services appear and run. That series stopped at knowing how to drive the car, never opening the hood. This series is the opposite: we open the hood, pull the engine apart piece by piece, and put it back together by hand. By the end, things that used to look like they just happened on their own become clear: you know what each component is, why it's there, and how it talks to the others.

The approach is to stand up a complete Kubernetes cluster from the ground up: create each certificate yourself, start each binary with each flag yourself, wire pod networking yourself, and use no kubeadm or automation script at all. This is the spirit of Kubernetes The Hard Way that Kelsey Hightower wrote, but we go one step further: as we build each part, we stop and explain the mechanics behind it.

And the series doesn't stop once the cluster is running. Once you have a hand-built cluster — one where you know exactly what each component does — it becomes the ideal lab to deep-dive every Kubernetes concept: the Pod and its lifecycle, scheduling, storage, advanced networking, security, how to extend the API. The second half of the series does exactly that, topic group by topic group, with the same approach — run it for real, cross-check the official docs, explain the mechanics.

What "from scratch" means, and why skip kubeadm

In practice almost nobody builds a production cluster by hand. People use kubeadm, or a managed service like EKS/GKE/AKS, or installers like kops, k3s. One command, a few minutes, you have a cluster. So why take the long road here?

Because that convenience hides the part worth learning. kubeadm init does dozens of things in one command: generates the CA, signs a whole batch of certificates, writes kubeconfig files, stands up etcd, launches the control plane as static pods, configures initial RBAC. When it finishes and everything works, you don't know why it works. And when something breaks — a certificate expires, etcd loses quorum, a kubelet can't join — what's in front of you is a black box.

   kubeadm init  ─────►  [ BLACK BOX ]  ─────►  cluster runs
                          (the part worth learning is in here)

   from scratch  ─────►  you do each step yourself  ─────►  cluster runs
                          ↑ each step is something learned

Our goal isn't to find a way to stand up a cluster for production, it's to understand. Once you've signed a certificate for kube-apiserver yourself, written the --etcd-servers flag yourself, drawn routes for the pod network yourself, then when you read the logs of a kubeadm-built cluster, every line means something to you. You can debug it, operate it, and you're no longer afraid of it.

A cluster is really just a few processes talking to each other

Before we take the machine apart, hold on to a simple mental picture. A Kubernetes cluster, with all the outer layers peeled off, is a group of Linux processes running on a few machines, talking to each other over HTTP/gRPC, plus a database that holds state.

   etcd            ── database: stores all cluster state
   kube-apiserver  ── the gate: anyone reading/writing etcd goes through here
   controller-mgr  ── the loops that keep "actual" matching "desired"
   scheduler       ── picks a node for unassigned pods
   kubelet         ── on each worker: takes work, tells the runtime to run containers
   kube-proxy      ── on each worker: builds network rules for Services
   containerd      ── the runtime that actually runs containers

The whole series, at its core, is installing those seven or eight processes onto the right machines, giving them certificates so they trust each other, telling them each other's addresses, then switching them on. The declarative mindset we met in the minikube series — you declare a desired state and the loops pull the system toward it — is still how Kubernetes works; the only difference now is that we see each of those loops running in a specific process.

What we're going to build

We don't stop at a single control plane like the original KTHW; we go straight to an HA layout: three control planes and two workers, with etcd running right on the control planes (stacked etcd), plus an HAProxy load balancer out front that collapses the three api-servers into a single address.

                       ┌──────── HAProxy LB :6443 ────────┐
        kubectl ─────► │  collapse 3 api-servers → 1 VIP  │
                       └───────────────┬──────────────────┘
            ┌──────────────────────────┼──────────────────────────┐
            ▼                          ▼                          ▼
     ┌─ controller-0 ─┐        ┌─ controller-1 ─┐        ┌─ controller-2 ─┐
     │ etcd  ◄────────┼────────┤ etcd  ◄────────┼────────┤ etcd           │  quorum 3
     │ apiserver      │        │ apiserver      │        │ apiserver      │
     │ controller-mgr │        │ controller-mgr │        │ controller-mgr │  (1 leader)
     │ scheduler      │        │ scheduler      │        │ scheduler      │  (1 leader)
     └────────────────┘        └────────────────┘        └────────────────┘
                       ┌────────────────┴────────────────┐
                       ▼                                  ▼
                ┌──── worker-0 ────┐               ┌──── worker-1 ────┐
                │ containerd       │               │ containerd       │
                │ kubelet          │               │ kubelet          │
                │ kube-proxy ──►   │               │ kube-proxy ──►...│  (later: Cilium eBPF)
                └──────────────────┘               └──────────────────┘

The reason to go HA from the start is that this layout exposes a few important details a single control plane doesn't: why etcd needs an odd number of nodes and the concept of quorum, why controller-manager and scheduler must elect a leader instead of all three running at once, why the client points at a load balancer rather than directly at one api-server. By building HA, we're forced to confront and understand those questions.

On networking, we go in two stages. The first stage builds the old-school way: configure kube-proxy and a basic CNI by hand to route the pod network. This is how to understand how a Service turns into iptables rules. The second stage, in the advanced networking part, removes kube-proxy and replaces it with Cilium 1.19 (eBPF, kube-proxy-less) along with Hubble — how most production clusters in 2026 run. We go through both so you see both the roots and the modern approach, and understand what Cilium replaces.

Pin versions so it doesn't go stale

A legitimate worry with any Kubernetes guide is that it's outdated a few months later. We handle that two ways. The first is to pin specific versions and say so up front — the whole series runs on the following versions (as of May 2026):

   Kubernetes     v1.36.1      (latest stable at the time of writing)
   etcd           v3.6.x
   containerd     v2.x  + runc
   CNI plugins    v1.6+
   CoreDNS        1.12+
   Cilium         1.19.x  (+ Hubble)

The second is to dedicate a whole article to the upgrade process — upgrading the control plane then the nodes, and the version skew policy (the allowed version differences between components). Once you have that process down, you can update your own cluster to any later version, regardless of whether this article is still fresh. Before each article we also cross-check the official kubernetes.io docs (and the etcd, containerd, Cilium docs) at the exact version we're using.

Lab environment and cost

We build on AWS EC2: six Ubuntu VMs (1 LB, 3 controllers, 2 workers), created to learn from then torn down when done. Each hands-on article has a 💰 Cost section estimating per-hour cost and a 🧹 Cleanup section so you don't get an unexpected bill. Every command in the series is run for real on this EC2 cluster and the output is real — not staged and pasted in.

All certificates, config files, manifests and helper scripts live in a separate repo: github.com/nghiadaulau/kubernetes-from-scratch. You can cross-check each step, but I recommend typing it out by hand, because that's the whole point of the series.

Series roadmap

The series is split into two large parts. Part one is the build described above — from the first certificate to a real HA cluster, then tracing a request as it travels through it. Part two uses that same cluster as a lab to deep-dive each Kubernetes concept, grouped by the same topics as the official docs.

Part one — Build a cluster from scratch:

Fundamentals and theory: Kubernetes architecture at a deep level, and PKI/TLS — why a cluster needs so many certificates.
PKI by hand: stand up 6 EC2 instances and prepare the OS, create the CA and all the certificates yourself, generate kubeconfigs and the Secret encryption config.
Control plane: etcd, kube-apiserver, controller-manager and scheduler, then the HAProxy load balancer out front.
Workers: containerd/CRI, kubelet, kube-proxy.
Networking and DNS: the Kubernetes network model, configuring the pod network by hand, CoreDNS.
Verification: a comprehensive smoke test, then tracing a request from kubectl apply to a running pod.

Part two — Deep-dive concepts on the cluster we built:

Pods in depth: lifecycle, conditions, init/sidecar/ephemeral containers, probes, QoS, disruptions.
Workload controllers: Deployment, ReplicaSet, StatefulSet, DaemonSet, Job/CronJob.
Objects and the API: labels/selectors, namespaces, annotations, finalizers, owners/dependents, garbage collection.
Configuration and policy: ConfigMap/Secret, resource requests/limits, LimitRange, ResourceQuota.
Scheduling, preemption, eviction: scheduler and scheduling framework, affinity, taints/tolerations, topology spread, priority.
Autoscaling: metrics-server, HPA, VPA.
Storage: Volume, PV/PVC, StorageClass, dynamic provisioning, CSI, snapshot.
Advanced networking: Cilium/eBPF (replacing kube-proxy), NetworkPolicy, Ingress, Gateway API.
Security: authentication, RBAC, ServiceAccount, Pod Security Standards, seccomp/AppArmor, hardening.
Extending Kubernetes: Custom Resource, admission webhook, operator pattern, API aggregation.
Operations and observability: etcd backup/restore, cluster upgrades, logging, metrics, traces — then cleanup and wrap-up.

Each article in part two keeps the same approach as part one: run it for real on the cluster, stick to the kubernetes.io docs at the exact version, and explain the mechanics rather than listing flags. By the end of the series you won't just be able to use Kubernetes — you'll understand it from the inside, confident enough to operate it, debug it, and read any cluster someone else built.

Article 1 begins the theory part: a fresh look at a cluster's architecture, this time going deep into how the components coordinate — where the control loop runs, why everything goes through the api-server, and which stages a kubectl apply command passes through before the container runs.