API Aggregation: Bolting On a Second API Server

Article 57 added a new kind via a CRD: the main API server serves it and stores it in etcd. API aggregation is a different way to extend — it doesn't add a kind to the main API server, it bolts a second API server in behind it. The main API server receives requests for an API group and proxies them to that second server, which decides its own storage. Our cluster doesn't need anything new stood up to see this: the metrics-server installed in Article 39 is an aggregated API that's already running.

How aggregation differs from a CRD

   CRD:                                  Aggregation:
   ┌─────────────────────┐               ┌─────────────────────┐
   │  kube-apiserver      │               │  kube-apiserver      │
   │   /apis/kkloud.io    │               │   /apis/metrics.k8s.io ─┐ proxy
   │   ↓ store in etcd    │               │                      │  │
   │  [etcd]              │               └──────────────────────┘  ▼
   └─────────────────────┘                          ┌──────────────────────┐
   MAIN API server serves                            │ metrics-server        │
   + stores in etcd                                  │ (second apiserver)    │
                                                     │ computes live, NO etcd│
                                                     └──────────────────────┘

CRD: a new kind served by the main API server, data lives in etcd. Aggregation: a separate API server serves its own API group and stores things its own way — could be a different database, or computed on the spot and stored nowhere. metrics-server picks the latter: CPU/memory figures are computed live from kubelet, never stashed in etcd.

An APIService registers the second server

An APIService object "claims" an API group and points it at the Service that serves it. Look at what metrics-server registers:

kubectl get apiservice v1beta1.metrics.k8s.io
kubectl get apiservice v1beta1.metrics.k8s.io \
  -o jsonpath='group={.spec.group} version={.spec.version} service={.spec.service.namespace}/{.spec.service.name} available={.status.conditions[0].status}'

NAME                     SERVICE                      AVAILABLE   AGE
v1beta1.metrics.k8s.io   kube-system/metrics-server   True        6h30m

group=metrics.k8s.io version=v1beta1 service=kube-system/metrics-server available=True

v1beta1.metrics.k8s.io tells the main API server: proxy every request for /apis/metrics.k8s.io/v1beta1 to the Service kube-system/metrics-server. Compared against the full APIService list, the core groups read Local (served by the main API server itself), while metrics reads a Service:

kubectl get apiservice | grep -E "Local|metrics" | head

v1.                       Local                        True   9h
v1.apps                   Local                        True   9h
v1.authentication.k8s.io  Local                        True   9h
v1beta1.metrics.k8s.io    kube-system/metrics-server   True   6h30m

Local means the main API server serves it itself; metrics is handed off to another server.

The aggregation layer authenticates with the front-proxy cert

For the main API server to call the second server with trust, it uses a dedicated cert set — front-proxy, signed back in Article 4. Article 39 already set these flags on the API server (along with --enable-aggregator-routing to fix the control plane not running CNI):

ssh controller-0 'grep -oE "\-\-(requestheader-[a-z-]*|proxy-client-[a-z-]*|enable-aggregator-routing)[^ ]*" \
  /etc/systemd/system/kube-apiserver.service | sort -u'

--enable-aggregator-routing=true
--proxy-client-cert-file=/var/lib/kubernetes/front-proxy-client.pem
--proxy-client-key-file=/var/lib/kubernetes/front-proxy-client-key.pem
--requestheader-allowed-names=front-proxy-client
--requestheader-client-ca-file=/var/lib/kubernetes/front-proxy-ca.pem
--requestheader-username-headers=X-Remote-User
--requestheader-group-headers=X-Remote-Group

When proxying a request, the main API server authenticates the original user, then attaches their identity to the X-Remote-User/X-Remote-Group headers and signs with front-proxy-client.pem. The second server trusts those headers because they arrive over a cert that front-proxy-ca.pem vouches for. This is why Article 4 had to create a separate front-proxy CA.

Live data, never in etcd

Call the aggregated API group directly to see the real figures metrics-server returns over the proxy path:

kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes

worker-0  {'cpu': '97791959n',  'memory': '980620Ki'}
worker-1  {'cpu': '139388326n', 'memory': '791264Ki'}

These figures aren't stored anywhere — every time you ask, metrics-server computes them from kubelet and returns them. Verify by searching etcd: the metrics data is not there, only a single key, which is the registration object itself:

sudo etcdctl ... get /registry/ --prefix --keys-only | grep metrics.k8s.io

/registry/apiregistration.k8s.io/apiservices/v1beta1.metrics.k8s.io

Just one key — the APIService object (itself an ordinary resource, stored in etcd like every object). No key holds any node/pod measurement. This is the key difference from the CRD in Article 57: the Widget was stored at /registry/kkloud.io/widgets/..., while metrics are computed on the spot by the second server and nothing is stashed. Aggregation fits exactly this kind of data — computed live, or stored elsewhere — which a CRD (always backed by etcd) cannot do.

🧹 Cleanup

This article creates nothing — it just examines the metrics-server that's been around since Article 39, kept as infrastructure for HPA/kubectl top. There's nothing to clean up. The commands used in this article are at github.com/nghiadaulau/kubernetes-from-scratch, directory 60-api-aggregation.

Wrap-up

API aggregation bolts a second API server in behind the main one: an APIService object registers a group/version and points it at the Service that serves it, after which the main API server proxies every request for that group across. We examined metrics-server (Article 39) as a real aggregated API — v1beta1.metrics.k8s.io points at kube-system/metrics-server, Available=True, while core groups read Local. The aggregation layer authenticates with the front-proxy cert set (Article 4): the main API server attaches the user identity to X-Remote-* headers signed by front-proxy-client, and the second server trusts it via front-proxy-ca. It differs from a CRD in storage: kubectl get --raw /apis/metrics.k8s.io/... returns CPU/memory figures computed live, and etcd holds only the APIService registration object, no metrics data — because the second server handles its own storage. Use aggregation when you need data computed live or stored outside etcd; use a CRD when you just need to add a kind stored in etcd.

The last four articles extended the API server with data and controllers. Article 61 closes Part XII on a different extension axis — hardware: device plugins let a node advertise resources beyond CPU/memory (GPU, special devices) for pods to request and the scheduler to divvy up.