Object management, recommended labels, and storage version

K
Kai··7 min read

Throughout the series we've created objects every which way without naming the technique: sometimes kubectl create deployment, sometimes kubectl apply -f, sometimes kubectl create -f. The docs treat these as three distinct object-management techniques, and warn outright: "A Kubernetes object should be managed using only one technique. Mixing and matching techniques for the same object results in undefined behavior." This article closes Part V by laying the three techniques side by side, then ties into two operational concepts: recommended labels so tools speak a common language, and storage version, the API version an object actually sits at inside etcd.

Three object-management techniques

Imperative command — operate directly on the live object

The fastest style, per the docs: "a user operates directly on live objects in a cluster. The user provides operations to the kubectl command as arguments or flags ... This is the recommended way to get started or to run a one-off task ... it provides no history of previous configurations."

kubectl create deployment imp-cmd --image=busybox:1.36 -- sleep 3600
deployment.apps/imp-cmd created

No file, no history, just one verb acting straight on the cluster. Good for experiments or one-off work (kubectl run, kubectl expose, kubectl scale). Downsides the docs list: no review integration, no audit trail, no "source of truth" beyond the live object itself.

Imperative object configuration — create -f a file

A step more serious: "the kubectl command specifies the operation (create, replace, etc.), optional flags and at least one file name. The file specified must contain a full definition of the object."

kubectl create -f cm.yaml      # cm.yaml defines ConfigMap imp-obj
kubectl create -f cm.yaml      # run AGAIN
configmap/imp-obj created
Error from server (AlreadyExists): ... configmaps "imp-obj" already exists

The config file can be stored in Git and reviewed, but the command is imperative: a second create reports AlreadyExists rather than updating. To edit, you must kubectl replace -f, and replace wholesale swaps the old spec for the file: the docs warn it "drops all changes to the object missing from the configuration file", dangerous for fields the cluster fills in (like a LoadBalancer Service's externalIPs). This is why people move to the third style.

Declarative — apply a whole directory

The declarative style, per the docs: "the user does not define the operations to be taken on the files. Create, update, and delete operations are automatically detected per-object by kubectl." You state the desired state (the file), and kubectl works out whether to create or update. And before changing anything, kubectl diff shows what will change:

kubectl apply -f dep.yaml          # creates Deployment decl-app (replicas: 2)
# ... change replicas 2 -> 3 in the file ...
kubectl diff -f dep.yaml
-  generation: 1
+  generation: 2
-  replicas: 2
+  replicas: 3
kubectl apply -f dep.yaml           # apply AGAIN
deployment.apps/decl-app configured

Quite different from create -f: a second apply does not error but prints configured, because it detects the difference and patches it. The trick, per the docs: "using the patch API operation to write only observed differences, instead of using the replace API operation". apply only patches the changed part, so "changes made directly to live objects are retained, even if they are not merged back into the configuration files." It diffs your file against the last-applied-configuration annotation (the one we met in Articles 24 and 29) to know what you deliberately changed, leaving alone parts other people/controllers adjusted. apply works on directories (apply -f configs/) and is the foundation of GitOps. The docs' summary table: imperative command for development, the two file-based styles for production; the learning curve rises command → object config → declarative.

The golden rule, restated: use only one technique per object. Mixing apply with edit/replace skews last-applied-configuration and produces unpredictable behavior.

Article 28 said labels are yours to choose, but if everyone names them differently, tools (Helm, dashboards, monitoring) have no way to understand them in common. Kubernetes therefore recommends a label set using the app.kubernetes.io/ prefix. The docs: "Shared labels and annotations share a common prefix: app.kubernetes.io. Labels without a prefix are private to users. The shared prefix ensures that shared labels do not interfere with custom user labels."

metadata:
  labels:
    app.kubernetes.io/name: mysql            # application name
    app.kubernetes.io/instance: mysql-abcxyz # distinguishes multiple installs
    app.kubernetes.io/version: "8.0.36"      # app version
    app.kubernetes.io/component: database    # role within the architecture
    app.kubernetes.io/part-of: wordpress     # belongs to a larger system
    app.kubernetes.io/managed-by: kubectl    # the managing tool

Because they're labels (Article 28), they're queryable like any label, but now follow a convention every tool understands:

kubectl get deploy -l app.kubernetes.io/part-of=wordpress
kubectl get deploy -l 'app.kubernetes.io/component in (database)'
recommended
recommended

part-of=wordpress groups every component of the same system (database, frontend, cache...); component in (database) filters by role. This is why you should attach this label set from the start: any dashboard can build an "application tree" without knowing your private convention.

Storage version: which API version the object sits at in etcd

The final concept, and the most subtle. A resource can be served by the API server at multiple versions at once:

kubectl api-versions | grep -E "autoscaling|networking"
autoscaling/v1
autoscaling/v2
networking.k8s.io/v1
networking.k8s.io/v1beta1

autoscaling has both v1 and v2; kubectl get hpa.v1.autoscaling or hpa.v2.autoscaling both work, because the API server converts between them. But in etcd, each object is stored only once, at exactly one version called the storage version. Let's dig in for real by reading etcd on controller-0 (the very cluster we stood up in Article 6), pulling the actual decl-app Deployment:

sudo etcdctl --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/etcd/etcd-ca.pem --cert=/etc/etcd/etcd.pem --key=/etc/etcd/etcd-key.pem \
  get /registry/deployments/default/decl-app | head -c 80 | tr -c '[:print:]' '.'
/registry/deployments/default/decl-app.k8s.....apps/v1..Deployment........decl-a

A lot is readable in the first few bytes: the object sits under the key /registry/deployments/default/decl-app; the value opens with k8s, the magic prefix signaling this is protobuf-encoded data (not JSON, more compact); then apps/v1 Deployment, which is the storage version. No matter which API version you created the Deployment through, etcd stores it at apps/v1. (Compare with Article 7, where reading /registry/secrets/... showed the k8s:enc:aescbc:... prefix of at-rest encryption: same place, different object type.)

This matters when you upgrade the cluster (Article 63): a new Kubernetes version can change a resource's storage version (e.g. dropping v1beta1, moving to v1). Old objects in etcd remain at the old version until they're rewritten. This is the root of "you must migrate objects before removing an API version", a step in the upgrade process we'll meet again. Understanding storage version now means you won't be caught off guard then.

🧹 Cleanup

kubectl delete deployment imp-cmd decl-app recommended
kubectl delete cm imp-obj

These are all objects in the cluster, so deleting cleans them up (the Deployment drags along its ReplicaSet + Pods via the GC of Article 29). The cluster returns to two CoreDNS pods. Manifests are at github.com/nghiadaulau/kubernetes-from-scratch, directory 30-object-management.

Wrap-up

Three object-management techniques, don't mix them: imperative command (kubectl create deployment, straight at the live object, no history, good for experiments); imperative object configuration (create -f/replace -f, has a file for Git, but a second create reports AlreadyExists and replace wholesale swaps the spec); declarative (apply -f/diff -f, auto-detects create/update, patches only the changed part via last-applied-configuration, prints configured rather than erroring, the foundation of GitOps). Recommended labels app.kubernetes.io/* (name/instance/version/component/part-of/managed-by) let tools understand each other, queryable like any label. Storage version: the API server serves a resource at multiple versions (we saw autoscaling/v1 + v2) but etcd stores one; reading etcd directly we saw decl-app sits at apps/v1, protobuf, the basis for migration during an upgrade.

That ends Part V. Part VI moves into configuration and policy: Article 31 opens with ConfigMap and Secret, how to separate configuration from the image, inject it into pods via env or volume, and how Secret differs from ConfigMap (including the at-rest encryption whose trace we already saw in etcd in this article).