Labels, selectors, namespaces and annotations

Across the past four parts we've typed -l app=web, -l job-name=..., --field-selector status.phase=... reflexively without once stopping to dissect them. Part V shifts focus from "what to run" (controllers) to "how to organize and query things" (objects and the API), and the natural starting point is that classification toolkit itself: labels, selectors, namespaces, annotations, and field selectors. This is the foundation for everything downstream: Services finding pods, controllers managing pods, RBAC scoping by namespace — all stand on these few concepts.

Labels: tags to select by

The docs define it: "Labels are key/value pairs that are attached to objects such as Pods. Labels are intended to be used to specify identifying attributes of objects that are meaningful and relevant to users, but do not directly imply semantics to the core system." The latter clause matters: a label carries no meaning to the core system — tier=frontend doesn't make Kubernetes do anything special; it only means something when you (or a controller/Service) use a selector to group by it. And "The label selector is the core grouping primitive in Kubernetes" — the selector is the core grouping primitive.

Set up a basket of labeled pods to query:

# web-prod: app=web, env=prod, tier=frontend
# web-dev:  app=web, env=dev,  tier=frontend
# api-prod: app=api, env=prod, tier=backend
# cache:    app=redis            (NO tier/env label)

kubectl get pods --show-labels

NAME       ...   LABELS
api-prod         app=api,env=prod,tier=backend
cache            app=redis
web-dev          app=web,env=dev,tier=frontend
web-prod         app=web,env=prod,tier=frontend

Equality-based selectors

The simplest kind, per the docs: "Three kinds of operators are admitted =,==,!=." The first two mean equals (synonyms), the last means not-equals:

kubectl get pods -l env=prod          # equals
kubectl get pods -l 'tier!=frontend'  # not equals

# env=prod:
api-prod
web-prod

# tier!=frontend:
api-prod
cache

env=prod filters to exactly the two prod pods. tier!=frontend returns api-prod (tier=backend) and cache — note that cache has no tier label yet still passes, since it doesn't carry the value frontend. Joining multiple conditions with a comma is an AND:

kubectl get pods -l 'env=prod,tier=frontend'

web-prod

Only web-prod satisfies both. The Service in Article 16 and the ReplicaSet in Article 24 select pods exactly this way: selector: {matchLabels: {app: web}} is equivalent to app=web.

Set-based selectors

More powerful is filtering by a set of values. Docs: "Three kinds of operators are supported: in,notin and exists."

kubectl get pods -l 'env in (prod,qa)'      # value within the set
kubectl get pods -l 'tier notin (frontend)' # value outside the set (+ pods without the label)
kubectl get pods -l 'tier'                  # the tier label exists (value ignored)
kubectl get pods -l '!tier'                 # the tier label does NOT exist

# env in (prod,qa):       api-prod, web-prod
# tier notin (frontend):  api-prod, cache
# tier (exists):          api-prod, web-dev, web-prod
# !tier (not exists):     cache

Four lines cover all the operators. exists (-l tier) gathers every pod that has a tier label regardless of value — three pods. !tier takes the inverse — only cache. The docs note exactly what we see: notin also gathers pods that don't have the label at all (cache slips into tier notin (frontend)). Set-based is handy for managing by group: "all pods with env in (prod, staging)", "all pods not yet labeled with a team".

Labels can be changed at any time (kubectl label pod web-dev env=staging --overwrite), and because controllers/Services select pods by label dynamically, changing one pod's label can move it into or out of a controller's management. That's a double-edged sword: handy for "pulling" a broken pod out of a Service while keeping it for inspection (change its label so it falls out of the selector), but also easy to cause an accident if you change the wrong one.

Annotations: metadata not meant to select by

If a label is to select, then an annotation is to attach information. The docs distinguish bluntly: "Labels can be used to select objects ... In contrast, annotations are not used to identify and select objects. The metadata in an annotation can be small or large, structured or unstructured, and can include characters not permitted by labels." Our web-prod pod carries a few realistic annotations:

kubectl get pod web-prod -o jsonpath='{.metadata.annotations}'

{"kkloud.io/git-commit":"a1b2c3d4e5f6 (release branch, PR #142)",
 "kkloud.io/owner":"team-platform, pager +84-xxx",
 "kubectl.kubernetes.io/last-applied-configuration":"{...}"}

The values contain parentheses, #, +, and spaces — characters labels forbid. Annotations hold them freely: a commit hash, a PR number, the on-call person, even a whole JSON config (the very last-applied-configuration that kubectl apply writes — and the thing tied to the rollback warning in Article 24). The key point: annotations are not selectable:

kubectl get pods -l 'kkloud.io/owner'

No resources found in default namespace.

-l only inspects labels, not annotations — so even though every pod has that annotation, the selector returns empty. The rule of thumb: anything you need to filter/group by goes in a label (short, character-constrained); anything just to read/reference (tools, people, build info) goes in an annotation.

Namespaces: isolating names and scope

So far everything we've created lives in the default namespace. A namespace divides the cluster into independent name spaces — an object's name need only be unique within a namespace, not cluster-wide. Create a namespace and put in it a pod with the same name as a pod in default:

kubectl create namespace shop
# create a pod named web-prod IN the shop namespace (web-prod already exists in default)
kubectl get pods -A --field-selector metadata.name=web-prod

NAMESPACE   NAME
default     web-prod
shop        web-prod

Two pods with the same name web-prod live side by side — because they're in different namespaces. This is the basis of multi-tenancy: a namespace per team/environment, no name collisions, and (with RBAC in Part XI) scoping permissions by namespace. A command without -n runs in default; -n shop targets a different namespace; -A (or --all-namespaces) gathers all. Note: not every object belongs to a namespace — Nodes, PersistentVolumes, and Namespaces themselves are cluster-scoped, whereas Pods/Services/Deployments are namespaced.

Field selectors: filtering by built-in fields

A label selector filters by labels you attach. A field selector filters by an object's built-in fields, the ones Kubernetes fills in itself, like status.phase, spec.nodeName, metadata.namespace. We already used it in Article 23 (--field-selector=status.phase=Succeeded); now look closely:

kubectl get pods --field-selector 'status.phase=Running,spec.nodeName=worker-0'
kubectl get pods -o wide   # to cross-check nodeName

# field-selector result:
cache
web-dev

# cross-check:
api-prod   ... worker-1
cache      ... worker-0
web-dev    ... worker-0
web-prod   ... worker-1

The field selector returns exactly the two pods that are Running and on worker-0 (cache, web-dev), matching the nodeName column. The core difference from a label selector: a field selector inspects the object's actual state/structure (where a pod is running, what phase), not metadata you stuck on. The set of fields supporting field selectors is narrower than labels (depends on the resource type — common ones are metadata.name, metadata.namespace, status.phase, spec.nodeName), but it's the only way to filter by attributes you don't control via labels.

🧹 Cleanup

kubectl delete pod web-prod web-dev api-prod cache --now
kubectl delete namespace shop      # deleting the namespace cleans up everything inside it

Deleting the shop namespace takes the web-prod inside it with it. The cluster returns to two CoreDNS pods. Manifests at github.com/nghiadaulau/kubernetes-from-scratch, directory 28-labels-selectors.

Wrap-up

Four object-organizing tools, each with one role. A label is a key/value pair to group and select, via equality selectors (=, !=) and set-based ones (in, notin, exists/!), joined by commas as AND; this is the primitive Services and controllers use to find pods. An annotation attaches non-identifying metadata (build info, on-call, JSON), can hold characters/lengths labels forbid, but can't be selected with -l. A namespace isolates names (two objects with the same name can coexist if in different namespaces) and is the scope for RBAC/quota later. A field selector filters by built-in fields (status.phase, spec.nodeName...), i.e. the object's actual state, distinct from user-applied labels. Grasp this set and every -l, -n, --field-selector from now on reads off its intent immediately.

Article 29 digs further into the lifecycle and linkage between objects: finalizers (block deletion until external resources are cleaned up), ownerReferences (the ownership relation — we've seen Pod→ReplicaSet→Deployment, Job→CronJob), and garbage collection that auto-deletes child objects when the parent is gone.