Skip to main content

Control Plane and API

The control plane determines whether Kubernetes behaves like a platform or only like a cluster running containers. The API server is the central gate, so the most important policy decisions should be enforced there or directly around it.

Components

Component	Production concern
API server	Authn/authz, admission, rate limits, audit, API availability.
etcd	Encryption, backup, compaction, quorum, restore drill.
Scheduler	Placement, fairness, priority, topology awareness.
Controller manager	Reconcile correctness, stuck finalizers, leader election.
Cloud controller manager	Load balancer, node lifecycle, route and volume integration.

API design principles

Treat Kubernetes manifests as public interfaces. A breaking change in labels, selectors, probes, or PVC names can be as serious as a code API change.
Custom resources should have clear status conditions. If a CRD cannot explain progress, failure, and observed generation, operators become hard to debug.
Admission policy should reject unsafe intent early: privileged pods, missing resource requests, mutable image tags, and broad host access.

Failure modes

API server overload: excessive controllers, chatty clients, or large list calls. Watch cache and client-side rate limiting matter.
etcd pressure: too many events, large objects, or stale compaction can slow cluster-wide operations.
Admission outage: webhook failure policy can block all workload changes if not designed carefully.
Controller storm: bad reconciliation can create repeated writes and amplify API pressure.

Operating signals

API server request latency by verb and resource.
Admission webhook latency and rejection rate.
etcd fsync duration, leader changes, database size.
Controller work queue depth and retry rate.

Components
API design principles
Failure modes
Operating signals