Skip to main content

Production Cluster

Production Kubernetes cluster architecture

Intent

Separate the control plane, worker pools, platform control plane, and shared services. The goal is to reduce blast radius, make capacity planning explicit, and keep platform services tied to clear ownership.

Key decisions

  • Dedicated system pool for platform agents and cluster-critical services.
  • Separate app pool for normal workloads.
  • Separate GPU pool for model serving and batch inference.
  • GitOps and policy-as-code manage desired state.
  • Observability and backup run as shared platform services with tested recovery.

Review signals

  • System workloads do not compete with tenant workloads.
  • Critical add-ons have resource requests and PDBs.
  • GPU workloads cannot accidentally land on general nodes.
  • Shared services have dashboards and runbooks.