Skip to main content

Kubernetes LLM Labs

The labs turn the guide into operator exercises. They are not a hosted sandbox. They are field runbooks you can run in your own cluster, adapt to a platform review, or use as interview-grade design drills.

Each lab has the same contract: objective, prerequisites, hands-on tasks, validation signals, failure drills, and links back to the deeper guide.

Lab prerequisites

RequirementWhy it matters
Kubernetes cluster accessYou need namespace, workload, service, and log visibility.
GPU node pool for inference labsLLM serving exercises depend on accelerator scheduling and runtime health.
Metrics stackPrometheus, Grafana, OpenTelemetry, or equivalent telemetry makes validation concrete.
GitOps or manifest workflowLabs are easier to review when every change is versioned.
Test prompts and evaluation casesLLM workloads need behavior checks, not only pod readiness.

How to use the labs

  1. Read the related architecture guide before applying manifests.
  2. Run the baseline task without optimization.
  3. Capture metrics before changing runtime flags or autoscaling policy.
  4. Trigger one failure drill.
  5. Write down which signal detected the failure first.

New labs should follow the Content Review Checklist: objective, prerequisites, manifests or commands, validation signals, failure drills, and expected signals.

Start with K8s LLM: Kubernetes LLM Platform Guide, continue to GPU Node Pool Kubernetes, run the vLLM inference lab, then move to RAG On Kubernetes and the RAG retrieval lab.