Kubernetes LLM Labs

The labs turn the guide into operator challenges. The docs remain free, indexable runbooks. The interactive version lives at labs.k8sllm.online, where you can type commands in a lab terminal, inspect Kubernetes output, unlock hints, reveal solutions, and keep private progress on your device.

Challenge of the Week

Start with the vLLM Inference Challenge. It is the best first demo because it checks the core platform loop: GPU placement, model runtime readiness, token latency, queue wait, and production evidence.

Entry point	Link
Interactive challenge	Start vLLM Inference Challenge
Production deployment guide	vLLM Kubernetes Production Deployment
Readiness lead magnet	LLM Production Readiness Checklist
Product roadmap	Kubernetes LLM Platform Roadmap

Interactive MVP12free browser-guided challenges

K8sLLM Labs is moving from static runbooks into an interactive lab product at labs.k8sllm.online. V1 uses browser-side checks, local progress, hints, and solution reveal tracking. Hosted sandboxes and paid lab packs come later after auth, abuse controls, and infrastructure cost limits are designed.

Topic

Difficulty

Path

Lab signal

Showing 12 of 12 challenges.

Model servingHard75 minFree

vLLM Inference Challenge

Deploy a GPU-backed OpenAI-compatible endpoint and prove scheduling, health, TTFT, queueing, and rollback readiness.

Persona: AI infrastructure engineer
Tools: kubectl, vLLM, Prometheus
Checks: 6

Open interactive challenge Read guide

RAGMedium60 minFree

RAG Retrieval Challenge

Operate ingestion, metadata filters, vector retrieval, answer evaluation, and failure drills for production RAG.

Persona: MLOps engineer
Tools: kubectl, curl, vector database
Checks: 6

Open interactive challenge Read guide

ProductionHard50 minFree

Production Readiness Challenge

Run a launch review across security, quota, rollout, observability, cost, and ownership before live traffic.

Persona: Platform lead
Tools: kubectl, policy engine, dashboard
Checks: 6

Open interactive challenge Read guide

ObservabilityMedium45 minFree

LLM Observability Challenge

Build the signal model needed to debug user latency, runtime saturation, GPU pressure, traces, logs, and alerts.

Persona: SRE
Tools: Prometheus, Grafana, OpenTelemetry
Checks: 6

Open interactive challenge Read guide

Model servingMedium55 minFree

vLLM Kubernetes Deployment Lab

Design the deployment contract for vLLM with model cache, readiness, runtime flags, and service exposure.

Persona: AI infrastructure engineer
Tools: kubectl, vLLM, container registry
Checks: 6

Open interactive challenge Read guide

ArchitectureMedium35 minFree

KServe vs Ray Serve Decision Lab

Choose the serving abstraction by ownership model, CRDs, graph complexity, autoscaling, and rollout needs.

Persona: Platform architect
Tools: decision matrix, runtime inventory
Checks: 4

Open interactive challenge Read guide

GPU capacityHard65 minFree

GPU Node Pool Scheduling Lab

Prove accelerator placement with labels, taints, tolerations, quotas, and unschedulable-pod debugging.

Persona: Platform engineer
Tools: kubectl, NVIDIA device plugin, cluster autoscaler
Checks: 6

Open interactive challenge Read guide

RAGHard70 minFree

RAG Retrieval Quality Lab

Measure retrieval recall, citation accuracy, tenant filtering, and reranking latency before generation.

Persona: MLOps engineer
Tools: evaluation set, vector database, reranker
Checks: 6

Open interactive challenge Read guide

CostMedium45 minFree

Inference Cost Model Lab

Calculate cost per request from input tokens, output tokens, GPU profile, utilization, and cache behavior.

Persona: AI platform lead
Tools: spreadsheet, metrics export, benchmark report
Checks: 4

Open interactive challenge Read guide

ProductionHard60 minFree

LLM Rollout and Rollback Lab

Design traffic shifting, readiness gates, rollback triggers, and model-version ownership for inference services.

Persona: SRE
Tools: Argo CD, gateway policy, metrics dashboard
Checks: 6

Open interactive challenge Read guide

SecurityHard70 minFree

Multi-Tenant LLM Security Lab

Review tenant routing, namespace boundaries, secrets, NetworkPolicy, prompt logging, and retrieval authorization.

Persona: Security-minded platform engineer
Tools: kubectl, NetworkPolicy, admission policy
Checks: 6

Open interactive challenge Read guide

ObservabilityMedium50 minFree

LLM Observability and Cost Dashboard Lab

Create a dashboard model that joins user latency, queue wait, GPU pressure, token throughput, and cost signals.

Persona: SRE
Tools: Prometheus, Grafana, OpenTelemetry
Checks: 6

Open interactive challenge Read guide

Lab prerequisites

Requirement	Why it matters
Kubernetes cluster access	You need namespace, workload, service, and log visibility.
GPU node pool for inference labs	LLM serving exercises depend on accelerator scheduling and runtime health.
Metrics stack	Prometheus, Grafana, OpenTelemetry, or equivalent telemetry makes validation concrete.
GitOps or manifest workflow	Labs are easier to review when every change is versioned.
Test prompts and evaluation cases	LLM workloads need behavior checks, not only pod readiness.

How to use the labs

Read the related architecture guide before applying manifests.
Open the matching interactive challenge at labs.k8sllm.online/challenges.
Run the challenge scenario in your own cluster or mock environment.
Paste command output into the guided checks and use hints only when blocked.
Trigger one failure drill and record which signal detected the failure first.

New labs should follow the Content Review Checklist: objective, prerequisites, manifests or commands, validation signals, failure drills, and expected signals.

Lab experience

Capability	Why it matters
Lab terminal	Practice Kubernetes and LLM platform commands in the flow of the challenge.
Guided checks	Validate evidence such as GPU placement, model readiness, metrics, and policy boundaries.
Progressive hints	Keep moving without turning every lab into a copy-paste answer sheet.
Private device progress	Resume challenges on the same device without creating an account.
Downloadable kits	Future lab packs can include manifests, failure drills, worksheets, and deeper solution reviews.

Recommended path

Start with K8s LLM: Kubernetes LLM Platform Guide, continue to GPU Node Pool Kubernetes, run the vLLM Inference Challenge, then move to RAG on Kubernetes and the RAG Retrieval Challenge.

Challenge of the Week​

vLLM Inference Challenge

RAG Retrieval Challenge

Production Readiness Challenge

LLM Observability Challenge

vLLM Kubernetes Deployment Lab

KServe vs Ray Serve Decision Lab

GPU Node Pool Scheduling Lab

RAG Retrieval Quality Lab

Inference Cost Model Lab

LLM Rollout and Rollback Lab

Multi-Tenant LLM Security Lab

LLM Observability and Cost Dashboard Lab

Lab prerequisites​

How to use the labs​

Lab experience​

Recommended path​

Challenge of the Week

Lab prerequisites

How to use the labs

Lab experience

Recommended path