KServe vs Ray Serve for LLM Platforms

KServe vs Ray Serve is not only a feature comparison. It is an ownership decision: should the primary serving contract be a Kubernetes-native platform API or a Python-native application serving graph?

Last reviewed: June 8, 2026. Use this page when a team is choosing a serving layer for more than one model endpoint.

LLM inference stack on Kubernetes architecture

Scenario

A model endpoint starts simple. Later it needs retrieval, reranking, model routing, generation, post-processing, custom metrics, and rollback. The platform team wants CRDs, revisions, policy, and standard lifecycle controls. The ML team wants programmable Python graph behavior.

Decision table

Dimension	KServe tends to fit	Ray Serve tends to fit
Primary contract	Kubernetes resources and platform policy.	Python deployment graph and application code.
Owner	Platform team standardizes endpoint lifecycle.	ML or application team owns graph behavior.
Serving graph	Repeatable endpoint patterns.	Custom multi-step pipelines.
Rollout	Resource revision, route, runtime, artifact.	Ray Serve deployment, graph code, runtime env, artifact.
SRE operability	Operate from Kubernetes resources and controller signals.	Operate Ray cluster plus graph-level telemetry.

Commands and checks

# Write this inventory before choosing the serving layer.
route=<route-name>
owner=<platform-or-app-team>
graph_complexity=<single-endpoint-or-multi-step>
rollback_unit=<resource-runtime-model-graph>
autoscaling_owner=<gateway-serving-layer-runtime-cluster>

Check	Pass signal
Owner is explicit	The team knows who owns the endpoint contract and production behavior.
Rollback unit is explicit	Runtime image, model artifact, prompt, and graph code are not confused.
SRE can operate it	On-call can debug routing, queueing, replica state, and runtime health.
Alternative rejected	The decision records why the other serving layer was not selected.

Run the KServe vs Ray Serve decision lab to practice choosing by ownership and graph complexity.

Scenario​

Decision table​

Commands and checks​

Related lab​

Related pages​

Scenario

Decision table

Commands and checks

Related lab

Related pages