Skip to main content

RAG Tenant Isolation on Kubernetes

RAG tenant isolation on Kubernetes must be proven before context enters the prompt. If unauthorized text reaches prompt assembly, filtering the final answer is already too late.

Last reviewed: June 8, 2026. Use this page when retrieval quality is good but access boundaries are not measurable.

RAG platform on Kubernetes architecture

Scenario

A shared RAG service stores embeddings for multiple customer workspaces. Ingestion writes tenant metadata, but a migration leaves stale chunks without complete access fields. Retrieval returns a strong answer with one citation from another tenant.

Decision table

RAG layerTenant isolation requirement
IngestionAttach tenant, source owner, access scope, source version, and index version to every chunk.
Index publishReject indexes with incomplete or stale authorization metadata.
RetrievalApply tenant and access filters before reranking and prompt assembly.
Prompt assemblyLog context IDs and policy decisions without storing sensitive prompt text.
EvaluationTest retrieval recall and unauthorized-document exclusion together.

Commands and checks

curl -sS "$RAG_API/search?tenant=team-a&q=rollback"
curl -sS "$RAG_API/search?tenant=team-b&q=rollback"
kubectl -n rag-system logs deploy/rag-api --tail=100
CheckPass signal
Tenant filter visibleSearch output or logs show tenant-aware metadata filtering.
Unauthorized test existsEvaluation includes documents the user must not retrieve.
Context is traceableCitations map to source ID, tenant ID, index version, and chunking config.
Rollback is possibleThe index can roll back independently from the model runtime.

Run the RAG retrieval challenge to practice tenant filters, citations, recall, and failure drills.