The great migration: Why every AI platform is converging on Kubernetes

Link
2026-03-05 ~1 min read www.cncf.io #cncf

⚡ TL;DR

Three eras, one platform Foundation: Data processing at scale Orchestration: Connecting the pipeline Training: Gang scheduling and resource coordination Serving: Inference at scale Agentic workloads: Building the agent operating system Optimizing for the GPU economy Multi-cluster orchestration and AI conformance What’s next: Innovations driven by AI scale The path forward Posted on March 5, 2026 by Sabari Sawant, Amazon CNCF projects highlighted in this post When Kubernetes launched a decade ago, its promise was clear: make deploying microservices as simple as running a container. Fast forward to 2026, and Kubernetes is no longer “just” for stateless web services.

📝 Summary

Three eras, one platform Foundation: Data processing at scale Orchestration: Connecting the pipeline Training: Gang scheduling and resource coordination Serving: Inference at scale Agentic workloads: Building the agent operating system Optimizing for the GPU economy Multi-cluster orchestration and AI conformance What’s next: Innovations driven by AI scale The path forward Posted on March 5, 2026 by Sabari Sawant, Amazon CNCF projects highlighted in this post When Kubernetes launched a decade ago, its promise was clear: make deploying microservices as simple as running a container. Fast forward to 2026, and Kubernetes is no longer “just” for stateless web services. In the CNCF annual survey released in January 2026, 82% of container users report running Kubernetes in production, and 66% of organizations hosting generative AI models use Kubernetes for some or all inference workloads. The conversation has fundamentally shifted from stateless web applications to distributed data processing, distributed training jobs, LLM inference, and autonomous AI agents. This isn’t just evolution, it’s platform convergence driven by a practical reality: running data processing, model training, inference, and agents on separate infrastructure multiplies operational complexity while Kubernetes provides a unified foundation for all of them. The Kubernetes journey mirrors how software has evolved. Microservices era (2015–2020): hardened stateless services, rollout patterns, and multi-tenant platforms. Data + GenAI era (2020–2024): brought distributed data processing and GPU-heavy training/inference into the mainstream. Agentic era (2025+): shifts workloads from request/response APIs to long-running reasoning loops. Each wave builds on the last, creating a single platform where data processing, training, inference, and agents coexist. Before models train, data must be prepared. Kubernetes is now the unified platform where data engineering and machine learning converge, handling both steady-state ETL and burst workloads scaling from hundreds to thousands of cores within minutes.