KServe joins CNCF as an incubating project
Link⚡ TL;DR
📝 Summary
KServe joins CNCF as an incubating project The critical engine behind Red Hat OpenShift AI Unlocking enterprise AI value Join the Movement! About the author Yuan Tang More like this Blog post Blog post Original podcast Original podcast Browse by channel Automation Artificial intelligence Open hybrid cloud Security Edge computing Infrastructure Applications Virtualization Share We are excited to share that KServe , the leading standardized AI inference platform on Kubernetes, has been accepted as an incubating project by the Cloud Native Computing Foundation (CNCF). This milestone validates KServe’s maturity, stability and role as the foundation for scalable, multi-framework model serving in production environments. By moving into the CNCF’s neutral governance, KServe’s development will be driven purely by community needs, accelerating its standardization for serving AI models on Kubernetes. For Red Hat this is a validation of our commitment to delivering open, reliable and standardized AI solutions for the hybrid cloud. At Red Hat, we believe the best AI infrastructure is built on open standards and Kubernetes. KServe is the critical model serving component that powers Red Hat OpenShift AI, helping ensure our customers can transition from model experimentation to production inference seamlessly and at scale. OpenShift AI leverages KServe’s features to solve the biggest enterprise AI challenges, helping enterprises realize: High-performance LLM optimization - KServe is optimized for large language models (LLMs), providing high-performance features like KV cache offloading, distributed inference with vLLM, as well as disaggregated serving, pre-fix caching, intelligent scheduling and variant autoscaling via the integration with llm-d. Advanced autoscaling - In addition to the horizontal pod autoscaling capability from Kubernetes, KServe also supports autoscaling with KEDA (Kubernetes Event-driven Autoscaler), which enables event-driven scaling based on external metrics such as vLLM metrics. Both predictive and generative AI model inference - KServe supports pluggable, reusable, extensible runtimes, ranging from scikit-learn and XGBoost for predictive AI to Hugging Face and vLLM for generative AI model inference. This helps ensure that enterprises can switch to the best runtime for specific use cases. The journey of AI from the lab to the bottom line requires production infrastructure that can handle exponential growth, especially as enterprise usage shifts to widespread generative applications. Now bolstered by the full resources and neutral governance of the CNCF, KServe directly addresses these core operational challenges - from tackling complexity with a unified API to controlling cloud costs through its scale-to-zero capabilities.
Open the original post ↗ https://www.redhat.com/en/blog/kserve-joins-cncf-incubating-project