Red Hat AI 3 delivers speed, accelerated delivery, and scale

Link
2025-10-14 ~1 min read www.redhat.com #kubernetes

⚡ TL;DR

Red Hat AI 3 delivers speed, accelerated delivery, and scale 1. Achieve new levels of efficiency with SLA-aware inference 2.

📝 Summary

Red Hat AI 3 delivers speed, accelerated delivery, and scale 1. Achieve new levels of efficiency with SLA-aware inference 2. Accelerate agentic AI innovation 3. Connecting models to your private data 4. Scaling AI across the hybrid cloud A new approach to enterprise AI The adaptable enterprise: Why AI readiness is disruption readiness About the authors Jennifer Vargas Carlos Condado Will McGrath Robbie Jerrom Younes Ben Brahim Ornkanya Sinonpat (Aom) More like this Blog post Blog post Original podcast Original podcast Keep exploring Browse by channel Automation Artificial intelligence Open hybrid cloud Security Edge computing Infrastructure Applications Virtualization Share This past May at Red Hat Summit, we made several announcements across the Red Hat AI portfolio, including the introduction of Red Hat AI Inference Server and Red Hat AI third-party validated models, the integration of Llama Stack and Model Context Protocol (MCP) APIs as a developer preview, and the establishment of the llm-d community project. The portfolio’s latest iteration, Red Hat AI 3 , delivers many of these production-ready capabilities for enterprises. Additionally, we’re providing more tools and services to empower teams to increase efficiency, collaborate more effectively, and deploy anywhere. Let’s explore what Red Hat AI 3 means for your business. Red Hat’s strategy is to serve any model across any accelerator and any environment. The latest inferencing improvements offer features to meet Service Level Agreements (SLAs) of generative AI (gen AI) applications, support for additional hardware accelerators, and an expanded catalog of validated and optimized third-party models. Some highlights include: llm-d is now generally available in Red Hat OpenShift AI 3.0. llm-d provides kubernetes-native distributed inference, which is essential for scaling and managing the unpredictable nature of large language models (LLMs).