Red Hat, the world’s leading provider of open source solutions, announced Red Hat AI 3, a significant evolution of its enterprise AI platform. Bringing together the latest innovations of Red Hat AI Inference Server, Red Hat Enterprise Linux AI (RHEL AI) and Red Hat OpenShift AI, the platform helps simplify the complexities of high-performance AI inference at scale, enabling organizations to more readily move workloads from proofs-of-concept to production and improve collaboration around AI-enabled applications.
As enterprises move beyond AI experimentation, they face significant hurdles, including data privacy, cost control and managing diverse models. “The GenAI Divide: State of AI in Business” from the Massachusetts Institute of Technology NANDA project, highlights the reality of production AI, with approximately 95% of organizations failing to see measurable financial returns from ~$40 billion in enterprise spending.
Red Hat AI 3 focuses on directly addressing these challenges by providing a more consistent, unified experience for CIOs and IT leaders to maximize their investments in accelerated computing technologies. It makes it possible to rapidly scale and distribute AI workloads across hybrid, multi-vendor environments while simultaneously improving cross-team collaboration on next-generation AI workloads like agents, all on the same common platform. With a foundation built on open standards, Red Hat AI 3 meets organizations where they are on their AI journey, supporting any model on any hardware accelerator, from datacenters to public cloud and sovereign AI environments to the farthest edge.
Marketing Technology News: MarTech Interview with Miguel Lopes, CPO @ TrafficGuard
From training to “doing”: The shift to enterprise AI inference
As organizations move AI initiatives into production, the emphasis shifts from training and tuning models to inference, the “doing” phase of enterprise AI. Red Hat AI 3 emphasizes scalable and cost-effective inference, by building on the wildly-successful vLLM and llm-d community projects and Red Hat’s model optimization capabilities to deliver production-grade serving of large language models (LLMs).
To help CIOs get the most out of their high-value hardware acceleration, Red Hat OpenShift AI 3.0 introduces the general availability of llm-d, which reimagines how LLMs run natively on Kubernetes. llm-d enables intelligent distributed inference, tapping the proven value of Kubernetes orchestration and the performance of vLLM, combined with key open source technologies like Kubernetes Gateway API Inference Extension, the NVIDIA Dynamo low latency data transfer library (NIXL), and the DeepEP Mixture of Experts (MoE) communication library, allowing organizations to:
llm-d builds on vLLM, evolving it from a single-node, high-performance inference engine to a distributed, consistent and scalable serving system, tightly integrated with Kubernetes, and designed for enabling predictable performance, measurable ROI and effective infrastructure planning. All enhancements directly address the challenges of handling highly variable LLM workloads and serving massive models like Mixture-of-Experts (MoE) models.
Marketing Technology News: BambooHR and Marketing Architects Launch First National TV Campaign to Build Brand Visibility
A unified platform for collaborative AI
Red Hat AI 3 delivers a unified, flexible experience tailored to the collaborative demands of building production-ready generative AI solutions. It is designed to deliver tangible value by fostering collaboration and unifying workflows across teams through a single platform for both platform engineers and AI engineers to execute on their AI strategy. New capabilities focused on providing the productivity and efficiency needed to scale from proof-of-concept to production include:
Building the foundation for next-generation AI agents
AI agents are poised to transform how applications are built, and their complex, autonomous workflows will place heavy demands on inference capabilities. The Red Hat OpenShift AI 3.0 release continues to lay the groundwork for scalable agentic AI systems not only through its inference capabilities but also with new features and enhancements focused on agent management.
To accelerate agent creation and deployment, Red Hat has introduced a Unified API layer based on Llama Stack, which helps align development with industry standards like OpenAI-compatible LLM interface protocols. Additionally, to champion a more open and interoperable ecosystem, Red Hat is an early adopter of the Model Context Protocol (MCP), a powerful, emerging standard that streamlines how AI models interact with external tools—a fundamental feature for modern AI agents.
Red Hat AI 3 introduces a new modular and extensible toolkit for model customization, built on existing InstructLab functionality. It provides specialized Python libraries that give developers greater flexibility and control. The toolkit is powered by open source projects like Docling for data processing, which streamlines the ingestion of unstructured documents into an AI-readable format. It also includes a flexible framework for synthetic data generation and a training hub for LLM fine tuning. The integrated evaluation hub helps AI engineers monitor and validate results, empowering them to confidently leverage their proprietary data for more accurate and relevant AI outcomes.
The post Red Hat Brings Distributed AI Inference to Production AI Workloads with Red Hat AI 3 first appeared on PressReleaseCC.
Red Hat Brings Distributed AI Inference to Production AI Workloads with Red Hat AI 3 first appeared on Web and IT News.
Micro-Quakes on Silicon: How Tiny Vibrations Are Poised to Reshape Mobile Computing In the relentless…
The Silent Wisdom: Why Veteran Tech Leaders Watch Doomed Ventures Crumble In the high-stakes world…
Symbolic AI’s Bold Leap into Murdoch’s Media Empire: A Watershed Moment for AI-Driven Newsrooms In…
Meta’s Virtual Office Dream Dissolves: Inside the Shutdown of Horizon Workrooms and the Broader Retreat…
The Great AI Exodus: Co-Founders Flee Mira Murati’s Startup Back to OpenAI’s Fold In the…
Codebase Alchemy: AI’s Bold Leap in Turning Raw Code into Living Knowledge Repositories In the…
This website uses cookies.