logo-mark

Cookie Settings

We use cookies to operate this website, improve usability, personalize your experience and improve our marketing. Your privacy is important to us. Privacy Policy.

AIbreaker

November 04, 2025

breaker

6 min read

November 04, 20256 min read

Creating AI agents for gigawatt-scale AI factories with NVIDIA Omniverse DSX Blueprint

Phaidra is part of the NVIDIA Omniverse DSX partnership developing AI agents to maximize energy efficiency and reliability for gigawatt-scale AI factories

Share

This was a big week for Phaidra and the broader AI factory ecosystem.

During his keynote address at GTC DC, NVIDIA founder and CEO Jensen Huang unveiled the NVIDIA Omniverse DSX Blueprint — a comprehensive and open blueprint for designing and operating gigawatt-scale AI factories.

Phaidra is proud to build the future of AI factories (i.e. data centers designed specifically for massive AI workloads) alongside NVIDIA and the other DSX ecosystem members. We envisage a future where AI factories operate continuously at peak performance — aided by AI agents that vigilantly manage the complex infrastructure on a 24/7 basis.

Jensen Huang announces the NVIDIA Omniverse DSX blueprint for gigascale AI factories, with partner logos including Phaidra displayed on stage at GTC DC.
logo-morsecode

Jensen announcing the NVIDIA Omniverse DSX Blueprint with Phaidra and others at GTC DC

Why now? Two major forcing functions are driving the need for the Omniverse DSX Blueprint:

  1. Inefficiencies get magnified at gigawatt-scale: a single GW AI factory represents both a $50Bn investment and $200Bn revenue opportunity. Every 1% of inefficiency therefore represents $2Bn in lost revenue.

  2. Extreme performance requires extreme co-design: AI factories today are so large, complex, and interconnected that they must operate as a single integrated machine rather than a collection of loosely-orchestrated components. This is the pathway towards step function tokens/watt improvements.

By openly sharing a reference architecture for the entire industry to build upon, we make it easier for everyone to collaboratively and iteratively improve energy efficiency, time-to-value, and reliability at scale.

Phaidra’s unique contributions to the Omniverse DSX Blueprint are:

  • Helping define the open data exchange standards and communications protocols that enable the various data center control planes (i.e. BMS, PMS, workload manager, etc.) to freely and openly communicate with each other.

  • Ensuring readiness for the new generation of agentic AI companies like Phaidra, Emerald AI, and others to readily integrate with the physical and digital twins parts of the AI factory — in pursuit of extreme performance.

  • Developing and sharing OpenUSD SimReady assets, and utilizing existing SimReady assets within the NVIDIA Omniverse digital twin, to rapidly train and evaluate our AI agents. This makes it easier for everyone to leverage robust simulation capabilities to improve AI factory performance.

The benefits of a collaborative, simulation-first approach are clear. Within the GTC DC keynote demo, an animation of Phaidra’s liquid cooling AI agent was shown in action (reposted below).

Timeseries graph comparing IT load and supply temperature under traditional control vs. Phaidra’s AI agent, showing the AI’s superior ability to maintain thermal stability and reduce temperature spikes during fluctuating workloads.
logo-morsecode

Phaidra’s self-learning AI agent is substantially better at reducing thermal spikes than traditional control systems.

This AI agent eliminates the thermal spikes resulting from massively synchronized AI workloads, which in turn enables the AI factory to run at higher TCS temperatures. The end result is substantially less power and capital required for cooling, and more power available for revenue-generating GPUs.

Our AI agents were developed and trained in simulation before they were deployed into real-world production systems. Phaidra collaborated closely with NVIDIA’s data center engineering team and used an operational digital twin of an NVIDIA DGX SuperPOD to rapidly prototype various AI agent architectures under varying environmental conditions. The optimal AI agent was trained further in simulation (i.e. bootstrapped on synthetic data) before testing on production NVIDIA DGX GB200 systems with live AI workloads.

Interestingly, we discovered that the AI agent trained in simulation dramatically improved the performance of the existing liquid cooling control system that had already been fine-tuned by human experts.

This is shown in the graph below, which illustrates the before/after performance of our AI agents at providing precision thermal control in liquid cooling systems. The red dots correspond to the traditional control system’s performance given a 30-70% load ramp. The blue dots correspond to an AI agent that had been trained in simulation. Finally, the green dots correspond to an AI agent after several hours of live learning on the production system (i.e. the reinforcement learning-based AI agent taught itself to get better without human intervention).

Chart showing how temperature stability improves as Phaidra’s AI agent progresses from inactive to untrained to fully trained, with visibly reduced variation and tighter control over time.
logo-morsecode

Phaidra’s AI agent teaching itself to get better at managing AI factory infrastructure without human intervention.

This mini-case study illustrates two important points:

  • Design and iteration in simulation can greatly accelerate your product development. What would have taken multiple quarters of iteration time (because this is mission-critical infrastructure) instead took weeks in simulation.

  • AI agents trained in simulation can already outperform existing systems — provided the simulation environment is of sufficiently high fidelity.

The Phaidra team is proud to work alongside NVIDIA and the team of industry-leaders to define the future of gigawatt-scale AI factories. Our goal is to openly share our learnings and know-how through the Omniverse DSX Blueprint initiative to ensure that AI factories can readily leverage agentic AI technologies to achieve extreme energy efficiency, time-to-value, and reliability.

Featured Expert

Learn more about one of our subject matter experts interviewed for this post

author-avatar

Jim Gao

Co-Founder, Chief Executive Officer

Jim is a co-founder and the CEO of Phaidra. He sets the strategic direction and leads the company in operational excellence. Prior to Phaidra, Jim led the DeepMind Energy Team and pioneered Google's use of AI controls on their hyperscale data center cooling systems. Prior to DeepMind, he spent a decade working as a Technical Lead for Google’s Data Centers.

Share


Recent Posts

logo-morsecode
Visualizing the data classification process in AI factories - how raw industrial data is cleaned, structured, and made ready for agentic AI systems.

Setup | October 06, 2025

AI system outputs can only be as good as the quality of the data inputs. AI systems can fail when data is mislabeled, messy or misleading. Learn about data classifications and what can be done at each stage to get your data ready for AI

article-thumbnail

AI | August 12, 2025

Tired of reactive mode? Discover how AI-powered insights help operators stay ahead, prevent failures, and take back control

Cyber Security strategies for data centers in 2025

Security | December 12, 2024

Practical strategies for securing data centers in 2025 that address cybersecurity, cloud computing risks, and IT/OT integration challenges.

Phaidra Logo
linkedin

Subscribe to our blog

Stay connected with our insightful systems control and AI content.

You can unsubscribe at any time. For more details, review our Privacy Policy page.

© 2025 Phaidra, Inc. All Rights Reserved.
Alfred