Taking the stage at the Taipei Music Center for COMPUTEX 2026, NVIDIA CEO Jensen Huang unveiled a massive dual-assault on the physical AI landscape. Declaring that the “big bang of physical AI is just around the corner,” Huang announced NVIDIA Cosmos 3, the world’s first fully open physical AI omnimodel, alongside the NVIDIA Isaac GR00T Reference Humanoid Robot, a standardized hardware and software blueprint aimed at democratizing the humanoid robotics industry.
The twin announcements mark a dramatic tactical shift for NVIDIA, moving from a pure-play silicon and cloud software provider to anchoring the entire open-source physical intelligence ecosystem to its hardware pipeline.
1. NVIDIA Cosmos 3: The First Open Omnimodel for Physical AI
Billed as a generational leap for developers building robots, autonomous vehicles, and vision agents, Cosmos 3 unifies visual reasoning, simulated world generation, and physical action generation into a single architecture. Until now, these capabilities required completely separate software systems.
The Mixture-of-Transformers (MoT) Backbone
The true breakthrough of Cosmos 3 lies in its Mixture-of-Transformers (MoT) architecture. Rather than generating reactive physical frames, the model pairs a reasoning transformer with an expert generation transformer. The reasoning block interprets a physical scene, calculates spatial-temporal layouts, and figures out object interactions before the generation block maps out video or robotic motor trajectories.
Massive Token Scale and Benchmark Leadership
NVIDIA trained Cosmos 3 on an unprecedented dataset of 20 trillion multimodal tokens, which includes nearly 1 billion images, 400 million real/synthetic videos, ambient sound, and critical human/robot action trajectories. The model launched with open weights across multiple footprints:
- Cosmos 3 Super (32B per block / ~65B total): Built for large-scale data centers running Hopper and Blackwell GPUs to generate high-fidelity synthetic data.
- Cosmos 3 Nano (8B per block / ~16B total): Tailored for local AI workstations running hardware like the RTX PRO 6000 series.
- Cosmos 3 Edge: Coming soon to handle ultra-low latency, real-time on-device robot inference.
Cosmos 3 immediately claimed the top spot among open-weights models across leading physical AI benchmarks, dominating in world-generation accuracy (Physics-IQ, PAI-Bench) and motor action policy (RoboLab, RoboArena).
2. Isaac GR00T Reference Humanoid: The “Android” of Robotics
While Cosmos 3 acts as the brain, NVIDIA’s second blockbuster reveal delivers the standardized “body”. The NVIDIA Isaac GR00T Reference Humanoid Robot is the industry’s first fully open reference design engineered to clean up a heavily fragmented hardware and simulation landscape.
[ ISAAC GR00T REFERENCE HUMANOID ARCHITECTURE ]
│
┌──────────────────────────────┴──────────────────────────────┐
▼ ▼
[ THE BODY: Human-Scale Platform ] [ THE BRAIN: Jetson Thor & Software ]
• Chassis: Unitree H2 Plus (6ft, 150 lbs) • Compute: Jetson AGX Thor T5000 (Blackwell GPU)
• Dexterity: Sharpa Wave 5-Finger Tactile Hands • Performance: 2,070 FP4 AI Teraflops
• Freedom: 75 Degrees of Freedom (Total) • Ecosystem: End-to-end Isaac GR00T Workflows
Unifying Hardware and AI Compute
The reference blueprint brings together several cutting-edge commercial components into a unified research stack:
- The Frame: A Unitree H2 Plus humanoid chassis standing nearly 6 feet tall and weighing 150 pounds, sporting 31 degrees of freedom across the main frame.
- The Hands: Dual Sharpa Wave tactile five-finger hands with 22 degrees of freedom, bringing the robot’s total whole-body mobility to 75 degrees of freedom for highly delicate, human-like object manipulation.
- Onboard Powerhouse: The system is driven locally by the NVIDIA Jetson AGX Thor™ T5000 system-on-chip, powered by a Blackwell architecture GPU pushing out 2,070 FP4 teraflops of local AI performance paired with 128GB of unified memory.
3. The Cosmos Coalition: Standardizing Synthetic Simulation
NVIDIA has recognized that the biggest bottleneck facing robotics development is data scarcity—recording millions of hours of real-world video to teach a robot how to pick up an object or react to a rare failure is too slow and costly.
To solve this, the tech giant announced the formation of the NVIDIA Cosmos Coalition. Founding members include elite AI research labs and world-model builders such as Runway, Black Forest Labs, Skild AI, Agile Robots, LTX, and Generalist.
By using Cosmos 3 as a physics-grounded, highly controllable world simulator, these firms can generate infinite variations of plausible synthetic video data—changing weather, lighting, or sudden environmental hazards. This allows robotics developers to train World Action Models (WAMs) and test complex control policies safely in closed-loop simulations before ever deploying them to a physical humanoid chassis.
Academic powerhouses including Stanford Robotics Center, ETH Zurich, Ai2, and UC San Diego have already signed on to receive the Isaac GR00T reference units, effectively cementing NVIDIA’s hardware, software, and simulation stacks as the foundational template for the next generation of general-purpose robotics.
