Nvidia has confirmed that its flagship Rubin R100 GPU is indeed equipped with 288GB of HBM4 (High Bandwidth Memory). This represents a 50% increase in capacity over the 192GB found in the previous Blackwell (B200) generation.
The boost in memory is a strategic response to the “Memory Wall”โthe primary bottleneck in running massive-context AI models and hyper-scale Mixture-of-Experts (MoE) architectures.

Gemini said
As of March 2026, Nvidia has confirmed that its flagship Rubin R100 GPU is indeed equipped with 288GB of HBM4 (High Bandwidth Memory). This represents a 50% increase in capacity over the 192GB found in the previous Blackwell (B200) generation.
The boost in memory is a strategic response to the “Memory Wall”โthe primary bottleneck in running massive-context AI models and hyper-scale Mixture-of-Experts (MoE) architectures.
Key Technical Specifications
The Rubin architecture, which entered full production in early 2026, focuses on memory bandwidth and capacity as much as raw compute power.
| Specification | Rubin R100 | Blackwell B200 | Improvement |
| HBM Capacity | 288GB HBM4 | 192GB HBM3e | 1.5x |
| Memory Bandwidth | 22 TB/s | 8 TB/s | 2.75x |
| Transistor Count | 336 Billion | 208 Billion | 1.6x |
| FP4 Inference | 50 PFLOPS | 10 PFLOPS | 5x |
| Process Node | TSMC N3 (3nm) | TSMC 4NP | 1 Generation |
- Why 288GB? This capacity (8 stacks of 36GB HBM4) allows a single GPU to hold trillion-parameter models that previously required complex distribution across multiple chips.
- The 22 TB/s Target: While the initial target was 22 TB/s, recent reports from March 3, 2026, indicate that some early production batches may operate closer to 20 TB/s due to yield challenges at HBM4 suppliers like SK hynix and Samsung.
System-Level Impact: The NVL72 Rack
The Rubin GPU is rarely used in isolation; it is the core of the Vera Rubin NVL72 rack-scale supercomputer.
- Total System Memory: A full rack contains 72 Rubin GPUs, totaling 20.7TB of HBM4 memory.
- The Vera CPU: Each rack also includes 36 Vera CPUs (88-core ARM-based), which support up to 1.5TB of LPDDR5X per CPU, acting as a high-speed buffer for KV-cache offloading.
- Agentic AI Economics: Nvidia claims this massive memory pool enables a 10x reduction in cost per token for long-reasoning agents compared to the Blackwell platform.
Current Status & Samples
- Customer Sampling: On February 25, 2026, Nvidia confirmed it has started shipping the first Vera Rubin samples to select customers (including Microsoft, AWS, and Google).
- Volume Availability: Production shipments are on track for the second half (H2) of 2026.
- The “VRAM Crunch”: Due to the extreme demand for HBM4, rumors from early March suggest Nvidia is no longer supplying VRAM to certain board partners, requiring them to source memory independently for specialized workstations.


