In early February 2026, reports surfaced that OpenAI is dissatisfied with the performance of Nvidia’s current GPU architecture for certain AI tasks. While Nvidia remains the undisputed leader in training large language models, the friction points are emerging in the area of inference—the process of actually running the models for users.
1. The Core Issue: Inference Speed
According to multiple industry reports and internal sources, OpenAI has expressed frustration with the latency (response time) of Nvidia’s hardware when handling real-time requests.
- Memory Bottleneck: Nvidia GPUs rely on external memory, which requires the chip to constantly fetch data. This adds milliseconds of delay that become particularly noticeable in high-speed applications like AI coding (Codex) and real-time software-to-software communication.
- The SRAM Advantage: OpenAI is reportedly seeking “SRAM-heavy” chips—where memory is embedded directly on the silicon. This design allows for near-instant data retrieval, making chatbots feel much more responsive.
- Codex Weakness: Internal teams at OpenAI reportedly attributed some of the lag in their coding tools to Nvidia’s hardware, noting that professional developers place a “massive premium” on speed.
2. Diversification: The Search for Alternatives
OpenAI is actively looking to reduce its total reliance on Nvidia, aiming for new hardware to eventually cover at least 10% of its inference needs.
| Partner | Recent Activity (Jan/Feb 2026) |
| Cerebras | OpenAI announced a commercial deal to use Cerebras’ “wafer-scale” chips, which are optimized for ultra-fast inference. |
| AMD | OpenAI has evaluated AMD’s Instinct MI300 series to broaden its hardware base and lower supply chain risks. |
| Groq | OpenAI held talks for faster inference capacity, but Nvidia recently struck a $20 billion licensing deal with Groq, effectively blocking OpenAI’s path to that specific hardware. |
3. The “Stalled” $100 Billion Deal
The technical dissatisfaction has bled into the financial partnership between the two giants.
- The Non-Binding MoU: In September 2025, Nvidia announced plans to invest up to $100 billion in OpenAI. However, negotiations have reportedly “stalled” because it was never a finalized, binding agreement.
- CEO Friction: While Sam Altman and Jensen Huang have publicly called reports of a rift “nonsense,” Huang has privately expressed concerns about OpenAI’s “lack of business discipline” and the rising competition from Google’s TPUs and Anthropic’s Trainium chips.
- Vera Rubin Delay: The first gigawatt of the planned 10-gigawatt “AI Factory” is not expected until the second half of 2026 using Nvidia’s next-gen Vera Rubin platform.
4. Competitive Pressure
OpenAI is feeling the heat because its rivals are using more specialized hardware:
- Google (Gemini): Uses in-house TPUs (Tensor Processing Units) designed specifically for inference, giving them a “performance-per-dollar” edge.
- Anthropic (Claude): Relies heavily on Amazon’s Trainium/Inferentia chips, bypassing some of the general-purpose GPU bottlenecks.
Conclusion: A “Co-Design” Future
Despite these frustrations, OpenAI’s infrastructure leader emphasized that Nvidia’s technology remains “foundational.” The relationship is shifting from a simple vendor-customer model to a deep co-design partnership. OpenAI is pushing Nvidia to make chips that act less like general-purpose graphics cards and more like specialized “thinking engines.”

