DeepSeek, once hailed as the “efficiency king” for training its R1 model on a modest budget, has reportedly hit a hardware wall. According to industry insiders and reports from the Wall Street Journal, the startup attempted to train its upcoming DeepSeek V4 model entirely on Huawei Ascend hardware but was forced to pivot back to Nvidia chips after months of instability and failure.
The Failure of the “Domestic-First” Strategy
In mid-2025, Chinese authorities reportedly urged DeepSeek to transition away from Western silicon. However, the attempt to use the Huawei Ascend ecosystem for flagship-level training revealed three critical bottlenecks:
- Hardware Instability: Huawei’s chips suffered from frequent crashes during large-scale training runs, causing weeks of work and massive amounts of electricity to be lost.
- Glacial Interconnects: The chip-to-chip communication speeds (vital for training models with trillions of parameters) could not match Nvidia’s NVLink, leading to “computation bottlenecks.”
- Software Maturity: DeepSeek’s researchers reportedly found Huawei’s CANN software toolkit insufficient for the complex “Engram” and “mHC” architectures required for the V4 model.
The Compromise: Smuggled Chips and H20s
To get the V4 project back on track, DeepSeek reportedly utilized a “mixed” hardware strategy:
- Nvidia for Training: The company allegedly utilized a cache of smuggled H100s and recently approved Nvidia H200 exports (under strict U.S. caps) to handle the heavy lifting of pre-training.
- Huawei for Inference: In a strategic compromise, DeepSeek has relegated Huawei’s Ascend accelerators to inference duty (running the model after it’s trained), where the hardware is significantly more stable.
DeepSeek V4: What to Expect in February 2026
Despite the hardware setbacks, the V4 model is expected to debut around the Lunar New Year (mid-February 2026).
| Feature | DeepSeek V3 (2025) | DeepSeek V4 (2026) |
| Primary Hardware | Nvidia H800 / H100 | Mixed Nvidia H200 / H100 |
| Core Architecture | MoE (Mixture-of-Experts) | Engram (O(1) Memory Architecture) |
| Context Window | 128K Tokens | 1 Million+ Tokens |
| Key Capability | Multi-tasking | Repository-Level Coding & Reasoning |
Conclusion: The “Muscle vs. Elegance” Debate
DeepSeek’s retreat to Nvidia highlights a sobering reality for the Chinese AI sector: while algorithmic “elegance” can reduce costs, there is still no substitute for the raw “muscle” of American hardware when pushing the frontier of model quality. As Justin Lin of Alibaba’s Qwen team recently noted, the odds of Chinese models overtaking OpenAI without better hardware remain low—at best, a 20% chance within the next five years.
