In a bold move that resets the price-to-performance ratio in the generative AI market, Chinese AI unicorn MiniMax has officially launched its next-generation text model, MiniMax M2.5. Positioned as a “production-grade native agent model,” M2.5 is designed to handle complex, long-horizon tasks—particularly in software engineering and autonomous research—at a fraction of the cost of Western counterparts like Claude 4.6 and GPT-5.2

The launch, which took place on February 12, 2026, marks a significant milestone for MiniMax. The company reported that the model is already powering 30% of its internal business tasks and generating 80% of its production code.
High Performance, Low Footprint
While flagship models often rely on massive parameter counts, M2.5 utilizes a sophisticated Mixture-of-Experts (MoE) architecture. Out of its 230 billion total parameters, only 10 billion are activated during any given task. This efficiency allows M2.5 to deliver flagship-level intelligence with significantly lower latency and compute requirements.
Key Benchmarks at a Glance:
- SWE-Bench Verified: 80.2% (Surpassing GPT-5.2’s 80.0%)
- Multi-SWE-Bench: Ranked #1 in industry for multilingual coding.
- Task Speed: 37% faster task completion compared to the previous M2.1.
- Inference Speed: Up to 100 tokens per second (TPS).
“Intelligence Too Cheap to Meter”
The most disruptive aspect of the M2.5 launch is its aggressive pricing strategy. MiniMax is marketing the model as the first frontier-level AI where “cost is no longer a concern.”
| Feature | M2.5 Lightning | M2.5 Standard |
| Output Speed | 100 Tokens/Sec | 50 Tokens/Sec |
| Input Price (per 1M) | $0.30 | $0.15 |
| Output Price (per 1M) | $2.40 | $1.20 |
| Hourly Operating Cost | ~$1.00 | ~$0.30 |
For developers, this means running a complex coding agent 24/7 for a year could fit within a $10,000 budget—a task that would cost upwards of $100,000 using other frontier models.
Native “Spec Behavior” and Agentic Reasoning
Beyond raw speed, M2.5 introduces what MiniMax calls “Native Spec Behavior.” Unlike standard LLMs that jump straight into code generation, M2.5 is trained via large-scale Reinforcement Learning (RL) to think like a human architect. It proactively creates specifications, defines project structures, and plans UI designs before writing a single line of code.
This “think-before-you-act” approach has led to a 20% reduction in tool-call rounds in multi-step agentic workflows. Whether it’s managing a full-stack development lifecycle or performing complex financial modeling in Excel, M2.5 demonstrates a maturity in decision-making that rivals the “thinking” models from OpenAI and Anthropic
Availability
MiniMax M2.5 is now fully integrated into the MiniMax Agent platform and is available for developers via the MiniMax Open Platform API. The company has also released the model weights for local deployment, supporting popular frameworks like vLLM and SGLang.


