OpenAI has officially shifted the paradigm of AI development from “batch processing” to “real-time collaboration” with the release of GPT-5.3-Codex-Spark. Launched on February 12, 2026, Spark is a streamlined version of the powerhouse GPT-5.3-Codex, specifically optimized for ultra-low latency and interactive engineering workflows.
The model marks the first major milestone of OpenAIโs $10 billion partnership with Cerebras Systems, utilizing specialized hardware to achieve speeds that feel near-instantaneous.
Built for Speed: The Cerebras Advantage
While standard frontier models are hosted on massive GPU clusters optimized for raw throughput, GPT-5.3-Codex-Spark runs on the Cerebras Wafer Scale Engine 3 (WSE-3). This dinner-plate-sized chip enables a “latency-first” serving tier, allowing developers to see code generated as fast as they can think.
| Metric | GPT-5.3-Codex | GPT-5.3-Codex-Spark |
| Tokens Per Second | ~100-150 TPS | 1,000+ TPS |
| Time-to-First-Token | Standard | 50% Faster |
| Generation Speed | 1x | 15x Faster |
| Ideal Use Case | Long-horizon agents | Real-time pairing/UI tweaks |
The “Spark” Philosophy: Interactive Over Autonomous
Unlike its larger sibling designed for autonomous tasks that can run for hours, Spark is a conversational partner. It is tuned for “targeted editing”โmaking small, surgical changes to logic or interfaces without the heavy overhead of rewriting entire files.
Key Technical Enhancements:
- Persistent WebSockets: Spark uses a continuous connection by default, reducing client-server roundtrip overhead by 80%.
- Lightweight Defaults: By default, Spark performs minimal edits and skips automatic test runs to prioritize speed, only executing complex validations when explicitly requested.
- 128k Context Window: Despite its smaller size, it retains a massive context window to keep entire project structures in active memory.
Benchmarks: Speed vs. Precision
OpenAI acknowledges that Spark is a “research preview” that trades some reasoning depth for extreme velocity. On Terminal-Bench 2.0, which tests agentic command-line proficiency, Spark scored 58.4%, trailing the full GPT-5.3-Codex (77.3%) but handily beating older models like GPT-5.1-Codex-mini.
However, the real-world value lies in the “Human-in-the-loop” speed. Tasks that take the full-fat Codex 15 minutes to plan and execute can be iterated upon by a developer using Spark in under 3 minutes.
Availability & Access
GPT-5.3-Codex-Spark is rolling out as a research preview across the following platforms:
- ChatGPT Pro: Available today for $200/mo subscribers.
- Codex App & Extensions: Integrated natively into the VS Code extension and the Codex CLI.
- Windsurf: Available through the “Fast Arena” and “Hybrid Arena” battle groups.
- API: Limited access for design partners, with a broader rollout expected in late February.
“Spark is the first step toward a Codex that works in two modes: real-time collaboration when you want to iterate, and deep reasoning when you want to delegate.” โ OpenAI Engineering Blog


