In a major move to solidify its lead in autonomous software engineering, OpenAI officially launched GPT-5.3-Codex on February 5, 2026.
The release, which coincided with the launch of Anthropic’s Claude Opus 4.6, marks a fundamental shift for OpenAI from “coding assistants” to full-scale “computer-using agents.” The company claims this is the first model that was instrumental in building itself, having been used to debug its own training runs and manage its deployment.
1. Key Features: Beyond Just Code
GPT-5.3-Codex expands the definition of a “coding model” by integrating broad professional knowledge and advanced computer-use capabilities.
- 25% Speed Boost: The model runs significantly faster than its predecessor (GPT-5.2-Codex), enabling “long-horizon” tasks that can last for hours or even days.
- Mid-Turn Steering: A breakthrough in user experience, this allows users to course-correct the agent in real-time. You can read the model’s “thinking traces” and give new instructions without interrupting the process or losing context.
- Computer Use (OSWorld): The model is optimized for desktop automation, meaning it can navigate an operating system, use a mouse and keyboard, and interact with non-coding software like spreadsheets and slide decks.
- Professional Lifecycle: It is designed to handle the entire software lifecycle—from writing Product Requirement Documents (PRDs) and designing architecture to monitoring deployments and fixing production bugs.

2. Performance Benchmarks: A New Industry High
OpenAI reported that GPT-5.3-Codex has set new records across specialized agentic and engineering benchmarks.
| Benchmark | GPT-5.3-Codex Score | Significance |
| SWE-Bench Pro | 78.2% | Measures multi-language software engineering. |
| Terminal-Bench 2.0 | 77.3% | Evaluates terminal/shell automation skills. |
| OSWorld-Verified | 64.7% | Measures ability to use a desktop environment. |
| GDPval-AA | 1606 (Elo) | Measures high-level economic and professional reasoning. |
Note: While GPT-5.3-Codex dominates in terminal and computer use, early tests suggest Claude Opus 4.6 still holds a slight lead in “graduate-level” scientific reasoning (GPQA Diamond).
3. The “Cybersecurity” Safety Stack
For the first time under its Preparedness Framework, OpenAI has classified a model as “High capability” for cybersecurity. As a result, GPT-5.3-Codex is launching with the most stringent safeguards in the company’s history.
- Aardvark Security Agent: A private beta tool released alongside the model that specifically focuses on finding and fixing software vulnerabilities.
- Trusted Access for Cyber: A pilot program that grants $10 million in API credits to vetted researchers working on cyber-defense and open-source security.
- Staged Access: While the model is available now for paid ChatGPT Plus, Team, and Enterprise users, full API access is currently restricted until additional safety gates are cleared.
4. Infrastructure: The NVIDIA NVL72 Powerhouse
The launch also highlighted a deepening partnership between OpenAI and NVIDIA.
- Hardware: GPT-5.3-Codex was co-designed for and trained on the NVIDIA GB200 NVL72 systems.
- Efficiency: This infrastructure is responsible for the 25% increase in inference speed, making the agent feel more like a real-time “colleague” rather than a batch processor.
Conclusion: The End of “Copy-Paste” Coding
With GPT-5.3-Codex, OpenAI is moving away from the “snippet” era. By providing a model that can think, act, and be steered while it works, they are positioning the Codex app (released earlier this week) as the primary “cockpit” for the modern professional.