OpenAI Says GPT-5.5 Cyber Beats Anthropic’s Mythos on a Security Benchmark

OpenAI has made a new AI model. It is called GPT-5.5-Cyber. An AI model is a computer program that learns to do a task. This one is built to find and fix safety holes in software. OpenAI says it did better than a rival model from Anthropic, called Mythos 5. They tested both on a benchmark. A benchmark is a standard test used to compare AI models on the same job.

The news came out on June 23, 2026. It was reported by a tech site called The Decoder. OpenAI is the company that made ChatGPT. Anthropic is the company that made the Claude AI models. Both want to build AI that can guard computers from hackers. This new model pushes that race forward.

What is GPT-5.5-Cyber?

GPT-5.5-Cyber is a special version of OpenAI’s GPT-5.5 model. It is trained for one main job: cybersecurity. Cybersecurity means keeping computers and data safe from attacks. The model can find weak spots in code. It can suggest fixes, called patches. It can also check that those fixes really work.

A weak spot in code is called a vulnerability. Hackers look for these gaps to break in. OpenAI says GPT-5.5-Cyber can do the whole job. It finds the gap. It writes a patch, which is a small code fix. Then it tests the patch to make sure it holds.

How the benchmark scores compare

OpenAI tested the model on three security tests. The main one is called CyberGym. It checks if an AI can copy known software flaws inside a safe test space. On CyberGym, GPT-5.5-Cyber scored 85.6 percent. Anthropic’s Mythos 5 scored 83.8 percent. So OpenAI’s model won by about two points.

The other two tests are ExploitGym and SEC-bench Pro. ExploitGym checks if the AI can turn a weak spot into a real working attack. This shows defenders how risky the gap is. SEC-bench Pro checks if the AI can find brand new flaws over a longer time. OpenAI only shared Mythos 5’s score for CyberGym. So we cannot compare the two on the other two tests.

Key facts

Item	Detail
Model	GPT-5.5-Cyber (by OpenAI)
Main rival	Mythos 5 (by Anthropic)
Announced	June 23, 2026
CyberGym score	85.6% (vs Mythos 5 at 83.8%)
ExploitGym score	39.5%
SEC-bench Pro score	69.8%
Commits scanned	Over 30 million across 30,000+ codebases
Findings flagged fixed	Over 500,000 (70,000 manually confirmed)
Who can access	Verified defenders only

Benchmarks and specs: GPT-5.5-Cyber vs rivals

Here are the scores across all three tests. Only the numbers OpenAI shared are listed. A dash means no score was given for that model. Higher percentages are better.

Model	CyberGym	ExploitGym	SEC-bench Pro
GPT-5.5-Cyber	85.6%	39.5%	69.8%
Mythos 5 (Anthropic)	83.8%	–	–
GPT-5.5	81.8%	25.95%	63.1%
GPT-5.4	79.0%	–	–
Claude Opus 4 (Anthropic)	73.1%	–	–

What it means: GPT-5.5-Cyber wins on the one test where every model has a score. It also clearly beats OpenAI’s own older models. But its lead over Mythos 5 is small. So the two top models are very close.

Built for defenders, not attackers

A tool that finds security holes could be used by bad people too. So OpenAI lets only “verified defenders” use it. That means the company checks who you are first. Access comes with checks, watching, and guardrails. Guardrails are safety limits that block harmful use.

OpenAI also updated a tool called Codex Security. It first came out as a preview in March. Since then, OpenAI says it has scanned over 30 million commits. A commit is a single saved change to a software project. These came from more than 30,000 codebases. The tool marked over 500,000 findings as fixed. People checked 70,000 of them by hand.

Big partners and a global push

OpenAI says it works with more than 25 security firms. These include big names like Cisco, CrowdStrike, Cloudflare, Palo Alto Networks, and IBM. It also works with many governments. These are Australia, Canada, France, Germany, Japan, South Korea, the UK, and the EU’s cyber agency, ENISA.

There is also a plan called “Patch the Planet.” It works with more than 30 open-source projects. Open-source projects are software whose code is free for anyone to see and use. Many apps are built on top of this shared code. So fixing flaws here helps protect lots of other apps too.

FAQ

What is GPT-5.5-Cyber?

It is a version of OpenAI’s GPT-5.5 model made for cybersecurity. It finds weak spots in software, writes fixes, and checks that the fixes work.

How did it score against Anthropic’s Mythos 5?

On the CyberGym test, GPT-5.5-Cyber scored 85.6 percent. Mythos 5 scored 83.8 percent. OpenAI did not share Mythos 5 scores for the other two tests.

Can anyone use it?

No. OpenAI says only “verified defenders” can use it. The company checks who you are. It also adds watching and safety limits before it lets you in.

What are CyberGym, ExploitGym, and SEC-bench Pro?

They are three security tests. CyberGym checks if the AI can copy known flaws. ExploitGym checks if it can build a working attack. SEC-bench Pro checks if it can find new flaws over time.

Why it matters (especially for India / founders)

India does a huge share of the world’s software work. Many startups and IT firms here write code every day. AI tools that scan code for security holes could save these teams a lot of time and money. A small team could check its software without hiring a big security staff.

For founders, there is also a lesson about trust. OpenAI keeps this power behind strict checks. AI is getting better at both attack and defense. The firms that build safety in from day one will earn customer trust. That is true whether you build apps, run a fintech, or sell to big companies.

The main point is simple. The AI race is no longer just about chatbots. It is now about who can best defend the digital world. OpenAI says it leads on one big test. But Anthropic is close behind. For users and businesses, stronger and safer security tools are the real prize.

Source: The Decoder

Related coverage

OpenAI launch new initiative to help find and patch open-sourced bugs

OpenAI Says GPT-5.5 Cyber Beats Anthropic’s Mythos on a Security Benchmark

What is GPT-5.5-Cyber?

How the benchmark scores compare

Key facts

Benchmarks and specs: GPT-5.5-Cyber vs rivals

Built for defenders, not attackers

Big partners and a global push

FAQ

What is GPT-5.5-Cyber?

How did it score against Anthropic’s Mythos 5?

Can anyone use it?

What are CyberGym, ExploitGym, and SEC-bench Pro?

Why it matters (especially for India / founders)

Related coverage

Related Stories

Prime Intellect’s Prime-RL Aims to Train Trillion-Parameter AI Models

The Gemini-Powered Google Home Speaker Is Finally Here

Midjourney, Known for AI Images, Unveils a Full-Body Ultrasound Scanner

Leave a Comment Cancel reply