Meta says it’s next AI matches GPT-5.5 performance

During a recent internal all-hands town hall meeting, Meta’s Head of Superintelligence, Alexandr Wang, reportedly told employees that the company’s next flagship model—codenamed “Watermelon”—has successfully caught up to OpenAI’s GPT-5.5 on closely watched industry benchmarks.

The leak offers a fascinating glimpse into Meta’s strategy to bridge the gap with frontier labs, though it also underscores the sheer scale of brute-force computing required to get there.

1. The Compute Scale: Moving Beyond “Avocado”

According to reporting on the internal meeting, Wang framed Watermelon as the direct successor to “Avocado,” which was the internal codename for the Muse Spark model family Meta rolled out in April.

The breakthrough isn’t attributed to a secret architectural shortcut, but rather to an immense injection of raw computing power. Wang stated that Watermelon is currently in training and utilizes an order of magnitude more compute than its predecessor.

 [ META'S INTERNAL MODEL EVOLUTION ]
 
  Avocado (Muse Spark) ──► Released April 2026 ──► Efficient production-tier performer
                                 │
                                 ▼ (10x Compute Injection)
  Watermelon           ──► In Training (July 2026) ──► Reportedly hitting GPT-5.5 benchmark parity

2. Realities of the Benchmark Claim

While the milestone is being celebrated internally as validation for Meta’s aggressive talent blitz, the claim comes with a few major real-world caveats:

Unnamed Evaluations: The specific benchmarks where Watermelon achieved parity with GPT-5.5 were not disclosed during the town hall.
A Moving Target: Reaching parity with GPT-5.5 means Meta is catching up to the flagship OpenAI shipped in April. However, OpenAI has already begun moving the goalposts, having recently introduced a limited preview of its next-tier GPT-5.6 model.
The Coding Timeline: Despite the benchmark parity claim, practitioners are still waiting for enhanced capabilities. When asked on social media about when a Meta model would truly compete with Anthropic’s Claude Opus on heavy software engineering tasks, Wang noted it would happen “pretty soon.”

3. The Double-Edged Town Hall

The hype surrounding Watermelon provided a stark contrast to comments made minutes earlier in the exact same room by CEO Mark Zuckerberg.

As previously leaked, Zuckerberg used the first half of the town hall to candidly admit that Meta’s development of actual autonomous AI agents has been significantly slower than expected over the past four months. He noted that the massive organizational reshuffle that shifted 7,000 workers into AI units has “not been as clean” as hoped and has yet to bear major fruit.

The Hardware Hype (Watermelon)	The Software Reality Check (AI Agents)
Pumping Massive Capex: Meta is on track to spend $125B to $145B on infrastructure in 2026 to power these giant training runs.	Friction at the User Layer: Building giant base models on paper is moving faster than translating them into reliable consumer agents for Instagram, WhatsApp, or business automation.
Benchmark Victory: Proving that Meta can match closed-source giants like OpenAI when it scales its cluster sizes.	Timeline Delays: Zuckerberg estimates it will take another 3 to 6 months before the company realizes tangible consumer benefits from its massive restructuring.

Ultimately, the Watermelon disclosure proves that Meta’s massive data center investments can successfully match the raw intelligence of premium APIs on paper. However, as the company prepares to eventually open-source the fruits of this heavy training run, the true test will be whether Meta can turn that raw benchmark score into fluid, everyday tools that ordinary users actually want to use.

1. The Compute Scale: Moving Beyond “Avocado”

2. Realities of the Benchmark Claim

3. The Double-Edged Town Hall

Related Stories

Zuckerberg admits Meta’s AI progress is slow

iOS 27 to detect when user may be getting scammed

Meta launch new AI app called Pocket

Leave a Comment Cancel reply