Google’s AI Overview wrong 10% of the times

April 8, 2026

127

A major report published yesterday by The New York Times, in collaboration with AI startup Oumi, has confirmed that Google’s AI Overviews currently carry an error rate of approximately 9% to 10%.

The study used the SimpleQA benchmark—a rigorous test of over 4,000 verifiable factual questions—to measure the accuracy of the summaries that now appear at the top of nearly 50% of all Google searches.

1. The Accuracy Gap: Gemini 2.5 vs. Gemini 3

The report highlights that while Google’s accuracy is improving with newer models, the “confidently wrong” nature of AI still poses a massive scale problem.

Model Version	Accuracy Rate	Error Rate	Status
Gemini 2.5	85%	15%	2025 Standard
Gemini 3	91%	9%	Current (April 2026)

While a 91% accuracy rate sounds impressive, analysts warn that with Google handling over 5 trillion searches per year, even a 9% error rate results in tens of millions of incorrect answers every single day.

2. High-Profile “Hallucinations”

The NYT and Oumi analysis identified several specific instances where AI Overviews provided false information as absolute fact:

Bob Marley Museum: When asked when Marley’s home became a museum, Google answered 1987. The correct year is 1986. The AI cited multiple sources, but none supported the 1987 claim.
Yo-Yo Ma: For a query about his induction into the “Classical Music Hall of Fame,” Google linked to the organization’s site but claimed the hall of fame did not exist.
Dick Drago: The AI correctly stated the baseball player’s age at death but provided a completely fabricated date of death.

3. The “Ungrounded” Citation Problem

Perhaps more troubling than the flat-out lies is the rise of “ungrounded” responses.

Definition: These are answers that are factually correct but link to sources that do not contain the information or actually contradict the answer.
The Trend: In October 2025, 37% of correct AI answers were ungrounded. By February 2026, that figure rose to 56%. This makes it nearly impossible for users to verify the information without doing a manual search.

4. Google’s Rebuttal: “Flawed Benchmarks”

Google spokesperson Ned Adriance has pushed back against the findings, stating that the study has “serious holes” and does not reflect how people actually use Search.

Flawed Data: Google claims the SimpleQA benchmark itself contains inaccurate data and that they use a more strictly validated internal tool called “SimpleQA Verified.”
Dynamic Results: Google notes that AI Overviews use different models (some smaller/faster, some larger/smarter) depending on the query, making a single accuracy score misleading.
Safety First: The company maintains that its safety systems filter out the most dangerous hallucinations, particularly in “Your Money or Your Life” (YMYL) categories like health and finance.

5. Why It Matters: The “Zero-Click” Risk

Critics argue that the 10% error rate is more dangerous now than in previous years because AI Overviews have led to a “Zero-Click” search culture.

Trust Factor: Unlike the old “ten blue links” where users compared sources, AI Overviews present a single, authoritative summary.
Organic Drop: Organic click-through rates (CTR) have plummeted by 61% for queries featuring an AI Overview. If the summary is wrong, users are 60% less likely to click a link and discover the truth.

“Google is trading the reliability of the open web for the convenience of a summary,” noted an analyst from Ars Technica. “When 1 in 10 answers is wrong, ‘convenience’ becomes a liability.”

Lapaas Voice

Subscribe to newsletter

Startup

Artificial Intelligence

Funding

Case Studies

Lapaas Voice

Startup

Artificial Intelligence

Funding

Case Studies

Lapaas Voice

1. The Accuracy Gap: Gemini 2.5 vs. Gemini 3

2. High-Profile “Hallucinations”

3. The “Ungrounded” Citation Problem

4. Google’s Rebuttal: “Flawed Benchmarks”

5. Why It Matters: The “Zero-Click” Risk

LEAVE A REPLY Cancel reply

Lapaas Voice

About us

Latest Articles

Most Popular

Subscribe