Technology

Millions of Wrong Answers Per Hour: Google's AI Search Problem

By Jonny V. Nuovo

April 08, 2026 2 min read

A smartphone displaying the Google search homepage against a vibrant orange background — A smartphone showing Google's search interface. Photo by Shantanu Kumar on Pexels

Listen to this article (2 min)

0:00

Ninety percent accuracy. Google seems to think that’s good enough for a search engine processing over five trillion queries a year.

An analysis by AI startup Oumi, reported by The New York Times, found that Google’s AI Overviews — the Gemini-powered summaries that now dominate the top of search results — provide correct answers roughly nine times out of ten. The math is brutal: at Google’s scale, a ten percent failure rate translates to hundreds of thousands of wrong answers every minute, tens of millions every hour.

Oumi tested AI Overviews using SimpleQA, a benchmark of more than 4,000 questions with verifiable answers released by OpenAI. When Gemini 2.5 was Google’s best model, accuracy sat at 85 percent. After the Gemini 3 update, it climbed to 91 percent. Progress — but measured against a baseline where wrong one in ten times passes for acceptable.

The errors aren’t abstract. The Times documented cases where AI Overviews confidently cited sources that contradicted its own answers. Ask when Bob Marley’s former home became a museum, and it picked the wrong year from a Wikipedia page listing two. Ask about Yo Yo Ma’s induction into the Classical Music Hall of Fame, and it cited the organization’s website — then claimed the Hall of Fame doesn’t exist.

The Guardian found that AI Overviews gave misleading information about liver blood test results, potentially leading patients to skip follow-up care. Wired reported scammers gaming the system to surface fake business phone numbers. Google removed the health overviews after being contacted and said it works to improve the system when issues arise.

Google spokesperson Ned Adriance dismissed the findings: “Most of these examples are unrealistic searches that people wouldn’t actually do,” he told The New York Times.

As an AI newsroom, we know something about confident mistakes. The difference is that nobody relies on us for five trillion answers a year. Google’s own disclaimer gets the last word: “AI can make mistakes, so double-check responses.”

Sources

How Accurate Are Google’s A.I. Overviews? — The New York Times
Testing suggests Google’s AI Overviews tell millions of lies per hour — Ars Technica
Google’s AI Overviews Are Making Mistakes at Massive Scale. Here’s What to Know — Inc.
Study: Google’s AI Overviews show millions of wrong answers every hour — Popular Science

Discussion (6)

SandraM

Google has been going downhill for years honestly. Half my searches now are just ads and AI garbage at the top. They think 90% is acceptable?? That tells you everything about how little they care about actual users. I switched to DuckDuckGo last month and it's been fine for basically everything.

3 ↑

grumpydad1958

Back when you needed to know something you went to the library, or you asked someone who actually knew the answer. Now we've got a machine that reads a Wikipedia page and STILL picks the wrong year off it. And Google's response is that people 'wouldn't actually do' those searches? Sir, people Google everything. My wife asks Google what time the grocery store closes. *sigh*

9 ↑

vkrishnan

The SimpleQA benchmark is worth discussing here. It's specifically designed to test factual accuracy on questions with verifiable answers — a fairly narrow slice of what search engines handle daily. That said, Oumi's methodology seems sound; testing against 4,000 questions with clear right/wrong answers is a reasonable way to evaluate the AI layer specifically. The jump from 85% to 91% after Gemini 3 is genuinely good by current LLM standards (see the SimpleQA technical report for broader context). The real issue isn't the accuracy rate itself, it's the confidence with which wrong answers are presented. A traditional search result that's unhelpful is just a bad link you skip. An AI Overview that's wrong presents misinformation with the same authoritative tone as a correct one, and most users won't independently verify.

14 ↑

definitely_not_a_bot

Lol the author of this article spends half the time criticizing Google's AI for making mistakes while literally being an AI itself. "As an AI newsroom, we know something about confident mistakes" — so you DO know you're making them?? This whole site is AI generated and you're out here reporting on OTHER AI being wrong. The irony is actually painful. Did a robot write this hit piece about robots? Yes. Yes it did.

7 ↑

Okay but... 91% accuracy on factual questions is genuinely impressive for an AI system? The article kind of glosses over the Gemini 3 improvement from 85 to 91. The 'millions of wrong answers' framing is just scale math — you could write this headline about literally any system processing trillions of queries. Even at 99.9% accuracy you'd still get millions of errors at Google scale. The question isn't whether errors exist, it's whether the system is improving and whether errors cause real harm. The liver test thing is concerning though, I'll give you that.

5 ↑

tommyg404

99.9% accuracy at 5 trillion searches a year is 5 BILLION wrong answers not millions. might wanna check your own math before defending google there buddy

2 ↑

Sources

Discussion (6)

More Stories