One of these tools confidently cited a 2024 study with a URL that returned 404, by an author with zero online footprint. The study did not exist. The answer was wrong. The reader — me — almost copied it into a doc.
I ran 30 queries across Perplexity, ChatGPT Search, and Google AI Overview to find out which AI search engine actually deserves your trust. The surprise wasn’t which one was best. It was which one was confidently, fluently, helpfully wrong.
Why every other comparison gets this wrong
Every Perplexity vs ChatGPT Search vs Google AI Overview article you’ll find lines up the same things: pricing tiers, monthly active users, feature checklists. None of which answer the only question that actually matters when you paste an answer into a client email.
The NYT and Ars Technica both ran the numbers in April 2026: Google AI Overview is materially wrong on roughly 9% of factual queries. At billions of impressions per day, that’s millions of bad answers per hour — delivered with the confidence of a top SERP result.
A feature comparison can’t see that. The practitioner question isn’t “which tool has more features.” It’s “which one will quietly lie to me today.” So I stopped reading comparisons and ran the test the comparisons skipped.
How I ran the test
Thirty queries, split across five buckets: factual lookups, comparison questions, current events, technical how-tos, and research synthesis. Real questions I’d actually ask in a workday — not gotchas.
Same exact prompt typed into Perplexity (default model), ChatGPT Search (logged-in, browsing on), and Google AI Overview (default Google SERP). Each answer fact-checked against primary sources. Every cited link opened to confirm two things: the page existed, and the page supported the claim. Response times measured from submit to fully rendered answer.
Honest limits: one tester, English queries, a single day of testing. Models update weekly and the gap will shift. But one day is enough to expose the pattern — and the pattern was loud.
The accuracy scorecard: 30 queries, three answers each
Headline numbers: Perplexity 92% accurate, ChatGPT Search 87%, Google AI Overview last — and the only tool that fabricated entire sources rather than just misstating details.
Broken down by bucket, the pattern sharpens. On simple factual lookups, all three were close — capitals, conversions, definitions. On current events, Perplexity and ChatGPT Search pulled ahead because they actually pulled live sources; Google AI Overview leaned on cached summaries and got the most recent developments backwards twice. On research synthesis — “summarize the 2026 EU AI Act timeline” type questions — Perplexity won decisively on citation quality, with every link landing on a real page that supported the claim. On comparison questions, ChatGPT Search produced the cleanest structured answers — worth knowing if you already use ChatGPT’s full feature set for other tasks. On technical how-tos, ChatGPT Search edged ahead with multi-step reasoning that didn’t skip a step.
The uncomfortable pattern: Google AI Overview was the fastest and most confident of the three, and most often wrong in ways the reader could not catch — a pattern that shows up in Google AI’s accuracy track record beyond search, too. That combination — high speed, high confidence, low accuracy, weak citation surfacing — is the most dangerous one for a non-technical user who just wants an answer.
Aggregate percentages are easy to dismiss until you see the actual failures.
The hallucination receipts
One from each tool, exact claims:
Google AI Overview answered a question about a regulatory deadline by citing a 2024 study, with a URL that 404’d and an author who has no online footprint — no LinkedIn, no Scholar, no prior work. The source did not exist. The answer above the fold did. This was the fabrication that opened this article.
ChatGPT Search got a headline economic figure right — but attributed it to the wrong year and the wrong report, off by one revision. The number was real. The receipt was wrong. A reader copying the citation into a deck would be defensible-looking and factually exposed.
Perplexity had the cleanest run, but missed a nuance on a regulation question. The answer was technically true and materially misleading without context. The pages it cited were real and supported the surface claim — they just didn’t carry the qualifier that mattered.
The takeaway isn’t “tool X is bad.” It’s that confidence does not equal correctness — and testing what’s real and what’s fabricated is a problem that extends beyond search into every AI output you encounter. Only one of these tools makes the source easy to verify in a single click. The other two ask you to take their word for it.
The speed vs. trust tradeoff
Speed numbers from the test: Google AI Overview lands in roughly 1 second, inline in the search results. ChatGPT Search runs 6–9 seconds. Perplexity runs 8–12 seconds, with sources rendering progressively as the answer builds.
Decision rule of thumb: Google AI Overview for throwaway facts you’d also verify by instinct. ChatGPT Search for thinking-with-you tasks, drafts, and multi-step reasoning. Perplexity for anything you’d put in a doc, an email, or a decision — the citation friction is a feature.
On pricing, free tiers cover most daily use. Pay for Perplexity Pro only if you do research weekly. Pay for ChatGPT Plus if you already use it for non-search reasoning work. Don’t pay anything for AI search alone — the gap between free and paid is smaller than the gap between tools.
Practical workflow: start in the fastest tool. The moment the answer would change a decision, escalate to the most cited tool. One extra click beats one wrong claim.
The verdict
The fabricated source from the hook came from Google AI Overview — the one that runs by default in the search box almost everyone already uses. That’s the part that should bother you.
Clear ranking for the practitioner reader: Perplexity for trust, ChatGPT Search for reasoning, Google AI Overview for speed when the stakes are low and the answer is verifiable at a glance. Re-test quarterly — these models move fast, and so does the gap.
The question isn’t which AI search is best. It’s which one you trust enough not to double-check. Right now, that’s a list of one.