Image Credit: https://unsplash.com/@breebuddy Image Credit: https://unsplash.com/@breebuddy

AI hallucinations gain attention as the problem threatens to get worse

New models often worse for hallucinating, OpenAI data shows.

AI hallucinations are capturing the attention of more CEOs as the use of generative AI continues to rise, with analysis showing a 68% increase in mentions of the issue over the last quarter.

Market intelligence firm AlphaSense found a “surge” in mentions of the topic, when AI models appear to completely fabricate information, in earnings calls, particularly for law, finance and tech companies.

UK Country Director Daniel Sanchez-Grant said the data was a “clear signal” of increasing anxiety about the “accuracy, trust and liability” of AI generated data and insight.

He said: “Companies racing to deploy AI risk overlooking the consequences of hallucinations and embedded bias. Without grounding models in trusted, domain-specific data and enforcing human oversight, the risks go beyond reputational. They’re regulatory.”

Not everyone is concerned

The topic was found in earnings calls and company documents as early as 2022 but has increased 64% year on year between July 2024 and 2025, though it’s not all negative with NVIDIA CEO Jensen Huang claiming “a lot of people” have been able to get past the “barrier” of hallucination concerns during a May earnings call.

AlphaSense’s own AI Research Director Sarah Hoffman also said she sometimes welcomed “imaginative” ideas presented by AI models to inspire creative thinking, though advised caution for precise tasks such as drafting legal briefs.

Despite the industry’s best attempts to quell concerns, data shows that newer AI models are often actually worse for hallucinating, with OpenAI’s o3 and 04-mini models recording 0.51 and 0.79 hallucination rates compared to the 0.44 rate recorded by its 01 model when answering a set of 4,000 questions during the company’s own testing.

A hallucination leaderboard maintained by GenAI company Vectara also has versions of Google’s lightweight Gemma 1.1 model, Apple’s OpenELM and Meta’s Llama 3.2 as some of the worst offenders.

In response to the rise, new start ups have begun promising to tackle the problem, including AI bias and auditing company CTGT and “machine unlearning” platform Hirundo, which raised $8m in seed funding in June.

The use of automated reasoning has also been pushed as a way to limit hallucinations by encouraging models to evidence the truth of their outputs, and limiting AI model inputs to verified datasets has been shown to improve their accuracy, something Google acknowledged last year.

REGISTER NOW

By Noah Bovenizer / The Stack Reporter

Noah Bovenizer is a reporter with The Stack. He previously covered the transportation sector for Global Data. He has a first class degree in journalism and has also worked as a newsreader for Gaydio.

(Source: thestack.technology; July 30, 2025; https://tinyurl.com/24psrt4z)
Back to INF

Loading please wait...