Tag Archives: Why Language Models Hallucinate

So-called AI hallucinates no matter how good its training data –OpenAI 2025-09-18

This is according to research by the creator of ChatGPT, the bot that started the “AI”boom.

Is this what we want in datacenters sucking up our water?

If not, see a previous post for some bills in the Georgia legislature.

https://wwals.net/?p=69394

[So-called AI hallucinates, no matter how good its training data --OpenAI 2025-09-18]
So-called AI hallucinates, no matter how good its training data –OpenAI 2025-09-18

Gyana Swain, Computerworld, September 18, 2025, OpenAI admits AI hallucinations are mathematically inevitable, not just engineering flaws,

In a landmark study, OpenAI researchers reveal that large language models will always produce plausible but false outputs, even with perfect data, due to fundamental statistical and computational limits.

OpenAI, the creator of ChatGPT, acknowledged in its own research that large language models will always produce hallucinations due to fundamental mathematical constraints that cannot be solved through better engineering, marking a significant admission from one of the AI industry’s leading companies.

The study, published on September 4 and led by OpenAI researchers Adam Tauman Kalai, Edwin Zhang, and Ofir Nachum alongside Georgia Tech’s Santosh S. Vempala, provided a comprehensive mathematical framework explaining why AI systems must generate plausible but false information even when trained on perfect data.

“Like students facing hard exam questions, large language models sometimes guess when uncertain, producing plausible yet incorrect statements instead of admitting uncertainty,” the researchers wrote in the paper. “Such ‘hallucinations’ persist even in state-of-the-art systems and undermine trust.”

The admission carried particular weight given OpenAI’s position as the creator of ChatGPT, which sparked the current AI boom and convinced millions of users and enterprises to adopt generative AI technology.

Continue reading