LLMs Must Hallucinate to Function
Nov 16, 2025
Imagine that you made a technology that could not really be tested for correctness, that you couldn't really explain how it worked, that you definitely couldn't reliably predict what it would do, and that using it could cause very serious problems, like people committing suicide.
How seriously do you think people would take you? Do you think they'd line up to pour hundreds of billions of dollars into your product?
Here's the bare, simple truth about LLMs, or what most people know of as "Generative AI" or perhaps just by one of the product's names, ChatGPT or Grok or Claude: LLMs must hallucinate to function.
Two years ago, when ChatGPT was making a big splash and heads of companies were tripping over themselves to deploy "AI initiatives" and loudly proclaim they were reducing headcount or not hiring anymore because "AI" could do the job, this is not what people were leading with.
But more recently, the news organizations are starting to catch up. Here are a couple recent, notable articles:
"Hallucination is fundamental to how transformer-based language models work. In fact, it’s their greatest asset: this is the method by which language models find links between sometimes disparate concepts. But hallucination can become a curse when language models are applied in domains where the truth matters." Your AI strategy needs mathematical logic
... "domains where the truth matters." That's a big hmm for me. 🤔
According to a recent article in the Wall Street Journal, Yann LeCun, formerly head of Meta's FAIR (Fundamental AI Research team), "has been telling anyone who asks that he thinks large language models, or LLMs, are a dead end in the pursuit of computers that can truly outthink humans." He’s Been Right About AI for 40 Years. Now He Thinks Everyone Is Wrong.
But, you know, technology isn't perfect and sometimes things are rough around the edges, so maybe we shouldn't be too hasty in judging this new technology.
That might be a fair point. I'm not sure. Maybe if we knew how big of a problem this might be, we could assess our stance more carefully.
You may have heard of IBM. They're a pretty big company and have done a thing or two with computational technologies over the years. For example, their computer, Deep Blue, was the first to beat a world chess champion.
On the IBM website, you can find something called the AI risk atlas. To make it easy, I added all their content links to a single enumerated list to give a sense of the span of topics they've collected about dangers in deploying LLMs. You might enjoy browsing:
- Toxic output
- Unexplainable and untraceable actions
- Data poisoning
- Unreliable source attribution
- Mitigation and maintenance
- AI agent compliance
- Function calling hallucination
- Harmful output
- Indirect instructions attack
- Confidential information in data
- Lack of model transparency
- Exploit trust mismatch
- Unrepresentative data
- AI agents' Impact on human agency
- AI agents' impact on human agency
- Personal information in prompt
- Sharing IP/PI/confidential information with user
- Temporal gap
- Nonconsensual use
- Decision bias
- Lack of testing diversity
- Exposing personal information
- Confidential data in prompt
- Data privacy rights alignment
- Discriminatory actions
- IP information in prompt
- Prompt leaking
- Hallucination
- Legal accountability
- Social hacking attack
- Non-disclosure
- Lack of training data transparency
- Model usage rights restrictions
- Reproducibility
- Specialized tokens attack
- Prompt injection attack
- Incomplete advice
- Lack of system transparency
- Data usage restrictions
- Impact on cultural diversity
- Impact on education: plagiarism
- Personal information in data
- Direct instructions attack
- Improper usage
- Exclusion
- Extraction attack
- Impact on Jobs
- Jailbreaking
- Data acquisition restrictions
- Sharing IP/PI/confidential information with tools
- Prompt priming
- Reidentification
- Overfitting
- Attribute inference attack
- Poor model accuracy
- Data transfer restrictions
- Generated content ownership and IP
- Encoded interactions attack
- Impact on human dignity
- Output bias
- Dangerous use
- Lack of AI agent transparency
- Unexplainable output
- Human exploitation
- AI agents' impact on jobs
- Improper data curation
- Over- or under-reliance on AI agents
- Attack on AI agents’ external resources
- Revealing confidential information
- Lack of domain expertise
- Spreading disinformation
- Data bias
- Uncertain data provenance
- Unrepresentative risk testing
- Redundant actions
- Data usage rights restrictions
- Unauthorized use
- Misaligned actions
- AI agents' impact on environment
- Harmful code generation
- Data contamination
- Incomplete usage definition
- Copyright infringement
- Context overload attack
- Lack of data transparency
- Impact on affected communities
- Improper retraining
- Spreading toxicity
- Inaccessible training data
- Impact on education: bypassing learning
- Introduce data bias
- Accountability of AI agent actions
- Incomplete AI agent evaluation
- Untraceable attribution
- Evasion attack
- Impact on the environment
- Over- or under-reliance
- Incorrect risk testing
- Membership inference attack
I don't know about you, but if someone showed me a list like this about a piece of technology, I might be concerned about the level of oversight it might be getting and whether we might want to investigate it a bit further.
At the same time, you might also wonder if we're just stuck with LLMs or whether there might be something else we could try. Turns out, there are quite a few things.