Why LLMs Lie: The Probability Problem

Hallucinations: why they happen and how they can be mitigated.

Jul 07, 2025

If you prefer to watch the first part of this (until “Extended Discussion”) as a video, click here.

Large Language Models (LLMs) are remarkable text generation machines, but they have a fundamental flaw that every user should understand: they hallucinate. They make things up and say them with complete confidence. To truly understand why this happens, we need to examine how these models work.

The Foundation: Models Are Just Text Generation Machines

Foundationally, LLMs are probability-based text generators. They work by analyzing all possible words that could come next and assigning each word a probability. Then they sample from this probability distribution to choose the next word.

Here's a simple example: if you have the phrase "cat in the," the model might assign a 90% probability to the word "hat" as the next word. But there's still a 10% probability distributed among thousands of other possible words. To make things simple, let’s say "house" gets 5% probability and "bathtub" gets 5%.

This means that (on average) 90 times out of 100, the model would choose "hat" and give you "cat in the hat." But 5 times out of 100, it might choose "house" and say "cat in the house." And 5 times out of 100, it could say "cat in the bathtub."

The Probability Problem: How Random Choices Lead to Hallucinations

Here's where hallucinations begin. The model doesn't always choose the most likely word; it samples from the probability distribution. Sometimes, purely by chance, it selects a word with very low probability. Once it makes that choice, it's committed to that path.

If the model randomly chooses "house" instead of "hat," it doesn't know it made an error. It just continues from there: "cat in the house..." and keeps building off of that. The model has no concept of right or wrong; it's just following the probability chain wherever it leads.

It's remarkably similar to those collaborative poetry games we played in school, where each person would write a word or sentence and pass it to the next person. Once someone writes something unexpected, everyone else just rolls with it, building on whatever came before.

The Core Issue: No Fact-Checking Mechanism

The fundamental problem is that LLMs don't actually know what's fact and what's fiction. They're not connected to databases or the web unless specifically augmented to do so. They're just outputting text based on patterns they learned during training.

When the model encounters areas with more ambiguity — where the probabilities are more evenly distributed across many possible words — it's essentially making educated guesses. But the model doesn't think "I don't know where I'm going to go." It just samples a word and sticks with it, then samples the next word based on what it just chose.

Extended Discussion: Other Reasons for Model Hallucinations

The probability sampling issue is just scratching the surface. There are several other critical reasons why models hallucinate, and it's important to understand that this isn't the whole story.

Training Data Issues

Models are only as good as their training data. If the training data contains misinformation, conflicting information, or gaps, the model will reflect those problems. Think of it this way: if you learned everything you know from Wikipedia articles that had errors in them, you'd propagate those errors too. The model has no way of knowing which parts of its training data were accurate and which weren't.

Attention Mechanism Quirks

Models use attention mechanisms to figure out which parts of the input to focus on, but sometimes they focus on the wrong things or miss important context. It's like trying to answer a question while getting distracted by irrelevant details and missing the key information you actually need.

Context Window Limitations

Models have limited context windows; they can only "remember" a certain amount of text at once. When conversations get really long or when there's a lot of information to process, important details can get pushed out of the context window. The model might start making things up because it literally cannot see the relevant information anymore.

Overconfidence from Training

During training, models learn to be confident in their outputs. They're essentially rewarded for producing fluent, coherent text, not necessarily for being correct. This creates a tendency to sound confident even when they're basically guessing. The training process doesn't teach them to express uncertainty appropriately.

Prompt Engineering Issues

Sometimes hallucinations happen because of how we ask questions. Ambiguous prompts, leading questions, or requests for information that doesn't exist can cause models to fill in gaps with made-up information. It's like asking someone to explain something they've never heard of; they might try to give you an answer anyway rather than admitting they don't know.

Fine-tuning and Alignment Side Effects

The process of fine-tuning models to be helpful and follow instructions can sometimes make them more likely to hallucinate. They learn to always try to give you an answer, even when the honest answer would be "I don't know." This helpfulness can work against accuracy.

Mitigation Strategies

Understanding why hallucinations occur helps us develop strategies to minimize them:

Web Search Integration: Connect models to real-time web search so they can reference current, factual information rather than relying solely on training data.

Database Integration: Link models to specific databases where they can search for relevant information based on your prompt and incorporate that data into their response.

Retrieval-Augmented Generation (RAG): This approach involves searching through relevant documents or knowledge bases and injecting that information into the model's context before generating a response.

Prompt Engineering: Craft prompts that explicitly ask the model to express uncertainty when it doesn't know something, and provide clear instructions about what constitutes acceptable sources.

The key insight is that these approaches work by adding reliable information to the model's "working memory" or context, giving it factual grounding rather than forcing it to rely purely on probability-based text generation.

Conclusion

LLMs are powerful tools, but they're fundamentally text generators, not knowledge systems. They don't distinguish between fact and fiction at their core level. Understanding this limitation is crucial for using them effectively and safely. The goal isn't to eliminate hallucinations entirely — that may be impossible — but to understand when and why they occur so we can build better systems and use existing ones more wisely.

Remember: when an LLM gives you information, especially about specific facts, recent events, or technical details, always verify it against reliable sources. The confidence in the model's voice doesn't correlate with the accuracy of its content.

Harper Carroll AI

Discussion about this post

Ready for more?