hallucination is when the model generates content that is fluent and confident but factually wrong. This is different from "the model got it wrong" — wrong with low confidence is fine. Wrong with high confidence, in plausible language, that's hallucination. The danger is that it looks indistinguishable from a correct answer.
the model is a fluency engine, not a truth engine.
The two types of hallucination
intrinsic = contradicts source; extrinsic = invents content.
Different mitigations apply to each:
| Type | Primary mitigation |
|---|---|
| Intrinsic (contradiction) | Better prompting, shorter context, lower temperature, structured output, fact-extraction checks |
| Extrinsic (fabrication) | RAG (give it sources), grounding, refuse-when-unsure prompting, fine-tuning to say "I don't know," external fact verification |
If the model doesn't know the answer, the top token is wrong regardless of temperature. T=0 will confidently and consistently produce the wrong answer.
Lower temperature reduces random errors but doesn't fix systematic errors. If the model's underlying belief is wrong, determinism makes the wrong answer consistent, not correct.
RAG dramatically reduces hallucination but doesn't eliminate it. The model can still ignore the retrieved chunks, synthesize them incorrectly, add unsupported detail, or fall back to parametric knowledge when retrieval misses. Production RAG needs strong grounding prompts, mandatory citations, and a separate faithfulness eval pipeline that checks whether each generated claim is actually supported by the retrieved sources.