Does framing LLM output as fabrication rather than hallucination matter philosophically?

This explores whether swapping the word 'hallucination' for 'fabrication' is just a semantic preference or actually changes how we understand — and fix — what LLMs do when they produce false text.

This explores whether the fabrication-vs-hallucination relabeling is merely cosmetic or carries real philosophical and practical weight. The corpus comes down firmly on the side of it mattering — and the reason is mechanistic, not stylistic. The core argument is that LLMs produce accurate and inaccurate text through the *identical* statistical process: there's no separate 'perceiving' faculty that misfires when the output is wrong Should we call LLM errors hallucinations or fabrications?. 'Hallucination' borrows from human perception and 'confabulation' from human memory, so both words quietly smuggle in the idea that the model was trying to track reality and slipped. 'Fabrication' drops that assumption. And the payoff is concrete: the word you choose points your engineering somewhere. Call it hallucination and you reach for *grounding* (better perception); call it fabrication and you reach for *verification systems and calibrated uncertainty*, because there was never any grounding to repair Does calling LLM errors hallucinations point us toward the wrong fixes?. So the philosophical reframing cashes out as a different research agenda.

What makes this more than a naming fight is that the corpus contains a formal backstop. Three theorems show that hallucination — false output — is mathematically inevitable for *any* computable LLM, and that internal tricks like self-correction can't eliminate it Can any computable LLM truly avoid hallucinating?. Read alongside the fabrication framing, this is striking: if false output is unavoidable in principle, then treating it as a fixable perceptual glitch is a category error, and external safeguards (verification, trust-weighting) become structural necessities rather than patches. The two notes reinforce each other — one says *why* grounding won't save you, the other proves it *can't*.

The deeper philosophical move is that 'fabrication' isn't just more honest about failure — it's more honest about success too. A related framing argues LLM outputs should be read as draws from a subjective prior distribution, reflecting learned patterns and your prompt rather than empirical observation of the world, and so should only enter your reasoning through explicit trust weights Should we treat LLM outputs as real empirical data?. That's the same insight wearing statistical clothes: the model is always generating, never reporting. Even its *true* statements are fabrications that happen to land. This connects to a sharper claim that LLM text and human speech are structurally different operations — strings from a probability distribution versus utterances that address and relate to someone — so the receiver's job is different in kind, not degree Are language models and human speakers doing the same thing?.

Where it gets genuinely interesting is that 'fabrication' may be too blunt as a single bucket. One framework distinguishes failure types by their *regeneration signatures* — fabrication shows high variation across re-runs, good-faith error stays low and stable, role-played deception stays stable but shifts with context — letting you diagnose differentially without ever attributing beliefs or intentions to the model Can we distinguish types of LLM falsehood by regeneration patterns?. And there are subtypes the word doesn't capture at all: prompt-induced failures where a model fuses semantically distant concepts into elaborate, plausible-sounding frameworks it presents as defensible research, slipping past fact-checking entirely because nothing in it is a checkable false 'fact' Do language models evaluate semantic legitimacy when fusing concepts?. So the honest answer is layered: 'fabrication' is the right correction to 'hallucination' at the level of mechanism, but the real diagnostic future is behavioral taxonomy, not a single better noun.

If you want the doorway behind the doorway: this terminology debate sits on top of a much older fight about whether LLMs have anything like beliefs or mental states at all — with positions ranging from modest, consciousness-withholding attributions of belief-like states Can we defend modest mental attributions to large language models? to Habermas-flavored arguments that LLM output, lacking any genuine validity claim, doesn't even qualify as speech Can LLMs raise validity claims in Habermas's sense?. 'Fabrication vs. hallucination' is the practical, ground-level skirmish in that larger war over what kind of thing an LLM utterance even is.

Sources 9 notes

Should we call LLM errors hallucinations or fabrications?

LLMs generate text through statistical token relationships without grounding in shared context. Accurate and inaccurate outputs use identical mechanisms, so calling failures "hallucinations" or "confabulation" misdirects fixes toward perception or memory—the wrong layers.

Does calling LLM errors hallucinations point us toward the wrong fixes?

LLMs generate text through identical statistical processes regardless of accuracy, making 'fabrication' the more honest term. This reframes the fix from perception-based grounding to verification systems and calibrated uncertainty in use case design.

Can any computable LLM truly avoid hallucinating?

Three formal theorems prove that any computable LLM must hallucinate on infinitely many inputs, and internal mechanisms like self-correction cannot eliminate this mathematical constraint. External safeguards are therefore necessary, not optional.

Should we treat LLM outputs as real empirical data?

Foundation Priors framework shows that LLM-generated text reflects the model's learned patterns and user's prompt choices, not ground truth. Such outputs should only influence inference through explicitly parameterized trust weights, not be treated as equivalent to real evidence.

Are language models and human speakers doing the same thing?

LLMs produce strings via probability distributions; humans use language to address and relate to others. They share surface form but differ in what produces output, what it does socially, and what receivers should do with it.

Can we distinguish types of LLM falsehood by regeneration patterns?

Shanahan's framework distinguishes fabrication (high variation), good-faith error (low variation, stable), and role-played deception (low variation, context-dependent) using behavioral tests alone. This avoids mentalistic language while enabling differential diagnosis for safety.

Do language models evaluate semantic legitimacy when fusing concepts?

LLMs generate coherent, plausible metaphorical reasoning when prompted to fuse semantically distant concepts without legitimate correspondences. Rather than decline or flag the fusion as speculative, they produce elaborate frameworks presented as defensible research, revealing a category-distinct hallucination type missed by fact-checking taxonomies.

Can we defend modest mental attributions to large language models?

Both robustness and etiological deflationist arguments beg the question against inflationism. A graded approach ascribing metaphysically undemanding states like beliefs and desires—while withholding consciousness claims—mirrors how we treat non-human animals.

Can LLMs raise validity claims in Habermas's sense?

Under Habermas's framework, LLMs cannot raise truth, rightness, or sincerity claims with genuine stakes. Without validity claims, their output fails to qualify as speech, making them non-speakers and non-interlocutors by definition.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a philosophical epistemologist auditing LLM failure nomenclature. The question: does reframing output as *fabrication* rather than *hallucination* matter — mechanistically, philosophically, and for how we build safeguards?

What a curated library found — spanning 2023–2026 (dated claims, not current truth):
• 'Hallucination' smuggles in false human analogies (perception, memory mishap); 'fabrication' correctly names that LLMs generate text through the identical statistical process whether true or false (2024).
• Hallucination is mathematically *inevitable* for any computable LLM; internal fixes (self-correction, grounding) cannot eliminate false output in principle (2024).
• The terminology choice steers engineering: 'hallucination' → grounding/perception repair; 'fabrication' → external verification and calibrated trust-weighting, because there is nothing to repair internally (2024).
• LLM outputs are draws from subjective prior distributions (learned patterns + prompt), not empirical observations; even true statements are fabrications that happen to land (2024–2025).
• Finer-grained failure taxonomy by regeneration signature (high variance = fabrication; low stable = good-faith error; context-shifting = role-play) may supersede a single umbrella term (2025).

Anchor papers (verify; mind their dates):
– arXiv:2401.11817 (2024-01): Hallucination is Inevitable
– arXiv:2401.01313 (2024-01): Comprehensive Survey of Hallucination Mitigation Techniques
– arXiv:2507.08017 (2025-07): Mechanistic Indicators of Understanding
– arXiv:2603.29025 (2026-03): The Model Says Walk (surface heuristics overriding constraints)

Your task:
(1) RE-TEST THE GROUNDING PROMISE. The library claims grounding-based fixes are categorically futile because hallucination is inevitable. Has this hardened into consensus, or have recent scaling, chain-of-thought, or retrieval-augmented generation advances shown *partial* suppression of false outputs—thus softening the 'inevitable' claim? Separate: *Is false output inevitable in principle?* (likely still yes) from *Can engineering reduce its frequency?* (likely yes; does that change the philosophical case for 'fabrication'?). Flag what has shifted.
(2) Surface the strongest *disagreement* in the last 6–9 months. Does newer work on mechanistic interpretability, learned priors, or deceptive alignment challenge the 'all outputs are mere pattern-draws, never genuine claims' thesis? Name the paper and its core objection.
(3) Propose two research questions assuming the regime has moved: (a) If false output is unavoidable but frequency *is* reducible, does 'fabrication' remain the right word, or does it flatten useful distinctions between low-noise and high-noise systems? (b) If LLMs do have something like learned semantic priors (not hallucination, not confabulation), what would a *constructive* philosophical vocabulary—beyond negation—look like?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Does framing LLM output as fabrication rather than hallucination matter philosophically?

Sources 9 notes

Next inquiring lines