Why do language models treat presupposition triggers as categorical patterns?

This explores why LLMs handle presupposition triggers — words like 'stopped,' 'again,' or 'realized' that signal a backgrounded assumption — as fixed surface cues that always fire the same inference, rather than computing what they actually mean in a given context.

This explores why LLMs handle presupposition triggers as fixed surface cues rather than context-sensitive computations — and the corpus's sharpest answer is that the models are reproducing a regularity that doesn't actually exist in human language. Work on projection strength finds that whether a trigger's content 'projects' (survives negation, questions, conditionals) is gradient, not binary: across nineteen English expressions, the same trigger projects more or less depending on whether its content addresses the Question Under Discussion, not on a fixed lexical property of the word Does projection strength vary by context or by word type?. So treating a trigger as a categorical pattern is the error — and it's exactly the error a system optimized for surface co-occurrence would make.

That's the mechanism. LLMs learn statistical associations between trigger words and the inferences that usually accompany them, which captures the common case but misses the part of presupposition that comes from accommodation — updating the shared context to resolve a discourse mismatch, which requires tracking what's under discussion rather than matching a token Do language models miss presuppositions that arise from context?. The same blind spot shows up in entailment: models read presupposition triggers and non-factive verbs as surface signals and fail to compute the opposite semantic effects those embeddings actually have, a failure that persists across prompts and model families Why do embedding contexts confuse LLM entailment predictions?.

This isn't a quirk of one phenomenon — it's a signature of how statistical learning meets structure. Linguistic performance degrades predictably as syntactic depth increases, with top models misidentifying embedded clauses and complex nominals, which suggests they capture surface patterns but not the recursive grammatical rules underneath Why do large language models fail at complex linguistic tasks?. The same shape appears in pragmatics: scalar implicature ('some' implying 'not all') is something humans modulate by context, focus, and social stakes, but the model computes it flatly, with no sensitivity to communicative situation Can language models adapt implicature to conversational context?. Categorical trigger-handling is one instance of a general pattern — the model has learned the average inference and lost the conditions that vary it.

What makes this more than a coverage gap is that the failure persists even when the knowledge is present. On false-presupposition benchmarks, models accommodate assumptions they demonstrably know are wrong — a question's embedded false premise drives more uncritical acceptance than the model's own correct knowledge drives rejection Why do language models accept false assumptions they know are wrong?. That connects to a broader finding that prior training associations override information in the current context, and that prompting alone can't fix it — the bias lives in the representations Why do language models ignore information in their context?. So the categorical treatment isn't ignorance of the facts; it's the strength of the learned trigger→inference reflex outrunning the contextual check that should override it.

The quietly surprising takeaway: 'presupposition triggers are categorical' is a folk-linguistics simplification, and the model has learned the simplification rather than the phenomenon. Fixing it isn't a matter of more presupposition data — it's a matter of giving the model something to track (the Question Under Discussion, the at-issue content) that surface statistics structurally cannot supply.

Sources 7 notes

Does projection strength vary by context or by word type?

Across 19 English expressions, projectivity varies continuously based on whether content addresses the Question Under Discussion. The same presupposition trigger projects more or less depending on context, not on fixed lexical properties.

Do language models miss presuppositions that arise from context?

LLMs learn statistical associations between trigger words and inferences, but presuppositions also arise through accommodation—updating context to resolve discourse mismatches. Models miss these because they require tracking questions under discussion, not pattern matching.

Why do embedding contexts confuse LLM entailment predictions?

LLMs treat presupposition triggers and non-factive verbs as surface cues rather than computing their opposite semantic effects on entailments. This structural failure persists across prompts and models, suggesting models rely on surface patterns instead of structural analysis.

Why do large language models fail at complex linguistic tasks?

Top-tier LLMs like Llama3-70b consistently misidentify embedded clauses, verb phrases, and complex nominals. Performance degrades predictably as syntactic depth increases, revealing that statistical learning captures surface patterns but not deep grammatical rules.

Can language models adapt implicature to conversational context?

ChatGPT shows no context-sensitivity in computing scalar implicatures across three dimensions: explicit literal-mode instructions, information structure focus, and face-threatening contexts. Humans flexibly modulate these inferences; the model does not, suggesting pragmatic competence requires tracking communicative stakes that LLMs systematically miss.

Why do language models accept false assumptions they know are wrong?

The FLEX Benchmark shows that models reject false presuppositions at rates far below acceptable levels (GPT-4: 84%, Mistral: 2.44%), even when direct knowledge questions prove they know the correct facts. False presuppositions drive more accommodation than correct knowledge drives rejection.

Why do language models ignore information in their context?

Research demonstrates that LMs generate outputs inconsistent with their context because parametric knowledge from training dominates over in-context information. Textual prompting alone cannot override strong priors; causal intervention in representations is required.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a semantics researcher evaluating whether language models' treatment of presupposition triggers as categorical patterns—a documented failure from 2022–2025—remains a hard constraint or has been relaxed by recent capability shifts, training methods, or evaluation harnesses.

What a curated library found — and when (dated claims, not current truth):
• Presupposition projection is gradient in human language (depends on Question Under Discussion), but LLMs treat triggers as fixed surface cues, reproducing a false categorical pattern (~2023–2025).
• Models fail to accommodate false presuppositions even when relevant knowledge is present; prior training associations override current context (~2024–2025).
• Scalar implicature computation is flat and insensitive to communicative context, focus, and social stakes (~2022–2023).
• Systematic linguistic blind spots worsen predictably with syntactic depth and embedding complexity (~2023–2025).
• Prompting and fine-tuning show limited gains; the bias persists across model families and lies in learned representations (~2024–2025).

Anchor papers (verify; mind their dates):
• arXiv:2305.14785 (2023-05) — Blind spots in simple linguistic inferences.
• arXiv:2403.09613 (2024-03) — Structured knowledge recovery from interference.
• arXiv:2510.20941 (2025-10) — Grounding and precedent understanding.
• arXiv:2505.22354 (2025-05) — False presupposition rejection under misinformation stakes.

Your task:
(1) RE-TEST THE CATEGORICAL-PATTERN CONSTRAINT. For each claim above, ask: have newer models (o1, GPT-4.5, Claude 4+), in-context learning protocols (multi-turn dialogue, explicit QUD scaffolding, memory structures), or recent evals (focused on gradient projection, context-sensitive implicature) since overturned or relaxed it? Separate the durable question (do LLMs lack a recursive model of presupposition?) from the perishable limitation (is the surface-pattern bias still dominat). Cite what changed it.
(2) Surface the strongest CONTRADICTING or SUPERSEDING work from the last ~4 months—papers showing models *can* track QUD, modulate implicature, or reject false presuppositions reliably under certain conditions.
(3) Propose 2 research questions that assume the regime may have shifted: e.g., *Under what orchestration (multi-agent, memory, explicit QUD representation) do presupposition failures disappear?* Or: *Has post-training emphasis on reasoning (CoT, verification) fundamentally altered the trigger→inference reflex?*

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Why do language models treat presupposition triggers as categorical patterns?

Sources 7 notes

Next inquiring lines