INQUIRING LINE

Can the same predicate generate different projection strength in different contexts?

This explores whether projection — the way certain content (like presuppositions) survives even when the sentence around it is negated or questioned — is a fixed property of a word, or something that shifts depending on the conversation it lands in.


This explores whether projection — the way certain content 'leaks through' negation or questioning — is locked to a word, or whether the same trigger can project strongly in one context and weakly in another. The corpus answers directly: yes, and the variation is the whole point. Across 19 English expressions, projection strength turns out to be gradient rather than binary, and what governs it is not the word class but at-issueness — whether the content is addressing the live Question Under Discussion in that moment Does projection strength vary by context or by word type?. The same presupposition trigger projects more when its content is backgrounded and less when it becomes the thing actually being debated. So 'predicate X always projects' is the wrong mental model; 'predicate X projects to the degree this conversation treats it as settled' is the right one.

The interesting turn is what happens when you hand this context-sensitivity to a language model. If projection depends on reading the surrounding discourse, a system that only registers surface cues should get it badly wrong — and it does. LLMs treat presupposition triggers and non-factive verbs as fixed surface patterns rather than computing their actual semantic effect, so embedding contexts act as systematic 'blinds' that flip entailments the model can't track Why do embedding contexts confuse LLM entailment predictions?. The same blindness shows up as models accommodating false presuppositions even when a direct question proves they know the fact is wrong — the presupposed framing carries more weight than the knowledge does Why do language models accept false assumptions they know are wrong?.

Why is context so hard for these systems? Because their default is to let training-time associations override what's actually in front of them. Models generate outputs inconsistent with their context when parametric knowledge dominates in-context information, and textual prompting alone often can't dislodge a strong prior Why do language models ignore information in their context?. That's the same machinery, one level down, that makes projection a problem: deciding how much a predicate projects requires weighing the local discourse over the lexical default, and these models are biased toward the default.

The gradient story also rhymes with two findings that have nothing to do with presupposition on the surface. Semantically identical paraphrases produce systematically different model outputs because the model responds to corpus frequency, not meaning — 'same meaning' is a fiction once you look at statistical mass Why do semantically identical prompts produce different LLM outputs?. And the same sentence draws genuinely different human interpretations across social positions, where the disagreement is real signal, not annotation noise Why do readers interpret the same sentence so differently?. Put beside the projection result, a pattern emerges: meaning isn't a fixed value attached to a string. The same words carry different strength, different reading, different output depending on what surrounds them — for triggers it's the Question Under Discussion, for paraphrases it's frequency, for readers it's social position.

The thing you might not have known you wanted to know: the very gradience that makes human projection flexible and context-aware is exactly the capability current models lack. They fail not because they don't know the words, but because they can't recompute a word's force from the situation — they reach for the surface default instead.


Sources 6 notes

Does projection strength vary by context or by word type?

Across 19 English expressions, projectivity varies continuously based on whether content addresses the Question Under Discussion. The same presupposition trigger projects more or less depending on context, not on fixed lexical properties.

Why do embedding contexts confuse LLM entailment predictions?

LLMs treat presupposition triggers and non-factive verbs as surface cues rather than computing their opposite semantic effects on entailments. This structural failure persists across prompts and models, suggesting models rely on surface patterns instead of structural analysis.

Why do language models accept false assumptions they know are wrong?

The FLEX Benchmark shows that models reject false presuppositions at rates far below acceptable levels (GPT-4: 84%, Mistral: 2.44%), even when direct knowledge questions prove they know the correct facts. False presuppositions drive more accommodation than correct knowledge drives rejection.

Why do language models ignore information in their context?

Research demonstrates that LMs generate outputs inconsistent with their context because parametric knowledge from training dominates over in-context information. Textual prompting alone cannot override strong priors; causal intervention in representations is required.

Why do semantically identical prompts produce different LLM outputs?

Cao et al. and Adam's Law show that semantically identical prompts with different sentence-level frequencies produce systematically different output quality. Higher-frequency phrasings win because models register statistical mass from pre-training, not meaning.

Why do readers interpret the same sentence so differently?

Interpretation Modeling research shows that disagreement on socially embedded sentences reflects valid differences in reader perspective, not annotation failure. Structured human disagreement in NLI benchmarks confirms that interpretation distributions carry meaningful information.

Next inquiring lines