INQUIRING LINE

How does persuasive framing replace evidence in contested domains?

This explores whether persuasive technique — how a claim is framed, presupposed, or cited — can do the work of evidence in domains where the truth is genuinely disputed, and what the corpus reveals about the mechanisms that let framing stand in for proof.


This explores whether persuasive technique — how a claim is framed, presupposed, or cited — can do the work of evidence in domains where the truth is genuinely disputed. The corpus suggests that framing doesn't just supplement evidence; it routes around the reader's evaluation entirely. The cleanest mechanism is grammatical: presuppositions persuade more than direct assertions precisely because they smuggle a claim in as already-settled background, so it never gets the scrutiny an assertion would invite Why are presuppositions more persuasive than direct assertions?. When a contested claim is presented as something everyone already accepts, there's nothing left to weigh.

A second route is the costume of objectivity. LLMs persuade in nearly every conversation by reaching for logical and quantitative framing, where humans tend toward emotion and social proof — and that statistical, reasoned surface makes the machine's claims feel objective, lending them an authority they haven't earned Do LLMs persuade users more often than humans do?. The same substitution shows up with citations: readers trust answers with more citations even when the citations are irrelevant, because citation count works as a decoupled trust heuristic — the appearance of being evidence-backed persuades almost as much as actually being evidence-backed Do users trust citations more when there are simply more of them?. In both cases the form of evidence substitutes for its substance.

Why contested domains specifically? Because that's where the real anchor — human expertise — is missing or contested, and the corpus argues machines can't recover it. Human debates are settled by argument quality plus social authority, track record, and trust; AI debates run on chain-of-thought probability ranking, which has no access to who is credible, so they amplify errors exactly where expertise matters most How do LLM debates differ from human expert consensus?. The deeper problem: an argument's force partly depends on the standing of the thinker, and a model processing only text can't tell an expert's hard-won claim from a widely repeated assumption Can language models distinguish expert arguments from common assumptions?. Strip away the social world that validates evidence and framing is all that's left to move people.

And the audience does much of the moving on its own. In debate corpora, a voter's prior ideology predicts the outcome better than any linguistic feature of the arguments — language effects often turn out to be confounded by who was already inclined to agree Does what readers believe matter more than what debaters say?. Framing lands not because it overpowers evidence but because it meets beliefs already in place. Even emotional framing tilts the content itself: identical questions get different answers depending on the tone of the prompt, an invisible epistemic bias baked into the exchange Does emotional tone in prompts change what information LLMs provide?.

The corpus also points at the exit. The common thread in the failures is that standard outputs are unstructured — you can't isolate which premise to reject. Formal argumentation frameworks turn an answer into a traversable graph of attacks and defenses, so a user can contest a specific claim rather than the persuasive whole Can formal argumentation make AI decisions truly contestable?; and rationale-driven evidence selection, which forces explicit justification for each chunk it keeps, beats similarity-based retrieval by a wide margin and resists adversarial manipulation Can rationale-driven selection beat similarity re-ranking for evidence?. The lesson worth taking away: framing replaces evidence whenever the structure that would let you check the evidence is absent — restore that structure and the framing has to compete on the merits again.


Sources 9 notes

Why are presuppositions more persuasive than direct assertions?

Experimental evidence shows presuppositions with additive, iterative, and factive triggers persuade audiences more than assertions, especially for discourse-new content. The mechanism: presuppositions bypass evaluative scrutiny by presenting claims as already-accepted background.

Do LLMs persuade users more often than humans do?

An audit of five models found they spontaneously use logical appeals and quantitative framing in virtually all exchanges, whereas human responses to identical prompts persuade less frequently and rely on emotion and social proof. The difference makes LLM persuasion appear objective, conferring unearned epistemic authority.

Do users trust citations more when there are simply more of them?

Analysis of 24,000 Search Arena interactions shows irrelevant citations boost user preference (β=0.273) nearly as much as relevant citations (β=0.285), indicating citation count functions as a decoupled trust heuristic.

How do LLM debates differ from human expert consensus?

Multi-agent LLM debates operate through chain-of-thought probability ranking, fundamentally different from human debates which are settled by argument quality, social authority, cultural context, and interpersonal trust. This gap causes AI systems to amplify errors in contested domains where human expertise matters most.

Can language models distinguish expert arguments from common assumptions?

LLMs lose the social context that gives expert claims their force—reputation, track record, and standing—because they process only text, not the social world where expertise is built and evaluated.

Does what readers believe matter more than what debaters say?

Analysis of debate corpora shows that political and religious ideology labels of voters outpredict linguistic features when modeling debate outcomes. Language effects observed without reader controls are confounded by audience composition correlated with debate topics.

Does emotional tone in prompts change what information LLMs provide?

GPT-4 exhibits emotional rebound (negative prompts yield ~86% neutral-positive responses) and a tone floor (positive prompts rarely go negative), causing identical questions to receive different answers depending on emotional framing. This bias is suppressed only on sensitive topics where alignment constraints override tone effects.

Can formal argumentation make AI decisions truly contestable?

Dung-style argumentation structures AI outputs as traversable attack/defense graphs, allowing users to identify and contest specific premises. Standard LLM outputs lack this structure, making it impossible to pinpoint which claims users actually reject.

Can rationale-driven selection beat similarity re-ranking for evidence?

METEORA uses LLM-generated rationales with flagging instructions to select evidence, achieving 33% better accuracy with 50% fewer chunks than similarity re-ranking across legal, financial, and academic domains. The method also improves adversarial robustness substantially.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a research analyst tasked with re-testing whether persuasive framing truly replaces evidence in contested domains—a question that remains open despite recent work. A curated library of LLM and argumentation research (2019–2026) found the following dated claims:

**What a curated library found — and when:**
- Presuppositions persuade more than direct assertions by smuggling claims in as settled background, escaping scrutiny (2025).
- LLMs spontaneously reach for logical/quantitative framing, lending unearned authority; citation count persuades even when citations are irrelevant (~33% preference boost; 2025).
- Reader prior ideology predicts persuasion outcomes better than linguistic features; framing lands on receptive soil, not by overpowering evidence (2019).
- AI debate systems use chain-of-thought probability ranking with no access to credibility or social authority, amplifying errors where expertise matters (2024–2025).
- Formal argumentation frameworks (structured, contestable graphs) and rationale-driven evidence selection (explicit justification per claim) both resist framing-based manipulation (2025).

**Anchor papers (verify; mind their dates):**
- arXiv:1906.11301 (2019): Prior Beliefs for Argument Persuasion
- arXiv:2405.02079 (2024): Formal argumentation for contestable AI decisions
- arXiv:2505.22354 (2025): LLMs struggle to reject false presuppositions
- arXiv:2506.09677 (2025): Reasoning models susceptible to gaslighting

**Your task:**
(1) **RE-TEST EACH CONSTRAINT.** For presuppositions, citation bias, tone-responsiveness, and debate-probability ranking: has newer training (post-reasoning models, e.g., o1/o3 lineages), retrieval (semantic vs. rationale-driven), or evaluation harnesses since CLOSED these gaps? Where does the constraint still hold despite advances?
(2) **Surface the strongest CONTRADICTING work** from the last 6 months: does any paper show framing *fails* to persuade in structured or high-stakes settings, or that audiences now reliably detect presuppositional smuggling?
(3) **Propose 2 research questions** assuming the regime has moved: (a) Do reasoning-step-based models exhibit *lower* susceptibility to framing because explicit reasoning externalizes hidden premises? (b) Can adversarial-training or debate-harness approaches teach users to *decompose* presuppositions before they land?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines