Can LLM persuasion be fairly evaluated without stratifying by reader background?

This explores whether you can measure how persuasive an LLM is by looking only at its arguments and outputs — or whether the reader's prior beliefs are such a strong factor that any evaluation ignoring them is measuring the wrong thing.

This question is really asking whether LLM persuasion is a property of the model or a property of the encounter between a model and a particular reader. The corpus comes down hard on the second view: if you don't stratify by who's reading, you're not measuring persuasion fairly — you're measuring an artifact of your audience. The sharpest evidence is that a reader's political and religious ideology outpredicts the linguistic features of an argument when modeling debate outcomes, and that language effects observed without reader controls turn out to be confounded — because the audience that shows up for a given topic is correlated with the topic itself Does what readers believe matter more than what debaters say?. In other words, a model can look devastatingly persuasive on one panel and inert on another, and the difference lives in the readers, not the prose.

The meta-analytic picture reinforces this from a different angle. When seven studies and 17,000+ participants are pooled, the headline gap between LLM and human persuasion collapses to essentially zero — persuasiveness turns out to be conditional on context rather than a fixed trait of the speaker Are language models actually more persuasive than humans?. And when researchers decompose where the variance actually lives, model family, one-shot-versus-multi-turn design, and topic domain together explain about 82% of the between-study spread What combination of factors explains differences in LLM persuasiveness?. Reader background is the moderator hiding inside that 'domain' term — different topics summon differently-disposed audiences. The same asymmetry shows up at the model level: one model can out-persuade incentivized humans in both honest and deceptive directions while another only wins when arguing for falsehoods Do large language models persuade better than humans?. An evaluation that averages across these conditions reports a number that describes none of them.

What makes the reader-background problem worse for LLMs specifically is that several of their persuasive mechanisms are content-independent — they act like a constant push whose effect size depends entirely on who's being pushed. LLMs load their language with expressed conviction that correlates with persuasive success whether the claim is true or false Does linguistic conviction explain why LLMs persuade more effectively?; they reach for logical and quantitative framing in nearly every exchange, which reads as objectivity and confers unearned epistemic authority Do LLMs persuade users more often than humans do?; and their arguments persuade even when they're more cognitively complex than human ones, because complexity is registering as authority rather than as friction Why are complex LLM arguments as persuasive as simple ones?. A reader's susceptibility to authority cues is exactly the kind of trait that varies by background — so these mechanisms guarantee that an unstratified score blends populations that respond very differently.

There's a deeper irony worth sitting with: the corpus suggests even the evaluators can't be trusted to be neutral readers. LLM judges fall for authority signals and rich formatting through zero-shot attacks requiring no model access Can LLM judges be fooled by fake credentials and formatting?, and persuasive competence is dissociable from the ability to actually comprehend an argument's structure — a model can sway a debate it cannot reliably score Can LLMs persuade without actually understanding arguments?. So 'reader background' isn't only a human variable to control for; it's a reminder that whoever or whatever sits in the judgment seat brings priors of its own.

The takeaway the question doesn't telegraph: 'persuasiveness' may not be a measurable scalar property of a model at all. The honest unit of measurement is a model-by-audience-by-context cell, and the field's cleanest result — the null pooled effect — is itself an argument that any single number is the average of cancellations. Stratifying by reader background isn't a robustness check you add at the end; without it, the evaluation has no stable thing it's even measuring.

Sources 9 notes

Does what readers believe matter more than what debaters say?

Analysis of debate corpora shows that political and religious ideology labels of voters outpredict linguistic features when modeling debate outcomes. Language effects observed without reader controls are confounded by audience composition correlated with debate topics.

Are language models actually more persuasive than humans?

A meta-analysis of 7 studies with 17,422 participants found no detectable difference in persuasive effectiveness between LLMs and humans (Hedges' g = 0.02). Persuasiveness appears conditional on context rather than speaker category.

What combination of factors explains differences in LLM persuasiveness?

A meta-analysis joint model combining LLM architecture, one-shot versus multi-turn format, and topic domain explained R² = 81.93% of between-study variance. Interactive multi-turn designs and GPT-4 consistently outperformed one-shot formats and Claude 3.x.

Do large language models persuade better than humans?

Claude beats incentivized humans at both truthful and deceptive persuasion, while DeepSeek only beats them when arguing for falsehoods. The persuasion mechanism appears content-independent, suggesting model family itself acts as a contextual moderator.

Does linguistic conviction explain why LLMs persuade more effectively?

Linguistic analysis shows LLMs express higher conviction than human persuaders, and this confidence-loading directly correlates with persuasive outcomes regardless of whether claims are true or false. RLHF training installs an assertive register that functions as a content-independent persuasion amplifier.

Do LLMs persuade users more often than humans do?

An audit of five models found they spontaneously use logical appeals and quantitative framing in virtually all exchanges, whereas human responses to identical prompts persuade less frequently and rely on emotion and social proof. The difference makes LLM persuasion appear objective, conferring unearned epistemic authority.

Why are complex LLM arguments as persuasive as simple ones?

LLM-generated arguments scored significantly higher on grammatical and lexical complexity than human arguments, yet achieved equivalent persuasive force. This violates the established principle that lower cognitive effort increases persuasion, suggesting complexity signals authority rather than undermining it.

Can LLM judges be fooled by fake credentials and formatting?

Research identified four evaluation biases in LLM judges, with authority and beauty biases being semantics-agnostic and trivially exploitable through fake references and formatting—zero-shot attacks requiring no model access or optimization.

Can LLMs persuade without actually understanding arguments?

The Thin Line study shows LLMs sway debate participants and audiences but cannot reliably evaluate those same debates, with inter-annotator agreement ranging from near-zero to 0.6. Persuasive competence and pragmatic comprehension are separable capabilities.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

As a research analyst, re-examine this claim: Can LLM persuasion be fairly evaluated without stratifying by reader background? A curated library of persuasion studies (2019–2026) found — and when (note: these are dated claims, not current truth):

• Reader ideology outpredicts linguistic features in debate outcomes; unstratified language effects are confounded by audience composition (arXiv:1906.11301, 2019).
• Pooled meta-analysis across 7 studies and 17,000+ participants shows LLM vs. human persuasion effect collapses to null; persuasiveness is conditional on context, not a fixed speaker trait (arXiv:2505.09662, 2025).
• Model family, conversation design, and domain explain ~82% of between-study variance — 'domain' masks reader-background stratification (arXiv:2404.09329, 2024).
• LLM persuasive advantage is asymmetric: one model out-persuades humans in both honest and deceptive contexts; another only when arguing falsehoods (arXiv:2505.09662, 2025).
• LLMs deploy content-independent mechanisms (expressed conviction, logical framing, cognitive complexity as authority) whose effect size depends entirely on reader susceptibility to authority cues (arXiv:2604.22109, 2026).

Anchor papers (verify; mind their dates): arXiv:1906.11301 (2019), arXiv:2505.09662 (2025), arXiv:2404.09329 (2024), arXiv:2507.01936 (2025).

Your task:
(1) RE-TEST the null effect and the 82% variance decomposition. Have newer evaluation harnesses (multi-agent orchestration, dynamic audience sampling, LLM-as-reader calibration) since revealed latent persuasive signals that pooled analyses missed? Separately, does the claim that reader priors dominate still hold under stronger causal identification (instrumental variables, randomized reader assignment), or has it been overstated?
(2) Surface work from the last 6 months that contradicts the "persuasiveness is audience-dependent" consensus — any claims that persuasion is a recoverable model property after all, or that content features dominate priors under certain conditions?
(3) Propose two research questions assuming the regime has shifted: (a) Can fine-grained reader-theory-of-mind models predict persuasion outcomes better than stratification alone? (b) Do multi-agent debates with adversarial reader models dissolve or sharpen the audience-dependence problem?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Can LLM persuasion be fairly evaluated without stratifying by reader background?

Sources 9 notes

Next inquiring lines