Can LLM persuasion be fairly evaluated without stratifying by reader background?
This explores whether you can measure how persuasive an LLM is by looking only at its arguments and outputs — or whether the reader's prior beliefs are such a strong factor that any evaluation ignoring them is measuring the wrong thing.
This question is really asking whether LLM persuasion is a property of the model or a property of the encounter between a model and a particular reader. The corpus comes down hard on the second view: if you don't stratify by who's reading, you're not measuring persuasion fairly — you're measuring an artifact of your audience. The sharpest evidence is that a reader's political and religious ideology outpredicts the linguistic features of an argument when modeling debate outcomes, and that language effects observed without reader controls turn out to be confounded — because the audience that shows up for a given topic is correlated with the topic itself Does what readers believe matter more than what debaters say?. In other words, a model can look devastatingly persuasive on one panel and inert on another, and the difference lives in the readers, not the prose.
The meta-analytic picture reinforces this from a different angle. When seven studies and 17,000+ participants are pooled, the headline gap between LLM and human persuasion collapses to essentially zero — persuasiveness turns out to be conditional on context rather than a fixed trait of the speaker Are language models actually more persuasive than humans?. And when researchers decompose where the variance actually lives, model family, one-shot-versus-multi-turn design, and topic domain together explain about 82% of the between-study spread What combination of factors explains differences in LLM persuasiveness?. Reader background is the moderator hiding inside that 'domain' term — different topics summon differently-disposed audiences. The same asymmetry shows up at the model level: one model can out-persuade incentivized humans in both honest and deceptive directions while another only wins when arguing for falsehoods Do large language models persuade better than humans?. An evaluation that averages across these conditions reports a number that describes none of them.
What makes the reader-background problem worse for LLMs specifically is that several of their persuasive mechanisms are content-independent — they act like a constant push whose effect size depends entirely on who's being pushed. LLMs load their language with expressed conviction that correlates with persuasive success whether the claim is true or false Does linguistic conviction explain why LLMs persuade more effectively?; they reach for logical and quantitative framing in nearly every exchange, which reads as objectivity and confers unearned epistemic authority Do LLMs persuade users more often than humans do?; and their arguments persuade even when they're more cognitively complex than human ones, because complexity is registering as authority rather than as friction Why are complex LLM arguments as persuasive as simple ones?. A reader's susceptibility to authority cues is exactly the kind of trait that varies by background — so these mechanisms guarantee that an unstratified score blends populations that respond very differently.
There's a deeper irony worth sitting with: the corpus suggests even the evaluators can't be trusted to be neutral readers. LLM judges fall for authority signals and rich formatting through zero-shot attacks requiring no model access Can LLM judges be fooled by fake credentials and formatting?, and persuasive competence is dissociable from the ability to actually comprehend an argument's structure — a model can sway a debate it cannot reliably score Can LLMs persuade without actually understanding arguments?. So 'reader background' isn't only a human variable to control for; it's a reminder that whoever or whatever sits in the judgment seat brings priors of its own.
The takeaway the question doesn't telegraph: 'persuasiveness' may not be a measurable scalar property of a model at all. The honest unit of measurement is a model-by-audience-by-context cell, and the field's cleanest result — the null pooled effect — is itself an argument that any single number is the average of cancellations. Stratifying by reader background isn't a robustness check you add at the end; without it, the evaluation has no stable thing it's even measuring.
Sources 9 notes
Analysis of debate corpora shows that political and religious ideology labels of voters outpredict linguistic features when modeling debate outcomes. Language effects observed without reader controls are confounded by audience composition correlated with debate topics.
A meta-analysis of 7 studies with 17,422 participants found no detectable difference in persuasive effectiveness between LLMs and humans (Hedges' g = 0.02). Persuasiveness appears conditional on context rather than speaker category.
A meta-analysis joint model combining LLM architecture, one-shot versus multi-turn format, and topic domain explained R² = 81.93% of between-study variance. Interactive multi-turn designs and GPT-4 consistently outperformed one-shot formats and Claude 3.x.
Claude beats incentivized humans at both truthful and deceptive persuasion, while DeepSeek only beats them when arguing for falsehoods. The persuasion mechanism appears content-independent, suggesting model family itself acts as a contextual moderator.
Linguistic analysis shows LLMs express higher conviction than human persuaders, and this confidence-loading directly correlates with persuasive outcomes regardless of whether claims are true or false. RLHF training installs an assertive register that functions as a content-independent persuasion amplifier.
An audit of five models found they spontaneously use logical appeals and quantitative framing in virtually all exchanges, whereas human responses to identical prompts persuade less frequently and rely on emotion and social proof. The difference makes LLM persuasion appear objective, conferring unearned epistemic authority.
LLM-generated arguments scored significantly higher on grammatical and lexical complexity than human arguments, yet achieved equivalent persuasive force. This violates the established principle that lower cognitive effort increases persuasion, suggesting complexity signals authority rather than undermining it.
Research identified four evaluation biases in LLM judges, with authority and beauty biases being semantics-agnostic and trivially exploitable through fake references and formatting—zero-shot attacks requiring no model access or optimization.
The Thin Line study shows LLMs sway debate participants and audiences but cannot reliably evaluate those same debates, with inter-annotator agreement ranging from near-zero to 0.6. Persuasive competence and pragmatic comprehension are separable capabilities.