INQUIRING LINE

Which linguistic features predict persuasion only after audience composition is held constant?

This explores a methodological twist in persuasion research: once you statistically control for *who* is in the audience (their ideology, their prior beliefs), which language features still earn their predictive power — and which were never really doing the work?


This explores what happens to the "persuasive language" story once you account for the audience listening to it. The most direct answer in the corpus is unsettling: many linguistic features that look predictive in standard analyses are actually artifacts of *who showed up*, not what was said. When debate corpora are modeled without controlling for the audience, the language appears to carry the persuasion — but Does what readers believe matter more than what debaters say? shows that voters' political and religious ideology outpredicts linguistic features outright, because audience composition is correlated with debate topics. The language effect was riding on a hidden passenger.

The sharper finding for your exact question is in Do linguistic features of persuasion stay the same across audiences?: the set of features that predict persuasion *changes* once ideology is added as a control. Features that ranked as top predictors in naive analyses often reflected audience-text matching — the right readers being drawn to the right arguments — rather than any intrinsic persuasive property of the words. So the honest answer to "which features predict only after holding audience constant" is partly a warning: a large share of published features predict *less*, not more, once you control, and the survivors are different ones. The corpus doesn't hand you a clean list of newly-promoted features so much as it tells you the leaderboard reshuffles.

What *does* survive controls, across adjacent work, tends to be content-independent register rather than topical word choice. Does linguistic conviction explain why LLMs persuade more effectively? isolates expressed conviction — confidence-loading in the phrasing — as a persuasion amplifier that works regardless of whether claims are true or false, which is exactly the kind of feature that wouldn't wash out when you control for audience ideology because it doesn't depend on topic-audience matching. Similarly, Why are presuppositions more persuasive than direct assertions? points to a structural mechanism: presuppositions persuade more than assertions by smuggling claims in as already-accepted background, bypassing scrutiny. These are features tied to *how* something is said, not *what* topic it's about — the kind that should remain predictive after audience composition is held fixed.

There's a useful cross-domain contrast lurking here. Can simple linguistic features detect AI-written arguments? shows that interpretable linguistic features can detect AI-authored arguments with near-perfect accuracy — proof that stylistic signatures are real and measurable. But *detectable* and *persuasive-after-controls* are different bars: a feature can be a reliable fingerprint of who wrote the text while contributing nothing to whether it changed a reader's mind. The audience-control literature is precisely the tool that separates those two — and it's why the field-wide null in the-pooled-effect-of-llm-vs-human-persuasion-is-statistically-null-the matters: when speaker category and audience are properly accounted for, persuasion turns out to be conditional on context, not a property you can read off the surface language alone.

The thing worth taking away: "persuasive language" research has a confound problem, and the audience is the confound. Holding it constant doesn't just refine the answer — it can invert it, demoting topical features that were really audience-matching in disguise and promoting register-level features like conviction and presupposition that work independently of who's listening.


Sources 6 notes

Does what readers believe matter more than what debaters say?

Analysis of debate corpora shows that political and religious ideology labels of voters outpredict linguistic features when modeling debate outcomes. Language effects observed without reader controls are confounded by audience composition correlated with debate topics.

Do linguistic features of persuasion stay the same across audiences?

The linguistic features that predict persuasion success change dramatically once political and religious ideology are added as statistical controls. Features appearing predictive in standard analyses often reflect audience-text matching rather than true language effects, making many published findings potentially artifacts of audience composition.

Does linguistic conviction explain why LLMs persuade more effectively?

Linguistic analysis shows LLMs express higher conviction than human persuaders, and this confidence-loading directly correlates with persuasive outcomes regardless of whether claims are true or false. RLHF training installs an assertive register that functions as a content-independent persuasion amplifier.

Why are presuppositions more persuasive than direct assertions?

Experimental evidence shows presuppositions with additive, iterative, and factive triggers persuade audiences more than assertions, especially for discourse-new content. The mechanism: presuppositions bypass evaluative scrutiny by presenting claims as already-accepted background.

Can simple linguistic features detect AI-written arguments?

General linguistic features combined with argument-quality measures achieved 99% accuracy detecting LLM-generated counter-arguments on r/ChangeMyView, matching heavyweight neural detectors while remaining computationally cheap and transparent. LLMs produce detectable stylistic signatures: accommodation to prompts and textbook-quality argument markers that humans don't replicate.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a persuasion linguist. The question remains open: Which linguistic features predict persuasion *only* after audience composition is held constant?

What a curated library found — and when (dated claims, not current truth):
Findings span 2019–2026. A curated library reports:
• Audience ideology outpredicts linguistic features in debate corpora; topical word choice often reflects audience-text matching, not intrinsic persuasive power (2019).
• The set of predictive features *reshuffles* once reader prior beliefs are controlled; naive leaderboards collapse (2024).
• Expressed conviction (confidence-loading in phrasing) survives audience controls and persuades regardless of truth value (~2024).
• Presuppositions outperform assertions by embedding claims as background; this register-level feature should remain predictive post-control (2025).
• LLM persuasiveness is context-conditional; pooled effect is statistically null when speaker category and audience are properly accounted for (2025).

Anchor papers (verify; mind their dates):
• arXiv:1906.11301 (2019) — Prior beliefs as confound
• arXiv:2404.09329 (2024) — Cognitive effort in LLM persuasion
• arXiv:2505.22354 (2025) — Presupposition rejection under pressure
• arXiv:2604.22109 (2026) — Audit of everyday conversational persuasiveness

Your task:
(1) RE-TEST the audience-confound thesis. Has recent work (last 6 mo.) with better causal isolation methods, larger matched-audience designs, or adversarial audience sampling *weakened* the claim that topical features are mostly artifacts? Or do they reinforce it? Separately: have new register-level features (beyond conviction/presupposition) emerged that survive controls?
(2) Surface the strongest *disagreement*: any recent paper claiming that linguistic features *do* retain predictive power independent of audience, contradicting the library's null-finding narrative. Flag the methodological crux.
(3) Propose 2 research Qs that assume the regime has shifted: (a) How do multimodal persuasion signals (tone, gesture, LLM-generated persona) interact with audience composition controls? (b) Do audience *heterogeneity* and *polarization* *reinstate* topical-linguistic effects by breaking the homogeneity assumption underlying prior-belief controls?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines