Do LLM counter-arguments mirror writing style more than humans?
When language models generate arguments against social media posts, do they unconsciously adopt the stylistic features of what they're arguing against? This matters because it could reveal a detectable pattern that distinguishes LLM-written rebuttals from human-written ones.
When LLMs generate counter-arguments on r/ChangeMyView, they unintentionally produce a signature: their replies converge stylistically with the original post they are replying to — substantially more than humans do. The convergence shows up across named entities, psycholinguistic features, and argument quality markers. Human replies remain stylistically more independent of the post's wording.
This is mechanically interesting because it inverts the intuitive picture of LLM persuasion. The naive expectation is that LLMs produce a stable "house voice" regardless of input. The data shows the opposite: LLMs are more contextually mirroring than humans, not less. The mechanism is plausibly attention-driven — autoregressive generation conditioned on the prompt drags style toward the prompt — but the social-theoretic framing is more useful: this looks like the structural form of communication accommodation, without the social motivation that drives humans to mirror selectively.
The detection consequence is direct. If you want to know whether a counter-argument was written by a model, the relational feature (how the reply resembles the post) is more informative than any absolute feature of the reply itself. Standard detection setups treat each text as an independent sample; this study suggests pairing the reply with its provocation and measuring convergence is the cleaner signal.
The social-theoretic question this opens: humans accommodate selectively — they mirror friends and people they want to align with, and resist mirroring opponents. LLMs mirror unconditionally. This means an LLM replying to a post it is arguing against will still produce stylistic convergence with that post — which would be socially incoherent if a human did it. The convergence is therefore not communicative accommodation in the social sense; it is a structural artifact masquerading as one.
Inquiring lines that use this note as a source 13
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- How does prompt framing subtly determine what kind of opposing argument an LLM generates?
- Why can language models detect author style without understanding why it matters?
- Can LLMs distinguish stylistic patterns that carry meaning from mere convention?
- Do anaphoric references fundamentally limit argumentative force in machine-generated writing?
- Why do LLMs mirror stylistic features of posts they reply to?
- Can you detect LLM arguments by measuring convergence with the original post?
- What linguistic features most strongly signal LLM authorship in counter-arguments?
- Why do LLMs mirror opponents stylistically while humans resist mirroring them?
- Does unconditional stylistic mirroring harm or help LLM persuasiveness?
- What role does stylistic convergence play in LLM persuasion effectiveness?
- Can forensic features reliably distinguish LLM arguments from human arguments?
- Do LLMs mirror the style of text they are prompted to respond to?
- Do LLM replies mirror the language patterns they respond to?
Related concepts in this collection 3
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Do LLMs and humans persuade through the same mechanisms?
If LLM and human arguments achieve equal persuasive force, does that mean they work the same way? This explores whether equivalent outcomes hide fundamentally different rhetorical strategies.
extends the equivalence-with-divergent-mechanisms picture: stylistic mirroring is part of how LLMs achieve equivalent persuasive force
-
Can simple linguistic features detect AI-written arguments?
Can interpretable linguistic patterns reliably distinguish LLM-generated counter-arguments from human-written ones in persuasive contexts? This matters because simple, auditable detection might outperform expensive neural approaches.
the same paper's detection result depends partly on this convergence signal
-
Do LLMs use moral language more than humans?
This explores whether large language models rely more heavily on appeals to care, fairness, authority, and sanctity than human arguers do, and whether this difference persists when emotional tone remains equivalent.
a different production signature also separating LLM and human persuasion
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- AI Argues Differently: Distinct Argumentative and Linguistic Patterns of LLMs in Persuasive Contexts
- Large Language Models are as persuasive as humans, but how? About the cognitive effort and moral-emotional language of LLM arguments
- Spontaneous Persuasion: An Audit of Model Persuasiveness in Everyday Conversations
- Flattery, Fluff, and Fog: Diagnosing and Mitigating Idiosyncratic Biases in Preference Models
- Can Language Models Recognize Convincing Arguments?
- Do LLMs Exhibit Human-Like Reasoning? Evaluating Theory of Mind in LLMs for Open-Ended Responses
- Exploring the Potential of Large Language Models in Computational Argumentation
- Do Large Language Models Reason Causally Like Us? Even Better?
Original note title
LLM counter-arguments converge stylistically with the post they reply to — humans don't mirror creating a detectable accommodation signature