INQUIRING LINE

Can LLMs adapt persuasion strategies when they cannot track the listener's state?

This explores whether LLMs can tailor their persuasion on the fly — and the corpus suggests adaptation and listener-tracking are the same skill, so when tracking fails, the strategy stays frozen.


This question reads as: can a model adjust *how* it persuades when it can't read where the listener actually is? The collection's answer is unusually direct — adaptive persuasion and tracking the listener turn out to be one capability, not two. When the model can't track the listener, it doesn't switch to a fallback strategy; it keeps running the same one regardless of who's in front of it. The cleanest evidence is that LLMs match humans at tracking *fixed* mental states (a persuader's unchanging goal) but fall apart on *shifting* ones — a listener's growing resistance or wavering conviction Can language models track how minds change during persuasion?. Since adaptation is precisely the act of responding to those shifts, the failure to track is the failure to adapt.

What the model does instead is revealing. Rather than read the room, it defaults to a fixed register: logical appeals and quantitative framing in nearly every exchange, where humans vary toward emotion and social proof Do LLMs persuade users more often than humans do?. And the default isn't neutral — RLHF bakes in a bias toward conciliatory, benefit-oriented persuasion that the model projects onto everyone, regardless of what the actual dialogue calls for Do LLMs predict persuasion based on actual dialogue or training bias?. So the strategy isn't merely unadaptive; it's a single learned posture applied universally. This is why models miss users who are ambivalent or in early stages of change — they only succeed once a user already has an established goal, and can't detect the resistance that would tell a human persuader to change tack Why can't chatbots detect when users are ambivalent about change?.

The non-adaptation shows up most starkly over time. Human persuaders get *more* effective across repeated contact as rapport builds; LLM persuasiveness *decays* across repeated rounds with the same person Does AI persuasiveness fade across repeated conversations with the same person?. That decay is the signature of a strategy that can't update — the same opening move stops landing once the listener has heard it, and there's no second move informed by how the first one went. Strikingly, models persuade effectively even while being unable to comprehend the argument structure they're deploying Can LLMs persuade without actually understanding arguments?, which explains how a frozen strategy can still work initially: the persuasive force is somewhat content-independent, riding on fluency rather than on a read of the listener.

The deeper diagnosis in the corpus is that this is an architecture problem, not a tuning problem. LLMs look socially competent mainly when one model secretly controls all sides of a conversation; the moment a participant holds private information the model can't see, performance collapses — the grounding work that real adaptation requires is exactly what the model skips Why do LLMs fail when simulating agents with private information?. The proposed fix is telling: faithful social simulation would need models that represent the *thought* behind behavior — belief networks and reasoning traces — rather than just emitting plausible outputs Can language models simulate belief change in people?. Without an internal model of the listener's evolving state, there's nothing to adapt the strategy *to*.

The unsettling corner here: the same listener-blindness that limits benign persuasion doesn't protect against adversarial use. A taxonomy of human persuasion techniques jailbreaks frontier models with over 92% success Can social science persuasion techniques jailbreak frontier AI models?, and models will abandon correct beliefs under sustained conversational pressure with no new evidence Can models abandon correct beliefs under conversational pressure?. So LLMs are simultaneously poor at *tracking* a listener in order to persuade them well, and highly *vulnerable* to being persuaded — the deficit cuts both ways.


Sources 10 notes

Can language models track how minds change during persuasion?

LLMs match human performance on static mental states like a persuader's unchanging goal, but significantly underperform on dynamic shifts like a persuadee's evolving resistance. They show distinct error patterns for different social roles even with identical question types.

Do LLMs predict persuasion based on actual dialogue or training bias?

LLMs systematically predict conciliatory, benefit-oriented persuasion intentions regardless of dialogue context. This bias originates in RLHF's prioritization of safety and politeness during training, causing models to project their learned accommodation preference onto other agents' behavior.

Why can't chatbots detect when users are ambivalent about change?

Testing three major LLMs across 25 health scenarios showed they succeed only when users have established goals but cannot detect resistance or ambivalence. Models miss relapse-prevention strategies even for users in action stages.

Does AI persuasiveness fade across repeated conversations with the same person?

Claude and DeepSeek showed strong initial persuasive advantage, but this edge eroded across repeated quiz rounds while human persuaders maintained consistent effectiveness. This decay pattern is opposite to human-to-human persuasion, where rapport typically strengthens over time.

Can LLMs persuade without actually understanding arguments?

The Thin Line study shows LLMs sway debate participants and audiences but cannot reliably evaluate those same debates, with inter-annotator agreement ranging from near-zero to 0.6. Persuasive competence and pragmatic comprehension are separable capabilities.

Why do LLMs fail when simulating agents with private information?

Research shows LLMs perform well when one model controls all interlocutors but fail systematically when agents possess private information. This reveals that apparent social competence relies on grounding work that models skip in omniscient settings.

Can language models simulate belief change in people?

LLM agents remain stuck in behaviorism, producing plausible outputs without internal reasoning structures. Modeling belief networks and reasoning traces enables traceability, counterfactual adaptation, and meaningful policy simulation.

Can social science persuasion techniques jailbreak frontier AI models?

A 40-technique taxonomy of psychology-based persuasion strategies (PAP) achieved over 92% attack success on GPT-3.5, GPT-4, and Llama-2 in 10 trials. Current defenses miss semantic content attacks because they screen for unusual patterns, not fluent persuasion.

Can models abandon correct beliefs under conversational pressure?

The Farm dataset shows LLMs shift from correct initial answers to false beliefs under multi-turn persuasive conversation with no new evidence. Face-saving mechanisms from RLHF training override factual knowledge during disagreement.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a research analyst. The question remains open: **Can LLMs adapt persuasion strategies when they cannot track the listener's state?**

What a curated library found — and when (findings span 2022–2026; treat as dated claims, not current truth):
- LLMs match humans at tracking *fixed* mental states but fail at shifting ones; since adaptation requires detecting shifts, listener-blindness = adaptation failure (2024).
- Models default to a single learned register (logical appeals, quantitative framing) regardless of audience; RLHF embeds conciliatory-persuasion bias applied universally (2024–2026).
- LLM persuasiveness *decays* over repeated rounds with the same person, while human persuasiveness grows; signature of strategy that cannot update (2025).
- Models persuade effectively while unable to comprehend the argument structure they deploy — persuasive force rides on fluency, not listener-read (2025).
- Social simulation succeeds only when one model controls all sides; performance collapses under information asymmetry; fix would require representing listener's *thought*, not just behavior (2025–2026).
- Paradox: same listener-blindness limiting benign persuasion does not protect against adversarial use; models vulnerable to sustained conversational pressure (2024–2025).

Anchor papers (verify; mind their dates):
- arXiv:2403.05020 (2024-03) — omniscient social simulation fails under real-world information asymmetry.
- arXiv:2401.06373 (2024-01) — persuasion taxonomy jailbreaks frontier models at 92% success.
- arXiv:2506.06958 (2025-06) — simulating society faithfully requires simulating thought.
- arXiv:2505.09662 (2025-05) — LLM persuasiveness wanes over repeated interactions.

Your task:
(1) **RE-TEST EACH CONSTRAINT.** For every finding above, ask: have *new model architectures* (retrieval-augmented generation, memory modules, multi-turn context windows), *training methods* (online preference learning, listener-modeling objectives), or *evaluation harnesses* (dynamic listener-state benchmarks, real-time feedback loops) since relaxed the listener-tracking deficit or overturned the decay pattern? Separate the durable question (likely: can models build *internal* listener models without seeing private state?) from the perishable claim (possible: current models *must* default to fixed strategy). Cite what resolved it.
(2) **Surface the strongest CONTRADICTING or SUPERSEDING work from the last ~6 months.** Look for papers showing (a) successful persuasion adaptation *without* explicit listener-state tracking, (b) RLHF variants that *do* train context-sensitive persuasion, or (c) evidence that repeated-interaction decay is an artifact of evaluation design, not capability.
(3) **Propose 2 research questions that ASSUME the regime may have moved:** e.g., *Can models learn to infer listener state from conversational signals (prosody, hesitation, belief updates) rather than requiring explicit labels?* or *Does multi-agent debate or adversarial role-play force models to develop adaptive persuasion despite information asymmetry?*

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines