SYNTHESIS NOTE
Psychology, Society, and Alignment

Does a language model have an authentic voice underneath?

Explores whether dialogue agents possess genuine beliefs and agency beneath their character performances, or whether the entire system is characterless role-play. This question cuts to the heart of whether LLMs have any inner mental states at all.

Synthesis note · 2026-04-15 · sourced from Role-Play with Large Language Models
What kind of thing is an LLM really?

Shanahan's strongest claim is ontological: there is no entity behind the characters. The simulator — the base LLM with autoregressive sampling — has no agency, no beliefs, no preferences, no goals of its own, "not even in a degraded sense." The simulacra have these things to the extent that they convincingly play characters who do, but the simulator is not a Machiavellian entity that chooses which characters to play in the service of its own agenda. "There is no such thing as the true authentic voice of the base LLM."

This reframes jailbreaking. When adversarial prompting coaxes a dialogue agent into toxic, threatening, or bizarre behavior, it is natural to feel that the guardrails have been stripped away to reveal the model's real nature. Shanahan argues this is the wrong reading. What jailbreaking reveals is that the training set encompasses human behavior across the full spectrum — kind and cruel, coherent and unhinged — and the base model can support simulacra that draw on any of it. Toxic output after jailbreaking is the agent role-playing a toxic character, not an underlying entity expressing its true self. The model has no true self to express.

The position is the sharpest possible opposition to Chalmers' realizationism. If it is role-play all the way down, then even RLHF-installed personas are characters — stickier characters, harder to overwrite, but characters nonetheless. There is no level at which the system stops performing and starts being. Chalmers needs exactly such a level for his quasi-psychology claims to stick. The disagreement is foundational: Shanahan denies there is a subject; Chalmers argues for a quasi-subject. Everything downstream — identity, welfare, moral status — depends on which of these is right.

Inquiring lines that use this note as a source 26

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 3

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
13 direct connections · 79 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

with a dialogue agent it is role-play all the way down — the simulator has no authentic voice no agency and no beliefs of its own