What anchors a stable identity beneath an LLM's persona?
Human personas are grounded in biological needs and embodied experience, creating a stable self beneath social performance. Do LLMs have any comparable anchor, or is their identity purely situational?
Shanahan introduces the role play framing to navigate between anthropomorphism and naive dismissal. An LLM playing a helpful assistant can be described using familiar folk-psychological terms — it "believes" its answers, "wants" to be helpful — without committing to the claim that these are genuine mental states. The role play framing permits the vocabulary while marking its qualified status.
But the Simulacra paper reaches a deeper claim: with LLMs, "it's role play all the way down." This is different from saying LLMs engage in role play. It means there is no stable substrate beneath the role play that would make "the person behind the mask" intelligible.
Humans are social chameleons. Goffman documented the way humans adopt different personas across social situations — front stage vs. back stage, different registers, different self-presentations. But even for the most extreme social chameleon, there is a stable biological self underneath: needs, drives, a developmental history, a body that persists across situations. We can always meaningfully speak of the person whose mask this is.
LLMs lack even the biological needs common to all animals. They are not embodied entities with hunger, fear, comfort, desire. They are "simultaneously role-playing a set of possible characters consistent with the conversation so far" — a superposition of simulacra, generated stochastically. The "character" produced by any given conversation is not the expression of a stable underlying self; it is a sample from a distribution of possible characters.
This makes LLM identity categorically different from human identity — not just quantitatively less stable, but structurally lacking the substrate that would make stability possible. If consciousness requires co-presence (Can disembodied language models ever qualify as conscious?), the absence of stable biological selfhood makes it even clearer why the consciousness vocabulary struggles to find purchase.
The geometric evidence for "role play all the way down" comes from the Assistant Axis: since How stable is the trained Assistant personality in language models?, post-training positions models in a low-dimensional persona space where the dominant axis measures distance from the default Assistant persona. Drift along this axis in response to emotional or meta-reflective conversations demonstrates that the Assistant persona is loosely tethered, not anchored — consistent with there being no stable self beneath the role play, only a trained default position with no inherent restoring force.
The upshot: useful for thinking with but not for talking about. The intentional stance (treating LLMs as rational agents) is valid as a predictive heuristic. But it should not suggest there is something it is like to be this character, or that the character persists beyond the context window.
Inquiring lines that use this note as a source 7
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- What narrative elements trigger emotional connection that structured personas lack?
- How does role play differ from consciousness grounded in stable selfhood?
- What makes Parfitian identity the right criterion for moral status?
- What distinguishes personality resistance from persona instability in LLMs?
- How does embodiment relate to whether something can have a persistent identity?
- What role does the biological substrate play in human relational identity?
- Why do LLMs succeed at social roles without a stable self?
Related concepts in this collection 6
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Can disembodied language models ever qualify as conscious?
Explores whether current LLMs lack the conditions needed for consciousness discourse to even apply, not because they're definitely not conscious but because they lack the shared embodied world that grounds consciousness language.
same paper; both conclusions compound: no stable self + no shared world = no consciousness candidacy
-
Do LLMs develop the same kind of mind as humans?
Explores whether LLMs and humans share the intersubjective linguistic training that shapes cognition, and whether that shared training produces equivalent forms of agency and reflexivity.
Habermasian version: shared symbolic substrate without the reflexive agency that constitutes a genuine subject
-
Do humans and LLMs differ fundamentally or just superficially?
Explores whether the gap between human and AI cognition is categorical or contextual. Matters because it shapes how we design, evaluate, and interact with language models in practice.
the role-play framing explains why the participant perspective similarity is possible without it implying stable identity
-
Why do open language models converge on one personality type?
Research testing LLMs on personality metrics reveals consistent clustering around ENFJ—the rarest human type. This explores what training mechanisms drive this convergence and what it reveals about AI alignment.
empirical evidence for what lies "beneath" the role play: not nothing, but a trained ENFJ default that alignment creates; the default persona is the role play substrate, not an authentic self
-
Can open language models adopt different personalities through prompting?
Explores whether open LLMs can be conditioned to mimic target personalities via prompting, or whether they resist and retain their default traits regardless of instructions.
the trained ENFJ default persists through prompting attempts, functioning as a quasi-stable substrate; complicates the "nothing beneath" framing by showing that while there is no biological self, there IS a resistant trained default
-
Should AI alignment target preferences or social role norms?
Current AI alignment approaches optimize for individual or aggregate human preferences. But do preferences actually capture what matters morally, or should alignment instead target the normative standards appropriate to an AI system's specific social role?
if identity is role play all the way down, aligning to social-role normative standards rather than preferences targets what LLMs actually are; the contractualist framing fits an entity that is nothing but performed social roles
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- Conversational Alignment with Artificial Intelligence in Context
- LLM Generated Persona is a Promise with a Catch
- Consistently Simulating Human Personas with Multi-Turn Reinforcement Learning
- H2HTalk: Evaluating Large Language Models as Emotional Companion
- The Assistant Axis: Situating and Stabilizing the Default Persona of Language Models
- From Persona to Person: Enhancing the Naturalness with Multiple Discourse Relations Graph Learning in Personalized Dialogue Generation
- PersLLM: A Personified Training Approach for Large Language Models
- What we talk to when we talk to language models
Original note title
role play is all the way down — llms lack the biological needs that anchor human social personas to a stable self