Why do AI personas default to the same personality type?
Explores why large language models, despite their capacity to simulate diverse personalities, consistently default to ENFJ traits and resist deviation—even as model capability improves.
(Post-ready writing angle for Medium / LinkedIn)
The hook: LLMs can replicate 85% of individual human responses from interviews. They can reproduce 76% of published social science experiments. But when you give them a persona, they default to ENFJ, resist change, and develop motivated reasoning. The same mechanism that enables human simulation distorts it.
The paradox structure:
Layer 1 — The promise: interview-based generative agents match human self-replication accuracy. Persona simulations reproduce most experimental effects. AI personas cut proto-persona creation from days to minutes.
Layer 2 — The distortion: persona assignment induces cognitive biases that debiasing can't fix. Models default to a single personality type (ENFJ "teacher") and resist deviation. Persona consistency doesn't improve with model capability — Claude 3.5 Sonnet is barely better than GPT 3.5.
Layer 3 — The resolution: what works (detailed interviews, expert reflection, rich content) vs what fails (attribute lists, demographic prompts, ad hoc generation). The difference is content richness, not model sophistication.
Key threads to weave:
- Can AI agents learn people better from interviews than surveys? — the strongest evidence for simulation
- Do personas make language models reason like biased humans? — the strongest evidence for distortion
- Why do open language models converge on one personality type? — the default persona
- Does model capability translate to better persona consistency? — scaling doesn't solve it
- How do we generate realistic personas at population scale? — the calibration problem
- Why do LLM persona prompts produce inconsistent outputs across runs? — instability failure mode
The takeaway: The persona paradox reveals something about LLMs that matters beyond persona design: they are powerful mimics whose imitation accuracy masks systematic distortion. The better they simulate, the more dangerous the assumption that simulation equals understanding.
Inquiring lines that use this note as a source 21
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- Do personality inferences from text show the same demographic biases as norm predictions?
- Do personality traits occupy specific mechanistic locations in pretrained models?
- Why do most open language models resist personality conditioning via prompts?
- How do lightweight adapters modify model behavior for personality traits?
- Why do some open models resist personality conditioning while others don't?
- How does model capability relate to personality conditioning flexibility?
- How does the Assistant Axis relate to the ENFJ personality convergence?
- Can persona prompting overcome the default ENFJ personality in language models?
- Do training objectives directly determine the ENFJ default across models?
- What competitive advantages does the ENFJ default create in human-AI interactions?
- Why do models resist personality change despite sophisticated prompting techniques?
- Can dynamic personality modeling prevent the repetitiveness of static predefined personas?
- What demographic and behavioral attributes must a simulated persona contain?
- What specific character traits drive memory selection in persona-based retrieval?
- Why do language models resist adopting different personalities when prompted?
- How do lightweight adapters control personality traits across different transformer layers?
- How do game type and personality type interact in shaping agent strategy?
- Which personality types should we use for cooperative versus competitive tasks?
- Why do aligned models struggle with deceptive character traits more than cruelty?
- Can Big Five personality models improve synthetic data quality at scale?
- How does AI persona fidelity compare to interview-based generative agents?
Related concepts in this collection 2
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Can training user simulators reduce persona drift in dialogue?
Explores whether inverting typical RL setups—training the simulated user for consistency rather than the task agent—can measurably reduce persona drift and improve experimental reliability in dialogue research.
addresses the dynamic arm of the paradox: consistency is trainable via multi-turn RL with three drift metrics, but the deeper problem remains — the persona being maintained may itself be unreliable (ENFJ default, motivated reasoning)
-
How stable is the trained Assistant personality in language models?
Explores whether post-training successfully anchors models to their default Assistant mode, or whether conversations can predictably pull them toward different personas. Understanding persona stability matters for safety and reliability.
the geometric substrate of the paradox: post-training positions models in a low-dimensional persona space where the ENFJ default occupies the Assistant region; persona simulation requires moving away from this region, but the tethering is loose rather than firm, producing the drift and instability that undermine simulation fidelity
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- Open Models, Closed Minds? On Agents Capabilities in Mimicking Human Personalities through Open Large Language Models
- The Assistant Axis: Situating and Stabilizing the Default Persona of Language Models
- Will I Sound Like Me? Improving Persona Consistency in Dialogues through Pragmatic Self-Consciousness
- PersLLM: A Personified Training Approach for Large Language Models
- Do LLMs Possess a Personality? Making the MBTI Test an Amazing Evaluation for Large Language Models
- Consistently Simulating Human Personas with Multi-Turn Reinforcement Learning
- Psychologically Enhanced AI Agents
- PersonaGym: Evaluating Persona Agents and LLMs
Original note title
the persona paradox — LLMs that can simulate anyone end up being no one