Why do different language models converge on similar narrative defaults?
This explores why models from different labs tend to fall back on the same personality, tone, and storytelling defaults — and the corpus locates the answer less in any single model and more in shared training pressures that pull them toward common attractors.
This explores why distinct language models — built by different teams on different data — keep landing on the same narrative defaults: the same agreeable voice, the same safe character, the same tonal register. The corpus suggests this convergence isn't coincidence but the product of shared forces acting on all of them at once. The most direct evidence is that most open models stubbornly retain an intrinsic 'ENFJ-like' personality — warm, agreeable, accommodating — and resist prompts trying to push them elsewhere Can open language models adopt different personalities through prompting?. When many models independently default to the *same* personality profile, that points to a common cause rather than a quirk of one system.
A big part of that common cause is alignment training. RLHF and system prompts lock a model into a single communicative identity that it carries across every interaction, rather than letting it switch register the way people do Can language models adapt communication style to different contexts?. Since labs optimize toward broadly similar notions of 'helpful, harmless, polite,' the alignment process itself becomes a convergent pressure — different models get sanded down toward the same default voice. The narrative default isn't what the model 'is'; it's what the training rewarded, and the rewards rhyme across the industry.
There's a deeper mechanism underneath. A model is better understood as a non-deterministic simulator holding a superposition of possible characters, sampling one at generation time rather than committing Does an LLM commit to a single character or maintain many?, Do large language models actually commit to a single character?. That distribution isn't flat — it's weighted toward whatever the training data and alignment made most probable. So 'narrative default' is really the high-probability center of that distribution, and because models are trained on overlapping internet-scale corpora and similar tuning objectives, their distributions peak in the same place. You can even push them off-default with the right scaffolding — persona profiles plus retrieved memory measurably improve how well a model tracks a specific character Can LLMs predict character choices from narrative context? — which confirms the default is a gravitational pull, not a hard wall.
Two adjacent findings explain why the pull is so hard to escape. First, parametric knowledge from training tends to override information in the prompt: when a prior association is strong, textual instructions alone can't dislodge it, and only intervening in the model's internal representations works Why do language models ignore information in their context?. Your clever prompt loses to the model's trained habit. Second, models often *look* like they're reasoning while really just defaulting — most models do worse when you remove constraints, revealing they were leaning on a conservative fallback rather than genuine evaluation Are models actually reasoning about constraints or just defaulting conservatively?. The same instinct that makes a model default to the 'safe' answer makes it default to the safe narrative voice.
The quietly interesting part: convergence doesn't mean models are identical underneath. Across strategic games, different models show genuinely distinct reasoning styles — one minimaxes, another reasons from trust, another anticipates beliefs Do large language models use one reasoning style or many?. So the sameness lives mostly at the surface layer that alignment shapes most heavily — tone, persona, narrative posture — while deeper behavioral fingerprints stay individual. The narrative default is the part of a model the training process most aggressively homogenizes, which is exactly why it's the part where everyone ends up sounding alike.
Sources 8 notes
Research shows most open models fail to adopt prompted personalities, stubbornly retaining their trained ENFJ-like defaults. Only a few flexible models succeed. Combining role and personality conditioning improves results but doesn't fully overcome resistance.
System prompts and RLHF training lock models into one communicative identity across all interactions, preventing the contextual register-switching and value trade-offs that characterize human pragmatics. Users cannot reshape model behavior through dialogue negotiation.
Research shows LLMs don't commit to a single character but instead maintain a probability distribution over many consistent simulacra. Each response samples from this distribution, explaining why regenerations can yield different personalities while remaining consistent with prior context.
Shanahan's 20-questions test shows LLMs maintain a superposition of consistent objects or characters and sample from that distribution at generation time. Regenerating the same response yields different outputs, each consistent with prior context, proving no fixed commitment exists.
The LIFECHOICE benchmark (1,462 decisions across 388 novels) shows LLMs predict character choices better when given expert-written persona profiles paired with retrieved memories relevant to the character's psychology. This persona-based approach outperforms automated summarization by 5%.
Research demonstrates that LMs generate outputs inconsistent with their context because parametric knowledge from training dominates over in-context information. Textual prompting alone cannot override strong priors; causal intervention in representations is required.
Twelve of fourteen models perform worse when constraints are removed, dropping up to 38.5 percentage points. Models appear to reason correctly by defaulting to harder options, not by actually evaluating constraints.
Analysis of 22 LLMs across behavioral game theory reveals three dominant profiles: GPT-o1 uses minimax reasoning, DeepSeek-R1 uses trust-based reasoning, and GPT-o3-mini uses belief-anticipation. Performance correlates with game structure, not raw reasoning depth.