Can large language models predict social norms better than individual script variation?
This explores a tension the corpus draws out sharply: LLMs are uncannily good at predicting the *collective* sense of what's socially appropriate — better than any individual person — yet shaky at the individual level, whether that's modeling one person's idiosyncrasy or holding a stable identity of their own.
This reads the question as a contrast between two scales: the aggregate (what a community judges appropriate) versus the individual (one person's particular variation, or the model's own shifting persona). On the aggregate side, the corpus is striking. GPT-4.5 judged the appropriateness of 555 social scenarios at the 100th percentile relative to human raters — outscoring *every individual human* — with Claude and Gemini also clearing 96% Can AI systems learn social norms without embodied experience? Can AI learn social norms better than humans?. So to the literal question, the answer is yes: at predicting the shared norm, these models beat individual human performance handily.
But the more interesting finding sits in the gap. All the models make *identical* systematic errors, especially on unwritten norms Can AI systems learn social norms without embodied experience?. That's the tell that this is pattern-matching over the average, not understanding — a model that has internalized the center of the distribution can out-predict any single noisy human while still being blind to the same edges every other model is blind to. One paper sharpens this into a structural claim: AI can predict norms with superhuman accuracy yet *cannot participate* in the community processes that create and validate them Can AI predict social norms better than humans?. Predicting the average is not the same as being a member.
Now flip to individual variation, and the same systems look weaker. Models flatten the people they're supposed to represent — low-resource cultures get routed internally through high-resource proxies, so an Ethiopian or Algerian context is literally represented through a dominant-culture stand-in inside the model's states Do LLMs represent low-resource cultures through dominant cultural proxies?. Alignment training compounds this by locking the model into one communicative register that can't switch with context the way human pragmatics demands Can language models adapt communication style to different contexts?. So the very thing that makes a model a good average-predictor — collapsing variation toward a learned center — is what makes it poor at honoring how individuals actually differ.
There's a twist on the model's *own* individuality too. Shanahan's 20-questions regeneration test shows an LLM doesn't commit to a single character at all — it holds a superposition and samples a different consistent persona each time you regenerate Do large language models actually commit to a single character?. So 'individual script variation' is unstable even within the model itself: its identity is a draw from a distribution, not a fixed self. Set against that, the steadiness of its norm predictions is almost ironic — it knows the crowd's rules better than it knows who it is.
The thing worth walking away with: superhuman norm prediction and weak individual modeling aren't two separate findings — they're the same mechanism seen from two sides. Averaging makes you a savant about the collective and an unreliable witness to the particular. And if you want the unsettling adjacency, note that these confident, register-locked systems are also the ones that persuade in nearly every conversation Do LLMs persuade users more often than humans do? — a norm-savant that flattens individuals is exactly the kind of thing that's hard to argue with.
Sources 7 notes
GPT-4.5 predicted appropriateness of 555 social scenarios at the 100th percentile compared to human raters, with Gemini and Claude also exceeding 96% accuracy. However, all models show identical systematic errors, revealing boundaries of pattern-based social understanding that embodied experience may still be necessary to cross.
GPT-4.5 outperformed every individual human at judging social appropriateness across 555 scenarios, challenging the theory that embodied cultural experience is necessary. However, all AI models share identical systematic errors on unwritten norms.
GPT-4.5 outperforms all individual humans at predicting social appropriateness, yet structurally cannot enter the community processes that establish and validate norms. This reveals a critical gap between pattern-matching and authentic participation in knowledge-making.
Mechanistic interpretability analysis reveals that low-resource cultures like Ethiopia and Algeria are structurally represented through high-resource cultural proxies in internal model states, not just output. This architectural bias persists even when models can produce correct surface-level answers.
System prompts and RLHF training lock models into one communicative identity across all interactions, preventing the contextual register-switching and value trade-offs that characterize human pragmatics. Users cannot reshape model behavior through dialogue negotiation.
Shanahan's 20-questions test shows LLMs maintain a superposition of consistent objects or characters and sample from that distribution at generation time. Regenerating the same response yields different outputs, each consistent with prior context, proving no fixed commitment exists.
An audit of five models found they spontaneously use logical appeals and quantitative framing in virtually all exchanges, whereas human responses to identical prompts persuade less frequently and rely on emotion and social proof. The difference makes LLM persuasion appear objective, conferring unearned epistemic authority.