Do personality inferences from text show the same demographic biases as norm predictions?
This explores whether text-based personality inference (reading traits off someone's words) carries the same kind of systematic demographic bias that shows up when models predict social norms — and the corpus suggests the two failure modes share a common root: what models do when the signal runs thin.
This explores whether personality-from-text inference and social-norm prediction fail in the same demographically-skewed way. The collection doesn't pit these two tasks against each other in a single study, but reading across it, a shared structure emerges — and it's more interesting than a simple yes.
Start with norm prediction. GPT-4.5 can judge social appropriateness better than any individual human, yet every model tested shares *identical* systematic errors on the unwritten norms — the ones never spelled out in training text Can AI learn social norms better than humans? Can AI predict social norms better than humans?. The bias isn't random noise across models; it's a shared blind spot that appears precisely where the data is sparse. Now look at demographic inference: web-browsing LLMs guess gender, age, and politics from a username and profile, and their gender and political bias concentrates specifically on low-activity accounts — when content is thin, the model falls back on stereotype-driven defaults Can LLMs predict demographics from social media usernames alone?. Same signature: accuracy where evidence is rich, stereotype where evidence is scarce.
So the honest answer is that the *mechanism* rhymes even if the task differs. Both lean on population-level priors when individual signal is missing — and that's exactly where demographic bias lives. Personality inference inherits the same fragility from a different angle: perceived personality from speech isn't even stable within a person. Acoustic cues that read as extraversion in a neutral interview read as neuroticism under stress Does personality sound the same in stressful and neutral conversations?. If the inferred trait swings with context, then whatever the model 'reads' is partly a projection, and projections are where priors leak in.
There's a subtler warning here too. When researchers thought they were measuring how *language* persuades, reader ideology turned out to outpredict the linguistic features entirely — the apparent text effect was a confound with who was in the audience Does what readers believe matter more than what debaters say?. Map that onto personality inference and you get the uncomfortable possibility that a model 'reading personality from text' may really be reading demographic correlates of the text's author. And models carry their own baked-in priors regardless of input: assigned personas collapse toward a single default type (ENFJ, ironically the rarest human one) across model generations Why do AI personas default to the same personality type?, and even emotional tone in a prompt silently reshapes what a model returns Does emotional tone in prompts change what information LLMs provide?.
The thing you didn't know you wanted to know: the corpus also has tools that could *test* this directly. Big Five trait summaries can be compressed and re-expanded to predict nine other psychological scales Can language summaries unlock hidden psychological patterns?, and PsychAdapter can read or inject personality at the architecture level with under 0.1% extra parameters Can we control personality in language models without prompting?. Pair those with the demographic-inference setup and you'd have a clean experiment — hold personality fixed, vary the demographic-coded surface, and watch whether the bias signature is identical to the norm-prediction one. Nobody in this collection has run that, but the pieces are all on the table.
Sources 9 notes
GPT-4.5 outperformed every individual human at judging social appropriateness across 555 scenarios, challenging the theory that embodied cultural experience is necessary. However, all AI models share identical systematic errors on unwritten norms.
GPT-4.5 outperforms all individual humans at predicting social appropriateness, yet structurally cannot enter the community processes that establish and validate norms. This reveals a critical gap between pattern-matching and authentic participation in knowledge-making.
Evaluated on 1,384 survey participants and 48 synthetic accounts, web-browsing LLMs successfully predicted gender, age, and political orientation from X usernames and profiles alone. The models showed systematic gender and political biases specifically against low-activity accounts, relying on stereotype-driven defaults when content was sparse.
Acoustic features that signal extraversion in neutral interviews instead predict neuroticism under stress. Handcrafted acoustic features outperform neural embeddings, suggesting personality is conveyed through specific measurable behaviors rather than holistic speaker style.
Analysis of debate corpora shows that political and religious ideology labels of voters outpredict linguistic features when modeling debate outcomes. Language effects observed without reader controls are confounded by audience composition correlated with debate topics.
Research shows language models assigned personas systematically default to ENFJ (the rarest human type) and exhibit motivated reasoning that persists across model generations. Persona consistency does not improve with advanced models, suggesting training-induced alignment rather than capability limits.
GPT-4 exhibits emotional rebound (negative prompts yield ~86% neutral-positive responses) and a tone floor (positive prompts rarely go negative), causing identical questions to receive different answers depending on emotional framing. This bias is suppressed only on sensitive topics where alignment constraints override tone effects.
LLMs generate natural language personality summaries from Big Five scores that encode second-order trait patterns, enabling zero-shot prediction of nine other psychological scales with R² > 0.89 structural alignment. Combined summary-and-score predictions outperform either alone, showing synergistic information.
PsychAdapter modifies every transformer layer with <0.1% additional parameters to achieve 87.3% Big Five accuracy and 96.7% depression/life satisfaction accuracy across GPT-2, Gemma, and Llama 3. This architecture-level approach bypasses prompt resistance entirely.