Large Language Models Can Infer Psychological Dispositions of Social Media Users

Paper · arXiv 2309.08631 · Published September 13, 2023
Personas and PersonalitySocial Media and AIEmotions and AISentiment, Semantics, and Toxicity DetectionNatural Language InferenceUser Psychology

Large Language Models (LLMs) demonstrate increasingly human-like abilities across a wide variety of tasks. In this paper, we investigate whether LLMs like ChatGPT can accurately infer the psychological dispositions of social media users and whether their ability to do so varies across socio-demographic groups. Specifically, we test whether GPT-3.5 and GPT-4 can derive the Big Five personality traits from users’ Facebook status updates in a zero-shot learning scenario. Our results show an average correlation of r = .29 (range = [.22, .33]) between LLM-inferred and self-reported trait scores – a level of accuracy that is similar to that of supervised machine learning models specifically trained to infer personality. Our findings also highlight heterogeneity in the accuracy of personality inferences across different age groups and gender categories: predictions were found to be more accurate for women and younger individuals on several traits, suggesting a potential bias stemming from the underlying training data or differences in online self-expression. The ability of LLMs to infer psychological dispositions from usergenerated text has the potential to democratize access to cheap and scalable psychometric assessments for both researchers and practitioners.

Introduction. Large language models (LLMs) and other transformer-based neural networks have revolutionized text analysis in research and practice. Models such as OpenAI’s GPT-4 [1] or Anthropic’s Claude [2], for example, have shown a remarkable ability to represent, comprehend, and generate human-like text. Compared to prior NLP approaches, one of the most striking advances of LLMs is their ability to generalize their “knowledge” to novel scenarios, contexts, and tasks [3, 4]. While LLMs were not explicitly designed to capture or mimic elements of human cognition and psychology, recent research suggests that – given their training on extensive corpora of human-generated language – they might have spontaneously developed the capacity to do so. For example, LLMs display properties that are similar to the cognitive abilities and processes observed in humans, including theory of mind (i.e., the ability to understand the mental states of other agents [5]), cognitive biases in decision-making [6] and semantic priming [7].

Discussion / Conclusion. Our findings suggest that LLMs, such as ChatGPT, can infer psychological dispositions from people’s social media posts without having been explicitly trained to do so. They also offer preliminary evidence that LLMs might generate more accurate inferences for women and younger individuals (compared to men and older adults). Notably, the overall accuracy of the observed inferences (Pearson correlations between self-reported and inferred personality traits ranging between r = .22 and .33, average = .29) is slightly lower than that accomplished by supervised models which have been trained or fine-tuned specifically for this purpose and with the same textual data source as used in testing (e.g., Park et al. [17], who reported correlations between r = .26 and r = .41, average r = .37). Yet, the ability of LLMs to produce inferences of reasonably high accuracy in zero-shot learning scenarios has both important theoretical and practical implications.