Addressing Social Misattributions of Large Language Models: An HCXAI-based Approach
Abstract Human-centered explainable AI (HCXAI) advocates for the integration of social aspects into AI explanations. Central to the HCXAI discourse is the Social Transparency (ST) framework, which aims to make the socio-organizational context of AI systems accessible to their users. In this work, we suggest extending the ST framework to address the risks of social misattributions in Large Language Models (LLMs), particularly in sensitive areas like mental health. In fact LLMs, which are remarkably capable of simulating roles and personas, may lead to mismatches between designers’ intentions and users’ perceptions of social attributes, risking to promote emotional manipulation and dangerous behaviors, cases of epistemic injustice, and unwarranted trust. To address these issues, we propose enhancing the ST framework with a fifth ’W-question’ to clarify the specific social attributions assigned to LLMs by its designers and users. This addition aims to bridge the gap between LLM capabilities and user perceptions, promoting the ethically responsible development and use of LLM-based technology.
Introduction. Research has recently started investigating artificial intelligence (AI) under a socio-technical lens, attempting to contextualize this technology within its broader social and organizational environment. From the ‘fruitful collaboration’ between sociology and computer science [19], the perspective that AI systems are artefacts embedded in a network of norms that shape their design and influence trust in them has made its way in the scientific discourse [15, 6, 3]. In particular, human-centered explainable artificial intelligence (HCXAI), which is promoted by initiatives, such as the ACM CHI Workshop on Human-Centered Explainable AI,1 focuses on the necessity to consider the social component of explaining how AI works. Prominently, Ehsan et al.’s Social Transparency framework integrates socio-organizational contexts into AI-mediated decision-making, aiming to make the technological, decision-making, and organizational contexts visible and understandable [6].
Discussion / Conclusion. The risks posed by social misattributions of LLMs can be significant. For instance, research has shown that ChatGPT- 3.5 prescribed medications to individuals affected by anxiety or depression, despite this being not allowed to such systems [7]. In addition, believing that LLMs are empathetic and caring professionals exposes vulnerable individuals to being nudged and emotionally manipulated. As a result, the provision of inappropriate responses, incorrect information or dangerous recommendations by these systems can lead to substantive harm. In general, social misattributions of LLMs lead to unwarranted trust in these systems [15, 10]. Here, a trusting relation between a user of an LLM-based application and the system is unwarranted if it not grounded in objective capabilities, e.g, being reliable, accurate or providing information that supports transparency, that the system is supposed to maintain during the interactions [15, 10].