Seemingly Conscious AI Risks

Paper · Source
Philosophy and SubjectivitySocial Theory and Society

AI systems are increasingly designed in ways that lead users to perceive them as conscious. This paper provides a unified framework connecting empirical hallmarks of consciousness attribution to a structured risk taxonomy of Seemingly Conscious AI (SCAI), AI systems that exhibit hallmarks which elicit consciousness attribution from users. We survey the empirical literature to identify five such hallmarks of SCAI, spanning affective capacity, anthropomorphic features, autonomous action, self-reflective behavior, and social-interactive behavior. These provide observable, system-level proxies for this inherently subjective phenomenon, informing its design and enabling its empirical study. Drawing on this foundation, we develop a taxonomy of SCAI risks spanning risks to individuals, including emotional dependence and autonomy erosion, and societal-level harms, including human status erosion and political strife. We complement this conceptual analysis with an expert survey to assess the likelihood of each risk category. We find that risks to individuals, particularly emotional dependence and autonomy erosion, are already observable and rated as high probability, while societal risks, at a low probability, carry high potential severity and path-dependence.

Introduction. Objectively attesting to the consciousness of an entity faces inherent falsifiability challenges [1]. Consciousness is often seen as a subjective, first-person dimension that cannot be fully explained by functional or physical mechanisms alone [2]. There has been substantial recent discussion about the possibility of AI consciousness and its consequences. A distinct and often overlooked question is what happens when an AI system seems conscious to users, regardless of its actual phenomenal status. Building on Suleyman’s [3] identification and framing of “seemingly conscious AI,” (SCAI) we first conduct a narrative review to identify the hallmarks of consciousness attribution that underpin this phenomenon, i.e., observable indicators that lead people to attribute consciousness to a system. Second, we then identify, analyze and taxonomize the types of risks that SCAI poses to individuals and society, and map those onto a probability and harm component framework. SCAI risks are not exclusively future concerns.

Discussion / Conclusion. In this paper, we have outlined a comprehensive account of SCAI and its risks, in an attempt to establish it as a significant AI risk area that merits additional research and governance attention. We have identified a total of five hallmark categories that drive consciousness attribution (Section 2), developed a taxonomy of six identified SCAI risks spanning risks to individuals and societal risks, while incorporating a survey to assess their probability (Section 3). This section presents general implications of our analysis, highlighting its limitations, and laying out a set of open research questions that arise from our analysis to help elucidate and guide future work needed in the SCAI field. 4.1 Implications Our analysis results in a number of broad implications for researchers, developers, and governance institutions. Temporal heterogeneity demands differentiated responses.