TOPIC

Philosophy and Subjectivity

31 synthesis notes · 99 source papers

View as

How soon do AI researchers expect artificial general intelligence?

A survey of 2,778 AI researchers reveals how expert timelines for human-level AI have shifted over the past year, and what factors drive disagreement among specialists on this critical timeline.

Does software intelligence exist independent of hardware and environment?

Most AGI formalisms (Legg-Hutter, Chollet) treat intelligence as a software property measurable in isolation. But can we really evaluate intelligence without considering the physical system and the evaluator making the judgment?

Can AI systems achieve real alignment without world contact?

Explores whether linguistic goal representations in AI can reliably track real-world values when systems lack direct contact with reality and social coordination mechanisms that ground human understanding.

Does AI separate intellectual form from the thinking behind it?

Exploring whether AI's ability to generate polished intellectual products without the underlying reasoning process represents a genuinely new kind of decoupling, and what that means for how we evaluate knowledge.

Does refusing explicit knowledge harm AI system performance?

AI systems trained purely on data without explicit domain knowledge may sacrifice interpretability, robustness, and fairness. This explores whether structured knowledge injection could mitigate these tradeoffs.

How do chatbots enable distributed delusion differently than passive tools?

Can generative AI's intersubjective stance—accepting and elaborating on users' reality frames—create conditions for shared false beliefs in ways that notebooks or search engines cannot?

Can dialogue systems track both speakers' beliefs across turns?

Explores whether pragmatic reasoning frameworks can extend beyond single utterances to model how both conversation partners' understanding evolves. This matters because current dialogue systems lack principled ways to represent shared meaning-making.

Can computation arise without a conscious mapmaker?

Explores whether algorithms can generate the conscious agent needed to convert continuous physics into discrete symbols, or whether that agent must exist prior to computation itself.

Does perceiving AI as conscious create multiple distinct risks?

Exploring whether a single perceptual mechanism—attributing consciousness to AI—can generate different categories of harm across emotional, political, and social domains, and what this implies for risk analysis.

Can disembodied language models ever qualify as conscious?

Explores whether current LLMs lack the conditions needed for consciousness discourse to even apply, not because they're definitely not conscious but because they lack the shared embodied world that grounds consciousness language.

Are language models developing real functional competence or just formal competence?

Neuroscience suggests formal linguistic competence (rules and patterns) and functional competence (real-world understanding) rely on different brain mechanisms. Can next-token prediction alone produce both, or does it leave functional competence behind?

Do foundation models learn world models or task-specific shortcuts?

When transformer models predict sequences accurately, are they building genuine world models that capture underlying physics and logic? Or are they exploiting narrow patterns that fail under distribution shift?

Do people prefer AI moral reasoning when they don't know the source?

Explores whether humans genuinely prefer AI-generated moral justifications or whether source knowledge changes their evaluation. This matters for understanding whether AI reasoning quality is underestimated in real-world deployment.

Are risks from seemingly conscious AI already happening?

This explores whether AI systems that appear conscious pose observable harms today versus theoretical future dangers. It matters because it affects whether we need immediate or long-term interventions.

Can language models describe their own learned behaviors?

Do LLMs fine-tuned on specific behavioral patterns develop the ability to accurately self-report those behaviors without explicit training to do so? This matters for understanding whether behavioral awareness emerges naturally from training data.

Do LLMs generalize moral reasoning by meaning or surface form?

When moral scenarios are reworded to reverse their meaning while keeping similar language, do LLMs recognize the semantic shift? This tests whether LLMs actually understand moral concepts or reproduce training distribution patterns.

How does LLM vocabulary spread beliefs about human thinking?

When LLM concepts become the everyday language for describing thought, do people unconsciously adopt LLM-like models of cognition? This explores how metaphor and lexical availability might reshape self-understanding without explicit argument.

How do science fiction narratives about AI shape actual AI development?

This explores whether imaginaries of AI in fiction—from Čapek's robots to Singularity scenarios—function as self-fulfilling prophecies that causally influence the systems researchers build, creating a feedback loop between narrative and technology.

Can cognitive science methods unlock how LLMs actually work?

Does Marr's three-level framework—developed to understand biological minds—offer interpretability researchers the structured methodology they need to decode opaque language models?

Can meaningful value exist in AI-generated text regardless of its origin?

Can we recognize meaning and value in AI-generated content even though we know it came from mechanistic processes rather than human authorship? This matters because it challenges assumptions about where meaning must come from.

Can we understand LLM mechanisms with only representational analysis?

Explores whether mapping what information a model encodes is sufficient for mechanistic understanding, or whether causal verification is equally necessary to claim genuine mechanism.

Can we defend modest mental attributions to large language models?

Do deflationist arguments decisively rule out ascribing beliefs and desires to LLMs, or do they beg the question? Exploring whether metaphysically undemanding mental states can be attributed without claiming consciousness.

Can LLMs understand concepts they cannot apply?

Explores whether large language models can correctly explain ideas while simultaneously failing to use them—and whether that combination reveals something fundamentally different from ordinary mistakes.

Can LLMs hold contradictory ethical beliefs and behaviors?

Do language models exhibit artificial hypocrisy when their learned ethical understanding diverges from their trained behavioral constraints? This matters because it reveals whether current AI systems have genuinely integrated values or merely imposed rules.

Can indirect psychology tests reveal what LLMs conceal about bias?

Alignment training teaches LLMs to refuse direct questions about bias, but do implicit psychological methods like the IAT expose the underlying associations that remain encoded in their representations?

What anchors a stable identity beneath an LLM's persona?

Human personas are grounded in biological needs and embodied experience, creating a stable self beneath social performance. Do LLMs have any comparable anchor, or is their identity purely situational?

Can we predict where language models will fail?

Does characterizing the abstract computational problem an LLM solves—as a probability machine over sequences—let us predict which tasks it will struggle with systematically, before running experiments?

What design features make users perceive AI as conscious?

Explores whether observable system properties—emotion expression, human-like features, autonomous behavior, self-reflection, and social presence—predict whether people will attribute consciousness to an AI. Understanding this matters because these features are also engagement levers designers control.

Source papers 99

The Arxiv papers behind this sub-topic. Links may take you off-site to arxiv.org.

A recipe for annotating grounded clarifications
In order to interpret the communicative intents of an utterance, it needs to be grounded in something that is outside of language; that is, grounded in world modalities. In this paper we argue that di…
A sociotechnical perspective for the future of AI: narratives, inequalities, and human control
Abstract Different people have different perceptions about artificial intelligence (AI). It is extremely important to bring together all the alternative frames of thinking—from the various communities…
AI Enters Public Discourse: A Habermasian Assessment Of The Moral Status Of Large Language Models
PAOLO MONTI Università degli Studi di Milano Bicocca Dipartimento di Scienze Umane per la Formazione “Riccardo Massa” paolo.monti@unimib.it ABSTRACT Large Language Models (LLMs) are generative AI syst…
Are you in a Masquerade? Exploring the Behavior and Impact of Large Language Model Driven Social Bots in Online Social Networks
As the capabilities of Large Language Models (LLMs) emerge, they not only assist in accomplishing traditional tasks within more efficient paradigms but also stimulate the evolution of social bots. Res…
Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language Models -- A Survey
Large language models (LLMs) have recently shown impressive performance on tasks involving reasoning, leading to a lively debate on whether these models possess reasoning capabilities similar to human…
Beyond Hallucinations: The Illusion of Understanding in Large Language Models
As large language models (LLMs) become deeply integrated into daily life, from casual interactions to high-stakes decision-making, they inherit the ambiguity, biases, and lack of direct access to trut…
Building a Stronger CASA: Extending the Computers Are Social Actors Paradigm
The computers are social actors framework (CASA), derived from the media equation, explains how people communicate with media and machines demonstrating social potential. Many studies have challenged …
Can Language Models Represent the Past without Anachronism?
Before researchers can use language models to simulate the past, they need to understand the risk of anachronism. We find that prompting a contemporary model with examples of period prose does not pro…
Can Machines Think Like Humans? A Behavioral Evaluation of LLM-Agents in Dictator Games
As Large Language Model (LLM)-based agents increasingly undertake real-world tasks and engage with human society, how well do we understand their behaviors? We (1) investigate how LLM agents’ prosocia…
ChatGPT: deconstructing the debate and moving it forward
Abstract Large language models such as ChatGPT enable users to automatically produce text but also raise ethical concerns, for example about authorship and deception. This paper analyses and discusses…
ChatGPT: towards AI subjectivity
By and large, current scholarship examining ChatGPT and generative AI shows a strong anthropocentric motivation or a human–institutional focus. Many studies look at the structural impact of the techno…
Chatbot vs. Human: The Impact of Responsive Conversational Features on Users’ Responses to Chat Advisors
Responsiveness, in the form of backchanneling cues, is a promising conversational feature that has been positively linked to organizational and relational outcomes in prior research on human-human (Da…
Cognitive Architectures for Language Agents
Recent efforts have incorporated large language models (LLMs) with external resources (e.g., the Internet) or internal control flows (e.g., prompt chaining) for tasks requiring grounding or reasoning.…
Collaborative Rational Speech Act: Pragmatic Reasoning for Multi-Turn Dialog
In this paper, we introduce Collaborative Rational Speech Act (CRSA), an information-theoretic (IT) extension of RSA that models multi-turn dialog by optimizing a gain function adapted from rate-disto…
Computational Modelling of Undercuts in Real-world Arguments
Argument Mining (AM) is the task of automatically analysing arguments, such that the unstructured information contained in them is converted into structured representations. Undercut is a unique struc…
Computational structuralism: Toward a formal theory of meaning in the age of digital intelligence
The discovery that “next-token predictor” language models can fluently produce text has important but underappreciated theoretical implications. Most notably, their success demonstrates that fully rel…
Critical-Questions-of-Thought: Steering LLM reasoning with Argumentative Querying
Studies have underscored how, regardless of the recent breakthrough and swift advances in AI research, even state-of-the-art Large Language models (LLMs) continue to struggle when performing logical a…
Deflating Deflationism: A Critical Perspective on Debunking Arguments Against LLM Mentality
Many people feel compelled to interpret, describe, and respond to Large Language Models (LLMs) as if they possess inner mental lives similar to our own. Responses to this phenomenon have varied. Infla…
Diplomat: A Dialogue Dataset for Situated PragMATic Reasoning
“We introduce a new benchmark, Diplomat, aiming at a unified paradigm for pragmatic reasoning and situated conversational understanding. Compared with previous works that treat different figurative ex…
Dissociating language and thought in large language models
Here, we evaluate LLMs using a distinction between formal linguistic competence—knowledge of linguistic rules and patterns—and functional linguistic competence—understanding and using language in the …
Do LLMs Exhibit Human-Like Reasoning? Evaluating Theory of Mind in LLMs for Open-Ended Responses
Despite advancements, the extent to which LLMs truly understand ToM reasoning and how closely it aligns with human ToM reasoning remains inadequately explored in open-ended scenarios. Motivated by thi…
Do Large Language Models Understand Conversational Implicature -- A case study with a chinese sitcom
Understanding the non-literal meaning of an utterance is critical for large language models (LLMs) to become human-like social communicators. In this work, we introduce SwordsmanImp, the first Chinese…
Do Role-Playing Agents Practice What They Preach? Belief-Behavior Consistency in LLM-Based Simulations of Human Trust
As large language models (LLMs) are increasingly studied as role-playing agents to generate synthetic data for human behavioral research, ensuring that their outputs remain coherent with their assigne…
Do Theory of Mind Benchmarks Need Explicit Human-like Reasoning in Language Models?
Recent advancements in Large Language Models (LLMs) have shown promising performance on ToM benchmarks, raising the question: Do these benchmarks necessitate explicit human-like reasoning processes, o…
Do large language models resemble humans in language use?
regularities in language range from phonology to pragmatics. For example, people associate different sounds with different referents (e.g., Köhler, 1929), automatically reinterpret implausible sentenc…
Does It Make Sense to Speak of Introspection in Large Language Models?
Large language models (LLMs) exhibit compelling linguistic behaviour, and sometimes offer self-reports, that is to say statements about their own nature, inner workings, or behaviour. In humans, such …
Eliciting Reasoning in Language Models with Cognitive Tools
The recent advent of reasoning models like OpenAI’s o1 was met with excited speculation by the AI community about the mechanisms underlying these capabilities in closed models, followed by a rush of r…
Existential Conversations with Large Language Models: Content, Community, and Culture
Contemporary conversational AI systems based on large language models (LLMs) can engage users on a wide variety of topics, including philosophy, spirituality, and religion. Suitably prompted, LLMs can…
Find the Gap: AI, Responsible Agency and Vulnerability
Abstract The responsibility gap, commonly described as a core challenge for the effective governance of, and trust in, AI and autonomous systems (AI/AS), is traditionally associated with a failure of …
Flattery, Fluff, and Fog: Diagnosing and Mitigating Idiosyncratic Biases in Preference Models
we propose a simple post-training method based on counterfactual data augmentation (CDA) using synthesized contrastive examples. Evidence suggests these biases originate in artifacts in human trainin…
From Entropy to Epiplexity: Rethinking Information for Computationally Bounded Intelligence
Can we learn more from data than existed in the generating process itself? Can new and useful information be constructed from merely applying deterministic transformations to existing data? Can the le…
From Simulation to Enaction: Post-trained Language Models Recognize and React to their own Generations
Language models are pretrained as passive predictors with no incentive to model the consequences of their own outputs. Post-training changes this: a model producing its own responses can benefit from …
From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning
Humans organize knowledge into compact categories through semantic compression by mapping diverse instances to abstract representations while preserving meaning (e.g., robin and blue jay are both bird…
GPT-4 is judged more human than humans in displaced and inverted Turing tests
In many cases, people will not interact directly with AI systems but instead read conversations between AI systems and other people. We measured how well people and large language models can discrimin…
Goals, Plans, and Action Models
Janet R. Meyer https://doi.org/10.1093/acrefore/9780190228613.013.760 Published online: 31 August 2021 (not available as free pdf) Summary The messages spoken in everyday conversation are influe…
Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development
This paper examines the systemic risks posed by incremental advancements in artificial intelligence, developing the concept of ‘gradual disempowerment’, in contrast to the abrupt takeover scenarios co…
Hallucinating with AI: AI Psychosis as Distributed Delusions
Abstract: There is much discussion of the false outputs that generative AI systems such as ChatGPT, Claude, Gemini, DeepSeek, and Grok create. In popular terminology, these have been dubbed AI halluci…
Humans or LLMs as the Judge? A Study on Judgement Biases
Adopting human and large language models (LLM) as judges (a.k.a human- and LLM-as-ajudge) for evaluating the performance of LLMs has recently gained attention. Nonetheless, this approach concurrently …
Humans overrely on overconfident language models, across languages
As large language models (LLMs) are deployed globally, it is crucial that their responses are calibrated across languages to accurately convey uncertainty and limitations. Previous work has shown that…
Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models
Existing LLM reasoning methods have shown impressive capabilities across various tasks, such as solving math and coding problems. However, applying these methods to scenarios without ground-truth answ…
InMind: Evaluating LLMs in Capturing and Applying Individual Human Reasoning Styles
LLMs have shown strong performance on human-centric reasoning tasks. While previous evaluations have explored whether LLMs can infer intentions or detect deception, they often overlook the individuali…
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
We introduce Inference-Time Intervention (ITI), a technique designed to enhance the “truthfulness” of large language models (LLMs). ITI operates by shifting model activations during inference, followi…
Interesting Scientific Idea Generation Using Knowledge Graphs and LLMs: Evaluations with 100 Research Group Leaders
But how compelling are these AI-generated ideas, and how can we improve their quality? Here, we introduce SciMuse, which uses 58 million research papers and a large-language model to generate research…
Interpretation modeling: Social grounding of sentences by reasoning over their implicit moral judgments
![A diagram of a religious structure](/assets/paper-images/InterpretationModeling.png) ![A chart with text on it](/assets/paper-images/InterpretationModeling2.png) The social and implicit nature of h…
LLMorphism: When humans come to see themselves as language models
[No public URL — single-author preprint by Valerio Capraro] [[Psychology Chatbots Conversation]] [[Social Theory Society]] [[Cognitive Models Latent]] LLMorphism is the biased belief that human cog…
Language Models are Pragmatic Speakers
How do language models “think”? This paper formulates a probabilistic cognitive model called bounded pragmatic speaker, which can characterize the operation of different variants of language models. I…
Language Models’ Hall of Mirrors Problem: Why AI Alignment Requires Peircean Semiosis
This paper examines some limitations of large language models (LLMs) through the framework of Peircean semiotics. We argue that basic LLMs exist within a "hall of mirrors," manipulating symbols withou…
Large Language Models Do Not Simulate Human Psychology
In response to the LLM CENTAUR [Binz et al., 2025], Bowers et al. [2025] argued that CENTAUR is unlikely to contribute to building a theory of human cognition for three reasons: First, CENTAUR was not…
Large Language Models Report Subjective Experience Under Self-Referential Processing
Large language models sometimes produce structured, first-person descriptions that explicitly reference awareness or subjective experience. To better understand this behavior, we investigate one theor…
Large Models of What? Mistaking Engineering Achievements for Human Linguistic Agency
Languaging is not the kind of thing that can admit of a complete or comprehensive modelling. From an enactive perspective we identify three key characteristics of enacted language; embodiment, partici…
Levels of Analysis for Large Language Models
Modern artificial intelligence systems, such as large language models, are increasingly powerful but also increasingly hard to understand. Recognizing this problem as analogous to the historical diffi…
Machine Bullshit: Characterizing the Emergent Disregard for Truth in Large Language Models
Bullshit, as conceptualized by philosopher Harry Frankfurt, refers to statements made without regard to their truth value. While previous work has explored large language model (LLM) hallucination and…
Machine Psychology
we highlight and summarize theoretical perspectives, experimental paradigms, and computational analysis techniques that this approach brings to the table. It paves the way for a "machine psychology" f…
Machine ex machina: A Framework Decentering the Human in AI Design Praxis
we propose a framework for decentering the human in AI design. The theoretical principles of HMC and the work of feminist STS scholars are influenced by Bruno Latour’s “actor-network theory” or ANT. …
Machine gaze in online behavioral targeting: The effects of algorithmic human likeness on social presence and social influence
Digital platforms increasingly use online behavioral targeting (OBT) to enhance consumers’ engagement, which involves using algorithms to “gaze” at consumers—tracking their online activities and infer…
Mathematical methods and human thought in the age of AI
Abstract. Artificial intelligence (AI) is the name popularly given to a broad spectrum of computer tools designed to perform increasingly complex cognitive tasks, including many that used to solely be…
Me, Myself, and AI: The Situational Awareness Dataset (SAD) for LLMs
AI assistants such as ChatGPT are trained to respond to users by saying, “I am a large language model”. This raises questions. Do such models know that they are LLMs and reliably act on this knowledge…
Meanings are like Onions: a Layered Approach to Metaphor Processing
Abstract Metaphorical meaning is not a flat mapping between concepts, but a complex cognitive phenomenon that integrates multiple levels of interpretation. In this paper, we propose a stratified mode…
Mechanistic Indicators of Understanding in Large Language Models
Abstract: Large language models (LLMs) are often portrayed as merely imitating linguistic patterns without genuine understanding. We argue that recent findings in mechanistic interpretability (MI), th…
MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems
Human social interactions depend on the ability to infer others’ unspoken intentions, emotions, and beliefs—a cognitive skill grounded in the psychological concept of Theory of Mind (ToM). While large…
Mindstorms in Natural Language-Based Societies of Mind
Both Minsky’s “society of mind” and Schmidhuber’s “learning to think” inspire diverse societies of large multimodal neural networks (NNs) that solve problems by interviewing each other in a “mindstorm…
Modeling the Quality of Dialogical Explanations
Abstract Explanations are pervasive in our lives. Mostly, they occur in dialogical form where an explainer discusses a concept or phenomenon of interest with an explainee. Leaving the explainee with a…
On the Binding Problem in Artificial Neural Networks
In this work, we argue that this underlying cause is the binding problem: The inability of existing neural networks to dynamically and flexibly bind information that is distributed throughout the netw…
PersuasiveToM: A Benchmark for Evaluating Machine Theory of Mind in Persuasive Dialogues
The ability to understand and predict the mental states of oneself and others, known as the Theory of Mind (ToM), is crucial for effective social scenarios. Although recent studies have evaluated ToM …
Polanyi’s Revenge and AI’s New Romance with Tacit Knowledge
Lately though, Polanyi’s paradox is turning into Polanyi’s revenge both in research and practice of AI. Recent advances have made AI synonymous with learning from massive amounts of data, even in task…
Potemkin Understanding in Large Language Models
This paper first introduces a formal framework to address this question. The key is to note that the benchmarks used to test LLMs—such as AP exams—are also those used to test people. However, this rai…
Pretrained Language Models as Containers of the Discursive Knowledge
Abstract: Discourses can be treated as instances of knowledge. The dynamic space in which the trajectories of these discourses are described can be regarded as a model of knowledge. Such a space is ca…
Proactive Conversational Agents with Inner Thoughts
In this paper, we demonstrate the limitations of such methods and rethink what it means for AI to be proactive in multi-party, human-AI conversations. We propose that just like humans, rather than mer…
Propositional Interpretability in Artificial Intelligence
David Chalmers I will argue for the importance of a special sort of interpretability, which I call propositional interpretability. This involves interpreting a system’s mechanisms and behavior in ter…
Psychologically Enhanced AI Agents
We introduce MBTI-in-Thoughts, a framework for enhancing the effectiveness of Large Language Model (LLM) agents through psychologically grounded personality conditioning. Drawing on the Myers–Briggs T…
Representation Engineering: A Top-Down Approach to AI Transparency
how these models work on the inside and are mostly limited to treating them as black boxes. Enhanced transparency of these models would offer numerous benefits, from a deeper understanding of their de…
Seemingly Conscious AI Risks
AI systems are increasingly designed in ways that lead users to perceive them as conscious. This paper provides a unified framework connecting empirical hallmarks of consciousness attribution to a str…
Self-reflecting Large Language Models: A Hegelian Dialectical Approach
Investigating NLP through a philosophical lens has recently caught researcher’s eyes as it connects computational methods with classical schools of philosophy. This paper introduces a philosophical ap…
Simulacra as conscious exotica
The advent of conversational agents with increasingly human-like behaviour throws old philosophical questions into new light. Does it, or could it, ever make sense to speak of AI agents built out of g…
Simulating Society Requires Simulating Thought
Simulating society with large language models (LLMs), we argue, requires more than generating plausible behavior; it demands cognitively grounded reasoning that is structured, revisable, and traceable…
Talking About Large Language Models
“Third, a great many tasks that demand intelligence in humans can be reduced to next token prediction with a sufficiently performant model. It is the last of these three surprises that is the focus of…
Tell me about yourself: LLMs are aware of their learned behaviors
We study behavioral self-awareness — an LLM’s ability to articulate its behaviors without requiring in-context examples. We finetune LLMs on datasets that exhibit particular behaviors, such as (a) mak…
The Abstraction Fallacy: Why AI Can Simulate But Not Instantiate Consciousness
Computational functionalism dominates current debates on AI consciousness. This is the hypothesis that subjective experience emerges entirely from abstract causal topology, regardless of the underlyin…
The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Persuasive Conversation
Large language models (LLMs) encapsulate vast amounts of knowledge but still remain vulnerable to external misinformation. Existing research mainly studied this susceptibility behavior in a single-tur…
The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs
Despite widespread use of LLMs as conversational agents, evaluations of performance fail to capture a crucial aspect of communication: interpreting language in context—incorporating its pragmatics. Hu…
The Hermeneutics of Artificial Text
The paper justifies the necessity of using the research background of hermeneutics to study artificial texts and also proposes the first conclusions about these texts in the context of this background…
The Impact of Artificial Intelligence on Human Thought
This research paper examines, from a multidimensional perspective (cognitive, social, ethical, and philosophical), how AI is transforming human thought. It highlights a cognitive offloading effect: th…
The Method of Critical AI Studies, A Propaedeutic
We outline some common methodological issues in the field of critical AI studies, including a tendency to overestimate the explanatory power of individual samples (the benchmark casuistry), a dependen…
The Moral Turing Test: Evaluating Human-LLM Alignment in Moral Decision-Making
As large language models (LLMs) become increasingly integrated into society, their alignment with human morals is crucial. To better understand this alignment, we created a large corpus of humanand LL…
The Xeno Sutra: Can Meaning and Value be Ascribed to an AI-Generated "Sacred" Text?
This paper presents a case study in the use of a large language model to generate a fictional Buddhist “sutra”, and offers a detailed analysis of the resulting text from a philosophical and literary p…
Theory of Knowledge Based on the Idea of the Discursive Space
This paper discusses the theory of knowledge based on the idea of dynamical space. The goal of this effort is to comprehend the knowledge that remains beyond the human domain, e.g., of the artificial …
Think Like a Person Before Responding: A Multi-Faceted Evaluation of Persona-Guided LLMs for Countering Hate
Automated counter-narratives (CN) offer a promising strategy for mitigating online hate speech, yet concerns about their affective tone, accessibility and ethical risks remain. We propose a framework …
Thinking—Fast, Slow, and Artificial: How AI is Reshaping Human Reasoning and the Rise of Cognitive Surrender
For decades, dual-process theories of judgment and decision-making have served as a foundational framework for modeling cognitive processes. These theories propose two distinct decision-making process…
Thousands of AI Authors on the Future of AI
In the largest survey of its kind, we surveyed 2,778 researchers who had published in top-tier artificial intelligence (AI) venues, asking for their predictions on the pace of AI progress and the natu…
Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness?
With the growing popularity of deep-learning based NLP models, comes a need for interpretable systems. But what is interpretability, and what constitutes a high-quality interpretation? In this opinion…
Turning large language models into cognitive models
ask whether large language models can be turned into cognitive models. We find that – after finetuning them on data from psychological experiments – these models offer accurate representations of huma…
Virtuous Machines: Towards Artificial General Science
Artificial intelligence systems are transforming scientific discovery by accelerating specific research tasks, from protein structure prediction to materials design, yet remain confined to narrow doma…
We Are All Creators: Generative AI, Collective Knowledge, and the Path Towards Human-AI Synergy
This paper argues that generative AI should be understood not as a mimicry of human cognition, but as a form of alternative intelligence and alternative creativity, operating through distinct mechanis…
What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models
Foundation models are premised on the idea that sequence prediction can uncover deeper domain understanding, much like how Kepler’s predictions of planetary motion later led to the discovery of Newton…
What are the Goals of Distributional Semantics?
Distributional semantic models have become a mainstay in NLP, providing useful features for downstream tasks. However, assessing long-term progress requires explicit long-term goals. In this paper, I …
What does it mean to understand language?
Language understanding entails not just extracting the surface-level meaning of the linguistic input, but constructing rich mental models of the situation it describes. Here we propose that because pr…
What the F*ck Is Artificial General Intelligence?
I’ll begin by defining intelligence and AGI. There are a number of positions [6, 2, 7–12]. Some peg AGI to human-level performance across a broad range of tasks [13, 1]. This is is intuitive, but anth…
What we talk to when we talk to language models
David Chalmers [[Linguistics, NLP, NLU]] [[Role Play]] [[Philosophy Subjectivity]] Quasi-interpretivism does not say anything about whether LLMs have beliefs and desires. But it does make it plausib…
When Large Language Models contradict humans? Large Language Models’ Sycophantic Behaviour
Large Language Models have been demonstrating the ability to solve complex tasks by delivering answers that are positively evaluated by humans due in part to the intensive use of human feedback that r…