Evaluating Theory of Mind and Internal Beliefs in LLM-Based Multi-Agent Systems

Paper · arXiv 2603.00142 · Published February 24, 2026

Abstract. LLM-based MAS are gaining popularity due to their potential for collaborative problem-solving enhanced by advances in natural language comprehension, reasoning, and planning. Research in Theory of Mind (ToM) and Belief-Desire-Intention (BDI) models has the potential to further improve the agent’s interaction and decision-making in such systems. However, collaborative intelligence in dynamic worlds remains difficult to accomplish since LLM performance in multi-agent worlds is extremely variable. Simply adding cognitive mechanisms like ToM and internal beliefs does not automatically result in improved coordination. The interplay between these mechanisms, particularly in relation to formal logic verification, remains largely underexplored in different LLMs. This work investigates: How do internal belief mechanisms, including symbolic solvers and Theory of Mind, influence collaborative decisionmaking in LLM-based multi-agent systems, and how does the interplay of those components influence system accuracy? We introduce a novel multi-agent architecture integrating ToM, BDI-style internal beliefs, and symbolic solvers for logical verification.

Introduction. Multi-Agent Systems (MAS) have been a cornerstone in the resolution of intricate, distributed issues, taking advantage of the strength of coordination between independent agents [27]. Large Language Models (LLMs) have ushered in a revolutionary age for MAS, providing unmatched possibilities for autonomous, cooperative, and scalable problem-solving in changing situations [9, 30]. LLMs, with their sophisticated natural language understanding and strategic reasoning abilities [22], hold immense promise to revolutionize agent interaction in MAS systems. In collective intelligence, agents’ abilities to infer and act upon the intentions and beliefs of other agents are invaluable. Significantly, research such as [19] has demonstrated that the integration of Theory of Mind (ToM) serves to enhance collaborative intelligence via agents’ capacity for improved prediction and adaptation to their counterparts’ objectives. Moreover, work represented by [16] has successfully applied the Belief-Desire-Intention (BDI) model to formalize internal belief processes within agent-based systems.

Discussion / Conclusion. Variability in performance is most likely accounted for by innate LLM properties. ChatGPT-4o being larger was likely responsible for its ability to integrate ToM and IB without loss of performance, which supports greater reasoning ability. ChatGPT-3.5-Turbo’s smaller structure may have induced greater cognitive load. The intermediate performance of ChatGPT-4o-mini aligns with the complexity of its architecture. Meta Llama 3.1 8B’s variability, particularly with combined ToM+IB, shows insufficient processing for numerous reasoning processes. Claude 3.5 Sonnet performance, as in ChatGPT-4o, indicates a strong architecture for this task. The results point to the complex interplay between inherent LLM capacities, the inclusion of Theory of Mind (ToM) and Internal Belief (IB) mechanisms, and overall performance in cooperative multi-agent tasks.

Evaluating Theory of Mind and Internal Beliefs in LLM-Based Multi-Agent Systems

Synthesis notes that discuss concepts related to this paper