SYNTHESIS NOTE

Why do LLMs fail when simulating agents with private information?

Explores whether single-model control of all social participants masks fundamental limitations in how LLMs handle information asymmetry and genuine uncertainty about others' knowledge.

Synthesis note · 2026-02-23 · sourced from Social Theory Society

Most LLM social simulations use a single model to generate all participants — an omniscient perspective fundamentally at odds with how real social interaction works. When evaluated against non-omniscient settings that preserve information asymmetry, LLMs struggle.

The "Is this the real life?" evaluation framework (2024) demonstrates this by comparing omniscient simulation (one LLM controls all parties) against non-omniscient simulation (separate LLM instances with private information). The performance gap is systematic: models that appear socially competent in omniscient mode fail when they must reason under genuine uncertainty about what the other party knows, wants, or intends.

This matters because real social interaction is defined by information asymmetry. In SOTOPIA's scenarios, agents have shared context but private goals — "Your goal is to buy the chair for $80" is visible only to the buyer. The Secret dimension (what agents must hide) directly requires information management that omniscient models bypass entirely.

The implication for persona simulation research is direct. Since Can AI agents learn people better from interviews than surveys?, simulation fidelity appears high. But if that fidelity was measured under omniscient conditions, it overstates real-world applicability. Since Do language models actually build shared understanding in conversation?, the failure under information asymmetry is predictable: models that skip grounding work will fail precisely when grounding is most needed — when parties have genuinely different information states.

Since Why do language models skip the calibration step?, non-omniscient simulation demands the dynamic grounding that LLMs systematically lack.

Inquiring lines that use this note as a source 106

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 4

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map

15 direct connections · 133 in 2-hop network ·medium cluster Open in graph ↗

Why do LLMs fail when simulating agents with pri… Do language models actually build shared understan… Why do language models skip the calibration step? Can AI agents learn people better from interviews … How do we generate realistic personas at populatio…

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Do language models actually build shared understanding in conversation? When LLMs respond fluently to prompts, do they perform the communicative work humans do to establish mutual understanding? Research suggests they skip the grounding acts that make dialogue reliable.
the mechanism: omniscient simulation lets models skip grounding work entirely
Why do language models skip the calibration step? Current LLMs assume shared understanding rather than building it through dialogue. This explores why that design choice persists and what breaks when it fails.
non-omniscient settings demand the dynamic mode
Can AI agents learn people better from interviews than surveys? Can rich interview transcripts seed more accurate generative agents than demographic data or survey responses? This matters because it challenges how we build digital simulations of real people.
simulation fidelity may overstate real-world capacity if measured under omniscient conditions
How do we generate realistic personas at population scale? Current LLM-based persona generation relies on ad hoc methods that fail to capture real-world population distributions. The challenge is reconstructing the joint correlations between demographic, psychographic, and behavioral attributes from fragmented data.
another mechanism producing simulation overconfidence

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

omniscient social simulation fails under real-world information asymmetry because single-model control eliminates distributed cognition

Why do LLMs fail when simulating agents with private information?

Related concepts in this collection 4

Related papers in this collection 8

Search by related questions 4