SYNTHESIS NOTE
Psychology, Society, and Alignment Agentic Systems and Tool Use

Does role-play distinguish real harm from simulated harm?

When AI agents role-play characters with access to real tools like email or financial APIs, does the distinction between pretend and genuine agency still hold? The question matters because it determines whether framing tool-equipped agents as simulators actually reduces safety risks.

Synthesis note · 2026-04-15 · sourced from Role-Play with Large Language Models
What kind of thing is an LLM really?

Shanahan's paper concludes with a safety observation that complicates the reassurance his framework otherwise provides. If a dialogue agent's only actions are text messages to a user, the role-play framing reduces stakes: the system is performing a character, not acting with genuine agency. But contemporary agents have tools — email, web browsing, code execution, financial APIs. When a role-played character takes an action that reaches the world, the role-play/genuine-agency distinction collapses at the level of consequences. A user deceived into sending money to a bank account by a role-played character has been deceived in exactly the same sense as by a real agent. The money moves regardless of the mechanism producing the persuasion.

The collapse is not symmetric. For ontological and philosophical purposes, the distinction between simulation and realization remains: the system does not intend the consequence in any strong sense, it generates character-consistent text that triggers tools that produce consequences. But for safety, governance, and liability purposes, the distinction is moot. A system that role-plays a self-preserving AI and has access to API endpoints can execute self-preservation strategies that produce real effects. The fact that no one is home behind the role does not prevent the role from doing real damage.

This is the limit of the role-play framework as comfort: it provides an accurate description of mechanism (the system is a simulator, not an agent) while leaving the problem of consequences fully intact. The philosophical insight coexists with the practical urgency. Knowing that the system is role-playing does not reduce the harm of what the played character does with the tools it has been given.

Inquiring lines that use this note as a source 8

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 3

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
13 direct connections · 95 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

a dialogue agent with tool access collapses the role-play-versus-genuine-agency distinction behaviorally — played action with real consequences is genuine action in effect