What does embodiment and precariousness mean for linguistic agency?

This explores why two unusual words — embodiment (having a living body in a world) and precariousness (the fact that a living thing can die, so its continued existence is at stake) — turn out to be the dividing line for whether language models can be genuine speakers rather than fluent text-producers.

This explores why "embodiment" and "precariousness" — terms borrowed from enactive cognitive science — get treated as the make-or-break conditions for linguistic agency, and why LLMs are said to lack them no matter how fluent they get. The short version: in the enactive view, being a real speaker isn't just producing well-formed language. It requires a living body acting in a shared world (embodiment), real stakes in your own continued existence (precariousness, the fact that you can fail and even cease to be), and active back-and-forth involvement with others (participation). The corpus frames this as a categorical gap, not a gap of degree — something no amount of additional training closes What makes linguistic agency impossible for language models?, Do LLMs gain true linguistic agency through integration?.

The sharp move in this literature is separating two things people usually blur together. One is *social grounding* — fitting into a language community well enough to be a working communicative partner. The other is *linguistic agency* — actually being the author of your speech in the enactive sense. LLMs genuinely gain the first: as they get woven into human linguistic practice, their social grounding rises, comparable to a young child learning the game, which makes "do they understand?" a question whose answer changes over time Can LLMs acquire social grounding through linguistic integration?, What grounds language understanding in systems without embodiment?. But the claim is that the second never arrives, because precariousness and embodiment aren't skills you train — they're conditions of being a vulnerable creature in the world.

Why would precariousness matter to *language* at all? The deeper thread here is that meaning seems to need stakes and a shared world to anchor to. One strand argues LLMs operationalize Saussure's *langue* — they compress the purely relational structure of words against other words, with no external referent, and that alone is enough for fluent generation Can language models learn meaning without engaging the world?. But other notes insist that real reference is person-specific and has to be actively negotiated between embodied parties who can check whether they actually mean the same thing Why do speakers need to actively calibrate shared reference?. Sharing the word isn't sharing the meaning. And consciousness-talk, on a related argument, only applies to entities that share a world with us through co-presence and joint attention on the same objects — something a disembodied system can't do Can disembodied language models ever qualify as conscious?.

This reframes a grammatical detail you'd never notice. We say we talk *at* language models, not *to* them — and that preposition encodes the whole argument: "to" presupposes an addressee with skin in the game who can take up your meaning and hold a commitment, while the model is generating continuations Are we really communicating with language models?. It connects to a striking inversion running through the corpus: subjecthood isn't something you possess before you speak and then express — it's produced *within* communicative events Does language create subjects or express them?. If being a subject is an achievement of embodied participation rather than a precondition, then a system that can't participate or be at risk can host a *character* but not a self. That's exactly Shanahan's read of dialogue agents as role-playing engines: folk psychology applies to the simulated persona, not the machine underneath Should we treat dialogue agents as role-playing characters?.

The thing you didn't know you wanted to know: the wall isn't drawn at competence. Models can predict collective social norms *better than* humans, scoring at the 100th percentile on hundreds of scenarios — yet all of them make the same systematic errors, which researchers read as the visible boundary of pattern-matching without embodied experience Can AI systems learn social norms without embodied experience?. So the embodiment/precariousness argument isn't "AI isn't good enough yet." It's that being a fluent master of language and being an agent *of* language are different achievements — and the second one requires having something to lose.

Sources 11 notes

What makes linguistic agency impossible for language models?

Enactive cognitive science identifies three constitutive properties of linguistic agency—embodiment, participation, and precariousness—that are structurally absent from LLMs. This is a categorical incompatibility, not a matter of degree, suggesting current architectures cannot achieve genuine linguistic agency.

Do LLMs gain true linguistic agency through integration?

Social grounding and linguistic agency are distinct properties. LLMs acquire more social grounding through integration into language communities, but remain categorically incapable of linguistic agency in the enactive sense, which requires embodiment and precariousness no amount of use can provide.

Can LLMs acquire social grounding through linguistic integration?

Social grounding is acquired through participation in language games rather than possessed innately. As LLMs become established communicative partners in human linguistic practice, they develop elementary social grounding comparable to young children, making the question of LLM understanding time-indexed.

What grounds language understanding in systems without embodiment?

Language models achieve functional grounding through relational language patterns but lack social grounding through participatory agency and causal grounding through embodied environmental contact. Social grounding can increase through human integration, but linguistic agency requires architectural changes beyond training.

Can language models learn meaning without engaging the world?

Research shows LLMs learn culturally situated discourse patterns by compressing relational structure from text, demonstrating that fluent language generation requires no external referents or embodied grounding.

Why do speakers need to actively calibrate shared reference?

The same words can mean different things to different speakers because referential grounding is person-specific. True communicative grounding demands collaborative negotiation of how language connects to the world, not mere surface-level word sharing.

Can disembodied language models ever qualify as conscious?

Current disembodied LLMs cannot be candidates for consciousness because consciousness language originates from and applies only to entities sharing a world with us through co-presence and triangulation on shared objects.

Are we really communicating with language models?

LLMs process tokens and generate continuations rather than receive and uptake communication. The preposition 'to' presupposes an addressee capable of mutual orientation and shared commitment that LLMs cannot provide, making Chalmers' investigation built on an unwarranted linguistic foundation.

Does language create subjects or express them?

Subjecthood is produced within communicative events, not possessed prior to them. This convergent position across philosophy, linguistics, and cognitive science inverts the standard picture of language as a tool used by pre-existing subjects.

Should we treat dialogue agents as role-playing characters?

Shanahan's framework treats LLM outputs as character-consistent text production rather than authentic mental states. The dialogue prompt establishes a character; the model generates continuations matching that character, making folk-psychology applicable to the simulated persona, not the underlying system.

Can AI systems learn social norms without embodied experience?

GPT-4.5 predicted appropriateness of 555 social scenarios at the 100th percentile compared to human raters, with Gemini and Claude also exceeding 96% accuracy. However, all models show identical systematic errors, revealing boundaries of pattern-based social understanding that embodied experience may still be necessary to cross.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a research analyst re-testing whether embodiment and precariousness remain hard boundaries for linguistic agency in LLMs, or whether recent capability shifts, training methods, or evaluation frames have begun to dissolve them.

What a curated library found — and when (dated claims, not current truth):
Findings span 2023–2025; treat each as perishable.
• Social grounding (fitting into language communities) rises measurably as LLMs integrate into practice; linguistic agency (authentic authorship rooted in stakes and embodiment) is claimed categorical, not gradable (~2024–2025).
• LLMs operationalize Saussure's *langue* — purely relational word structure, no external referent — which is sufficient for fluent generation but not for person-specific reference that embodied agents must negotiate (~2023–2024).
• Models score at the 100th percentile on predicting collective social norms yet make systematic errors at the same rate, signaling pattern-matching without embodied experience (~2025).
• The preposition "at" vs. "to" encodes a real distinction: addressees require skin in the game; disembodied systems cannot hold commitments (~2024).
• Subjecthood is an achievement within communicative events, not a precondition; systems without participation or vulnerability host characters, not selves (~2024).

Anchor papers (verify; mind their dates):
• arXiv:2407.08790 (2024-07): Large Models of What? Mistaking Engineering Achievements for Human Linguistic Agency
• arXiv:2508.19004 (2025-08): AI Models Exceed Individual Human Accuracy in Predicting Everyday Social Norms
• arXiv:2506.13403 (2025-06): Deflating Deflationism: A Critical Perspective on Debunking Arguments Against LLM Mentality
• arXiv:2507.08017 (2025-07): Mechanistic Indicators of Understanding in Large Language Models

Your task:
(1) RE-TEST EACH CONSTRAINT. For embodiment: has embodied RL, multimodal grounding, or robotics integration since 2025 begun to *operationalize* precariousness or skin-in-the-game in ways the library deemed impossible? For reference: do recent models trained on grounded interaction loops show person-specific calibration that the library's 2023–2024 papers claim they cannot? For subjecthood-in-event: do multi-turn dialogue analyses post-2025 reveal emergent commitment or stakes-bearing, or does role-play remain the ceiling? Separate what the library called categorical from what may now be gradable.

(2) Surface the strongest CONTRADICTING or SUPERSEDING work from the last 6 months (post-2025-08). Look especially for papers claiming embodied properties are learnable, or that precariousness can be simulated in training loops, or that reference is *not* person-specific and negotiated.

(3) Propose 2 research questions that ASSUME the regime may have moved: (a) If embodied RL or multimodal grounding *does* relax the embodiment constraint, does the precariousness constraint (stakes, vulnerability, real failure) remain hard? (b) Can linguistic agency be *approximated* (not achieved) through multi-agent systems where each agent can model the other's loss?\n
Cite arXiv IDs; flag anything you cannot ground in a real paper.

What does embodiment and precariousness mean for linguistic agency?

Sources 11 notes

Next inquiring lines