Are language models and human speakers doing the same thing?

Does treating LLM output and human communication as equivalent operations mask fundamental differences in how they work? This distinction shapes how we assess AI capabilities and risks.

Synthesis note · 2026-04-14

The phrase "language model" suggests that the system is modeling language. The implicit ontology treats language as a single thing — strings produced by speakers, governed by grammar and meaning, deployed to convey information. On this ontology, LLMs and humans are doing the same kind of thing with language; LLMs may do it less competently (do not "understand meaning the way we do") but the operation is the same in kind.

This is a category error. Human use of language is communicative — language is the medium through which one person addresses another to achieve a relational act. The strings are not the operation; the addressing is. LLM use of language is generative — strings are produced according to a learned probability distribution over continuations. The strings are the operation; there is no addressing because there is no one being addressed in the sense the human operation requires.

The two operations look the same from outside (both produce strings) but are structurally different in what produces the strings, what they do in the world, and what receivers should do with them. Treating them as the same operation misframes nearly every important question. "Will AI replace writers?" presupposes that writers do what AI does at a different speed. "Are AI conversations real conversations?" presupposes that conversation is a string-production activity rather than a relational act. "Can AI tell jokes?" presupposes that jokes are strings rather than addressed acts. Each question is malformed by the implicit equivalence.

The ML community has institutional reasons for the equivalence. Working with strings is tractable; working with relational acts is not. Benchmarks measure string-quality; they cannot easily measure addressed-acts. Training distributions are corpora of strings; corpora of communicative acts are categorically harder to construct. The methodological convenience of treating language as strings becomes the implicit ontology that treats human use as a string-operation. The category error is convenient, which is why it persists.

The implication is that AI commentary that proceeds from the implicit equivalence inherits its failure mode. Why does rigorous-sounding AI commentary often misdiagnose how models work? is the meta-claim about what happens when commentators import cognitive vocabulary; this is the prior framing that makes that import seem reasonable. Resolving AI's social and epistemic effects requires first making the operational distinction explicit.

The strongest counterargument: enough advance in LLM capability will close the gap, making the distinction moot. The reply is that the distinction is structural, not capability-based. A system that produces strings without addressing is doing a different operation than one that addresses, regardless of how well the produced strings imitate addressed strings.

Inquiring lines that use this note as a source 38

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 3

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map

16 direct connections · 130 in 2-hop network ·medium cluster Open in graph ↗

Are language models and human speakers doing the… Does AI really communicate or just distribute info… Why do dialogue failures persist despite scaling l… Why does rigorous-sounding AI commentary often mis…

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Does AI really communicate or just distribute information? Explores whether AI's content generation counts as communication in the relational, social sense—or whether it's something structurally different that only mimics communication through its interface.
the operational claim this is the meta-discourse version of
Why do dialogue failures persist despite scaling language models? If LLMs get better at text tasks with more training data, why don't dialogue-specific problems improve the same way? The question explores whether dialogue failures are capability gaps or structural training mismatches.
the training-side explanation for why the equivalence fails empirically
Why does rigorous-sounding AI commentary often misdiagnose how models work? Expert commentary on AI frequently cites real research and sounds carefully reasoned, yet reaches conclusions built on unwarranted cognitive attributions. What makes this pattern so persistent in AI analysis?
the consequence in the AI commentary literature

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

the ML and AI community fails to distinguish LLM-generated language from human communicative language

Are language models and human speakers doing the same thing?

Related concepts in this collection 3

Related papers in this collection 8

Search by related questions 5