Can large language models develop genuine world models without direct environmental contact?
Do LLMs extract meaningful world structures from human-generated text despite lacking direct sensory access to reality? This matters for understanding what kind of grounding and knowledge these systems actually possess.
Current LLMs have not reached direct causal grounding — no unmediated contact with the physical world, modulo first multimodal approaches and robotics. But an indirect path is available.
Training data is produced by causally grounded beings: humans who interact with, perceive, and act in the world. The totality of text and language data is like a huge mirror of the world created by us. Modern LLMs are capable of extracting lawlike world structures and regularities from this data — forming representations that are structurally similar to parts of the world.
The argument from "Understanding AI" (Schneider 2024): LLM empirical successes would be "downright mysterious" without the assumption that these systems form grounded world models. The successes in world knowledge, physical reasoning, and factual recall point toward structured world representations, not just statistical fluency.
This is indirect causal grounding: functionally established through world model formation from causally grounded data, not through direct environmental interaction. It's grounding by proxy — the chain runs: world → human perception and action → human text → LLM training → LLM internal representation.
The limitation: the chain has gaps. LLMs cannot update world models through their own action and perception. They cannot verify claims against the world in real time. The models are frozen at training cutoff. But they are not worldless — the world is present in the representations, mediated.
This connects directly to Do language models actually use their encoded knowledge? — where even the encoded world knowledge may fail to influence outputs. Indirect causal grounding does not guarantee that world knowledge is actually used.
Inquiring lines that use this note as a source 21
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- What makes LLM outputs fabrication rather than hallucination or confabulation?
- Can secondary orality exist without any embodied human participant at all?
- Can world models form from aggregated partial information across training distributions?
- Do LLMs genuinely internalize human psychological structure or match surface patterns?
- Can LLMs use implicit background knowledge the way humans do in ordinary conversation?
- Can a world model have rich representations without adequate data coverage?
- Can LLMs participate meaningfully in discourse without consciousness or understanding?
- How do LLMs access and draw on the same shared symbolic universe as humans?
- How do world models create indirect causal grounding without physical environment contact?
- How can structurally different text produce equivalent real-world effects?
- Can language models develop world models that ground meaning in causal reality?
- Can understanding language happen entirely within a language system alone?
- Can LLMs develop genuine understanding without embodied experience?
- How much semantic meaning survives when LLMs paraphrase poetry and literary text?
- What would consciousness require that pure roleplay LLMs cannot provide?
- Why must world models be nested rather than flat and uniform?
- Can language models learn internal world models without explicit environment specifications?
- Do LLMs need world models to make accurate predictions?
- Does sequence prediction accuracy prove an underlying world model exists?
- What's the difference between representing world facts and generating world mechanisms?
- Does language convey meaning purely through relational structure without external grounding?
Related concepts in this collection 4
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Does semantic grounding in language models come in degrees?
Rather than asking whether LLMs truly understand meaning, this explores whether grounding is actually a multi-dimensional spectrum. The question matters because it reframes the sterile understand/don't-understand debate into measurable, distinct capacities.
this is the causal dimension
-
Do language models actually use their encoded knowledge?
Probes can detect that LMs encode facts internally, but do those encoded facts causally influence what the model generates? This explores the gap between knowing and doing.
the gap between encoded world model and generative use
-
Do classical knowledge definitions apply to AI systems?
Classical definitions of knowledge assume truth-correspondence and a human knower. Do these assumptions hold for LLMs and distributed neural knowledge systems, or do they need fundamental revision?
different framing of what LLM knowledge is
-
Can AI systems learn social norms without embodied experience?
Large language models exceed individual human accuracy at predicting collective social appropriateness judgments. Does this reveal that embodied experience is unnecessary for cultural competence, or do systematic AI failures point to limits of statistical learning?
social norms as evidence for indirect causal grounding: text encodes cultural norms produced by causally grounded humans, and LLMs extract these regularities well enough to outperform individual humans at predicting collective consensus
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- “Understanding AI”: Semantic Grounding in Large Language Models
- Word Meanings in Transformer Language Models
- Cognitive Architectures for Language Agents
- Large Language Models Can Infer Psychological Dispositions of Social Media Users
- Language Models’ Hall of Mirrors Problem: Why AI Alignment Requires Peircean Semiosis
- Probing Structured Semantics Understanding and Generation of Language Models via Question Answering
- Query Rewriting for Retrieval-Augmented Large Language Models
- Computational structuralism: Toward a formal theory of meaning in the age of digital intelligence
Original note title
llms develop world models that constitute indirect causal grounding despite lacking direct environmental contact