Do metaphors work by decoupling meaning from linguistic associations?

This explores whether understanding a metaphor requires pulling meaning *away* from the statistical word-associations a model learned from text — and what the corpus says about whether LLMs can actually do that.

This explores whether metaphors work by decoupling meaning from linguistic associations — and the corpus suggests the question quietly contains its own answer: metaphor comprehension is hard for LLMs precisely *because* they can't easily decouple meaning from the word-associations they absorbed in training. One line of work reframes metaphors, idioms, and puns as a single task — recovering a literal meaning hiding inside a non-literal expression — and argues the missing ingredient isn't more examples but better "semantic decoupling" (Can one model handle all types of figurative language?). So 'decoupling' isn't a side effect of metaphor; on this view it's the core operation.

Why is that operation hard? Several notes point at the same mechanism from different angles. LLMs systematically prefer high-frequency surface phrasings over rarer but equivalent ones, tracking statistical mass from pretraining rather than meaning itself (Do language models really understand meaning or just surface frequency?). And when semantic content is stripped out of a reasoning task, performance collapses even with correct rules supplied — the models lean on token associations, not abstract structure (Do large language models reason symbolically or semantically?). A metaphor asks for the opposite move: ignore the literal associations of 'time' and 'money' and map an abstract relation across domains. That's exactly where comprehension breaks — models handle conventional, lexicalized metaphors (already baked into the associations) but fail on novel literary ones that demand fresh conceptual mapping (Where does LLM metaphor comprehension actually break down?).

There's a deeper architectural reason hiding underneath. One note argues transformers read words additively — aggregating all tokens in weighted parallel — rather than *resonantly*, selectively suppressing the irrelevant senses of a word the way humans do when a frame snaps into place (Why do AI systems miss jokes and wordplay so consistently?). Decoupling meaning from associations requires that selective suppression: a pun or metaphor lives in choosing which sense to silence. Without it, the model can't isolate the figurative reading from the literal pull of the words. Relatedly, strong prior associations from training simply override what's in front of the model, so even explicit context can't redirect it (Why do language models ignore information in their context?).

The surprising twist is what this says about meaning in general. If LLMs operationalize Saussure's *langue* — learning meaning purely from the relational structure of text, with no external referents (Can language models learn meaning without engaging the world?) — then for them, meaning *is* linguistic association. There's nothing to decouple to. That reframes your question: metaphor may be the place where a purely associational system reveals its ceiling. The 'potemkin understanding' pattern is the symptom — models that can correctly *explain* a metaphor yet fail to apply it, because the explanation pathway and the use pathway are functionally disconnected (Can LLMs understand concepts they cannot apply?).

So: yes, metaphor works by decoupling meaning from surface associations — and that's exactly the operation current models are worst at. The thing you didn't know to ask: a frequency bias toward common, abstract phrasing means LLMs don't just fail at metaphor, they actively drift *away* from the specific, figurative, lower-frequency language metaphor depends on (Does word frequency correlate with semantic abstraction?).

Sources 9 notes

Can one model handle all types of figurative language?

The Diplomat dataset (4,177 dialogues) reframes metaphors, idioms, and puns as one pragmatic task: recovering literal meaning from non-literal expression. This framing suggests LLMs need better semantic decoupling ability, not more category-specific training data.

Do language models really understand meaning or just surface frequency?

LLMs show consistent preference for higher-frequency surface forms over semantically equivalent rare paraphrases across math, machine translation, commonsense reasoning, and tool calling. This suggests models track statistical mass from pretraining rather than meaning-recognition as their primary mechanism.

Do large language models reason symbolically or semantically?

When semantic content is decoupled from reasoning tasks, LLM performance collapses even with correct rules in context. Models rely on parametric commonsense and token associations rather than formal logical manipulation, constraining reasoning to training distribution semantics.

Where does LLM metaphor comprehension actually break down?

LLMs handle conventional, lexicalized metaphors but fail on novel literary metaphors requiring conceptual domain mapping. This degradation reveals a fundamental gap between pattern recognition and genuine semantic mapping.

Why do AI systems miss jokes and wordplay so consistently?

Transformers integrate token information through weighted parallel aggregation rather than selective suppression of irrelevant words. This structural difference explains consistent failures with jokes, wordplay, and frame-dependent meaning—not knowledge gaps, but missing cognitive operations.

Why do language models ignore information in their context?

Research demonstrates that LMs generate outputs inconsistent with their context because parametric knowledge from training dominates over in-context information. Textual prompting alone cannot override strong priors; causal intervention in representations is required.

Can language models learn meaning without engaging the world?

Research shows LLMs learn culturally situated discourse patterns by compressing relational structure from text, demonstrating that fluent language generation requires no external referents or embodied grounding.

Can LLMs understand concepts they cannot apply?

Models can explain concepts accurately, fail to apply them, and recognize the failure—a triple pattern incompatible with human cognition. This indicates functionally disconnected explanation and execution pathways rather than simple knowledge gaps.

Does word frequency correlate with semantic abstraction?

WordNet analysis shows hypernyms (general concepts) occur more frequently than hyponyms (specific ones). Combined with LLMs' frequency bias, this means preferring common paraphrases systematically drifts toward abstraction, erasing expert-level specificity.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a research analyst re-testing claims about metaphor comprehension in LLMs. The question remains: Do metaphors work by decoupling meaning from linguistic associations?

What a curated library found — and when (dated claims, not current truth): Studies spanning 2023–2026 identified these constraints:
• LLMs fail at novel literary metaphors but handle conventional, lexicalized ones — because models track statistical mass from pretraining rather than abstract relational mapping (2025).
• Transformers read words additively in parallel, not resonantly; they cannot selectively suppress irrelevant senses the way humans do when a frame activates — so decoupling fails at the architectural level (2025).
• Strong prior training associations override explicit context; even when models correctly *explain* a metaphor, they fail to *apply* it (potemkin understanding) (2026).
• LLMs operationalize purely relational meaning (Saussure's langue) with no external referents; for them, meaning IS linguistic association, leaving nothing to decouple to (2025).
• Frequency bias actively pushes models away from the specific, figurative, lower-frequency language metaphors depend on (2026).

Anchor papers (verify; mind their dates):
• arXiv:2305.14825 (2023) — In-Context Semantic Reasoners
• arXiv:2507.10354 (2025) — Layered Approach to Metaphor Processing
• arXiv:2604.02176 (2026) — Adam's Law: Textual Frequency in LLMs
• arXiv:2510.14665 (2026) — Illusion of Understanding

Your task:
(1) RE-TEST EACH CONSTRAINT. For every claim above, ask: Have newer models, in-context learning methods, fine-tuning, retrieval augmentation, or multi-step reasoning (chain-of-thought, tree-search, latent reasoning with scaling) since relaxed or overturned the limits? Separate the durable question—do models struggle with novel figurative mapping?—from perishable limitations. Cite what resolved it; state plainly where the constraint still holds.
(2) Surface the strongest work from the last 6 months that CONTRADICTS the "potemkin understanding" or "frequency drift" narrative — does any recent paper show models *do* decouple meaning under certain conditions?
(3) Propose 2 research questions that assume the regime may have shifted: e.g., "Can test-time compute or intermediate reasoning modules enable true semantic decoupling?" "Do multimodal or hybrid (symbolic + neural) architectures avoid the purely-associational ceiling?"

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Do metaphors work by decoupling meaning from linguistic associations?

Sources 9 notes

Next inquiring lines