Do transformer static embeddings actually encode semantic meaning?
Explores whether the fixed word embeddings that enter transformer networks contain rich semantic information or serve only as shallow placeholders. This addresses a longstanding debate in philosophy of language about whether word meanings are stored or constructed.
The transformer architecture creates two distinct representations for every word: a static token embedding (input to self-attention) and a contextualized embedding (output of self-attention). The static embedding is the invariant entry for each word in the model's vocabulary. The question is whether these static embeddings carry meaningful semantic information or are mere placeholders that get enriched only during self-attention.
The "meaning eliminativist" hypothesis — defended in psycholinguistics by Elman (2004) and philosophy by Rayo (2013) and Recanati (2003) — holds that static word meanings are redundant. Applied to LLMs, this would mean static embeddings store only morphological and syntactic cues, with semantic information introduced entirely at the self-attention layer. Given that embeddings have only 768 parameters per token in RoBERTa-base versus tens of millions in the attention and feed-forward layers, there is architectural reason to expect semantic information might be deferred.
The evidence rules this out. Clustering RoBERTa-base's ~50,000 token embeddings into 200 clusters reveals sensitivity to five psycholinguistic measures:
- Valence — pleasantness of the concept (from the Mehrabian three-dimensional emotion model)
- Concreteness — perceptible entity vs. abstract notion ("bicycle" = 4.89, "justice" = low)
- Iconicity — perceived resemblance between form and meaning (challenging the arbitrariness-of-the-sign thesis)
- Taboo — social transgression load of the term
- Age of acquisition — when the word is typically learned
The iconicity finding is particularly striking because detecting it requires access to surface properties, semantic properties, and recognition of resemblance between them — all within the static embedding before any attention mechanism operates.
This means LLMs implement something analogous to a lexical store: each word has an entry containing genuine semantic information that is then modulated by context during self-attention. The parallel to the philosophy-of-language debate is direct: static embeddings are rich entries that get contextually adjusted, not minimal cores that get built from scratch each time.
The implication for mechanistic interpretability: semantic information is distributed across two levels — the token embedding layer and the contextualized layers — and analysis that focuses only on intermediate or final representations may miss what was already encoded at input.
Inquiring lines that use this note as a source 35
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- Can discrete codes and embedding injection both solve the text versus identity tradeoff?
- Can semantic tokens bridge embeddings and direct recommendation?
- Why does training data saliency distort how models judge meaning?
- How do embedding dimension limits constrain what concept models can represent?
- Can meaning-level metrics like Semantic Entropy avoid length bias?
- How does syntactic encoding relate to semantic feature representation?
- How does entropy-based patching compare to fixed token vocabularies in practice?
- Can contrastive learning fix the semantic association problem in embeddings?
- How should meaning spaces be systematically modeled across different applications?
- Can speech embeddings carry articulatory structure that text cannot?
- Why do embeddings measure semantic association instead of task relevance?
- What semantic classifier design avoids lexical variation without genuine conceptual distinctness?
- How does candidate-conditional activation differ from static embedding-based feature crosses?
- What makes vector embeddings fail on single-hop semantic relevance queries?
- Can correct model outputs prove that semantic meaning rather than surface patterns drove the response?
- Why does AI struggle with wordplay when it has access to word embeddings?
- Can frame semantics explain why context matters more than word similarity?
- How do hidden embeddings preserve more information than discrete tokens?
- Can presupposition projection strength vary by context in embeddings?
- Does the linear representation hypothesis reflect networks or reflect our analysis tools?
- What explains the contextual variability of knowledge in transformers?
- How does oral transmission of knowledge resemble transformer generation?
- How does iconicity detection work within static embeddings before any attention?
- What semantic information is lost if analysis skips the token embedding layer?
- Are static embeddings analogous to the formal linguistic competence layer?
- How do static embeddings and contextualized representations divide semantic labor?
- How does semantic framing differ from content injection attacks?
- What makes modernized N-gram embeddings composable with transformer architectures?
- Why do leading embedding eigenvectors align with WordNet taxonomy structure?
- Can vector embeddings measure task relevance instead of semantic similarity?
- Can decoder-only models become effective text encoders with training?
- Does the same spectral signature appear across different embedding models?
- Can single-vector embeddings capture non-commutative relationships like word order?
- Why do embeddings measure association instead of actual task relevance?
- How do semantic features in representations become steerable task-specific directions?
Related concepts in this collection 4
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Does semantic grounding in language models come in degrees?
Rather than asking whether LLMs truly understand meaning, this explores whether grounding is actually a multi-dimensional spectrum. The question matters because it reframes the sterile understand/don't-understand debate into measurable, distinct capacities.
the tri-partite taxonomy operates at the contextualized level; static embeddings provide the base material that functional grounding then operates on
-
Are language models developing real functional competence or just formal competence?
Neuroscience suggests formal linguistic competence (rules and patterns) and functional competence (real-world understanding) rely on different brain mechanisms. Can next-token prediction alone produce both, or does it leave functional competence behind?
the static/contextualized split in transformers may parallel the formal/functional competence distinction: formal competence in the embeddings, functional competence requiring attention
-
Why does reasoning training help math but hurt medical tasks?
Explores whether reasoning and knowledge rely on different network mechanisms, and why training one might undermine the other across different domains.
extends: semantic knowledge begins even before the first layer, in the embedding matrix itself
-
How do language models encode syntactic relations geometrically?
Do LLM embeddings use distance alone or also direction to represent syntax? Understanding whether neural networks can spontaneously develop symbolic-compatible geometric structures.
complementary layered discovery: static embeddings encode semantic features at the embedding layer while Polar Probe reveals syntactic structure across transformer layers — semantic base with syntactic superstructure
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- Word Meanings in Transformer Language Models
- Semantic Structure in Large Language Model Embeddings
- Problems with Cosine as a Measure of Embedding Similarity for High Frequency Words
- From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning
- What are the Goals of Distributional Semantics?
- From Text to Emoji: How PEFT-Driven Personality Manipulation Unleashes the Emoji Potential in LLMs
- Topic Modeling in Embedding Spaces
- A Primer on the Inner Workings of Transformer-based Language Models
Original note title
transformer static embeddings encode rich semantic information including valence concreteness iconicity and taboo — ruling out meaning eliminativism