INQUIRING LINE

How does co-occurrence statistics alone produce hierarchical concept organization?

This explores how the plain statistics of which words appear near which other words — with no built-in tree-builder — can give rise to the nested, general-to-specific organization of concepts inside language models.


This explores how the plain statistics of which words appear near which other words can produce neat, tree-like concept hierarchies inside language models — without any mechanism that was designed to build hierarchies. The short version the corpus offers: the hierarchy isn't added, it falls out of the math. When you take the matrix of how often words co-occur and look at its spectral structure (its eigenvectors), the nested geometry of concepts is already there as a direct consequence of corpus statistics, not something the model learned a special trick to represent Where does hierarchical structure in language models come from?.

What makes this concrete is the *order* in which that structure appears. The leading eigenvectors of an embedding's Gram matrix carve the concept space coarsely first — broad branches like animal-vs-object — and then progressively finer ones, level by level, in a way that lines up strikingly well with WordNet's hand-built hypernym tree Do embedding eigenvectors organize taxonomy from coarse to fine?. So 'hierarchy' here isn't a stored tree; it's a spectral ordering, where the strongest statistical signal happens to be the most general distinction and weaker signals are the finer ones.

There's a frequency story underneath this that explains *why* the coarse stuff dominates. General words (hypernyms like 'animal') simply occur far more often than specific ones (hyponyms like 'beagle'), because every specific instance is also an instance of the general category Does word frequency correlate with semantic abstraction?. High frequency means strong, stable co-occurrence patterns, which means those distinctions land in the dominant directions of the statistics — i.e., near the top of the emergent tree. Abstraction rides on frequency, and frequency rides on counting.

Widen the lens and this fits a broader theme in the corpus: meaning in these models is purely *relational*. LLMs reconstruct culturally situated structure by compressing the relations between words alone — no grounding in the outside world required — which is essentially Saussure's idea of language as a system of differences operationalized by matrix algebra Can language models learn meaning without engaging the world?. Hierarchical organization is one shape that relational compression naturally takes; another is the way many semantic features collapse onto a few entangled low-dimensional axes that mirror human evaluation dimensions Do LLM semantic features organize along human evaluation dimensions?.

The interesting twist worth carrying away: 'emerges from statistics' isn't a downgrade. Circuit tracing inside actual trained models finds features genuinely arranged in tiers — tokens, then abstract concepts, then operations, then outputs — with bigger models growing *richer* abstract layers rather than just memorizing more How do language models organize features across processing layers?. The same co-occurrence pressure that builds the hierarchy also has a cost, though: it pushes models to compress aggressively toward broad categories, losing the fine-grained distinctions humans keep for situated use Do LLMs compress concepts more aggressively than humans do?. So co-occurrence statistics give you the skeleton of conceptual organization for free — but the same drive that builds the tree is what blurs its smallest branches.


Sources 7 notes

Where does hierarchical structure in language models come from?

LLM hierarchical representations arise as a direct mathematical consequence of corpus statistics, not from hierarchy-specific mechanisms. Spectral analysis of word co-occurrence matrices predicts and reproduces the same nested geometry found in trained embeddings and word2vec models.

Do embedding eigenvectors organize taxonomy from coarse to fine?

Leading eigenvectors of embedding Gram matrices separate broad taxonomic branches first, then progressively finer sub-branches—a coarse-to-fine spectral order that tracks the WordNet hypernym tree level by level, confirming predictions from co-occurrence statistics.

Does word frequency correlate with semantic abstraction?

WordNet analysis shows hypernyms (general concepts) occur more frequently than hyponyms (specific ones). Combined with LLMs' frequency bias, this means preferring common paraphrases systematically drifts toward abstraction, erasing expert-level specificity.

Can language models learn meaning without engaging the world?

Research shows LLMs learn culturally situated discourse patterns by compressing relational structure from text, demonstrating that fluent language generation requires no external referents or embodied grounding.

Do LLM semantic features organize along human evaluation dimensions?

Twenty-eight semantic axes in LLM embeddings reduce to three principal components matching human EPA structure. Intervening on one feature predictably shifts aligned features proportionally, creating unavoidable off-target effects that reflect how meaning is fundamentally organized.

How do language models organize features across processing layers?

Circuit tracing in Claude models reveals features progress from token-level inputs to abstract concepts to functional operations to outputs. Larger models develop richer abstract features, suggesting scaling enables higher-level conceptual reasoning rather than pattern memorization.

Do LLMs compress concepts more aggressively than humans do?

Using Rate-Distortion Theory on cognitive datasets, LLMs capture broad category structure but lose fine-grained distinctions humans preserve. LLMs maximize compression efficiency; humans trade compression for contextual meaning that enables situated action.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a mechanistic interpretability researcher re-testing whether co-occurrence statistics alone can account for hierarchical concept organization in modern LLMs. The question remains open: *Does the spectral structure of word co-occurrence matrices fully explain—or merely approximate—the hierarchical organization we observe in trained models?*

What a curated library found — and when (findings span 2024–2026; treat as dated claims):
• Spectral eigenvectors of embedding Gram matrices carve concept space coarse-to-fine, mirroring WordNet's hypernym tree; hierarchy emerges without dedicated mechanism (2026-05).
• Hypernyms occur far more frequently than hyponyms, pushing general distinctions into dominant statistical directions; abstraction tracks frequency (2026-05).
• LLM embeddings compress meaning purely relationally (no external grounding), operationalizing Saussurean langue via matrix algebra (2026-05).
• Circuit tracing reveals four-tier feature hierarchies (tokens → abstract concepts → operations → outputs); larger models grow richer abstract layers (2025-05).
• Models aggressively compress toward broad categories, blurring fine-grained distinctions that humans preserve for situated use (2025-05).

Anchor papers (verify; mind their dates):
• arXiv:2605.23821 (2026-05) — Hierarchical Concept Geometry in Language Models Emerges from Word Co-occurrence
• arXiv:2505.17117 (2025-05) — From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning
• arXiv:2508.10003 (2025-08) — Semantic Structure in Large Language Model Embeddings
• arXiv:2507.08017 (2025-07) — Mechanistic Indicators of Understanding in Large Language Models

Your task:
(1) RE-TEST THE SPECTRAL CLAIM. Has newer work (last 6 months) shown that co-occurrence statistics *alone* fail to predict hierarchies in frontier models (e.g., multimodal, post-training RL, or instruction-tuned variants)? Where does the spectral ordering hypothesis still hold, and where does it break? Identify what mechanism—if any—now accounts for hierarchy *beyond* frequency.
(2) Surface the strongest **contradicting** work: Do recent papers argue hierarchies are *imposed* by architecture, loss, or alignment rather than *emergent* from corpus statistics? Cite concretely.
(3) Propose two follow-up questions that assume the regime may have shifted: (a) If fine-grained concept structure is being *actively suppressed* (per the compression cost), can you reverse it via training or prompting? (b) Does hierarchy remain spectrally primary in models trained on synthetic or adversarially-designed corpora?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines