SYNTHESIS NOTE

How do language models encode syntactic relations geometrically?

Do LLM embeddings use distance alone or also direction to represent syntax? Understanding whether neural networks can spontaneously develop symbolic-compatible geometric structures.

Synthesis note · 2026-02-23 · sourced from Cognitive Models Latent

The symbol-vector divide has been a core challenge in cognitive science since Smolensky (1987): syntactic trees are symbolic structures that seem incompatible with the vectorial representations of neural networks. The Structural Probe (Hewitt & Manning 2019) made partial progress — it showed that the existence of syntactic links between words is encoded in the distance between their corresponding embeddings. But whether the type and direction of syntactic relations were represented remained unknown.

The Polar Probe answers this: syntactic relations are coded by the relative direction between nearby embeddings, not just their distance. Using both distance and direction (a polar coordinate system), the Polar Probe recovers syntactic relation types and directions with nearly 2x the accuracy of the distance-only Structural Probe.

Three key findings:

Complete syntactic encoding. The polar coordinate system captures existence, type, AND direction of syntactic relations — the full specification of a dependency tree is encoded in the geometry of LLM activations.
Low-dimensional subspace. This encoding exists in a low-dimensional subspace of intermediate layers across many LLMs, and becomes increasingly precise in frontier models. This is not a brute-force representation but a compressed, structured one.
Nested consistency. Similar syntactic relations are coded similarly across nested levels of syntactic trees. The encoding is not ad hoc for each syntactic instance but systematic — a genuine coordinate system.

The resolution of the symbol-vector divide is significant: LLMs don't need explicit symbolic mechanisms to represent symbolic structures. They spontaneously learn a geometry that explicitly represents the main symbolic structures of linguistic theory. This doesn't mean LLMs "understand" syntax in a human sense, but it demonstrates that connectionist architectures can natively develop symbolic-compatible representations — the two paradigms are not incompatible.

This connects to Do transformer static embeddings actually encode semantic meaning? at a different structural level: static embeddings encode semantic features, while intermediate activations encode syntactic relations. Together they suggest LLM representations are far richer and more structured than the "statistical patterns" dismissal implies.

Inquiring lines that use this note as a source 51

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 5

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map

13 direct connections · 93 in 2-hop network ·medium cluster Open in graph ↗

How do language models encode syntactic relation… Do transformer static embeddings actually encode s… Why do neural networks fail at compositional gener… Can neural networks learn compositional skills wit… Do neural networks naturally learn modular composi… Where does hierarchical structure in language mode…

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Do transformer static embeddings actually encode semantic meaning? Explores whether the fixed word embeddings that enter transformer networks contain rich semantic information or serve only as shallow placeholders. This addresses a longstanding debate in philosophy of language about whether word meanings are stored or constructed.
semantic features in static embeddings complement syntactic features in intermediate activations
Why do neural networks fail at compositional generalization? Exploring whether the binding problem from neuroscience explains neural networks' inability to systematically generalize. The binding problem has three aspects—segregation, representation, and composition—each creating distinct failure modes in how networks handle structured information.
polar coordinate encoding is evidence against the strong version: systematic structure IS represented, even if binding problems remain at the compositional level
Can neural networks learn compositional skills without symbolic mechanisms? Do neural networks need explicit symbolic architecture to compose learned concepts, or can scaling alone enable compositional generalization? This asks whether compositionality is an architectural feature or an emergent property of scale.
convergent: symbolic-like structure emerges without explicit symbolic mechanisms
Do neural networks naturally learn modular compositional structure? Explores whether neural networks decompose compositional tasks into distinct subroutines without explicit symbolic design. This challenges the longstanding view that neural networks are fundamentally non-compositional.
related: modular structure emerges from training
Where does hierarchical structure in language models come from? Do LLMs build hierarchical concept geometry through dedicated mechanisms, or does it emerge naturally from word co-occurrence patterns in training data? Understanding the source matters for interpreting what representations actually reveal about model computation.
contrasts: this note reads symbolic-compatible geometry as spontaneously learned, but distributional theory shows such structure can be a co-occurrence shadow not a learned mechanism

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

a polar coordinate system in llm activations encodes both type and direction of syntactic relations — resolving the symbol-vector divide

How do language models encode syntactic relations geometrically?

Related concepts in this collection 5

Related papers in this collection 8

Search by related questions 4