Can reasoning happen at the sentence level instead of tokens?

Does moving from token-level to sentence-level reasoning in embedding space preserve the capability for complex reasoning while enabling language-agnostic processing? This challenges assumptions about how LLMs must operate.

Synthesis note · 2026-02-23 · sourced from Sentiment Semantics Toxic Detections

Current LLMs operate at the token level — every reasoning step is a next-token prediction. Meta's Large Concept Model (LCM) challenges this by operating at the sentence level, reasoning in an abstract embedding space (SONAR) where each "concept" corresponds to a sentence.

The architectural difference is fundamental. The LCM:

Does not see tokens — it receives and produces sentence-level embeddings
Is language-agnostic — the same reasoning process works for any language or modality because SONAR encodes meaning, not surface form
Separates reasoning from instantiation — reasoning happens once in the abstract space; decoding to a specific language happens afterward and can target any language without re-running the reasoning

The hierarchical structure adds a planning layer. The LCM predicts a sequence of concepts auto-regressively until it produces a "break concept" — analogous to a paragraph break. At that point, a Large Planning Model (LPM) generates a plan to condition the LCM for the next sequence. This two-level architecture (sentence-level prediction + paragraph-level planning) is designed to produce more coherent long-form output than flat token-level generation.

The comparison to JEPA (LeCun, 2022) is instructive: both predict representations in embedding space rather than raw observations. But where JEPA emphasizes learning the representation space via self-supervision, LCM focuses on accurate prediction within an existing embedding space (SONAR). The embedding quality is assumed, not learned end-to-end.

This connects to the latent reasoning thread through a different mechanism. Can models reason without generating visible thinking tokens? achieves reasoning without tokens via recurrent depth in continuous space. LCM achieves it via sentence-level embeddings. Both challenge the assumption that verbalized token-by-token generation is necessary for reasoning, but from different angles: depth-recurrent models reason within a single token's representation; LCM reasons between sentence-level units.

The practical implication: if reasoning can happen at the concept level rather than the token level, then the verbalized chain-of-thought paradigm is not the only path to sophisticated reasoning. The question is whether sentence-level granularity captures enough structure for complex reasoning tasks, or whether some tasks require finer-grained (sub-sentence) reasoning steps.

Inquiring lines that use this note as a source 48

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 3

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map

13 direct connections · 139 in 2-hop network ·dense cluster Open in graph ↗

Can reasoning happen at the sentence level inste… Can models reason without generating visible think… Can models reason without generating visible think… Do embedding dimensions fundamentally limit retrie…

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Can models reason without generating visible thinking tokens? Explores whether intermediate reasoning must be verbalized as text tokens, or if models can think in hidden continuous space. Challenges a foundational assumption about how language models scale their reasoning capabilities.
alternative latent reasoning via recurrent depth; LCM is a third approach (sentence-level, not token-level or depth-recurrent)
Can models reason without generating visible thinking steps? Do machine reasoning systems actually require verbalized chains of thought, or can they solve complex problems through hidden computation? This challenges how we measure and understand reasoning.
the shared question: does reasoning require verbalization? LCM says no, operating at sentence granularity
Do embedding dimensions fundamentally limit retrievable document combinations? Can single-vector embeddings represent any top-k document subset a user might need? Research using communication complexity theory suggests there are hard geometric limits independent of training data or model architecture.
LCM relies entirely on embedding quality (SONAR); the mathematical limits of embeddings constrain what LCM can represent

Can reasoning happen at the sentence level instead of tokens?

Related concepts in this collection 3

Related papers in this collection 8

Search by related questions 4