Can reasoning happen at the sentence level instead of tokens?
Does moving from token-level to sentence-level reasoning in embedding space preserve the capability for complex reasoning while enabling language-agnostic processing? This challenges assumptions about how LLMs must operate.
Current LLMs operate at the token level — every reasoning step is a next-token prediction. Meta's Large Concept Model (LCM) challenges this by operating at the sentence level, reasoning in an abstract embedding space (SONAR) where each "concept" corresponds to a sentence.
The architectural difference is fundamental. The LCM:
- Does not see tokens — it receives and produces sentence-level embeddings
- Is language-agnostic — the same reasoning process works for any language or modality because SONAR encodes meaning, not surface form
- Separates reasoning from instantiation — reasoning happens once in the abstract space; decoding to a specific language happens afterward and can target any language without re-running the reasoning
The hierarchical structure adds a planning layer. The LCM predicts a sequence of concepts auto-regressively until it produces a "break concept" — analogous to a paragraph break. At that point, a Large Planning Model (LPM) generates a plan to condition the LCM for the next sequence. This two-level architecture (sentence-level prediction + paragraph-level planning) is designed to produce more coherent long-form output than flat token-level generation.
The comparison to JEPA (LeCun, 2022) is instructive: both predict representations in embedding space rather than raw observations. But where JEPA emphasizes learning the representation space via self-supervision, LCM focuses on accurate prediction within an existing embedding space (SONAR). The embedding quality is assumed, not learned end-to-end.
This connects to the latent reasoning thread through a different mechanism. Can models reason without generating visible thinking tokens? achieves reasoning without tokens via recurrent depth in continuous space. LCM achieves it via sentence-level embeddings. Both challenge the assumption that verbalized token-by-token generation is necessary for reasoning, but from different angles: depth-recurrent models reason within a single token's representation; LCM reasons between sentence-level units.
The practical implication: if reasoning can happen at the concept level rather than the token level, then the verbalized chain-of-thought paradigm is not the only path to sophisticated reasoning. The question is whether sentence-level granularity captures enough structure for complex reasoning tasks, or whether some tasks require finer-grained (sub-sentence) reasoning steps.
Inquiring lines that use this note as a source 48
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- Does sentence-level granularity capture enough structure for complex reasoning tasks?
- How do embedding dimension limits constrain what concept models can represent?
- What is the relationship between reasoning depth and verbalization requirements?
- How does SONAR embedding quality affect downstream reasoning accuracy?
- Do modern architectures in NLP and vision rely on dot products intentionally?
- How do soft thought tokens differ from decoded assistant outputs?
- How does policy entropy collapse constrain token-level distribution in reasoning?
- How should meaning spaces be systematically modeled across different applications?
- Does the DeepSeek R1 single token insertion represent genuine reasoning?
- Why does semantic decoupling specifically break LLM reasoning abilities?
- Can latent reasoning in continuous space scale beyond supervised reasoning tasks?
- How do lower network layers compress facts versus higher reasoning layers?
- How do hidden embeddings preserve more information than discrete tokens?
- Do latent sequence vectors outperform per-token latent iterative computation for reasoning?
- What internal mechanisms explain LLM reasoning and representation limits?
- Why does LLM compression eliminate causal grounding in conceptual representations?
- Can models hide their reasoning in continuous space rather than natural language?
- How does in-context semantic reasoning differ from symbolic reasoning in concept fusion?
- Do reflection tokens and symbolic tokens serve different roles in reasoning?
- Can latent space represent reasoning dimensions that text cannot?
- How do recursive language models rethink where to store reasoning?
- Why does hierarchical formal language training improve token efficiency more than natural language?
- Can latent reasoning achieve the same substitution without tokens?
- Why does removing semantic content collapse reasoning in language models?
- Can capability boundary collapse be addressed by operating at representational rather than token level?
- Can language models perform genuine symbolic reasoning without semantic grounding?
- How does tokenization change what gets counted as valuable knowledge?
- What semantic information is lost if analysis skips the token embedding layer?
- Are static embeddings analogous to the formal linguistic competence layer?
- How do static embeddings and contextualized representations divide semantic labor?
- Can instance-adaptive reasoning happen without sequential token dependencies?
- Why does concise reasoning maintain accuracy with far fewer tokens?
- Does structured decomposition improve LLM reasoning in other compound tasks?
- How do reasoning-invariant tokens dilute learning signals in uniform averaging?
- Why do unit-sphere spaces fail at distinguishing word order and negation?
- Does reasoning happen in hidden space or in generated tokens?
- What semantic information is necessary to preserve for sound LLM reasoning?
- Can reasoning happen in latent space without chain of thought?
- What evidence shows that reasoning chains encode token-level functional structure?
- What computational structures can actually scale serial reasoning depth?
- How does token-level interaction like ColBERT overcome commutativity constraints?
- How do semantic and symbolic reasoning capabilities differ in language models?
- Why is latent-level prediction more sample-efficient than token-level prediction?
- Do discrete tokenized modalities preserve information better than continuous embeddings?
- Can articulating latent reasoning processes improve transfer across domains?
- Why does latent-level prediction beat token-level prediction for reasoning?
- How do latents at the same hierarchy level become more correlated than tokens?
- Why does token ordering in LLMs create sequences rather than true temporal flow?
Related concepts in this collection 3
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Can models reason without generating visible thinking tokens?
Explores whether intermediate reasoning must be verbalized as text tokens, or if models can think in hidden continuous space. Challenges a foundational assumption about how language models scale their reasoning capabilities.
alternative latent reasoning via recurrent depth; LCM is a third approach (sentence-level, not token-level or depth-recurrent)
-
Can models reason without generating visible thinking steps?
Do machine reasoning systems actually require verbalized chains of thought, or can they solve complex problems through hidden computation? This challenges how we measure and understand reasoning.
the shared question: does reasoning require verbalization? LCM says no, operating at sentence granularity
-
Do embedding dimensions fundamentally limit retrievable document combinations?
Can single-vector embeddings represent any top-k document subset a user might need? Research using communication complexity theory suggests there are hard geometric limits independent of training data or model architecture.
LCM relies entirely on embedding quality (SONAR); the mathematical limits of embeddings constrain what LCM can represent
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- Large Concept Models: Language Modeling in a Sentence Representation Space
- Training Large Language Models to Reason in a Continuous Latent Space
- Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space
- Farther the Shift, Sparser the Representation: Analyzing OOD Mechanisms in LLMs
- DeepSeek-R1 Thoughtology: Let's think about LLM Reasoning
- Large Language Models are In-Context Semantic Reasoners rather than Symbolic Reasoners
- Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language Models -- A Survey
- Reasoning Models Are More Easily Gaslighted Than You Think
Original note title
Large Concept Models enable sentence-level reasoning in a language-agnostic embedding space — hierarchical abstraction beyond token-level processing