INQUIRING LINE

What computational role do intermediate tokens actually play in transformers?

This explores whether the visible reasoning tokens a transformer generates — the chains-of-thought, the 'thinking out loud' — are actually where the computation happens, or whether they're doing something stranger.


This explores whether the intermediate tokens a transformer writes out — its reasoning chains, filler, scratch work — are the actual site of computation, or just a surface that the real work hides behind. The corpus points to an unsettling answer: the tokens you can read are often not where the thinking lives, and not all of them are pulling weight.

The most direct evidence that visible tokens can be decoupled from reasoning comes from work showing transformers compute the correct answer in their earliest layers and then actively overwrite it to emit format-compliant filler Do transformers hide reasoning before producing filler tokens?. The reasoning is real but buried — recoverable from lower-ranked predictions — while the token actually printed is a kind of cover story. This dovetails with the finding that you can scale a model's reasoning entirely in latent space, iterating on hidden states without ever verbalizing a step Can models reason without generating visible thinking tokens?. Put together, they suggest verbalization is a training artifact, not a computational requirement: the tokens are downstream of the work, not the work itself.

Then there's the question of whether the tokens even need to *mean* anything. Models trained on deliberately corrupted, irrelevant reasoning traces perform comparably to those trained on correct ones Do reasoning traces need to be semantically correct? — which implies the trace functions as computational scaffolding (a place to spend compute, hold state, extend the forward pass) rather than as a chain of meaningful inferences. But 'scaffolding' isn't uniform. When you prune reasoning chains by functional importance, distinct categories emerge: symbolic-computation tokens get preferentially preserved while grammar and meta-discourse get dropped first Which tokens in reasoning chains actually matter most?. And in reinforcement learning, only about 20% of tokens — the high-entropy 'forking points' where the model faces a real branch — carry the learning signal; training on just those matches full updates Do high-entropy tokens drive reasoning model improvements?. So a minority of tokens are genuine decision pivots, and the rest are connective tissue.

Laterally, this connects to a deeper view of what tokens are doing at all. One framing treats the transformer's residual stream as continuous *flow* rather than storage — knowledge exists only in the act of generation, more like oral performance than retrieval from an archive Do transformer models store knowledge or generate it continuously?. Under that lens, intermediate tokens are checkpoints in an ongoing computation, not records of stored thought. And at the theoretical ceiling, a single finite transformer can in principle become Turing-complete given the right prompt, with intermediate tokens acting as the program's working tape Can a single transformer become universally programmable through prompts? — though standard training rarely produces models that actually use them that way.

The thing you might not have known you wanted to know: the model's visible reasoning is closer to a *workspace* than a *transcript*. Some tokens are load-bearing decision points, many are scaffolding that need not be true, and the answer itself may already exist in the hidden layers before a single 'reasoning' word is written. The chain-of-thought you read is partly genuine computation, partly theater the model performs because we trained it to.


Sources 7 notes

Do transformers hide reasoning before producing filler tokens?

Logit lens analysis shows models trained with hidden CoT tokens compute correct answers in layers 1-3, then actively suppress these representations in final layers to produce format-compliant filler output. The reasoning is fully recoverable from lower-ranked token predictions.

Can models reason without generating visible thinking tokens?

Multiple architectures—depth-recurrent models, Heima, and Coconut—demonstrate that test-time compute scales through hidden state iteration rather than token generation. This suggests verbalization is a training artifact, not a reasoning requirement.

Do reasoning traces need to be semantically correct?

Models trained on systematically irrelevant traces maintain solution accuracy and sometimes improve out-of-distribution generalization, suggesting traces function as computational scaffolding rather than meaningful reasoning steps.

Which tokens in reasoning chains actually matter most?

Greedy likelihood-preserving pruning reveals six functional token categories; symbolic computation tokens are preferentially preserved while grammar and meta-discourse are pruned first. Student models trained on these pruned chains outperform those trained on frontier-model compression.

Do high-entropy tokens drive reasoning model improvements?

Only ~20% of tokens exhibit high entropy as pivotal reasoning decision points; RLVR primarily adjusts these forking tokens. Training exclusively on them matches or exceeds full-gradient performance, revealing that the minority carries the learning signal.

Do transformer models store knowledge or generate it continuously?

Transformers organize knowledge as flowing activations rather than retrievable archives, mirroring oral cultures where knowledge exists only in performance. This explains why model knowledge is contextual, difficult to edit, and inseparable from generation.

Can a single transformer become universally programmable through prompts?

Research proves a single finite-size transformer exists that can compute any computable function given the right prompt, achieving complexity bounds nearly matching unbounded models. However, standard training rarely produces models that learn to implement arbitrary programs this way.

Next inquiring lines