SYNTHESIS NOTE
Training, RL, and Test-Time Scaling Reasoning, Retrieval, and Evaluation Model Architecture and Internals

Do models fail worse when their own errors fill the context?

As a model's prior mistakes accumulate in context, does subsequent accuracy degrade predictably? And can scaling or architectural changes prevent this self-contamination effect?

Synthesis note · 2026-02-22 · sourced from Reasoning Critiques
How should we allocate compute budget at inference time?

A model executing a long-horizon task makes errors. Those errors remain in the context. The model then predicts the next token conditioned on a history that contains its own mistakes. Error probability increases. More errors accumulate. Performance degrades faster than a constant per-step error rate would predict.

This self-conditioning effect is empirically verified by controlling the error rate in the history shown to the model. As the error rate in prior context increases, subsequent step accuracy drops sharply. The mechanism is straightforward: models are trained to predict the most likely next token given context; when the context contains errors, those errors become part of the distribution being continued.

Unlike humans — who typically improve at a task with repetition — LLMs become less reliable as their context fills with their own mistakes. Practice does not help; contamination does.

Three practical implications:

  1. Model scaling does not fix this — larger models self-condition just as much as smaller ones. The problem is not capability but the conditional prediction objective itself.

  2. Long-horizon failure attribution matters — what looks like a reasoning or planning failure in long tasks is often an execution failure caused by error accumulation. The model had the capability; its own prior outputs degraded it. The DELEGATE-52 evidence — see Do frontier LLMs silently corrupt documents in long workflows? — is this mechanism at the workflow scale: a 50-round-trip relay is a maximally adversarial setup for self-conditioning, and the corruption curve decelerates but never plateaus, exactly the pattern this note predicts.

  3. Thinking models fix self-conditioning — thinking models (like R1) are not affected by prior mistakes in the same way; sequential test-time compute greatly improves the length of task a model can complete (DeepSeek-V3 fails at 2 steps; R1 executes 200). The thinking process appears to insulate reasoning from error-contaminated context.

This is distinct from Does self-revision actually improve reasoning in language models?. Self-revision is a model's deliberate re-examination of its own reasoning, which introduces errors. Self-conditioning is a passive contamination mechanism — no deliberate revision required, just the accumulation of prior errors in context.

Inquiring lines that use this note as a source 62

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 8

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
21 direct connections · 218 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

self-conditioning effect — prior errors in context history amplify future error rates in long-horizon tasks