Why does explicit theory injection work better than example-based learning for reasoning tasks?

This explores why handing a model explicit principles or abstractions might beat showing it worked examples on reasoning tasks — and the corpus suggests the real story is about what each method actually transfers, not which is simply 'better.'

This reads the question as: when teaching a model to reason, why would stating the underlying theory outperform letting it generalize from examples? The corpus points to one core reason — examples tend to teach the *form* of reasoning, while explicit theory teaches the *procedure* — but it also complicates the premise in interesting ways.

The strongest case for explicit injection comes from how reasoning generalizes. An analysis of millions of pretraining documents found that reasoning ability rides on broad, transferable *procedural* knowledge, while factual recall depends on narrow memorization of specific documents Does procedural knowledge drive reasoning more than factual retrieval?. Examples are closer to the memorization end: models learn to imitate the shape of a solution rather than the rule that generates it. That's exactly what chain-of-thought turns out to be — a constrained reproduction of familiar reasoning schemata that degrades predictably the moment the distribution shifts Does chain-of-thought reasoning reveal genuine inference or pattern matching?. A theory or abstraction, by contrast, gives the model something it can carry across problems instead of pattern-matching back to seen cases.

There's also a robustness-and-interpretability argument. Systems that learn purely tacitly from data inherit statistical biases that no normative rule corrects, produce uninterpretable representations, and fail to generalize beyond their training distribution — and the same work shows that injecting structured knowledge fixes much of this at surprisingly low corpus cost Does refusing explicit knowledge harm AI system performance?. Explicit theory acts like guardrails the examples can't supply on their own. Related work on abstraction-guided reasoning shows why: explicit abstractions force *breadth-first* exploration of strategies, preventing the 'underthinking' collapse where a model commits early to one shallow path Can abstractions guide exploration better than depth alone?. Examples nudge toward the nearest familiar path; theory opens up the option space.

But here's the twist the corpus insists on — and the thing you might not have known you wanted to know: neither method is really *creating* reasoning. Five independent techniques all turn out to merely *elicit* reasoning already latent in base-model activations; post-training selects capability rather than installs it Do base models already contain hidden reasoning ability?. This reframes the whole question. And it explains a genuinely strange result: models trained on *deliberately corrupted* reasoning traces perform about as well as those trained on correct ones, suggesting traces function as computational scaffolding, not as meaningful content the model absorbs Do reasoning traces need to be semantically correct?. If examples were teaching reasoning by their semantic content, garbage examples wouldn't work — yet they do. So part of why explicit theory 'wins' may be that it activates and organizes existing latent capability more reliably than examples, which can devolve into pure scaffolding.

Two caveats keep this honest. First, prompting or example-feeding can only *reorganize* knowledge already in the model — it cannot inject anything genuinely absent, which is a hard ceiling on any in-context approach and an argument for actual training-time injection when the knowledge is missing Can prompt optimization teach models knowledge they lack? How do knowledge injection methods trade off flexibility and cost?. Second, the advantage is task-shaped, not universal: explicit step-wise reasoning helps tasks with logical-derivation structure (math, code) but actively *degrades* tasks requiring holistic continuous judgment like reranking When does explicit reasoning actually help model performance?. And because these models reason through semantic association rather than symbolic logic, even a correct explicit rule can fail to bite when it's decoupled from familiar meaning Do large language models reason symbolically or semantically?. So 'theory beats examples' holds best precisely where the task has derivable structure — and where the theory is phrased in semantics the model already understands.

Sources 10 notes

Does procedural knowledge drive reasoning more than factual retrieval?

Analysis of 5 million pretraining documents shows reasoning relies on broad, transferable procedural knowledge from diverse sources, unlike factual recall which depends on narrow, document-specific memorization of target facts.

Does chain-of-thought reasoning reveal genuine inference or pattern matching?

CoT works by constraining models to reproduce familiar reasoning patterns from training, not by enabling novel symbolic reasoning. Performance degrades predictably under distribution shifts—the signature of imitation rather than capability emergence.

Does refusing explicit knowledge harm AI system performance?

AI systems that learn exclusively from data produce uninterpretable representations, inherit statistical biases uncorrected by normative rules, and fail to generalize beyond training distributions. Structured knowledge injection at minimal corpus cost substantially improves performance.

Can abstractions guide exploration better than depth alone?

RLAD jointly trains abstraction and solution generators, showing that allocating test-time compute to diverse abstractions outperforms parallel solution sampling at large budgets. Abstractions create structured breadth-first exploration that prevents the underthinking failure mode of depth-only reasoning chains.

Do base models already contain hidden reasoning ability?

Five independent mechanisms—RL steering, critique fine-tuning, decoding changes, SAE feature steering, and RLVR—all elicit reasoning already present in base model activations. Post-training selects rather than creates reasoning; the bottleneck is elicitation, not capability acquisition.

Do reasoning traces need to be semantically correct?

Models trained on systematically irrelevant traces maintain solution accuracy and sometimes improve out-of-distribution generalization, suggesting traces function as computational scaffolding rather than meaningful reasoning steps.

Can prompt optimization teach models knowledge they lack?

Prompting works entirely within a model's pre-existing training distribution and cannot supply domain knowledge absent from training data. This creates a hard ceiling: no prompt strategy can compensate for missing foundational knowledge, only reorganize what already exists.

How do knowledge injection methods trade off flexibility and cost?

Dynamic injection (RAG) maximizes flexibility but adds latency; static embedding is fastest but costly and inflexible; modular adapters balance efficiency with swappability; prompt optimization requires no training but only activates existing knowledge. Combining all three outperforms any single approach.

When does explicit reasoning actually help model performance?

Explicit reasoning benefits tasks with step-wise logical structure (math, code) but degrades tasks requiring nuanced continuous judgment (reranking, holistic assessment). Meta-analysis across 100+ papers confirms CoT helps primarily on symbolic logic tasks, with selective deployment saving 60-70% of inference tokens on non-math tasks.

Do large language models reason symbolically or semantically?

When semantic content is decoupled from reasoning tasks, LLM performance collapses even with correct rules in context. Models rely on parametric commonsense and token associations rather than formal logical manipulation, constraining reasoning to training distribution semantics.

Why does explicit theory injection work better than example-based learning for reasoning tasks?

Sources 10 notes

Next inquiring lines