Why do students learn better from explanations than from solving problems from scratch?
This reads 'students' as learning models (and the learners they stand in for), asking why engaging with worked-out reasoning or critiques of solutions can beat grinding problems out unaided — and the corpus suggests the answer is about what kind of signal each path actually carries.
This explores why a 'student' — whether a model being trained or a person — often gains more from studying explanations and critiques than from solving problems cold, and the collection has a surprisingly direct answer: engaging with reasoning that's already laid out exposes the *structure* of how an answer is reached, while solving from scratch mostly rewards arriving at a surface-correct result. The sharpest evidence is that training a model to critique noisy, sometimes-wrong responses produces deeper understanding than training it to imitate correct answers — because critique forces engagement with the failure modes, not just the happy path Does critiquing errors teach deeper understanding than imitating correct answers?. Strikingly, this works even from a single problem: showing a model good-versus-bad solutions to one question and asking it to judge them is enough to unlock general reasoning, no trial-and-error reinforcement needed Can a single problem unlock reasoning through solution critique?.
Why would seeing the work beat doing the work? Because the value lives in the *process*, not the final answer. Models trained on full search traces — including the wrong turns, dead ends, and backtracking — end up 25% better than models trained only on clean, optimal solutions, because they learn an internal model of *how to search* rather than memorizing one fixed route Does training on messy search processes improve reasoning?. Solving from scratch, by contrast, tends to produce wandering, unsystematic exploration whose success rate collapses as problems get deeper Why do reasoning LLMs fail at deeper problem solving?. An explanation hands you the systematic path for free.
There's a deeper mechanism underneath this. Reasoning generalizes when it draws on broad, transferable *procedural* knowledge — the 'how to do this kind of thing' — rather than narrow factual recall tied to one document Does procedural knowledge drive reasoning more than factual retrieval?. Explanations are procedural knowledge made explicit; solving from scratch leaves the procedure implicit and often un-learned. The same theme shows up in work where simply extracting the rules latent in a worked example into reusable 'skills' lifts a frozen model's performance with no weight updates at all — the gain is purely from making the method visible Can frozen models learn better by extracting context into skills?.
But the corpus also draws the boundary lines, which is where it gets interesting. Explanations don't help unconditionally. Teacher-refined material actually *hurts* when it sits beyond the student's current frontier — a student has to filter for what it can actually absorb, meaning the best explanation is one calibrated to the learner, not the objectively best one Does teacher-refined data always improve student model performance?. This rhymes with the finding that medium-difficulty problems teach best: too easy carries no signal, too hard amplifies shortcuts, and the productive zone balances success against informative failure Why do medium-difficulty problems teach reasoning better than hard ones?. So 'learning from explanations' isn't a free lunch — it's a Goldilocks effect about matching the explanation to where the learner stands.
Two caveats worth carrying away. First, explanations can be unsettlingly hollow and still work: models trained on deliberately corrupted, semantically-irrelevant reasoning traces perform about as well as those trained on correct ones, hinting that traces sometimes act as computational *scaffolding* rather than meaningful instruction Do reasoning traces need to be semantically correct?. Second, for human learners the danger flips: explanations breed false confidence. Reasoning traces and post-hoc justifications make people accept AI answers whether or not they're right — and only *contrastive* explanations that argue both sides genuinely help a reader tell correct from incorrect Do explanations actually help users spot AI mistakes?. The lesson hiding here is the thing you didn't know you wanted to know: explanations teach best not when they show the right answer, but when they show the *contrast* between right and wrong.
Sources 10 notes
Training models to critique noisy responses outperforms training on correct answers because critique forces engagement with failure modes and structural reasoning. Even imperfect critique supervision beats correct-answer imitation, showing how weak surface-pattern learning is for building genuine understanding.
Critique Fine-Tuning achieves reasoning activation comparable to RLVR using only one problem and teacher-generated critiques of varied solutions, with no reinforcement learning. This demonstrates that exposure to correct versus incorrect reasoning on a specific problem is the sufficient activation signal.
Stream of Search pretraining, which represents exploration and backtracking as serialized strings, achieves 25% higher accuracy than optimal-trajectory-only training. Models learn internal world models for search and adaptive strategies rather than fixed external methods.
Current reasoning models lack the three properties of systematic exploration: validity, effectiveness, and necessity. This causes success probability to drop exponentially with problem depth, making medium problems solvable but deep problems catastrophically harder.
Analysis of 5 million pretraining documents shows reasoning relies on broad, transferable procedural knowledge from diverse sources, unlike factual recall which depends on narrow, document-specific memorization of target facts.
Extracting natural-language rules from context into reusable skills improves frozen model reasoning without weight updates. On CL-bench, this lifts GPT-4.1 from 11.1% to 16.5%, with skills transferable across model backbones.
Teacher-refined data degrades performance when it exceeds the student's learning frontier, even if objectively higher quality. Students should filter refinements using their own statistical profile to retain only compatible improvements.
RLVR learning follows an inverted-U curve across difficulty: medium problems yield strongest gains because they balance success frequency with informative failures, while easy samples lack variance and hard samples amplify shortcuts.
Models trained on systematically irrelevant traces maintain solution accuracy and sometimes improve out-of-distribution generalization, suggesting traces function as computational scaffolding rather than meaningful reasoning steps.
Reasoning traces and post-hoc explanations increase user acceptance of AI answers regardless of correctness, engendering false trust. Only dual explanations presenting arguments for and against the answer genuinely help users distinguish correct from incorrect outputs.