INQUIRING LINE

What role does inductive bias play versus model capacity in practice?

This explores a tension in how models actually get their abilities: is performance driven by raw capacity (scale, parameters, how much a model *could* represent), or by inductive bias — the structural priors baked in during pretraining and architecture that shape what the model reaches for by default?


This explores whether what a model can do is mostly a matter of raw capacity, or of inductive bias — the built-in tendencies that pretraining and architecture install before any task-specific training. The corpus leans hard toward bias, but with a twist: capacity sets the ceiling, while bias decides what actually gets used. Several independent lines of work converge on the claim that base models already contain latent reasoning ability, and that post-training merely *selects* it rather than creating it Do base models already contain hidden reasoning ability?. RL post-training, on this view, teaches a model *when* to deploy reasoning, not *how* to reason — hybrid models recover most of the gains just by routing tokens, and the activation directions for reasoning strategies exist before training begins Does RL post-training create reasoning or just deploy it?. So the practical lever isn't adding capacity; it's shaping the bias that governs elicitation.

Where does that bias come from? Largely pretraining, and it runs deeper than most fine-tuning interventions can reach. A causal study using random seeds and cross-tuning found that models sharing a pretrained backbone show the same cognitive bias patterns regardless of what fine-tuning data they later saw — biases are planted in pretraining and only nudged afterward Where do cognitive biases in language models come from?. And what's absorbed isn't just facts. Reasoning generalization is driven by *procedural* knowledge spread across many documents, while factual recall depends on narrow memorization — meaning the transferable, reusable priors (the useful inductive bias) come from the diversity of how-to patterns in the corpus, not from any single source Does procedural knowledge drive reasoning more than factual retrieval?.

The sharper, less comfortable finding is that inductive bias often *masquerades* as capacity. Models can look like they're reasoning when they're really just leaning on a prior. Twelve of fourteen models in one study performed worse when constraints were removed — they were defaulting conservatively to harder options, not actually evaluating the problem Are models actually reasoning about constraints or just defaulting conservatively?. The same theme shows up in how models inherit human cognitive shortcuts: they reproduce human content effects on logic tasks item-by-item Do language models show the same content effects humans do? and replicate specific human causal-reasoning errors like weak explaining-away and Markov violations Do large language models make the same causal reasoning mistakes as humans?. Those aren't capacity limits — they're biases absorbed from training-data statistics, and high accuracy can hide them entirely, the same way a 'theory-free' high-accuracy model can launder statistical error as objectivity Can AI models be truly free from human bias?.

The payoff is that if bias is the real bottleneck, you can intervene on it cheaply rather than scaling capacity. Reasoning verbosity turns out to be a single linear direction in activation space — extract one vector from 50 examples and cut chain-of-thought length 67% with no retraining Can we steer reasoning toward brevity without retraining?. Architecture itself is an inductive-bias knob: making latent reasoning transitions stochastic rather than deterministic lets a model hold uncertainty and explore multiple solutions that the deterministic prior simply couldn't represent Can stochastic latent reasoning help models explore multiple solutions?. The cautionary flip side is that clumsy training can install *bad* bias on top of good capacity: over-hard RLVR samples teach degenerate shortcuts that then contaminate abilities the model already had Do overly hard RLVR samples actually harm model capabilities?.

The thing you might not have expected to learn: in this corpus, 'add capacity' is rarely the answer to a reasoning problem — the capacity is usually already there, latent. The leverage is almost always in the inductive bias: where it came from (pretraining, not fine-tuning), how it disguises itself as competence, and how surprisingly editable it is once you can name the direction it points in.


Sources 11 notes

Do base models already contain hidden reasoning ability?

Five independent mechanisms—RL steering, critique fine-tuning, decoding changes, SAE feature steering, and RLVR—all elicit reasoning already present in base model activations. Post-training selects rather than creates reasoning; the bottleneck is elicitation, not capability acquisition.

Does RL post-training create reasoning or just deploy it?

Evidence shows base models already contain reasoning capability in latent form; RL training optimizes deployment timing rather than capability creation. Hybrid models recover 91% of performance gains by routing tokens only, and activation vectors for reasoning strategies pre-exist before any RL.

Where do cognitive biases in language models come from?

A causal experiment using random-seed variation and cross-tuning showed that models sharing a pretrained backbone exhibit similar bias patterns regardless of finetuning data. Biases are planted during pretraining and merely swayed by instruction tuning.

Does procedural knowledge drive reasoning more than factual retrieval?

Analysis of 5 million pretraining documents shows reasoning relies on broad, transferable procedural knowledge from diverse sources, unlike factual recall which depends on narrow, document-specific memorization of target facts.

Are models actually reasoning about constraints or just defaulting conservatively?

Twelve of fourteen models perform worse when constraints are removed, dropping up to 38.5 percentage points. Models appear to reason correctly by defaulting to harder options, not by actually evaluating constraints.

Do language models show the same content effects humans do?

LLMs show identical content-sensitivity patterns to humans on NLI, syllogisms, and Wason tasks, with belief-bias signatures matching human error rates item-by-item. This behavioral isomorphism across three independent tasks suggests content and logical form are inseparable in transformer reasoning architecturally.

Do large language models make the same causal reasoning mistakes as humans?

LLMs show weak explaining away and Markov violations in collider networks, matching human error patterns exactly. This suggests shared mechanisms rooted in training data statistics rather than categorical reasoning inferiority.

Can AI models be truly free from human bias?

Research shows that 'theory-free' AI models mask bigotry behind high accuracy metrics while committing fundamental statistical errors. A 95% accurate criminal justice system would wrongly convict thousands, demonstrating that model sophistication does not validate causal inference.

Can we steer reasoning toward brevity without retraining?

Activation-Steered Compression extracts a single vector from 50 paired examples to reduce chain-of-thought length by 67% while maintaining accuracy and achieving 2.73x speedup. The method is training-free and generalizes across model sizes and domains.

Can stochastic latent reasoning help models explore multiple solutions?

GRAM replaces deterministic latent updates with stochastic sampling, enabling models to represent distributions over solutions rather than single predictions. This allows handling of ambiguous problems and multiple valid strategies that deterministic designs cannot represent.

Do overly hard RLVR samples actually harm model capabilities?

Training on nearly-impossible problems causes models to learn degenerate shortcuts rather than genuine reasoning, and these shortcuts contaminate pre-existing capabilities. Group-relative normalization treats rare accidental successes as high-advantage trajectories, reinforcing answer repetition and computation-skipping instead of sound reasoning patterns.

Next inquiring lines