Can models trained on many imperfect experts outperform each one?

Do generative models trained on diverse, imperfect human experts develop an implicit consensus that surpasses any individual contributor? This explores whether aggregating diverse perspectives at training time, rather than inference time, can denoise human biases.

Synthesis note · 2026-02-22 · sourced from Training Fine Tuning

The Transcendence paper formalizes a surprising property: generative models trained on many experts with diverse capacities and biases can outperform any single expert. The mechanism is implicit majority voting. When trained on diverse human players (chess), the model's cross-entropy optimization converges on the consensus behavior — which, by the wisdom-of-the-crowd effect, is often better than any individual contributor.

Low-temperature sampling is the key enabler. At low temperature, the model's output distribution concentrates on its highest-probability predictions — the consensus. This is formally equivalent to a majority vote. The advantage is primarily due to performing much better on a small subset of states — likely the critical, outcome-determining positions where individual human biases diverge most and the crowd wisdom is most valuable.

Diversity in the training data is a necessary condition. Without diversity, there is no denoising — a model trained on clones of one expert can only approach that expert's level. The practical conditions for transcendence: (1) diverse training sources with different biases, (2) a task where individual biases are uncorrelated (so they cancel under aggregation), and (3) low-temperature decoding to extract the consensus.

This connects to but is distinct from Why does majority voting outperform more complex inference methods?. That note describes inference-time majority voting over multiple samples from one model. Transcendence describes training-time majority voting implicitly encoded in a single model's weights through diverse training data. The mechanism is analogous — aggregation denoises — but operates at different timescales.

The implication for LLM training is provocative: the "average" of many imperfect human demonstrations may be better than any individual human demonstration, provided the imperfections are diverse rather than correlated. This challenges the assumption that training data quality should be maximized per-example; quantity and diversity of perspectives may matter as much as individual quality.

Inquiring lines that use this note as a source 21

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 4

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map

16 direct connections · 167 in 2-hop network ·dense cluster Open in graph ↗

Can models trained on many imperfect experts out… Why does majority voting outperform more complex i… Does voting discard useful reasoning from losing c… Does training on AI-generated content permanently … Can generative and discriminative models reach agr…

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Why does majority voting outperform more complex inference methods? Simple majority voting across independent samples often matches or beats sophisticated alternatives like Best-of-N and sequential revision. What makes this basic approach so hard to beat for reasoning models?
inference-time voting analog; this is the training-time version
Does voting discard useful reasoning from losing chains? When multiple reasoning chains compete through majority voting, intermediate steps from non-winning chains are discarded. Could extracting and mixing those intermediate facts improve both the final answer and our ability to understand the reasoning?
shows limits of pure voting; transcendence may have similar limits
Does training on AI-generated content permanently degrade model quality? When generative models train on outputs from previous models, do the resulting models lose rare patterns permanently? The question matters because future training data will inevitably contain synthetic content.
counterpoint: while diversity enables transcendence, synthetic data collapses diversity
Can generative and discriminative models reach agreement? Generative and discriminative decoding often produce conflicting answers. Can a game-theoretic framework force these two complementary procedures to reconcile their predictions into a single, more reliable output?
related consensus mechanism: transcendence achieves consensus across diverse training experts at training time, while Consensus Game achieves consensus between generative and discriminative decoding modes at inference time; both extract a signal more reliable than any single perspective

Can models trained on many imperfect experts outperform each one?

Related concepts in this collection 4

Related papers in this collection 8

Search by related questions 5