SYNTHESIS NOTE
Training, RL, and Test-Time Scaling Model Architecture and Internals Reasoning, Retrieval, and Evaluation

Can careful curation replace massive alignment datasets?

Does fine-tuning a strong pretrained model on 1000 carefully selected examples achieve alignment quality comparable to models trained on vastly larger datasets? This challenges assumptions about data volume in post-training.

Synthesis note · 2026-02-23 · sourced from Alignment
How do you build domain expertise into general AI models? What kind of thing is an LLM really? How should researchers navigate LLM reasoning research?

LIMA ("Less Is More for Alignment") establishes a foundational finding: given a strong pretrained language model, remarkably strong alignment performance can be achieved by fine-tuning on just 1,000 carefully curated training examples. This is the alignment-specific instantiation of a broader principle that pretraining does the heavy lifting and post-training is primarily about activating existing capabilities.

The finding connects to a converging evidence pattern across the vault:

The consistent pattern: post-training interventions require far less data than assumed, but the quality bar is high. Random data at scale underperforms curated data at small scale. This is the "Less Is More" principle — the pretrained model already contains the capabilities; post-training teaches it when and how to deploy them, not what they are.

For alignment specifically, the implication challenges the industry's data collection approach. Massive RLHF annotation efforts with thousands of labelers may be optimizing the wrong variable. Careful curation of a small number of high-quality examples, targeting the specific behavioral patterns desired, may achieve comparable results at a fraction of the cost.

Inquiring lines that use this note as a source 36

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 5

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
14 direct connections · 118 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

1000 carefully curated alignment examples achieve remarkably strong performance — alignment is primarily about data quality not quantity