SYNTHESIS NOTE
Training, RL, and Test-Time Scaling Reasoning, Retrieval, and Evaluation Model Architecture and Internals

Does fine-tuning on new facts increase hallucination risk?

When LLMs learn unfamiliar facts through fine-tuning, do they become more prone to hallucinating about things they already knew? Understanding this matters for safe knowledge updates.

Synthesis note · 2026-06-03 · sourced from Training Fine Tuning

It is often conjectured that supervised fine-tuning on facts the model never saw in pretraining teaches it to hallucinate — by training it to assert things ungrounded in its knowledge. This work tests that in a controlled closed-book QA setup, varying the proportion of fine-tuning examples that introduce new knowledge. Two findings: LLMs struggle to acquire new factual knowledge through fine-tuning — Unknown examples are fit significantly slower than examples consistent with existing knowledge; but as those Unknown examples are eventually learned, they linearly increase the model's tendency to hallucinate on pre-existing knowledge. So the harm shows up as a form of overfitting on the slow-to-fit Unknown examples.

The keeper is a concrete fine-tuning practice: prefer early-stopping over a fixed step count, and consider filtering out Unknown examples (or keeping a few to teach uncertainty expression) — because the act of forcing new facts in degrades grounded recall.

This completes a tight loop in the vault's knowledge-acquisition thread. It is the empirical mechanism behind Can models store unlimited facts without growing larger? (externalize facts rather than fine-tune them in), it complements Does teaching question patterns before document training improve knowledge access? (encoding order matters), and it shares the leakage/overfitting concern of Does repeated sensitive data in fine-tuning cause memorization?.

Inquiring lines that use this note as a source 1

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 3

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
12 direct connections · 115 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

fine-tuning on new factual knowledge is learned slowly and once learned linearly increases hallucination of pre-existing knowledge