Can priming from different facts interfere with each other in the same model?
This explores whether learning or being prompted with one fact 'lights up' related concepts in a way that can collide or compete with priming from other facts inside the same model — what the corpus calls knowledge priming and association interference.
This explores whether priming from one fact can collide or compete with priming from another inside the same model. The corpus doesn't have a single paper that stages two facts head-to-head, but it maps the surrounding territory well enough to reason about it. The starting point is that priming is real and surprisingly mechanical: after just a few training exposures, whether a new fact 'primes' a related keyword is predictable from how probable that keyword already was before training — with a sharp threshold (~10^-3) separating contexts where priming takes hold from those where it stays inert Can we predict keyword priming before learning happens?. That predictability is the key clue: if priming strength is governed by pre-existing probability, then two facts that both reach toward the same keyword aren't writing on a blank slate — they're competing for the same prior.
The clearest evidence of interference is the finding that models fail to integrate new context when older, stronger associations from training override it. The parametric prior simply wins, and textual prompting alone can't dislodge it — only direct intervention in the model's representations can Why do language models ignore information in their context?. That is interference in its plainest form: priming from one source (training history) suppressing priming from another (your current input). It also tells you the interference isn't symmetric — strength matters, and the entrenched fact usually beats the fresh one.
There's a deeper structural reason to expect collisions. Models can hold multiple distinct tasks active at once 'in superposition,' but the moment generation begins, autoregressive decoding forces a collapse to a single one after the first token Can LLMs handle multiple tasks at once during inference?. So even when several primed pathways coexist internally, the output channel is a bottleneck that lets only one through — which is exactly the condition under which competing primes interfere rather than blend. Relatedly, prompting can only reorganize and activate knowledge already present; it can't inject anything new Can prompt optimization teach models knowledge they lack?, meaning when you prime with a fact you're really redistributing activation across an existing landscape — and redistribution toward one region pulls it away from another.
Two cross-cutting notes sharpen the picture. Cognitive biases — the model's default leanings — are planted in pretraining and only nudged by later tuning Where do cognitive biases in language models come from?, so the 'prior' that competing primes fight against is deep and durable. And model confidence predicts how stubbornly a model resists having its output swayed Does model confidence predict robustness to prompt changes? — implying interference between primes is strongest exactly where the model is uncertain, and negligible where it's already committed. The thing you might not have known you wanted to know: priming interference isn't random noise, it's a contest decided by prior probability and confidence, where the older, stronger, more probable fact tends to silently overwrite the newer one.
Sources 6 notes
Pre-learning keyword probability strongly predicts post-learning priming across architectures and model sizes, with a ~10^-3 threshold separating contexts where priming occurs from those where it doesn't. Just 3 training exposures suffice to establish the effect.
Research demonstrates that LMs generate outputs inconsistent with their context because parametric knowledge from training dominates over in-context information. Textual prompting alone cannot override strong priors; causal intervention in representations is required.
Large language models represent multiple complete, computationally distinct tasks simultaneously during inference—a macroscopic phenomenon separate from feature-level superposition. However, autoregressive decoding forces convergence to a single task after the first token, preventing practical multi-task generation.
Prompting works entirely within a model's pre-existing training distribution and cannot supply domain knowledge absent from training data. This creates a hard ceiling: no prompt strategy can compensate for missing foundational knowledge, only reorganize what already exists.
A causal experiment using random-seed variation and cross-tuning showed that models sharing a pretrained backbone exhibit similar bias patterns regardless of finetuning data. Biases are planted during pretraining and merely swayed by instruction tuning.
ProSA found that when models are highly confident, they resist prompt rephrasing; low confidence causes major output swings. Larger models, few-shot examples, and objective tasks all correlate with higher confidence and greater robustness.