Do pretraining biases and traditional selection bias compound in production recommenders?

This explores whether two different sources of bias — the ones LLM recommenders absorb from language-model pretraining, and the classic selection bias that comes from only learning on items users already saw — stack on top of each other when a system runs in production.

This explores whether two distinct bias origins — biases baked in during language-model pretraining, and the older selection bias that comes from training only on items users were actually shown — combine and reinforce each other in a live recommender. The corpus suggests they're not the same problem, and that's exactly why they compound: each has a different source, so debiasing one leaves the other untouched.

Start with the pretraining side. When you build a recommender on top of an LLM, it arrives already opinionated. Wu et al. catalog three biases inherited straight from the pretraining objective and corpus demographics rather than from any interaction data — position, popularity, and fairness bias Where do recommendation biases come from in language models?. The sharpest illustration is GPT-4 concentrating its picks on items popular in *its pretraining corpus* rather than in your dataset — The Shawshank Redemption surfaces across datasets with totally different popularity distributions Where does LLM recommendation bias actually come from?. That's a domain-shift bias standard debiasing can't reach, because it isn't living in your logs.

Now the classic side. Traditional selection bias is a property of the feedback loop: the model learns from what past versions of the model chose to show. YouTube's multi-objective ranker treats this as a first-class problem, using a shallow position tower specifically to strip selection bias out of training data — and warns that without it, models converge on degenerate equilibria that amplify their own past decisions Why do ranking systems need to model selection bias explicitly?. There's a quieter structural version too: when embedding dimensions are too small, the system overfits toward popular items to maximize ranking quality, and that unfairness compounds over time as niche items keep getting starved of exposure Does embedding dimensionality secretly drive popularity bias in recommenders?.

Here's where they meet. Both biases push in the same direction — toward the popular, the already-seen, the safe — so in production they don't cancel, they align. The pretrained prior seeds a popularity skew before a single user clicks; the selection-bias feedback loop then takes whatever the system shows and feeds it back as 'evidence' that those items deserve exposure. The dimensionality result shows the loop tightening over time; the YouTube result shows it hardening into a self-confirming equilibrium. A system that fixes only collaborative-filtering selection bias still carries the pretraining skew untouched — and Wu et al. are explicit that LLM-inherited bias needs LLM-specific mitigation, not adapted collaborative-filtering techniques Where do recommendation biases come from in language models?.

The part you didn't know you wanted to know: this compounding doesn't stay inside the model. Recommendation feeds act as persuasion infrastructure where effects 'compound through rating contamination and selection biases,' shaping not just what users see but what producers make and what populations come to believe How do recommendation feeds shape what people see and believe?. So two technical biases with separate origins don't just degrade ranking quality — they braid together into a feedback loop that reshapes the catalog and the audience around their shared blind spot.

Sources 5 notes

Where do recommendation biases come from in language models?

Wu et al. show that LLM-based recommendation systems exhibit position bias, popularity bias, and fairness bias—unique failure modes stemming from the language model's pretraining objective and corpus demographics rather than interaction data. Mitigation requires LLM-specific approaches, not adapted collaborative filtering techniques.

Where does LLM recommendation bias actually come from?

GPT-4 concentrates recommendations on items popular in its pretraining corpus rather than in target datasets. The Shawshank Redemption dominates across different datasets even when they have different popularity distributions, revealing a domain-shift effect that standard debiasing methods cannot address.

Why do ranking systems need to model selection bias explicitly?

YouTube's multi-objective ranker uses MMoE for conflicting objectives and a shallow position tower to remove selection bias from training data. Without both mechanisms, models converge on degenerate equilibria that amplify their own past decisions.

Does embedding dimensionality secretly drive popularity bias in recommenders?

Research shows that when user/item embedding dimensions are too small, recommender systems overfit toward popular items to maximize ranking quality. This compounds over time as niche items receive insufficient exposure, and cannot be fixed post-hoc without treating dimensionality as a fairness hyperparameter.

How do recommendation feeds shape what people see and believe?

Research shows recommendation systems operate as political actors: feed weights influence producer behavior, network topology drives opinion convergence, and automation enables targeted persuasion at population scale. These effects compound through rating contamination and selection biases.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a recommender systems researcher evaluating whether pretraining bias and selection bias genuinely compound in production, or whether recent advances have decoupled them. The question: do these two distinct bias sources—one inherited from LLM pretraining, one from feedback loops on shown items—reinforce each other in live systems, or has the field found ways to address them independently?

What a curated library found — and when (dated claims, not current truth):
Findings span 2018–2025, with critical LLM-recommender work clustering 2023–2025.
• Pretrained LLMs inherit three biases (position, popularity, fairness) from their corpus, independent of user interaction data; GPT-4 favors items from its pretraining distribution even across datasets with different popularity (Wu et al., ~2023).
• Selection bias—the feedback loop where models learn from items they were shown—converges to degenerate equilibria; YouTube's multi-objective ranker uses a position tower to strip it from training (YouTube ranker paper, ~2022).
• Low-dimensional embeddings cause long-term popularity overfitting and unfairness compounding over time (arXiv:2305.13597, 2023).
• Both biases align toward popularity and safe items, so debiasing one leaves the other untouched; LLM-inherited bias requires LLM-specific mitigation, not adapted collaborative-filtering techniques (Wu et al., ~2023).
• Recent RL post-training (arXiv:2504.07912, April 2025) shows RL amplifies pretraining behaviors; new data permeation in LLMs (arXiv:2504.09522, April 2025) suggests model drift may reshape bias surfaces.

Anchor papers (verify; mind their dates):
• arXiv:2305.19860 (May 2023) — Survey on Large Language Models for Recommendation
• arXiv:2305.13597 (May 2023) — Curse of "Low" Dimensionality in Recommender Systems
• arXiv:2504.07912 (April 2025) — Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining
• arXiv:2504.09522 (April 2025) — How new data permeates LLM knowledge and how to dilute it

Your task:
(1) RE-TEST THE COMPOUNDING CLAIM. For each finding (pretrain bias inheritance, feedback-loop hardening, dimensional overfitting), judge whether newer training methods (RL post-training, continual adaptation, retrieval-augmented generation), evaluation harnesses (bias measurement via held-out diversity metrics, counterfactual logging), or orchestration (multi-agent debate, offline debiasing filters) have RELAXED or OVERTURNED each constraint. Separate the durable observation—that two distinct bias sources exist—from the perishable claim that they necessarily reinforce. Where has independent or coupled mitigation succeeded?
(2) SURFACE DISAGREEMENT. The April 2025 papers (2504.07912, 2504.09522) may contradict or complicate the 2023 synthesis: does RL amplification of pretraining bias make compounding *worse*, or do new-data permeation mechanisms offer escape routes? Find work that explicitly tests whether decoupling is possible or whether the feedback loop is structural.
(3) Propose 2 research questions that assume the regime may have moved: (a) Can continual retraining on fresh, debiased interaction data suppress pretraining bias inheritance faster than feedback loops amplify it? (b) Do multi-agent or retrieval-based recommenders (which bypass dense embeddings) sidestep the dimensionality compounding trap?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Do pretraining biases and traditional selection bias compound in production recommenders?

Sources 5 notes

Next inquiring lines