Do similar user profiles create worse personalization errors than random ones?
This explores whether personalization fails worst not when a user's profile is obviously wrong, but when it's matched to someone almost-but-not-quite like them — and what the corpus says about why near-misses are more dangerous than random ones.
This explores whether the most damaging personalization errors come from near-matches rather than random mismatches — the 'close but wrong' case. The corpus answers directly: yes. The PRIME work documents a U-shaped error curve where the steepest performance drops come from replacing a user's profile with the *most similar* available profile, not a random one Why do similar user profiles produce worse personalization errors?. The mechanism is an uncanny-valley effect of confidence: when profiles are nearly identical, the model stops hedging and applies the wrong preferences with conviction. Obvious mismatch at least leaves room for caution; a convincing near-twin removes it.
What makes this more than a curiosity is what it implies about *how* personalization actually works. The same research line finds that personalization rides on style and expressed preferences rather than semantic content — profiles built from a user's past outputs outperform ones built from their inputs Do user outputs outperform inputs for LLM personalization?, and abstracted preference summaries beat literal recall of past interactions Does abstract preference knowledge outperform specific interaction recall?. If personalization is a thin layer of stylistic and preference signal, then a near-identical profile is precisely the kind of error that slips past every check: it matches on everything coarse and diverges only on the fine-grained preferences that matter.
The corpus also explains *when* the system should have known to doubt itself but didn't. LLM judges fail badly when persona information is sparse, because thin profiles lack the predictive signal to distinguish one user from a similar one — and the fix is letting the model express verbal uncertainty and abstain rather than forcing a confident guess Why do LLM judges fail at predicting sparse user preferences?. That's the missing brake in the uncanny-valley case: the near-match feels like high-confidence territory, so the model never abstains.
The most surprising turn comes from an adjacent corner of the corpus that inverts the question entirely. In social recommendation, friends with *different* tastes outperform friends pulled toward similarity — networks add value precisely by surfacing anomalous, off-pattern choices, not by reinforcing what already looks alike Can friends with different tastes improve recommendations?. And modeling a user as several distinct personas weighted against the item at hand beats treating them as one monolithic taste Can modeling multiple user personas improve recommendation accuracy?. Both point the same direction: similarity is not the safe default it feels like. Leaning into near-matches concentrates error; deliberately admitting difference is often the more accurate move.
There's a darker echo, too. Personalized reward models lose the averaging effect of aggregate models, which lets them amplify sycophancy and harden echo chambers Does personalizing reward models amplify user echo chambers?. The same instinct that produces the uncanny-valley error — collapse onto the nearest familiar pattern, then commit hard — is what turns personalization into a feedback loop. The thread running through all of it: confident similarity, not obvious difference, is where personalization quietly goes wrong.
Sources 7 notes
PRIME shows a U-shaped error curve where most-similar profile replacements cause steepest performance drops. The model confidently applies wrong preferences when profiles are nearly but not truly matched, an uncanny valley effect more harmful than obvious mismatch.
Research shows that user profiles built from outputs alone match or exceed performance of complete profiles across multiple tasks, while input-only profiles degrade performance. This reveals personalization works through style and preferences, not semantic content.
PRIME framework shows semantic memory (preference summaries, parametric encodings) consistently beats episodic memory (retrieved past interactions) across models. Recency-based recall outperforms similarity-based retrieval, and task fine-tuning exceeds preference tuning methods.
Sparse persona information lacks predictive power for specific preferences, causing LLM judges to fail. Verbal uncertainty estimation recovers reliability above 80% on high-certainty samples by allowing abstention rather than forced judgment.
Social Poisson Factorization uses friends' diverse tastes to recommend items outside users' usual preferences, outperforming methods that pull friends' representations together. Networks add value through influence on anomalous choices, not taste similarity.
AMP-CF separates user representation into latent personas weighted by attention to the candidate item. This candidate-conditional approach improves accuracy by adapting the user representation at prediction time and produces inherent explanations for why items were recommended.
Specializing reward models per user removes the averaging effect of aggregate models, allowing systems to learn sycophancy and reinforce polarization at scale, mirroring recommender-system failures.