SYNTHESIS NOTE
Recommender Systems

Can neural networks explore efficiently at recommendation scale?

Exploration—discovering unknown user preferences—normally requires expensive posterior uncertainty estimates. Can a neural architecture make Thompson sampling practical for real-world recommenders without prohibitive computational cost?

Synthesis note · 2026-05-03 · sourced from Recommenders Architectures
What breaks when specialized AI models reach real users?

Supervised neural networks form the backbone of most recommenders, but they only exploit recognized user interests. Discovering unknown user preferences requires exploration — and the standard exploration framework (contextual bandits with Thompson sampling) requires posterior uncertainty estimates, which are computationally prohibitive for large neural networks at recommendation scale.

The Zhu et al. proposal is the Epistemic Neural Recommendation (ENR) architecture, an epistemic neural network designed to enable Thompson sampling at scale. Epistemic neural networks separate aleatoric uncertainty (irreducible noise in outputs) from epistemic uncertainty (uncertainty about the model's parameters). The latter is what's needed for Thompson sampling: sample a parameter setting from the posterior, choose actions according to that setting, observe outcomes, update.

Empirically, ENR significantly boosts click-through rates and user ratings by at least 9% and 6% respectively compared to state-of-the-art neural contextual bandit algorithms. It achieves equivalent performance with at least 29% fewer user interactions than the best-performing baseline. Computationally, it demands orders of magnitude fewer resources than other neural contextual bandit baselines — moving Thompson-sampling-based exploration from research-only to production-feasible.

The general principle: when a Bayesian technique seems too expensive at scale, ask whether the expensive part is genuinely necessary or whether a structural approximation captures what's needed. Epistemic networks make a focused commitment to estimating only the parameter uncertainty Thompson sampling actually uses, dropping the rest. The architectural simplification is what unlocks scale.

Inquiring lines that use this note as a source 14

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 4

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
13 direct connections · 104 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

scalable neural contextual bandits enable sample-efficient exploration via epistemic neural networks supporting Thompson sampling at scale