Can users steer recommendations with natural language at inference?
Can recommendation systems let users specify their preferences in natural language at inference time without retraining? This matters because it would let new users and existing users dynamically adjust what they want to see.
Sequential recommenders predict a user's next interaction from history. Recent work uses LLMs to extract preferences from reviews and feed them as auxiliary supervision during training, but this approach can't be steered at inference: the user's preferences are baked into the model weights, so a new user requires fine-tuning to be served well.
Preference discerning is a different paradigm. Instead of training the model to embody preferences, it conditions the generative recommender on user preferences as text in the model's context window at inference time. An LLM extracts preferences from user reviews and item-specific data, producing a textual description of what the user wants. This text is fed into the sequential recommender as in-context conditioning, alongside the interaction history.
The architectural shift unlocks several capabilities. Users can specify in natural language what they want or want to avoid ("more action, less romantic"). New users without retraining can be served by computing their preferences from minimal data and injecting them into context. The system can be evaluated on preference-following capability, not just next-item prediction — Mender's benchmark covers preference-based recommendation, sentiment following, fine-grained steering, coarse-grained steering, and history consolidation. State-of-the-art sequential recommenders fail several of these axes because they don't have a mechanism to incorporate preferences they didn't train on; Mender succeeds because preferences are a runtime input, not a training target.
The general lesson: making something a context input rather than a parameter target trades efficiency (longer prompts) for flexibility (runtime steering). For tasks where users know better than the training set what they want, the trade is worth it.
Inquiring lines that use this note as a source 5
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- Can users articulate what they want before AI helps them discover it?
- How can a single policy handle both asking preferences and recommending items?
- How much context length can sequential recommenders handle before steering degrades?
- Why do too-dynamic recommendations confuse users during active sessions?
- Can users modify their preference summaries to steer model behavior?
Related concepts in this collection 4
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Can language models bridge the gap between critique and preference?
When users express what they dislike rather than what they want, can LLMs reliably transform those critiques into positive preferences that retrieval systems can actually use?
complements: both let users steer recommendations via natural language at inference; preference discerning starts from positive preferences while critiques start from negative ones
-
Can user preferences be learned from just ten questions?
Explores whether adaptive question selection can efficiently infer user-specific reward coefficients without historical data or fine-tuning. This matters for scaling personalization without per-user model updates.
complements: PReF and Mender both achieve inference-time alignment without fine-tuning — PReF via reward factorization, Mender via NL conditioning
-
Can text summaries beat embeddings for personalized reward models?
When training reward models on diverse user preferences, does conditioning on learned text-based summaries of user preferences outperform embedding vectors? This matters because better representations could make personalization more interpretable and portable.
extends: text-based preference conditioning beats embedding conditioning at the reward-model level too — same insight in alignment
-
Can conversational recommenders recover lost preference signals from history?
Conversational recommenders abandoned item and user similarity signals when they shifted to dialogue-focused design. Can integrating historical sessions and look-alike users restore these channels without losing dialogue benefits?
complements: NL preferences from reviews are a fourth preference channel — text-distilled preferences abstract over individual interactions
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- Preference Discerning with LLM-Enhanced Generative Retrieval
- CoLLM: Integrating Collaborative Embeddings into Large Language Models for Recommendation
- Large Language Models for User Interest Journeys
- User-Centric Conversational Recommendation with Multi-Aspect User Modeling
- A Multi-facet Paradigm to Bridge Large Language Model and Recommendation
- Large Language Models are Zero-Shot Rankers for Recommender Systems
- Leveraging Large Language Models in Conversational Recommender Systems
- Revisiting Prompt Engineering: A Comprehensive Evaluation for LLM-based Personalized Recommendation
Original note title
preference discerning conditions sequential recommenders on natural-language preferences in context — letting users steer at inference without fine-tuning