How much user interaction data is needed for effective AI personalization?
This reads the question as 'how much' but the corpus mostly answers a sharper one: effective personalization depends far less on the volume of interaction data than on which signals you keep and how you compress them.
This explores how much user interaction data AI personalization actually requires — and the collection's most useful move is to reframe the question, because several lines of work suggest the answer is 'surprisingly little, if you use the right data.' The most direct datapoint: a reward-factorization approach can pin down a user's personalized preferences from about ten well-chosen adaptive questions, by learning shared base reward functions first and then asking only the questions that most reduce uncertainty Can user preferences be learned from just ten questions?. The lever there isn't quantity, it's informativeness — active questioning beats passive accumulation.
A second thread argues that even pre-collected data may be optional. A curiosity-reward method personalizes in real time by rewarding the agent for reducing its uncertainty about who it's talking to mid-conversation, so the interaction itself becomes the data source — no profile required up front Can conversations themselves personalize without user profiles?. Persona-based systems push the same idea: a structured persona can be refined at test time by simulating recent interactions against feedback, turning a handful of recent exchanges into a working model of the user Can personas evolve in real time to match what users actually want?.
Where the corpus gets genuinely counterintuitive is on what kind of data earns its keep. Abstracted preference summaries consistently outperform retrieving piles of specific past interactions — semantic memory beats episodic recall, which means compressing history into 'what this person tends to prefer' is more valuable than hoarding the raw log Does abstract preference knowledge outperform specific interaction recall?. Relatedly, profiles built only from a user's past outputs match or beat full profiles, while input-only profiles actually hurt — personalization runs on style and preference, not on the semantic content of every query Do user outputs outperform inputs for LLM personalization?. So 'more data' can be worse than a smaller, better-curated slice.
There's also a quality-of-signal dimension the volume framing misses entirely. Behavioral cues like gaze, hesitation, and typing speed can be read as a continuous signal of cognitive state, meaning a thin but rich real-time stream may carry more personalization value than a thick archive of clicks Can AI systems read cognitive state from interaction patterns alone?. And at the discovery end, language models can mine activity logs to surface month-long 'interest journeys' that collaborative filtering misses — extracting durable intent from existing data rather than demanding new data Can language models discover what users actually want from activity logs?.
The thing you didn't know you wanted to know: across these papers the binding constraint on personalization isn't data scarcity, it's data selection and abstraction. Ten targeted questions, a user's outputs alone, a compressed preference summary, or even the live texture of a single conversation can outperform exhaustive logging — which flips the usual 'collect everything' instinct on its head, and incidentally lightens the privacy footprint at the same time.
Sources 7 notes
PReF learns base reward functions from preference data, then uses active learning to select maximally informative questions that reduce coefficient uncertainty. Users can be personalized via inference-time reward alignment without weight modification.
Adding an intrinsic motivation reward for reducing uncertainty about user type during conversation enables personalization without pre-collected profiles. Tested in education and fitness domains with 20 user attributes, the approach balances helpfulness with strategic information gathering.
PersonaAgent uses structured personas to bridge episodic/semantic memory and personalized actions, optimizing them at test time by simulating recent interactions against textual feedback. Learned personas cluster meaningfully in latent space, suggesting genuine user-specific separation beyond standard post-training drift.
PRIME framework shows semantic memory (preference summaries, parametric encodings) consistently beats episodic memory (retrieved past interactions) across models. Recency-based recall outperforms similarity-based retrieval, and task fine-tuning exceeds preference tuning methods.
Research shows that user profiles built from outputs alone match or exceed performance of complete profiles across multiple tasks, while input-only profiles degrade performance. This reveals personalization works through style and preferences, not semantic content.
Research shows AI systems can instrument multimodal behavioral signals (gaze, hesitation, speed) to read cognitive state during interaction, preserving flow by avoiding disruptive explicit probes. However, the same substrate enables both helpful timing and manipulative profiling.
66% of users pursue valued interest journeys lasting over a month, described in specific phrases like 'designing hydroponic systems for small spaces.' LLM-powered journey discovery bridges the semantic gap that collaborative filtering cannot reach, operating at user-level granularity with persona-level precision.