User-LLM: Efficient LLM Contextualization with User Embeddings

Paper · arXiv 2402.13598 · Published February 21, 2024
Design Frameworks

Large language models (LLMs) have revolutionized natural language processing. However, effectively incorporating complex and potentially noisy user interaction data remains a challenge. To address this, we propose USER-LLM, a novel framework that leverages user embeddings to contextualize LLMs. These embeddings, distilled from diverse user interactions using self-supervised pretraining, capture latent user preferences and their evolution over time. We integrate these user embeddings with LLMs through cross-attention and soft-prompting, enabling LLMs to dynamically adapt to user context. Our comprehensive experiments on MovieLens, Amazon Review, and Google Local Review datasets demonstrate significant performance gains across various tasks. Notably, our approach outperforms text-promptbased contextualization on long sequence tasks and tasks that require deep user understanding while being computationally efficient. We further incorporate Perceiver layers to streamline the integration between user encoders and LLMs, reducing computational demands.

Introduction. Large language models (LLMs) have revolutionized the field of natural language processing (NLP) (Brown et al., 2020; Chowdhery et al., 2023; OpenAI, 2023; Touvron et al., 2023; Anil et al., 2023; Google, 2023). With their ability to learn and adapt from massive amounts of textual data, LLMs offer significant opportunities for user modeling and personalization. By analyzing user interactions and understanding user preferences, LLMs can be leveraged to power recommendations (Liu et al., 2023b; Lyu et al., 2023; Ji et al., 2023), language generation, summarization (Liu et al., 2023d; Basyal & Sanghvi, 2023), and question answering User interactions represent a rich source of behavioral data generated from a user’s engagement with digital systems. Spanning a wide range from textual input, search queries, media consumption (e.g., videos watched or rated), to social media activities, navigation patterns, location visits, and more, these interactions hold valuable insights for user modeling.

Discussion / Conclusion. In this paper, we introduced USER-LLM, a framework for contextualizing LLMs through user embeddings. These embeddings, derived from self-supervised pretraining on diverse user interactions, capture hidden user preferences and their evolution. By integrating these embeddings with LLMs through cross-attention and soft-prompt, USER-LLM empowers LLMs to adjust dynamically to user contexts. Our comprehensive evaluation across MovieLens, Amazon Review, and Google Local Review datasets demonstrated significant performance improvements in various tasks. USER-LLM showed competitive performance compared with non-LLM baselines and text-prompt-based LLM personalization techniques, particularly in handling long sequences and understanding users deeply. USER-LLM’s computational efficiency and ability to preserve LLM knowledge further make it a highly suitable approach for realworld user understanding applications.