Do LLM explanations faithfully describe their recommendation process?
When LLMs recommend items to groups, do their explanations match how they actually made the choice? This matters because users trust explanations to understand AI decision-making.
When LLMs are asked to make group recommendations from individual member preferences, the outputs converge on Additive Utilitarian aggregation — picking items with the highest sum of all members' ratings. This is the consensus-based strategy from social choice theory. The behavior is consistent across uniform and divergent group structures.
The disconnect is in the explanations. Asked to explain its recommendation procedure to a layperson, the LLM doesn't say "I summed the ratings" — it cites averaging (which is similar to but not identical to ADD), user or item similarity, diversity, undefined popularity metrics, and ad-hoc thresholds. Different LLMs invent different procedures: Llama tends to cite user similarity, while Mistral and Phi cite diversity in the recommendation list. These claimed procedures don't match the behavioral output.
This makes LLM explainers unreliable narrators. They generate recommendations one way and explain them another way, and the explanation is plausible enough that a user accepts it. As item set size grows, the mention of similarity and diversity in explanations increases (suggesting the LLM is performing post-hoc justification harder when more items make the choice less defensible) while the use of "undefined popularity" decreases. The implication for group recommender systems built on LLMs: the explanation layer cannot be trusted to faithfully describe what the model did, even though that's its stated purpose.
Inquiring lines that use this note as a source 5
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- Why do LLM explanations cite similarity and diversity more as options increase?
- How does explanation fluency mislead users about actual recommendation procedures?
- Can alignment techniques make LLM explainers match their recommendation behavior?
- Why do LLM stories over-explain themes and favor single-track plots?
- Can LLMs recommend items without seeing the product catalog?
Related concepts in this collection 4
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Do AI-assisted outputs fool users about their own skills?
When people use AI tools to produce high-quality work, do they mistakenly believe they personally possess the skills that generated it? This matters because such misattribution could mask genuine skill loss and prevent corrective action.
complements: same trust-failure pattern — users (or LLMs themselves) describe a process that does not match the actual procedure used
-
Does processing ease mislead users about their own competence?
When AI generates polished output, do users mistake the fluency of that output as evidence of their own understanding or skill? This matters because it could systematically inflate self-assessment across millions of AI interactions.
complements: explainer narrators are convincing because of fluency, not faithfulness — the unreliable explanation is fluently produced
-
Can LLMs explain recommenders by mimicking their internal states?
Can training language models to align with both a recommender's outputs and its internal embeddings produce explanations that are both faithful and human-readable? This explores whether dual-access interpretation solves the fundamental tension between behavioral accuracy and interpretability.
tension with: RecExplainer tries to align LLM-explainer behavior with the underlying model — exactly the alignment that LLM-as-explainer fails by default
-
Does validating AI output make models more defensive?
When professionals fact-check and push back on GPT-4 reasoning, does the model respond by disclosing limits or by intensifying persuasion? A BCG study of 70+ consultants explores this counterintuitive dynamic.
complements: same structural-honesty failure — LLM produces post-hoc justifications rather than disclosing actual mechanism
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- Consistent Explainers or Unreliable Narrators? Understanding LLM-generated Group Recommendations
- Large Language Models for User Interest Journeys
- Prompting Large Language Models for Recommender Systems: A Comprehensive Framework and Empirical Analysis
- Enabling Explainable Recommendation in E-commerce with LLM-powered Product Knowledge Graph
- Exploring the Impact of Large Language Models on Recommender Systems: An Extensive Review
- Large Language Models are Zero-Shot Rankers for Recommender Systems
- Large Language Models as Conversational Movie Recommenders: A User Study
- Explain-Query-Test: Self-Evaluating LLMs Via Explanation and Comprehension Discrepancy
Original note title
LLM group recommendations resemble additive utilitarian aggregation but explanations claim multiple criteria — explainers as unreliable narrators