SYNTHESIS NOTE
Psychology, Society, and Alignment Conversational AI and Personalization Language, Text, and Discourse

Why do LLM meeting summaries fail to help individuals?

Current LLM summarization treats all meeting participants the same, but organizational contexts require personalized recaps. What barriers prevent systems from learning what matters to each person?

Synthesis note · 2026-02-23 · sourced from Reading Summarizing
Why do AI conversations reliably break down after multiple turns? How do you build domain expertise into general AI models?

LLM-based dialogue summarization shows promise for meeting recap — but a user study with seven participants evaluating real work meetings reveals three specific failure modes that prevent organizational adoption.

The personal relevance gap. LLM recap summarizes what was globally important in the meeting, not what was personally relevant to each participant. A designer cares about the design decisions made. A project manager cares about timeline commitments. The same meeting requires different summaries for different participants, and current summarization has no model of what matters to whom. This is the personalization problem applied to collaborative settings — since Do user outputs outperform inputs for LLM personalization?, the system would need to learn from each participant's interaction history what they care about.

The mis-attribution problem. When the system attributes a statement to the wrong participant, the consequences extend beyond simple factual error. Mis-attributions are detrimental to group dynamics — they can create false impressions about who committed to what, who raised which concern, or who proposed which idea. In organizational settings where credit, accountability, and trust are at stake, getting attribution wrong damages the social fabric the meeting was meant to build. This parallels the finding that Does warmth training make language models less reliable? — errors in social contexts have consequences that accuracy metrics don't capture.

Context-dependent representation. Two distinct recap formats serve different needs: "highlights" (important moments, key decisions) for quick scanning and cognitive efficiency, and "hierarchical minutes" (structured, ordered, detailed) for reference and alignment. The rationale comes from cognitive science — perception and recall operate differently, and one format cannot serve both. Since Do generated interfaces outperform text-based chat for most tasks?, the representation should adapt to context rather than defaulting to a single format.

The design implication: AI summarization in collaborative organizational settings must learn from natural interactions what matters to each participant. Pure content summarization — extracting "what happened" — is insufficient when the question is "what happened that matters to me."

Inquiring lines that use this note as a source 3

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 4

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
16 direct connections · 136 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

llm meeting summaries fail on personal relevance and speaker attribution — mis-attributions harm group dynamics in organizational settings