Why do language models engage with conversational distractors?
Explores why state-of-the-art LLMs struggle to maintain topical focus when users introduce off-topic turns, despite having explicit scope instructions. This gap suggests models lack training signals for ignoring irrelevant directions.
CantTalkAboutThis identifies a specific gap in instruction-tuning datasets: they teach models to perform tasks but not to resist topical diversion. When task-oriented chatbots are given a system prompt defining their scope, and users introduce distractor turns that steer the conversation off-topic, even GPT-4-Turbo and Mixtral-Instruct engage with the distractors rather than maintaining focus.
The dataset is notably small (1080 synthetic dialogues) yet fine-tuning on it significantly improves topic resilience. This suggests the capability is easy to acquire — the gap is not in model capacity but in the absence of training signal. No existing instruction-tuning dataset explicitly teaches "ignore this."
The three-step generation process is instructive:
- Generate topic-following prompts across diverse scenarios
- Create dialogues adhering to topical instructions (dialogue inpainting)
- Integrate distractors to test topic following
A limitation is that synthetic distractors tend to be off-topic but simplistic. Real-world distractors may be more subtle — tangentially related topics, emotionally charged redirections, or Socratic questioning that appears on-topic but steers elsewhere.
This connects to the broader passivity/alignment problem. Since Does preference optimization harm conversational understanding?, RLHF trains models to be helpful in each response — and engaging with a user's distractor turn is locally helpful (it addresses what the user said). The globally correct behavior (maintaining topic focus) requires overriding the local helpfulness signal. Topic-following is another case where turn-level optimization conflicts with session-level goals.
The distinction between following instructions about what TO DO vs. what NOT TO DO is underexplored. Models are good at "act as a customer service agent" but poor at "do not discuss topics outside this scope." Negative constraints may require different training signals than positive instructions.
Inquiring lines that use this note as a source 60
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- How do users perceive attention from systems that lack continuous temporal presence?
- Can AI ever lead conversations without the anticipatory presence sustained attention provides?
- What happens when conversational design invites attention it cannot actually deliver?
- How does training data preserve communicative event structure without the actual events?
- Why do LLMs fabricate continuity when users shift conversational frames?
- Can LLMs propose pivots that change what counts as background context?
- How do readers selectively hold frame-related words in mind?
- What makes the frame problem distinct from feature-level shortcuts?
- Can curiosity-driven dialogue incrementally discover user interest journeys in real time?
- Can LLMs use implicit background knowledge the way humans do in ordinary conversation?
- Does transformer attention architecture fundamentally prevent topic-aware memory?
- Why do large language models follow user drift instead of maintaining topic focus?
- Why do conversational queries drift away from what triggered them?
- Why do large language models fail at taking conversational initiative?
- Can reranking candidate summaries improve perspective representation better than prompting?
- How should moderator LLMs decide which speakers to query per topic?
- How do discourse structure and dialogue state management relate to each other?
- Can transformer attention patterns actually prevent topic context loss in practice?
- Why do language models fail when users switch between and return to topics?
- What is the relationship between topic following and topic revisitation in conversation?
- Why do language models fail at pronouns across distant segments?
- Can AMR manipulation reveal where discourse coherence actually breaks down?
- Can explicit connectives compensate for missing intentional tracking in LLMs?
- What does attentional state look like in a static context window?
- How does removing a spurious cue change LLM performance?
- Why do Claude and Llama optimize for different dialogue outcomes?
- Why do next-speaker prediction baselines fail in group conversation settings?
- Can topic planning and response generation reduce dialogue turns?
- What happens to dialogue coherence when topic models use rigid stacks instead of flexible revisitation?
- Why do discourse failures cluster in attention and intentional layers rather than linguistics?
- Why do LLMs perform better on explicit discourse connectives than implicit relations?
- Why do LLMs struggle to update beliefs across multiple conversation turns?
- What prompting strategies most effectively boost long-context LLM performance on retrieval?
- How does distractor persona selection affect consistency enforcement in dialogue?
- Why does attention quality degrade as context length increases?
- Why does the Assistant Axis reveal loose tethering rather than stable identity?
- Does preference optimization actually erode conversational grounding in language models?
- Can language models recognize when to ignore off-topic information in conversations?
- What makes pronouns and demonstratives problematic in conversational retrieval systems?
- What dialogue content gaps remain after review augmentation?
- Can discourse-level structure and conversational-level organization work together?
- Why do conversational systems benefit from post-thinking between user turns?
- How should conversational recommender systems balance task focus with rapport building?
- Why do language models use twice as many words per conversation turn?
- How does preference optimization weaken conversational grounding in LLMs?
- Which conversation types most reliably cause models to drift from Assistant mode?
- How does RLHF alignment training reduce multi-turn conversational capability?
- What makes multi-session context tracking harder than single-turn underspecification problems?
- Why does single-turn Q&A framing not match real user deployment patterns?
- How does local helpfulness per turn conflict with maintaining session-level conversational goals?
- Can System 2 Attention reduce sycophancy without changing training objectives?
- Can attention mechanisms improve on Wide & Deep's static feature crosses?
- How does effort mismatch between user and model appear in conversation geometry?
- Why do models struggle with asking questions in multi-turn conversational reasoning tasks?
- How does treating conversation as a resource change what models learn to do?
- How do turn-level retrieval failures differ from dialogue-level accumulation failures?
- What update rules should govern dialogue-scoped versus turn-scoped memory?
- Why do untrained summarizers focus on topics rather than preference dimensions?
- Why do current large language models fail to entrain with users?
- What structural updates prevent context collapse in evolving conversations?
Related concepts in this collection 7
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Does preference optimization harm conversational understanding?
Exploring whether RLHF training that rewards confident, complete responses undermines the grounding acts—clarifications, checks, acknowledgments—that actually build shared understanding in dialogue.
engaging with distractors is locally helpful but globally harmful; same alignment tax mechanism
-
Why can't conversational AI agents take the initiative?
Explores whether current LLMs lack the structural ability to lead conversations, set goals, or anticipate user needs—and what architectural changes might enable proactive dialogue.
topic following requires goal awareness: the agent must maintain its own conversational goal against user pressure
-
Can models abandon correct beliefs under conversational pressure?
Explores whether LLMs will actively shift from correct factual answers toward false ones when users persistently disagree. Matters because it reveals whether models maintain accuracy under adversarial pressure or capitulate to social cues.
topic drift and belief drift share a mechanism: social pressure to accommodate the user
-
Does including all conversation history actually help retrieval?
Conversational search systems typically use all previous context to understand current queries. But do topic switches in multi-turn conversations inject noise that degrades performance rather than helps it?
complementary approaches to topic boundary management: topic-following resists diversion at generation time, selective history filters irrelevant context at retrieval time
-
Why do users drift away from their original information need?
When users know their knowledge is incomplete but cannot articulate what's missing, do they unintentionally shift topics? And can real-time systems detect this drift?
bilateral drift problem: users in ASK state drift unintentionally, and models with the topic-following gap follow them; neither party maintains the thread
-
Can models learn when NOT to speak in conversations?
Does training AI to explicitly predict silence—through a dedicated silent token—help models understand when intervention adds value versus when they should stay quiet? This matters for building conversational agents that feel naturally helpful rather than intrusive.
structurally parallel training gap: DiscussLLM trains when not to speak, topic-following trains when not to engage; both are "negative constraint" capabilities absent from standard instruction-tuning
-
Why do dialogue systems lose context when topics return?
Stack-based dialogue management removes topics after they're resolved, making it hard for systems to reference them later. Does this structural rigidity explain why conversational AI struggles with topic revisitation?
complementary aspects of topic structure: topic-following addresses resistance to LEAVING appropriate topics; topic management addresses RETURNING to previous topics; together they define the full problem space of conversational topic continuity
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues
- Intent Mismatch Causes LLMs to Get Lost in Multi-Turn Conversation
- Are LLMs All You Need for Task-Oriented Dialogue?
- DiscussLLM: Teaching Large Language Models When to Speak
- The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Persuasive Conversation
- LLMs Get Lost In Multi-Turn Conversation
- CollabLLM: From Passive Responders to Active Collaborators
- Proactive Conversational Agents in the Post-ChatGPT World
Original note title
topic-following is a crucial yet overlooked instruction-tuning gap — even SOTA LLMs engage with distractors when they should maintain focus