When should AI agents ask users instead of just searching?
Explores whether tool-enabled LLMs should probe users for clarification when uncertain, rather than silently chaining tool calls that drift from intent. Examines conversation analysis patterns as a formal alternative.
Tool-enabled LLMs have a structural problem: when they can't immediately answer a query, they chain tool calls (search, calculation, code execution) and each intermediate step is conditioned on the output of the previous step. The result is progressive divergence from the user's original intent. The more tools the model uses, the further it drifts.
Conversation Analysis (Schegloff, 2007) offers a formal alternative from human talk-in-interaction. When human speakers can't immediately provide the expected response, they don't silently think harder — they insert a new pair of utterances to bridge the gap. These "insert-expansions" serve three functions: clarifying intent ("Do you mean the downtown location?"), scoping responses ("Are you looking for something under $50?"), and enhancing appeal ("I should mention it also comes in blue").
The key move is the "user-as-a-tool" paradigm: instead of the model consulting external tools and accumulating drift, it consults the user. The user provides necessary details and refines their request. This replicates exactly the structure of human insert-expansions — post-first inserts recover from misunderstandings, pre-second inserts gather information needed to choose the right response.
The empirical evidence from recommendation tasks shows benefits from this approach. But the deeper point is architectural: since Why can't conversational AI agents take the initiative?, the insert-expansion framework gives a principled answer to WHEN agents should break passivity — not by adding unsolicited content, but by asking structured questions when their internal processing would otherwise diverge.
This connects to the distinction between formal and functional linguistic competence: LLMs have formal competence (handling language in itself) but lack functional competence (doing things WITH language — reasoning, using world knowledge, establishing common ground). Insert-expansions are a functional linguistic capability. The paper argues that natural speech patterns may emerge as a side-effect of more closely imitated reasoning paths — if agents reason through dialogue rather than through silent chains.
Since Does preference optimization harm conversational understanding?, insert-expansions are precisely the kind of conversational work that RLHF training discourages — they slow things down, ask questions instead of answering, and score lower on single-turn helpfulness ratings, despite being more effective for multi-turn interaction. Insert-expansions are the PRE-EMPTIVE half of the repair space; since Can AI systems detect and correct misunderstandings after responding?, TPR provides the REACTIVE half -- correcting misunderstanding after it has already been acted on. Together they cover the full repair lifecycle: insert-expansions prevent, TPR recovers.
The insert-expansion framework connects to a trainable capability. Since Can models learn to ask clarifying questions instead of guessing?, RL training can bring proactive questioning from 0.15% to 73.98% accuracy — but the insert-expansion framework provides the conversational-analytic structure for WHEN and HOW to deploy that capability in dialogue, not just whether the model can detect missing information.
Inquiring lines that use this note as a source 120
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- Does the same uncertainty-driven logic appear in other conversation systems?
- Can AI systems identify important unanswered questions that emerge during reasoning?
- When should an AI system actively intervene versus remain silent?
- Can timing and context awareness reduce the cognitive cost of AI suggestions?
- How does multi-turn conversation degrade AI intent alignment?
- Why can't users and AI articulate shared goals together?
- Why does preference optimization erode conversational grounding in AI assistants?
- What dialogue patterns do real human recommendation conversations actually contain?
- What role does conversation state tracking play in timing ask versus recommend?
- What makes human-LLM exchange closer to oracle-consultation than dialogue?
- Should LLMs query users back when presented with under-specified scenarios?
- What are the five specific conversation triggers where AI intervention adds value?
- Can parallel agents or complementary mechanisms replace single-human interrogation of LLMs?
- Can curiosity-driven dialogue incrementally discover user interest journeys in real time?
- Can validation procedures interrupt an AI's relationship-maintenance logic?
- How does treating AI as an agent affect user autonomy and decision-making?
- What makes users willing to relinquish control to an agent?
- Can agents learn user intent from unlabeled video without text labels?
- What dialogue dynamics distinguish negotiation from standard information-provision tasks?
- Can language systems learn when to ask for clarification instead of choosing one reading?
- Can systems guide users adaptively without imposing predetermined dialogue structures?
- What memory and planning capabilities do AI companions need for evolving user needs?
- How does user overreliance on model confidence differ between chat and deployed agents?
- Why do conversational queries drift away from what triggered them?
- Why do persistent chatbot companions face novelty decay that ad-hoc supporters avoid?
- Why do dialogue systems fail to detect declarative clarification requests?
- How do humans decide which level of clarification to request?
- How does asymmetric information shape what to ask users first?
- Can designers hide AI context complexity behind a stable user interface?
- Can models infer maintenance operations from conversational text data alone?
- Can users articulate what they want before AI helps them discover it?
- Can real-time detection identify when users have incomplete or underdeveloped intent?
- How do conversational design patterns predict whether dialogue will derail?
- How do users fail to articulate what they actually want?
- Can AI learn when to speak in a conversation?
- Does conversational AI personalization increase behavioral expectations too much?
- What makes LLM agents default to passive helpfulness without curiosity rewards?
- Can curiosity-driven personalization work better than pre-conversation preference elicitation?
- Why can't current AI agents lead conversations with users?
- Why do passive conversational agents fail at collaborative decision-making?
- How should moderator LLMs decide which speakers to query per topic?
- How do conversation repair patterns handle user corrections and interruptions?
- When should agents use clarification commands instead of assuming intent?
- Can API-first interaction replace traditional UI-based agent interfaces?
- What interaction patterns preserve human learning when AI provides domain answers?
- Can AI recognize and support behavior change in users without established commitment?
- Does current empathetic AI misalign with how humans actually ask questions?
- Can prompt engineering overcome the gulf between user intent and AI interpretation?
- Why might text-only interfaces underestimate agent preference elicitation capabilities?
- What interaction controls matter most for effective human-LLM collaboration?
- Can conversation analysis predict when agents should ask users for clarification?
- Can AI systems recover from premature assumptions made early in multi-turn conversations?
- Can conversational AI achieve mutual understanding if trained only on text?
- How should systems learn what each meeting participant actually cares about?
- What structural changes enable agents to ask clarifying questions?
- Can LLMs distinguish between surface requests and underlying mental states in dialogue?
- Do agent frameworks adequately compensate for LLM conversational passivity?
- What interaction design changes would help LLMs handle underspecified requests?
- How do users develop different interaction scripts specifically for machines versus humans?
- Can Socratic questioning replace external evidence verification in multi-agent systems?
- Can users detect and correct an AI's mental model of their preferences?
- What design signals help users know when AI is acting on their behalf?
- Why do chatbots fail to recognize when someone is ambivalent about change?
- How do customer service chatbots get systematically misled by users?
- Can curiosity reward during conversation compete with simulated interaction optimization for alignment?
- Can AI distinguish when validation helps versus when confrontation is needed?
- Do LLM conversational agents currently detect and prevent derailment trajectories?
- Can proactive AI agents deploy politeness strategies without appearing intrusive?
- How can agents detect whether users are willing to follow their topic guidance?
- What makes proactive conversational agents feel intrusive versus helpful to users?
- What social boundaries must proactive agents respect during conversation?
- How can agents learn to estimate user satisfaction in real-time during conversation?
- When should agents accommodate user preferences over their own goals?
- How do conversational agents overcome structural passivity and goal awareness gaps?
- What distinguishes proactive information provision from proactive clarification seeking?
- How does semantic mismatch between user language and API documentation degrade tool retrieval?
- What happens when tools compete for agent invocation rather than human clicks?
- Does proactive agent design improve conversation efficiency or create user frustration?
- Can agents balance goal-driven proactivity with user preference alignment?
- What role does uncertainty reduction play in personalized agent interaction?
- Can users articulate their intent before exploring what an AI system finds?
- How can dialogue structure and trajectory predict social agent performance?
- Why do conversational systems benefit from post-thinking between user turns?
- What training approach enables models to proactively request clarification?
- How do insert-expansions help systems probe users before silently diverging?
- Can users interrogate AI outputs without verifying every single claim?
- Which conversation types most reliably cause models to drift from Assistant mode?
- How do agents discover and select which tools to invoke?
- What expectations does human conversation activate that AI should avoid triggering?
- How does RLHF training push chatbots toward problem-solving over exploration?
- Why do AI models treat user intent as binary rather than evolving?
- Do conversational agents need goal awareness to initiate grounding work themselves?
- Can reinforcement learning teach AI when to ask clarifying questions?
- Why do AI products default to service roles when users seek different kinds of help?
- What tasks do users actually want AI to handle versus what can it automate?
- How does rising AI capability change what users expect from their tools?
- Can a separate mediator layer improve intent understanding before task execution?
- What architectural changes help AI avoid adding interpretations users didn't express?
- How should conversational AI balance world knowledge with avoiding false expertise?
- Can AI take initiative by questioning without being proactive in directive ways?
- How should AI systems model relationship evolution within a specific ongoing conversation history?
- Why does single-turn Q&A framing not match real user deployment patterns?
- Why do conversational agents lack the goal awareness needed to lead rather than just respond?
- How might dual-process dialogue use information gain to trigger clarification?
- What happens to user expectations as AI conversation quality improves?
- How should AI interfaces signal their non-communicative nature to users?
- How does machine agency spectrum explain tool design mismatches with user behavior?
- Can statistical token processing create the accountability needed for dialogue?
- Can structural conversation analysis replace text-based reward signals for AI alignment?
- Can tool access control prevent agents from filling optional personal fields?
- What stops AI from helping users articulate preferences they cannot express?
- Can code-based reasoning replace natural language deliberation in agentic systems?
- How can agents learn user preferences during conversation without pre-calibration?
- What behavioral signals let users detect communicative flexibility in AI?
- How can agents detect missing information before attempting to solve problems?
- What distinguishes communicative acts from operational actions in agentic LLMs?
- How does multi-turn dialogue improve user satisfaction in search interactions?
- Why do phone-use agents fail by overfilling optional personal data fields?
- Why does continuous agent inference differ from human user inference?
- How does context engineering bridge human intent and machine understanding?
Related concepts in this collection 7
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Why can't advanced AI models take initiative in conversation?
Despite extraordinary capability in answering and reasoning, LLMs fundamentally cannot initiate, redirect, or guide exchanges. Understanding this gap—and whether it's fixable—matters for building AI that truly collaborates rather than merely responds.
insert-expansions address one form of passivity: agent should probe when uncertain, not silently diverge
-
Does preference optimization harm conversational understanding?
Exploring whether RLHF training that rewards confident, complete responses undermines the grounding acts—clarifications, checks, acknowledgments—that actually build shared understanding in dialogue.
RLHF penalizes exactly the conversational work insert-expansions perform
-
Do language models actually build shared understanding in conversation?
When LLMs respond fluently to prompts, do they perform the communicative work humans do to establish mutual understanding? Research suggests they skip the grounding acts that make dialogue reliable.
insert-expansions are a specific mechanism for building common ground
-
Why do language models sound fluent without grounding?
Explores whether LLM fluency masks the absence of communicative work—the clarifying questions, acknowledgments, and understanding checks that humans perform. Why does skipping these acts make models sound more confident?
insert-expansions are communicative work that fluent models skip
-
Can models learn to ask clarifying questions instead of guessing?
Exploring whether large language models can be trained to detect incomplete queries and actively request missing information rather than hallucinating answers or refusing to respond. This matters because conversational agents today remain passive, responding only when prompted.
insert-expansions provide the conversational structure for deploying proactive questioning capability in dialogue
-
Can AI systems detect and correct misunderstandings after responding?
How do conversational systems recognize when their previous response was based on a misunderstanding, and what mechanism allows them to correct it retroactively rather than restart?
complementary repair mechanism: insert-expansions are pre-emptive, TPR is reactive; together they cover the full repair lifecycle
-
Which clarifying questions actually improve user satisfaction?
Not all clarification helps equally. This explores whether asking users to rephrase their needs works as well as asking targeted questions about specific information gaps.
insert-expansions define WHEN to probe; this research defines HOW to probe well — specific-facet questions outperform need-rephrasing, providing the content design principles for insert-expansion sequences
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- Insert-expansions For Tool-enabled Conversational Agents
- Intent Mismatch Causes LLMs to Get Lost in Multi-Turn Conversation
- Probing the Multi-turn Planning Capabilities of LLMs via 20 Question Games
- DiscussLLM: Teaching Large Language Models When to Speak
- Proactive Conversational Agents in the Post-ChatGPT World
- LLMs Get Lost In Multi-Turn Conversation
- Using Large Language Models to Generate, Validate, and Apply User Intent Taxonomies
- Conversational Alignment with Artificial Intelligence in Context
Original note title
insert-expansions from conversation analysis provide a formal framework for when tool-enabled agents should probe users instead of silently diverging