SYNTHESIS NOTE
Reasoning, Retrieval, and Evaluation Psychology, Society, and Alignment Conversational AI and Personalization

Can models learn to ask clarifying questions instead of guessing?

Exploring whether large language models can be trained to detect incomplete queries and actively request missing information rather than hallucinating answers or refusing to respond. This matters because conversational agents today remain passive, responding only when prompted.

Synthesis note · 2026-02-22 · sourced from Conversation Agents
Why do AI agents fail to take initiative? How should we allocate compute budget at inference time? How should researchers navigate LLM reasoning research?

Current LLMs face three failure modes when receiving flawed or incomplete queries: they hallucinate an answer, they refuse to respond, or they provide a generic "I need more information" deflection. None of these is productive. The proactive critical thinking paradigm introduces a fourth option: identify specifically what is missing and generate a targeted question to request it.

The GSM-MC benchmark tests this by deliberately removing key variables from math problems. Results are dramatic:

The near-zero baseline reveals something important: despite extensive post-training that makes these models excellent at reasoning, they have almost no ability to detect when a problem is ill-posed and actively seek the missing piece. This is a specific capability gap, not a general reasoning limitation.

A striking secondary finding: inference-time scaling (activating "thinking mode") actually degrades proactive critical thinking in vanilla models. The extended thinking induces "counterproductive self-doubt rather than useful analysis, leading to a clear drop in performance." But after RL training, thinking mode becomes beneficial — the same mechanism that hurts untrained models helps trained ones.

This finding matters beyond math: a patient omitting critical symptoms, a user providing incomplete specifications, a student asking an ambiguous question — all require the agent to identify what's missing and ask, not just refuse or guess. Since Why can't conversational AI agents take the initiative?, proactive critical thinking is a concrete, trainable instantiation of the broader proactivity gap.

ProCoT (Proactive Chain-of-Thought) extends the paradigm from individual queries to multi-turn goal planning: rather than just detecting missing information in a single exchange, models generate explicit reasoning chains about conversation goals and plan proactive interventions across turns. This bridges proactive critical thinking (reactive: "this query is incomplete") with proactive dialogue (strategic: "given the user's goal, I should ask about X before they realize they need it").

The ALFA framework for clinical reasoning extends this by showing that question quality is multidimensional — a question can be clear but irrelevant, or relevant but ambiguous. ALFA decomposes "good question" into theory-grounded attributes (clarity, relevance, specificity) and trains against each via 80K attribute-specific preference pairs. This addresses a gap: proactive critical thinking shows models can learn to ask, but ALFA shows they need attribute-specific training to ask well. Additionally, research on clarifying question design shows that specific-facet questions ("What type of monitor?") consistently outperform need-rephrasing questions ("Can you be more specific?") for user satisfaction — the form of the question matters as much as the decision to ask.

Inquiring lines that use this note as a source 60

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 15

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
26 direct connections · 180 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

proactive critical thinking enables models to identify missing information and actively request clarification rather than passively refusing or hallucinating answers