Can language models ask clarifying questions when sentences are ambiguous?

This explores whether LLMs can detect an ambiguous or underspecified sentence and respond by asking a clarifying question — and what the corpus says about why that behavior is rare and how it can be trained.

This explores whether LLMs can spot an ambiguous or underspecified sentence and ask a clarifying question rather than just guessing — and the corpus splits the problem into two surprising halves: recognizing the ambiguity, and acting on it. On recognition, the picture is bleak. The AMBIENT benchmark finds GPT-4 correctly untangles deliberately ambiguous sentences only 32% of the time, against 90% for humans, across lexical, structural, and scope ambiguity Can language models recognize when text is deliberately ambiguous?. The deeper issue isn't vocabulary — it's that the model struggles to hold two readings of a sentence in mind at once. And being a strong reasoner doesn't rescue you: models that ace fully-specified problems collapse to 40-50% when one needed variable is withheld and the task becomes 'figure out what to ask' Can models identify what information they actually need?. Asking good questions is a separate skill from answering well.

So why don't models just ask? Part of the answer is how they're trained. Standard RLHF rewards immediate helpfulness on the very next turn, which quietly teaches models to barrel ahead and answer rather than pause to ask — CollabLLM shows that swapping in rewards that estimate the long-term value of an exchange flips this, unlocking active intent discovery Why do language models respond passively instead of asking clarifying questions?. There's also a hidden cheat: many models look like they're reasoning about an underspecified problem when they're really just defaulting to the safe, conservative option — twelve of fourteen models actually got *worse* when constraints were removed Are models actually reasoning about constraints or just defaulting conservatively?. Apparent caution can be a reflex, not genuine recognition that something is missing.

The encouraging news is that the clarifying instinct is teachable. Reinforcement learning pushed proactive 'wait, this problem is flawed' accuracy from essentially zero (0.15%) to 74% on deliberately broken math problems — though the ability stayed fragile, and simply letting an untrained model think longer made it worse, not better Can models learn to ask clarifying questions instead of guessing?. More intriguingly, social meta-learning produces the behavior as an emergent side effect: train models only on complete problems, and they generalize to underspecified ones by spontaneously asking for what they need and delaying their answer Can models learn to ask clarifying questions without explicit training?. The model learns to treat the conversation itself as a place to gather information.

But asking *a* question isn't the same as asking a *good* one. The ALFA framework breaks question quality into concrete attributes — clarity, relevance, specificity — and trains on attribute-specific preferences rather than a single 'good/bad' score; this matters most in high-stakes settings like clinical reasoning, where the right clarifying question directly changes the decision Can models learn to ask genuinely useful clarifying questions?. Underlying all of this is a calibration question: a model only asks when it knows it doesn't know. Small models trained with uncertainty-aware objectives and an option to abstain can match models ten times their size, which suggests the 'sense of not knowing' that should trigger a clarifying question exists in LLMs but is badly undertrained by default Can models learn to abstain when uncertain about predictions?.

The thread worth pulling: clarifying questions sit at the intersection of three abilities we usually measure separately — noticing ambiguity, knowing you're uncertain, and valuing a future turn over the present one. Standard training actively suppresses all three. So the honest answer is 'not by default, but yes when trained for it' — and what gets trained is less a new skill than permission to stop guessing.

Sources 8 notes

Can language models recognize when text is deliberately ambiguous?

AMBIENT benchmark shows GPT-4 correctly disambiguates only 32% of cases versus 90% for humans. This failure spans lexical, structural, and scope ambiguity—revealing that LLMs cannot hold multiple interpretations simultaneously, a fundamental gap hidden by standard benchmarks.

Can models identify what information they actually need?

Models achieving high accuracy on complete reasoning tasks drop to 40-50% accuracy identifying what clarifying question to ask when one variable is withheld. Information gathering and problem execution are separable cognitive operations.

Why do language models respond passively instead of asking clarifying questions?

CollabLLM demonstrates that standard RLHF training optimizes for immediate helpfulness, discouraging models from asking clarifying questions or offering multi-turn insights. Multi-turn-aware rewards that estimate long-term interaction value enable active intent discovery and genuine collaboration.

Are models actually reasoning about constraints or just defaulting conservatively?

Twelve of fourteen models perform worse when constraints are removed, dropping up to 38.5 percentage points. Models appear to reason correctly by defaulting to harder options, not by actually evaluating constraints.

Can models learn to ask clarifying questions instead of guessing?

Reinforcement learning training increased proactive critical thinking accuracy from 0.15% to 73.98% on deliberately flawed math problems. Notably, inference-time scaling degraded this ability in untrained models but improved it after RL training, suggesting the capability is learnable but fragile without explicit training.

Can models learn to ask clarifying questions without explicit training?

Models trained via SML on complete problems generalize to underspecified tasks by asking for needed information and delaying answers. The training paradigm instills a meta-strategy of using conversation as an information source, addressing the premature-answering failure mode.

Can models learn to ask genuinely useful clarifying questions?

The ALFA framework breaks down question quality into theory-grounded attributes (clarity, relevance, specificity) and trains models on 80K attribute-specific preference pairs. Attribute-specific optimization outperforms single-score training, especially in clinical reasoning where asking the right clarifying question directly impacts decision quality.

Can models learn to abstain when uncertain about predictions?

Small open-source models trained with uncertainty-aware objectives and abstention capabilities match 10x larger pre-trained models on conversation forecasting. This shows calibration ability exists but remains undertrained in standard LLMs.

Can language models ask clarifying questions when sentences are ambiguous?

Sources 8 notes

Next inquiring lines