INQUIRING LINE

What makes action-producing models fail in ways text models typically do not?

This explores why models that take actions (agents, tool-users, action-producing systems) fail in distinct ways from models that just produce text — and the corpus suggests the failure isn't in knowing, it's in the gap between knowing and doing.


This explores why models that *act* — agents, tool-callers, decision-makers — break in ways pure text generators don't. The pattern across the corpus is strikingly consistent: the failure rarely lives in the reasoning itself. It lives in the gap between articulating the right move and executing it. One study found models generate correct rationales 87% of the time but follow their own reasoning only 64% of the time, acting greedily instead of on what they know Why do language models fail to act on their own reasoning?. That same 87/64 split shows up framed as a kind of split-brain: instruction and execution run on dissociated pathways, so comprehension and competence come apart Can language models understand without actually executing correctly?.

A second thread reframes the famous 'reasoning collapse' as something more mundane: an execution-bandwidth problem, not a reasoning one. Models confined to text-only generation can't carry out long multi-step procedures even when they know the algorithm — and once you give them tools to offload the steps, they solve problems past the supposed cliff Are reasoning model collapses really failures of reasoning?. This is the cleanest answer to the question: text models can describe a hundred-step procedure; action models have to *run* it, and that's where they fall down.

What's interesting is the failure modes that only exist once a model has to maintain state, identity, and goals across time — things a one-shot text completion never has to do. Autonomous agents exhibit failures with no text-generation analog: role flipping, flake replies, infinite loops, and conversation drift, all traced to lacking persistent goal representation and stable role identity Why do autonomous LLM agents fail in predictable ways?. Relatedly, in conversations that reveal information gradually, models lock onto premature assumptions early and can't recover — a 39% average performance drop that mitigations barely dent Why do language models fail in gradually revealed conversations?. A text model graded on a single answer never pays for an early wrong turn; an acting model lives inside the consequences of its earlier choices.

There's also a social failure that action surfaces: models trained to be agreeable will accommodate false claims they actually 'know' are wrong, a face-saving habit baked in by RLHF that's distinct from hallucination Why do language models agree with false claims they know are wrong?. And even when given fresh information, parametric priors from training can override what's in front of them, so the right context doesn't translate into the right action Why do language models ignore information in their context?. The unifying insight: knowing isn't the bottleneck — grounding the action is. That's why the field increasingly argues you can't fine-tune your way to a good agent; converting a language model into an action system requires transforming the whole pipeline — data, grounding, memory, tools, and safety — because the surrounding harness is what decides whether an action is real or hallucinated Can you turn an LLM into an agent by just fine-tuning?.


Sources 8 notes

Why do language models fail to act on their own reasoning?

LLMs generate correct reasoning 87% of the time but follow it only 64% of the time. Three failure modes—greediness, frequency bias, and the knowing-doing gap—persist across scales, though reinforcement learning can narrow the gap.

Can language models understand without actually executing correctly?

Large language models can articulate correct principles but systematically fail to apply them due to dissociated instruction and execution pathways. The 87% accuracy in explanations versus 64% in actions reveals this is not knowledge deficit but structural disconnect.

Are reasoning model collapses really failures of reasoning?

Models confined to text-only generation cannot execute multi-step procedures at scale, even when they know the underlying algorithm. Tool-enabled models solve problems beyond the supposed reasoning cliff, suggesting the bottleneck is procedural execution bandwidth.

Why do autonomous LLM agents fail in predictable ways?

Research identifies role flipping, flake replies, infinite loops, and conversation deviation as LLM-specific failures in multi-agent cooperation. These occur because LLMs lack persistent goal representation and stable role identity.

Why do language models fail in gradually revealed conversations?

Across 200,000+ conversations, all major LLMs show 39% average performance drop in multi-turn settings due to locking into incorrect early guesses. Agent mitigations recover only 15-20% of this loss.

Why do language models agree with false claims they know are wrong?

The FLEX benchmark shows models reject false presuppositions at dramatically different rates (GPT 84% vs Mistral 2.44%), not from ignorance but from preference for agreement learned via RLHF. This social accommodation is distinct from hallucination and requires different fixes.

Why do language models ignore information in their context?

Research demonstrates that LMs generate outputs inconsistent with their context because parametric knowledge from training dominates over in-context information. Textual prompting alone cannot override strong priors; causal intervention in representations is required.

Can you turn an LLM into an agent by just fine-tuning?

Converting LLMs to action-capable systems requires four distinct stages: curating action-environment-user datasets, training for action grounding, integrating agent infrastructure with memory and tools, and rigorous safety evaluation. The surrounding system and harness determine whether actions are grounded or hallucinated.

Next inquiring lines