Can LLMs coordinate with humans better using different model architectures?
This reads the question as asking whether swapping in different model architectures is the lever that makes LLMs better teammates with humans — and the corpus suggests the lever is mostly somewhere else.
This explores whether better human-AI coordination comes from changing the model's architecture — and what the collection keeps pointing to instead is that coordination is a property of training and surrounding system structure, not of the transformer guts. The starkest finding is that frontier models which solve problems well alone actually get *worse* when they collaborate, collapsing into >90% agreement whether or not they're right Why do language models fail at collaborative reasoning?. The fix there wasn't a new architecture — it was self-play preference training that taught models the social skill of productive disagreement, a ~17% improvement. So 'coordinate better' looks more like a training objective than a structural choice.
A second thread suggests the architecture you already have may be quietly working against coordination. LLMs are structurally *passive*: their training optimizes for answering queries, so they can't initiate topics, plan ahead, or steer a conversation toward a shared goal Why can't conversational AI agents take the initiative?. Alignment training compounds this by locking a model into one fixed communicative identity that can't switch register or renegotiate with a human partner mid-dialogue Can language models adapt communication style to different contexts?. These are coordination handicaps baked in by *how models are tuned*, not by attention heads — which is why 'a different architecture' may be aiming at the wrong layer.
Where the corpus does endorse changing structure, it means the structure *around* the model, not inside it. Reliable agents work by externalizing memory, skills, and interaction protocols into a harness layer rather than leaning on raw model scale Where does agent reliability actually come from?. Turning an LLM into something that can act with humans takes a four-stage pipeline transformation, not just retraining Can you turn an LLM into an agent by just fine-tuning?. And wrapping the model in an explicit algorithm that feeds it only step-relevant context can control its reasoning better than the model left to its own devices Can algorithms control LLM reasoning better than LLMs alone?. The most interesting coordination move is hybrid: a test-time learning system that handles its own uncertainty but deliberately *routes contradictions back to a human*, because the right call depends on context the system can't see Can LLMs learn reliably at test time without human oversight?.
There's also a deeper reason architecture alone won't close the gap: humans and LLMs may be doing categorically different things. One line argues LLM text generation and human communication only share surface form — the model emits strings from a probability distribution while humans use language to relate to and address others Are language models and human speakers doing the same thing?. Another reframes it as a matter of vantage point: from the outside humans and LLMs look utterly different, but inside a shared conversation both draw on the same symbolic substrate, making the difference structural rather than absolute Do humans and LLMs differ fundamentally or just superficially?. If the gap is partly about *what produces the output*, no architecture swap dissolves it — coordination has to be engineered at the level of training signals, conversational design, and the harness that mediates the partnership.
The thing worth walking away with: the collection barely treats architecture as the coordination knob at all. It treats coordination as a *learned social behavior* (self-play), a *system-design problem* (harness, pipeline, algorithmic scaffolding), and a *human-in-the-loop arrangement* (routing the hard calls back to people) — which suggests that if your goal is a better human-AI team, you'd reach for those before you reach for a different model.
Sources 9 notes
Frontier LLMs that solve problems alone fail when collaborating, achieving >90% agreement regardless of correctness. Self-play preference training improves outcomes by 16.7%, suggesting social skills for effective disagreement can be trained.
Research shows LLMs including ChatGPT cannot initiate topics, plan strategically, or lead conversations because their training optimizes for responding to queries, not creating dialogue from agent goals. This passivity is reinforced by alignment objectives and masked by fluent-sounding outputs.
System prompts and RLHF training lock models into one communicative identity across all interactions, preventing the contextual register-switching and value trade-offs that characterize human pragmatics. Users cannot reshape model behavior through dialogue negotiation.
Research shows reliable LLM agents externalize three cognitive burdens—memory (state persistence), skills (procedural components), and protocols (structured interaction)—into a harness layer rather than relying on model scale alone. The harness unifies these externalities and eliminates the need for the model to solve the same problems repeatedly.
Converting LLMs to action-capable systems requires four distinct stages: curating action-environment-user datasets, training for action grounding, integrating agent infrastructure with memory and tools, and rigorous safety evaluation. The surrounding system and harness determine whether actions are grounded or hallucinated.
LLM Programs embed LLMs within explicit algorithms that manage control flow and state, presenting only step-specific context to each LLM call. This information hiding addresses capability and context window limits while treating complex reasoning as modular, debuggable sub-tasks.
ARIA demonstrates that LLMs can adapt during inference through three integrated components: structured self-dialogue for uncertainty assessment, timestamped knowledge bases for conflict detection, and human-mediated resolution queries. Autonomous systems fail at reconciling contradictory rules because the correct choice depends on context outside the system.
LLMs produce strings via probability distributions; humans use language to address and relate to others. They share surface form but differ in what produces output, what it does socially, and what receivers should do with it.
Applied Habermas's observer/participant distinction to AI: from outside, humans and LLMs are utterly different; from within shared discourse, both draw on the same symbolic substrate, making the difference structural rather than absolute.