SYNTHESIS NOTE
Reasoning, Retrieval, and Evaluation Psychology, Society, and Alignment Conversational AI and Personalization

Can dialogue format help models reason more diversely?

Explores whether structuring internal reasoning as multi-agent dialogue rather than monologue can improve strategy diversity and coherency across different problem types, using the Compound-QA benchmark.

Synthesis note · 2026-02-22 · sourced from Conversation Architecture Structure
How should we allocate compute budget at inference time? What kind of thing is an LLM really? How should researchers navigate LLM reasoning research?

Current reasoning models (o1, R1, DeepSeek) use monologue-style reasoning within a think block: a single continuous chain of internal text. DialogueReason identifies two systematic weaknesses in this approach:

Low diversity — models persistently apply fixed strategies across diverse problems. When problems require different approaches (BFS for combinatorial, DFS for geometric proofs), monologue reasoning recycles the same strategy.

Low coherency — frequent shifts in attention within a single reasoning path. Repetitive hesitations ("Wait..."), unnecessary switches between ideas. The reasoning becomes fragmented, difficult to interpret, and often ineffective — swinging between overcommitting to one strategy and neglecting alternatives.

The Compound-QA task makes this visible: concatenating multiple independently solvable problems into a single prompt forces the model to demonstrate both diverse strategies and maintained coherency. Monologue reasoning fails at exactly this combination.

DialogueReason proposes dialogue-based internal reasoning structured through three dimensions:

The mechanism is scene-switching: the model sets up a dedicated scene for each question ("Quantum Café"), introduces characters with distinct expertise, and resolves through dialogue. When transitioning to the next question, it constructs a new environment ("Theoretical Physics Hall") with different characters. This prevents cross-problem interference while maintaining per-problem coherency.

This is distinct from multi-agent debate systems, which use SEPARATE models. DialogueReason is a SINGLE model that reasons in dialogue format — the diversity comes from internal role differentiation, not from aggregating multiple independent models. Since Why does parallel reasoning outperform single chain thinking?, DialogueReason achieves a related advantage through a different mechanism: not multiple parallel chains, but structured internal dialogue that naturally explores multiple strategies.

The connection to reasoning format effects is direct: since Does training data format shape reasoning strategy more than domain?, having the model reason in dialogue format activates different reasoning strategies than monologue format — the format IS the intervention.

Inquiring lines that use this note as a source 48

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 6

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
16 direct connections · 174 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

dialogue-based reasoning outperforms monologue reasoning on diversity and coherency by structuring internal thought as multi-agent interaction within defined scenes