SYNTHESIS NOTE
Reasoning, Retrieval, and Evaluation Model Architecture and Internals Training, RL, and Test-Time Scaling

Can separating causal models from language models improve reasoning?

Can an explicit formal causal model paired with an LLM translator overcome both spurious correlation reasoning and reward-without-explanation problems in RL? This explores whether dividing reasoning labor between systems addresses fundamental weaknesses in each.

Synthesis note · 2026-06-03 · sourced from Action Models

Two failure modes meet here. LLMs reason fluently but lean on spurious correlations and brittle patterns rather than robust causality; classical RL agents optimize reward without modeling why actions produce outcomes. Causal Reflection proposes a division of labor that fixes both: an explicit, formal causal model represents causality as a dynamic function over state, action, time, and perturbation — capturing delayed and nonlinear effects — and a formal Reflect mechanism detects mismatches between predicted and observed outcomes, generating causal hypotheses to revise the model. The LLM is deliberately not the reasoner; it serves as a structured inference engine that translates the formal causal outputs into natural-language explanations and counterfactuals.

The conceptual keeper is the architecture, not the (currently theoretical) implementation: keep causal reasoning in a verifiable formal substrate and use the LLM only for its genuine strength — rendering formal results in language. This sidesteps asking the model to be a causal reasoner, a role it performs unreliably.

This connects two vault threads. It builds on Why do LLMs handle causal reasoning better than temporal reasoning? — LLMs have causal fluency but not causal rigor, exactly the gap a formal model fills — and it shares the externalize-causality move with Can we extract causal belief networks from interview conversations?, which similarly keeps the causal structure outside the model and uses language only at the interface.

Inquiring lines that use this note as a source 10

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 3

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
14 direct connections · 130 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

separating a formal causal model from the LLM that only translates its outputs addresses both spurious-correlation reasoning and RL's reward-without-why