What formal representation could capture analogical reasoning across domains?
This explores what kind of formal structure—graphs, symbolic rules, formal languages—could represent the analogy-making that lets reasoning transfer from one domain to another, and what the corpus says about why that's hard.
This explores what formal representation could capture analogical reasoning across domains—the move where a pattern learned in one area maps onto a structurally similar problem in another. The corpus doesn't offer a single answer, but it triangulates the problem from several directions, and the most useful starting point is a confession of what *doesn't* work. Causal belief networks—the obvious candidate for modeling structured inference—turn out to capture only one slice of human reasoning. The GenMinds framework explicitly admits that causal graphs handle cause-and-effect well but cannot represent associative links or analogical mappings at all Can causal models alone capture how humans actually reason?. So analogy is named, in this collection, precisely as the thing the standard formalism leaves out.
Where the corpus gets constructive is in formal *languages* as a substrate for cross-domain transfer. Training models on Prolog and PDDL representations—logic and planning languages—improved reasoning on structurally similar but surface-different problems, with measurable gains in logical reasoning, planning, and general reasoning Do formal language prototypes improve reasoning across different domains?. The interesting word there is *structurally*: the prototype languages helped most when problems shared deep structure even when their content differed, which is exactly the signature of analogical transfer. That points toward a representation built on explicit relational structure rather than surface tokens.
The graph-shaped notes sharpen what 'relational structure' might mean. Reasoning topologies can be classified formally—chain, tree, and graph reasoning map precisely onto path graphs, trees, and arbitrary directed graphs, and that topology defines real computational structure rather than a metaphor Can reasoning topologies be formally classified as graph types?. But ordinary graphs bind only two things at a time. Hypergraphs lift that limit: a single hyperedge can bind three or more entities into one relation without breaking it into pairwise pieces, preserving joint constraints across multi-step reasoning Can hypergraphs capture multi-hop reasoning better than graphs?. Since an analogy is fundamentally a mapping between *systems* of relations—not isolated pairs—the hyperedge is arguably the more honest primitive for it. Complementing this, symbolic rules derived from knowledge-graph topology can be made explicit and reused as navigational plans, capturing structural patterns directly instead of leaning on semantic similarity Can symbolic rules from knowledge graphs guide complex reasoning?.
Here's the thing the corpus leaves you knowing that you might not have gone looking for: the reason a formal representation matters so much is that the default mechanism—chain-of-thought—demonstrably *fails* at analogical transfer. CoT is constrained imitation of reasoning form, reproducing familiar schemata from training rather than performing genuine abstract inference Does chain-of-thought reasoning reveal genuine inference or pattern matching?, and its accuracy degrades predictably the moment a problem shifts outside the training distribution in task, length, or format Does chain-of-thought reasoning actually generalize beyond training data?. Cross-domain analogy is *definitionally* a distribution shift. So the case for an explicit formal representation—formal-language prototypes, hypergraphs, extracted symbolic rules—isn't aesthetic. It's that pattern-matched fluency breaks exactly where analogy is supposed to begin, and only a structure that names relations explicitly can travel across the gap that breaks it.
Sources 7 notes
Causal belief networks excel at modeling causal reasoning but cannot represent associative links, analogical mappings, or emotion-driven belief shifts. The GenMinds framework itself acknowledges this as a tractable starting point rather than a complete theory.
Training on Prolog and PDDL representations improved logical reasoning by 4.7%, planning by 6.3%, and general reasoning by 4.0%. Models exposed to prototype languages generalized better to structurally similar problems than natural language-only training.
CoT, ToT, and GoT map precisely to path graphs, trees, and arbitrary directed graphs respectively. The topology is not metaphorical but defines actual computational structure—GoT's in-degree > 1 enables divide-and-conquer synthesis that trees cannot express.
HGMem organizes retrieved evidence as hyperedges rather than flat lists or binary graphs, allowing three or more entities to bind into single relations without decomposition. This structure accumulates coherent knowledge across retrieval steps, trading representational complexity for constraint expressiveness.
SymAgent derives symbolic rules from KG structure using LLM reasoning to create navigational plans that align natural language with graph topology. This approach captures structural reasoning patterns explicitly, outperforming retrieval methods that rely on semantic similarity alone.
CoT works by constraining models to reproduce familiar reasoning patterns from training, not by enabling novel symbolic reasoning. Performance degrades predictably under distribution shifts—the signature of imitation rather than capability emergence.
DataAlchemy experiments show CoT fails systematically under distributional shifts in task, length, and format. Models produce fluent but logically inconsistent reasoning — imitating reasoning form without valid underlying logic.