How do neural networks decompose tasks into modular subnetworks that transfer?

This explores whether neural networks split tasks into reusable modular pieces on their own — and what determines whether those pieces actually transfer to new tasks rather than just memorizing the old ones.

This question sits on a fault line in the corpus: some work finds that networks spontaneously carve tasks into clean, reusable modules, and other work finds that what looks modular is often memorization in disguise. Both turn out to be true under different conditions, and the gap between them is exactly where 'transfer' lives.

On the optimistic side, pruning experiments show that networks naturally implement compositional subroutines inside isolated subnetworks — you can ablate one and only its corresponding function breaks, leaving the rest intact Do neural networks naturally learn modular compositional structure?. Crucially, pretraining sharpens this: the modular structure becomes more consistent and reliable across architectures. That reusable scaffolding is what makes transfer possible. The clearest evidence is in length generalization, where models trained jointly on related tasks reuse the *same attention heads*, so a shorter task can borrow machinery to extrapolate beyond its own training length — and pretrained models already ship with that scaffolding built in Can length generalization transfer between different related tasks?. You can even force this modularity rather than wait for it: training with sparse weights produces compact circuits where neurons map to simple concepts, and ablation confirms each circuit is necessary and sufficient for its task Can sparse weight training make neural networks interpretable by design?.

But here's the thing the question doesn't anticipate: *which* part of a decomposed task transfers is not symmetric. When researchers split a reasoning system into a 'decomposer' (breaks the problem into steps) and a 'solver' (executes them), the decomposition skill transfers across domains while the solving skill does not Does separating planning from execution improve reasoning accuracy?. Planning is the portable module; execution stays parochial. The same flavor of insight drives function-calling work, where carving one task into seven explicit subtasks and training on them jointly generalizes better than a single umbrella dataset Can breaking function calling into subtasks improve model generalization?. Modularity isn't just emergent — you can engineer it by naming the seams.

Now the pessimistic side, which is what makes 'that transfer' the load-bearing phrase in your question. Transformers often *appear* compositional while actually memorizing computation subgraphs from training data — they succeed in-distribution and then fail drastically on novel combinations, with errors compounding step by step Do transformers actually learn systematic compositional reasoning?. The deeper diagnosis is the binding problem: networks struggle to segregate entities, keep their representations separate, and recombine them in new ways — the three things genuine modular transfer requires Why do neural networks fail at compositional generalization?. And the most unsettling result: two networks can produce *identical outputs on every input* while one has clean structure and the other has 'fractured, entangled' internals — and it's precisely the fractured one that can't transfer to novel contexts or recombine creatively Can identical outputs hide broken internal representations?. Benchmarks can't see this difference at all Can AI pass every test while understanding nothing?.

The synthesis worth taking away: decomposition into transferable modules is real, but it's a property of *internal structure*, not output behavior — and the two can diverge completely. Scaling helps by making compositional representations emerge when training covers enough of the combination space Can neural networks learn compositional skills without symbolic mechanisms?, but coverage that produces correct answers does not guarantee the clean, separable circuitry that lets a module move to a task it never saw. The frontier question isn't 'can networks be modular' — it's 'how do we tell true modularity from a convincing forgery,' since the thing that transfers is invisible to the tests we usually trust.

Sources 10 notes

Do neural networks naturally learn modular compositional structure?

Pruning experiments reveal that neural networks implement compositional subroutines in isolated subnetworks, with ablations affecting only their corresponding function. Pretraining substantially increases the consistency and reliability of this modular structure across architectures and domains.

Can length generalization transfer between different related tasks?

Models trained jointly on related tasks reuse the same attention heads to handle length generalization, allowing shorter tasks to extrapolate beyond their training length. Pretrained models already contain this reusable computational scaffolding.

Can sparse weight training make neural networks interpretable by design?

Training transformers with sparse weights creates compact, human-interpretable circuits where neurons correspond to simple concepts with clear connections. Ablation studies confirm these circuits are necessary and sufficient for task performance, though scaling beyond tens of millions of parameters while maintaining interpretability remains unsolved.

Does separating planning from execution improve reasoning accuracy?

Modular architectures with separate decomposer and solver models outperform monolithic LLMs, with decomposition ability transferring across domains while solving ability does not. The separation prevents planning-execution interference and produces more generalizable skills.

Can breaking function calling into subtasks improve model generalization?

Granite-20B-FunctionCalling shows that explicit training across seven granular subtasks—nested calls, chaining, parallel functions, name detection, parameter detection, next-best function, and response generation—generalizes better than umbrella datasets like ToolLLM. This multi-task approach closes the performance gap with GPT, Claude, and Gemini.

Do transformers actually learn systematic compositional reasoning?

Research shows transformers succeed on in-distribution tasks by memorizing computation subgraphs from training data, not by learning systematic rules. They fail drastically on novel compositions, with errors compounding across reasoning steps.

Why do neural networks fail at compositional generalization?

Greff et al. argue that neural networks cannot dynamically bind distributed information into compositional structures due to three failures: segregating entities from inputs, maintaining representational separation, and reusing learned structure in novel combinations. Scaling can partially overcome this by enabling compositional representations to emerge.

Can identical outputs hide broken internal representations?

Networks trained with SGD reproduce outputs perfectly while having radically different internal structure than evolved networks, with weight perturbations revealing fractured, entangled representations that prevent transfer to novel contexts or creative recombination.

Can AI pass every test while understanding nothing?

The Fractured Entangled Representation hypothesis shows that SGD-trained networks can produce identical outputs across all inputs while maintaining radically different internal representations. Standard benchmarks cannot detect this structural difference.

Can neural networks learn compositional skills without symbolic mechanisms?

Standard MLPs achieve compositional generalization through data and model scaling alone, without architectural modifications, provided the training distribution sufficiently covers combinations of task modules. Linear decodability of constituents from hidden activations reliably predicts success.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a researcher auditing claims about neural network modularity and transfer. The question remains open: *How do networks decompose tasks into modular subnetworks that actually transfer?*

What a curated library found — and when (dated claims, not current truth):
Findings span 2020–2026. Key constraints and decompositions documented:

• Pruning and sparse training produce clean, ablation-verified modular circuits; pretraining sharpens modularity across architectures (2023–2025).
• Length generalization transfers via shared attention heads across related tasks; pretrained models ship with this scaffolding (2025).
• Decomposition skill (breaking problems into steps) transfers across domains; execution skill does not — the asymmetry is engineered, not emergent (2024).
• Transformers often *appear* compositional while memorizing computation subgraphs; they fail drastically on novel combinations, with errors compounding step-by-step (2023).
• Two networks can produce identical outputs while one has clean structure and the other has fractured, entangled internals — the entangled one cannot transfer to novel contexts (2025).
• Compositional generalization emerges from scaling data and model size, but coverage producing correct answers does not guarantee separable circuitry (2025).

Anchor papers (verify; mind their dates):
• 2020-12: arXiv:2012.05208 — On the Binding Problem
• 2024-02: arXiv:2402.15000 — Divide-or-Conquer? Which Part to Distill
• 2025-05: arXiv:2505.11581 — Fractured Entangled Representations
• 2025-07: arXiv:2507.07207 — Scaling and Compositional Generalization

Your task:
(1) RE-TEST EACH CONSTRAINT. For every finding above, judge whether newer models (Claude 3.5+, o1, Grok-3), training methods (layer-wise distillation, routing mechanisms, mixture-of-experts), or evaluation protocols (adversarial composition benchmarks, mechanistic interpretability tooling) have since RELAXED or OVERTURNED it. Separate the durable question (networks *can* decompose; how do we verify it transfers?) from perishable limitations (e.g., 'entanglement prevents transfer' — has routing or architectural priors solved this?). Cite what resolved it.

(2) Surface the strongest CONTRADICTING or SUPERSEDING work from the last ~6 months. Has anyone shown that fine-tuning, in-context learning, or multi-agent orchestration bypasses the binding/entanglement problem entirely?

(3) Propose 2 research questions that ASSUME the regime has moved: (a) If modularity is now reliable under scaling/routing, what *new* bottleneck blocks transfer? (b) If entanglement persists, what representation-level interventions (e.g., orthogonal initialization, causality constraints) force clean decomposition?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

How do neural networks decompose tasks into modular subnetworks that transfer?

Sources 10 notes

Next inquiring lines