SYNTHESIS NOTE

Do neural networks naturally learn modular compositional structure?

Explores whether neural networks decompose compositional tasks into distinct subroutines without explicit symbolic design. This challenges the longstanding view that neural networks are fundamentally non-compositional.

Synthesis note · 2026-02-23 · sourced from MechInterp

Structural compositionality is the extent to which neural networks break down compositional tasks into subroutines and implement them in modular subnetworks. The alternative: matching inputs to learned templates without task decomposition.

The evidence supports compositionality. Using model pruning to isolate subnetworks:

Subnetworks that implement one subroutine can be identified
Ablating a subnetwork harms its corresponding subroutine while leaving others largely intact
This holds across multiple architectures (CNNs, transformers), tasks (vision, language), and scales

The pretraining effect: models initialized with pretrained weights more reliably produce modular subnetworks than randomly initialized models. Self-supervised pretraining appears to create internal structure that is more amenable to compositional decomposition. This suggests that the representations learned during pretraining have a modular quality that fine-tuning can exploit.

This provides empirical support against the longstanding objection that neural networks are fundamentally non-compositional. The finding: "some simple pseudo-symbolic computations might be learned directly from data using standard gradient-based optimization techniques." Explicit symbolic mechanisms may be unnecessary — gradient-based optimization discovers compositional structure when the task demands it and pretraining provides a good initialization.

The result is not perfect: "most do not exhibit perfect task decomposition." Compositionality is partial and graded, not all-or-nothing. Some architecture-task combinations show stronger structural compositionality than others.

This connects to the weight-sparsity finding: Can sparse weight training make neural networks interpretable by design? shows that enforcing sparsity produces clean decomposition. The structural compositionality paper shows that decomposition also emerges naturally, albeit imperfectly, from standard training. Sparsity amplifies a tendency that already exists.

Inquiring lines that use this note as a source 119

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 3

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map

14 direct connections · 109 in 2-hop network ·medium cluster Open in graph ↗

Do neural networks naturally learn modular compo… Can sparse weight training make neural networks in… Do base models already contain hidden reasoning ab… Can neural networks learn compositional skills wit…

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Can sparse weight training make neural networks interpretable by design? Explores whether constraining most model weights to zero during training produces human-understandable circuits and disentangled representations, rather than attempting to reverse-engineer dense models after training.
sparsity amplifies the compositional decomposition that standard training already partially produces
Do base models already contain hidden reasoning ability? Explores whether reasoning capability emerges during pre-training as a latent feature rather than being created by post-training methods like reinforcement learning or fine-tuning.
pretraining-induced modularity is part of the "latent capability" that minimal signals can activate
Can neural networks learn compositional skills without symbolic mechanisms? Do neural networks need explicit symbolic architecture to compose learned concepts, or can scaling alone enable compositional generalization? This asks whether compositionality is an architectural feature or an emergent property of scale.
complementary evidence: scaling enables compositionality in behavior; pruning reveals it in structure

Do neural networks naturally learn modular compositional structure?

Related concepts in this collection 3

Related papers in this collection 8

Search by related questions 4