Are some problems fundamentally unsolvable by parallel inference methods?
This explores whether some problems can't be cracked by running many reasoning attempts in parallel (sampling, voting, width) — and must instead be worked through step by step.
This explores whether some problems are genuinely beyond the reach of parallel inference — throwing more independent attempts at a question and voting — versus needing sequential, step-by-step reasoning. The corpus gives a surprisingly firm answer: yes, some problems are. The sharpest version comes from complexity theory, which proves that problems requiring polynomial-depth reasoning cannot be solved by parallel architectures like Transformers no matter how much you scale them — real progress requires recurrent structures that add serial computation depth Can parallel architectures solve inherently sequential problems?. That's a wall, not a tuning knob.
You can see the wall in practice, not just theory. On compositional tasks like graph connectivity — where you genuinely have to accumulate intermediate results before the next step makes sense — chain-of-thought beats parallel voting by an *exponential* margin, because short parallel chains simply can't reach the answer they never built up to When does sequential reasoning beat parallel voting?. So the boundary isn't about how clever the model is; it's about whether the problem's solution can be reached by independent guesses or only by a chain where each link depends on the last.
What makes this interesting is that parallel inference is genuinely powerful — just not universally. Under the same token budget, sampling multiple independent reasoning paths and taking a majority vote can beat extending one long chain by up to 22%, because diverse parallel samples explore the solution space more faithfully than a single chain that just inflates variance Why does parallel reasoning outperform single chain thinking?. And reasoning systems can scale in *width* — sampling parallel latent trajectories — to sidestep the latency cost of going deeper Can reasoning systems scale wider instead of only deeper?. So parallelism wins on problems whose answers can be sampled and checked; it loses on problems whose answers must be derived in order. The trap is assuming width can always substitute for depth.
There's also a subtler reframing worth knowing: failures attributed to 'hard' problems may not be about inherent complexity at all. One line of work argues that models break not at complexity thresholds but at *instance-novelty* boundaries — they fit patterns from problems they've seen rather than learning a general algorithm, so any reasoning chain succeeds if the model was trained on similar instances, regardless of length Do language models fail at reasoning due to complexity or novelty?. That suggests two different ceilings are in play: a structural one (parallelism can't fake serial depth) and a learning one (the model never internalized the procedure). And clever decompositions blur the line — methods that contract a problem into a memoryless chain of states Can reasoning systems forget history without losing coherence? or decouple reasoning from tool execution to expose hidden parallelism Can reasoning and tool execution be truly decoupled? show that some apparently-sequential work can be restructured to run wider than it first appears.
The takeaway you didn't know you wanted: 'parallel vs. sequential' isn't a preference or an efficiency trick — for a specific class of problems it's a provable capability boundary, and the engineering art is figuring out which side of that line your problem actually sits on.
Sources 7 notes
Complexity theory proves that problems requiring polynomial-depth reasoning cannot be solved by parallel architectures like Transformers, even with infinite scaling. Progress requires recurrent structures that increase serial computation depth.
On structured tasks requiring sequential multi-step reasoning like graph connectivity, chain-of-thought achieves exponentially higher accuracy than parallel voting. The difference emerges because solutions genuinely require accumulating intermediate results sequentially, which short parallel chains cannot achieve.
Multiple independent reasoning paths with majority voting achieve up to 22% higher accuracy than extending a single chain under the same token budget. Parallel diversity samples reasoning capability more faithfully than sequential extension, which inflates variance without improving correctness.
GRAM shows that stochastic latent transitions enabling parallel trajectory sampling sidestep the serial latency cost of depth-only scaling. Width matches token-level parallelism benefits: independent paths sample the solution space without variance inflation.
LRMs don't break at complexity thresholds but at instance-novelty boundaries. Models fit instance-based patterns rather than generalizable algorithms, so any reasoning chain succeeds if trained on similar instances, regardless of length.
Atom of Thoughts decomposes problems into DAGs and contracts them iteratively, ensuring each state depends only on the current problem—not prior steps. This memoryless approach eliminates historical baggage that bloats reasoning while maintaining answer equivalence.
ReWOO and Chain-of-Abstraction both decouple reasoning from tool responses through different mechanisms—planning-before-execution and abstract placeholders respectively—eliminating quadratic prompt growth and sequential latency while maintaining reasoning quality.