SYNTHESIS NOTE
Training, RL, and Test-Time Scaling Reasoning, Retrieval, and Evaluation Model Architecture and Internals

Does planning direction affect how hard problems become?

Planning research typically goes forward only. But some problems get easier when you work backward from the goal. What makes direction matter, and can language models exploit this?

Synthesis note · 2026-02-22 · sourced from Reasoning Architectures

Most LLM planning research studies forward direction only — generating steps from initial state toward goal. But many planning problems exhibit an inherent directional asymmetry: generating the correct final steps leading to the goal can be much easier than generating the correct steps from the beginning. This asymmetry is driven by bottlenecks near the goal.

The canonical example: a robot navigating to a bedroom at the end of a narrow hallway. Planning backward from the bedroom, the first step is constrained by the hallway (one possible path). Planning forward from the start, possibilities fan out quickly before the hallway constraint appears. The backward direction is easier because the bottleneck constrains the search space earlier in the backward chain.

The LLM finding: planning performance correlates with the planning complexity of the problem in that direction. This means which direction is easier is problem-specific, not universal. The paper demonstrates this holds for LLM planning, not just analytical planning theory.

However, backward planning in LLMs is systematically biased — models exhibit degraded performance when asked to plan in the backward direction directly (mirroring the difficulty humans have with backward reasoning intuitively). The solution is to flip the problem: invert the goal/start, then plan forward in the flipped problem. This avoids the backward bias while exploiting the backward direction's structural advantage.

Results: Combining planning in both directions with self-verification improves overall planning success by 4–24% across three planning domains. The diversity of candidate plans (forward + backward together) exceeds either direction alone.

This connects to the insight that How should we balance parallel versus sequential compute at test time? — but here the dimension is directional rather than just parallel/sequential. Generating diverse candidates by exploring different directions is a form of parallel planning.

Inquiring lines that use this note as a source 2

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 5

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
13 direct connections · 150 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

backward planning reduces difficulty when goal states have bottlenecks by constraining the early search space