How does decomposed prompting formalize prompt libraries as reusable software modules?

This explores whether breaking a big task into small, named, reusable prompt-steps turns prompting from one-off text into something closer to software — with modules, control flow, and debuggable parts — and what the corpus says about why that move works.

This explores whether breaking a big task into small, named, reusable prompt-steps turns prompting from one-off text into something closer to software. The corpus doesn't have a paper titled 'decomposed prompting,' but it maps the conceptual territory well, and the clearest anchor is the idea of the LLM Program: instead of asking a model to solve a hard task in one shot, you wrap it in an explicit algorithm that manages control flow and state, and hand each model call only the slice of context that step needs Can algorithms control LLM reasoning better than LLMs alone?. That single move is what makes prompts behave like software modules — each call has a defined input, a defined job, and can be tested and swapped on its own. The 'information hiding' framing is borrowed straight from programming: a module shouldn't see state that isn't its business.

The deeper reason this formalization matters is that it relies on something surprising — a transformer is, in principle, programmable. Research proves a single finite-size transformer can compute any computable function given the right prompt, which means a prompt is less like a question and more like a program you're feeding to a fixed processor Can a single transformer become universally programmable through prompts?. Decomposed prompting is the engineering discipline that makes that theoretical programmability usable: you don't trust one giant prompt to encode the whole algorithm, you externalize the control flow into code you can read.

Decomposition also pays off the way good module boundaries do — by eliminating redundancy. When reasoning and tool calls are tangled together, every step re-reads everything before it, so prompts grow quadratically and run sequentially. Decoupling the planning from the tool observations (plan first, fill in results later, or reason over abstract placeholders) cuts that waste and unlocks parallelism Can reasoning and tool execution be truly decoupled?. A related instinct shows up in recursive language models, which treat a giant prompt as an external environment to be queried by code rather than stuffed into one context window Can models treat long prompts as external code environments? — again, the prompt becomes data a program operates on, not a monolith the model must swallow whole.

Here's what a curious reader might not expect, though: a reusable prompt module is a leakier abstraction than a software function. Two prompts with identical meaning produce different outputs because the model responds to how frequently a phrasing appeared in training, not to its logic Why do semantically identical prompts produce different LLM outputs?. And a prompt module optimized in isolation can systematically underperform, because its quality depends on the inference strategy wrapped around it — tune the prompt and the decoding strategy together and you can gain up to 50% Does prompt optimization without inference strategy fail?. So the 'module' has hidden global dependencies that real functions don't.

The honest ceiling: decomposition reorganizes capability, it doesn't create it. Prompt optimization can only activate knowledge already in the model — no clever module architecture supplies knowledge the model never learned Can prompt optimization teach models knowledge they lack?. So formalizing prompt libraries as software is real and useful for structure, testability, and reuse — but you're composing a fixed processor's existing skills, not writing new instructions into it.

Sources 7 notes

Can algorithms control LLM reasoning better than LLMs alone?

LLM Programs embed LLMs within explicit algorithms that manage control flow and state, presenting only step-specific context to each LLM call. This information hiding addresses capability and context window limits while treating complex reasoning as modular, debuggable sub-tasks.

Can a single transformer become universally programmable through prompts?

Research proves a single finite-size transformer exists that can compute any computable function given the right prompt, achieving complexity bounds nearly matching unbounded models. However, standard training rarely produces models that learn to implement arbitrary programs this way.

Can reasoning and tool execution be truly decoupled?

ReWOO and Chain-of-Abstraction both decouple reasoning from tool responses through different mechanisms—planning-before-execution and abstract placeholders respectively—eliminating quadratic prompt growth and sequential latency while maintaining reasoning quality.

Can models treat long prompts as external code environments?

Recursive Language Models store long prompts in a Python REPL and query them via code execution, avoiding attention degradation. RLMs outperform base models even on shorter prompts while handling inputs two orders of magnitude beyond context windows.

Why do semantically identical prompts produce different LLM outputs?

Cao et al. and Adam's Law show that semantically identical prompts with different sentence-level frequencies produce systematically different output quality. Higher-frequency phrasings win because models register statistical mass from pre-training, not meaning.

Does prompt optimization without inference strategy fail?

Prompts optimized without knowledge of the inference strategy (best-of-N, majority voting) systematically underperform. Joint optimization of both prompt and inference strategy yields up to 50% improvement across reasoning and generation tasks.

Can prompt optimization teach models knowledge they lack?

Prompting works entirely within a model's pre-existing training distribution and cannot supply domain knowledge absent from training data. This creates a hard ceiling: no prompt strategy can compensate for missing foundational knowledge, only reorganize what already exists.

How does decomposed prompting formalize prompt libraries as reusable software modules?

Sources 7 notes

Next inquiring lines