Can auxiliary modules preserve reasoning without catastrophic forgetting?

This explores whether you can bolt reasoning ability onto a model through add-on modules — keeping the core model frozen — instead of retraining it and risking the model 'forgetting' what it already knew.

This explores whether you can bolt reasoning ability onto a model through add-on modules — keeping the base model frozen — rather than retraining the whole thing and erasing prior knowledge in the process. The corpus answers with a fairly confident yes, and the recurring trick is architectural separation: don't touch the weights that hold what the model already knows. The clearest version is SoftCoT, which freezes the main LLM entirely and delegates 'soft' continuous thinking to a small auxiliary model bolted alongside it — the big model's pre-trained knowledge stays intact while the little helper supplies the reasoning Can continuous reasoning avoid forgetting in instruction-tuned models?. The reframe worth taking away: forgetting isn't an inherent tax you pay for adaptation, it's a misallocation problem — you were updating the wrong thing.

That reframe gets stated outright in the Fast-Slow Training work, which splits adaptation into two channels: slow-moving weights and fast-moving textual context. Task-specific lessons get routed into optimized prompts while parameter changes are kept minimal, hitting the same performance 1.4–3x faster with far less forgetting Can splitting adaptation into two channels reduce forgetting?. Both of these say the same thing from different angles — knowledge lives in the frozen substrate, new capability lives in a lightweight, swappable layer beside it.

The most radical version of 'auxiliary module' is to externalize the new skills entirely, out of the network and into a library. VOYAGER stores executable skills in an embedding-indexed library and composes complex skills from simple ones, so an agent learns continuously without any weight updates at all — no weights changed means nothing to forget Can agents learn new skills without forgetting old ones?. The ACE framework does the textual equivalent: it treats the context itself as an evolving playbook, growing it through incremental generate-reflect-curate edits rather than full rewrites, which avoids the 'context collapse' where detail gets compressed away over iterations Can context playbooks prevent knowledge loss during iteration?. These are forgetting-prevention strategies that never go near gradient descent.

There's a second, quieter family in the corpus that's worth seeing: modules that don't add new knowledge at all, but unlock reasoning the frozen model already had latent. Cognitive Tools wrap reasoning operations as isolated, sandboxed LLM calls and lift GPT-4.1 on competition math from 26.7% to 43.3% — with zero training Can modular cognitive tools unlock reasoning without training?. In the same spirit, a single steering vector pulled from 50 examples can compress chain-of-thought by 67% without retraining Can we steer reasoning toward brevity without retraining?, and a decode-time penalty on premature thought-switching improves accuracy with no fine-tuning Do reasoning models switch between ideas too frequently?. The lesson: if the capability is already in there, an external module can elicit it without ever risking the weights.

The honest caveat the corpus supplies is that 'preserving reasoning' is a lower bar than 'reasoning well.' Frozen-backbone tricks keep what the model had — but what it had may not be much. Frontier reasoning models stall at 20-23% on constraint-satisfaction problems that need genuine backtracking Can reasoning models actually sustain long-chain reflection?, and reasoning accuracy quietly collapses as inputs lengthen, well before the context window is full Does reasoning ability actually degrade with longer inputs?. So yes, auxiliary modules can preserve reasoning without catastrophic forgetting — but they preserve a ceiling as much as a capability, and the more interesting open question is whether an external module can ever push past what the frozen base could do on its own.

Sources 9 notes

Can continuous reasoning avoid forgetting in instruction-tuned models?

SoftCoT avoids catastrophic forgetting by keeping the main LLM frozen while delegating soft thought generation to a small auxiliary model. This architectural separation maintains pre-trained knowledge while enabling continuous reasoning.

Can splitting adaptation into two channels reduce forgetting?

Fast-Slow Training routes task-specific lessons into optimized prompts while keeping parameter updates minimal, reaching equivalent performance 1.4–3x faster with substantially less catastrophic forgetting and plasticity loss, demonstrating that forgetting is a misallocation problem rather than an inherent cost.

Can agents learn new skills without forgetting old ones?

VOYAGER demonstrates that storing executable skills in an embedding-indexed library and composing complex skills from simpler ones allows agents to learn continuously while avoiding the forgetting that occurs with weight-update-based methods. Environmental feedback refines skills while an automatic curriculum drives continual exploration.

Can context playbooks prevent knowledge loss during iteration?

The ACE framework treats contexts as evolving playbooks using generation-reflection-curation loops rather than full rewrites. This prevents knowledge loss from compression and detail erosion, achieving +10.6% on agentic tasks and +8.6% on finance without labeled supervision.

Can modular cognitive tools unlock reasoning without training?

Four cognitive tools implemented as sandboxed LLM calls improved GPT-4.1 on AIME2024 from 26.7% to 43.3% without any RL training. Modularity enforces operation isolation that pure prompting cannot guarantee, eliciting pre-existing reasoning capability.

Can we steer reasoning toward brevity without retraining?

Activation-Steered Compression extracts a single vector from 50 paired examples to reduce chain-of-thought length by 67% while maintaining accuracy and achieving 2.73x speedup. The method is training-free and generalizes across model sizes and domains.

Do reasoning models switch between ideas too frequently?

o1-like models frequently abandon reasoning paths mid-exploration, wasting tokens on incomplete approaches. A decoding-only penalty on thought-transition tokens (TIP strategy) discourages switching, improving accuracy on challenging math without model fine-tuning.

Can reasoning models actually sustain long-chain reflection?

DeepSeek-R1 and o1-preview achieve only 20-23.6% exact match on 850 constraint satisfaction problems requiring genuine backtracking. This ceiling reveals that reflective reasoning fluency does not translate to actual problem-solving competence on unfamiliar instance structures.

Does reasoning ability actually degrade with longer inputs?

FLenQA shows reasoning accuracy drops from 92% to 68% at just 3000 tokens of padding, far below context window capacity. The degradation is task-agnostic, uncorrelated with language modeling performance, and persists even with chain-of-thought prompting.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a research analyst re-testing claims about auxiliary modules preserving reasoning without catastrophic forgetting in LLMs. The question remains open: can frozen-backbone + add-on-module architectures sustain both prior knowledge AND new reasoning capability?

What a curated library found — and when (dated claims, not current truth): findings span 2024–2026.

• SoftCoT freezes the main LLM entirely, delegates soft continuous thinking to a small auxiliary model; pre-trained knowledge stays intact (2025-02, arXiv:2502.12134).
• Fast-Slow Training splits adaptation into slow-moving weights + fast textual context, hitting 1.4–3x faster learning with far less forgetting (2026-05, arXiv:2605.12484).
• VOYAGER stores executable skills in an embedding-indexed library with zero weight updates — no gradient descent means no forgetting surface (inferred from compositional-skill-libraries reference).
• Cognitive Tools wrap reasoning as isolated LLM calls, lifting GPT-4.1 competition math from 26.7% to 43.3% with zero training (2025-06, arXiv:2506.12115).
• Frozen-backbone models stall at 20–23% on constraint-satisfaction problems; reasoning accuracy collapses as input lengthens, well before context-window limits (2024-02, arXiv:2402.14848; 2025-01, arXiv:2501.18585).

Anchor papers (verify; mind their dates): arXiv:2502.12134 (SoftCoT), arXiv:2605.12484 (Fast-Slow), arXiv:2506.12115 (Cognitive Tools), arXiv:2402.14848 (reasoning degradation).

Your task:

(1) RE-TEST EACH CONSTRAINT. For SoftCoT, Cognitive Tools, and activation steering: has scaling (o1/o3 reasoning models), new inference methods (test-time compute budgets, dynamic depth), or orchestration (multi-agent skill composition) since relaxed the frozen-backbone ceiling? Separate the durable finding (auxiliary modules do prevent parameter-forgetting) from the perishable claim (they unlock latent reasoning equally well across problem classes).

(2) Surface the strongest CONTRADICTING or SUPERSEDING work from the last ~6 months. Flag any recent papers showing that freezing + auxiliary modules fails on long-horizon or multi-step constraint problems, or conversely, that test-time scaling now bypasses the frozen-backbone limitation entirely.

(3) Propose 2 research questions that ASSUME the regime may have moved: (a) Can auxiliary modules + test-time compute budgets push past the 20–23% ceiling on constraint-satisfaction? (b) Do skill libraries (VOYAGER-style) now outcompete parameter-efficient fine-tuning on continual adaptation benchmarks?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Can auxiliary modules preserve reasoning without catastrophic forgetting?

Sources 9 notes

Next inquiring lines