SYNTHESIS NOTE
Training, RL, and Test-Time Scaling Model Architecture and Internals

Can models dynamically activate expert skills at inference time?

Can language models efficiently discover and compose task-specific capabilities on the fly without modifying base weights? This explores whether test-time adaptation through expert vector composition outperforms fixed fine-tuning approaches.

Synthesis note · 2026-02-23 · sourced from Novel Architectures

Transformer2 introduces Singular Value Fine-tuning (SVF): instead of modifying full weight matrices or even low-rank adaptations, SVF extracts and tunes only the singular values within a model's weight matrices. This produces compact expert vectors that are inherently composable — they can be dynamically mixed at inference without interference.

The inference mechanism has two passes:

  1. First pass (dispatch): The model executes on the input and observes its own test-time behavior, gathering information about what skills the current problem requires.
  2. Second pass (adaptation): The framework combines available expert vectors based on the first-pass analysis, providing a targeted modification to the base weights specifically tailored to the task.

Three adaptation strategies provide monotonic performance benefits with increasing access to test-time conditions, enabling deployment-scenario-appropriate tradeoffs.

The key properties that make this work:

The neuroscience parallel is deliberate: the brain activates specific regions depending on the task and dynamically reconfigures its functional networks in response to changing demands. Transformer2 operationalizes this for LLMs.

The deeper principle: the requisite capabilities for many downstream tasks already exist within pretrained models. The bottleneck is not knowledge but activation — knowing when to deploy which capability. This aligns with Does RL teach reasoning or just when to use it?, extending it to the architecture level: self-adaptation is about routing to existing capabilities, not creating new ones.

Inquiring lines that use this note as a source 67

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 4

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
15 direct connections · 147 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

self-adaptive LLMs compose expert vectors at inference via two-pass singular value fine-tuning