SYNTHESIS NOTE
Agentic Systems and Tool Use Model Architecture and Internals

Can agents learn new skills without forgetting old ones?

Explores whether externalized skill libraries—storing learned behaviors as retrievable code rather than parameter updates—can solve the catastrophic forgetting problem that plagues continual learning systems.

Synthesis note · 2026-02-23 · sourced from Agents

VOYAGER introduces an architecture for lifelong learning that solves the catastrophic forgetting problem through externalization rather than internal parameter management. Three components work together:

  1. Automatic curriculum — proposes tasks based on the agent's current skill level and world state (finding yourself in a desert means harvesting sand before iron). Generated by GPT-4 with the overarching goal of "discovering as many diverse things as possible" — an in-context form of novelty search.

  2. Ever-growing skill library — each successfully completed task produces an executable code program stored in the library, indexed by the embedding of its description. When similar situations arise, relevant skills are retrieved by semantic similarity. This externalizes learned behavior as retrievable artifacts rather than weight updates.

  3. Iterative prompting with environment feedback — incorporates execution errors, environment feedback, and self-verification for program improvement. The agent refines skills based on real-world outcomes.

The compounding mechanism is the key insight: complex skills are synthesized by composing simpler programs. Fighting zombies builds on combat primitives; navigating a cave builds on movement and resource-gathering skills. This composition enables rapid capability growth without the forgetting that plagues weight-update-based continual learning methods.

Three lifelong learning requirements are met: (1) propose suitable tasks based on current capability and context, (2) refine skills from environmental feedback and commit to memory, (3) continually explore in a self-driven manner. These parallel the three requirements of the When should proactive agents push toward their goals versus accommodate users? framework — goal awareness, context adaptation, and initiative.

Because Can agents learn from failure without updating their weights?, VOYAGER's skill library is a more structured version of the same principle: externalize learning as retrievable artifacts. The embedding-indexed retrieval means skills are found by semantic similarity, not exact match — enabling transfer to novel but related situations.

Since Can communication pressure drive agents to learn shared abstractions?, the skill library pattern may generalize: agents under performance pressure naturally develop reusable, composable abstractions.


MUSE-Autoskill generalizes Voyager's compounding library into an explicit five-stage skill lifecycle — creation, memory, management, evaluation, refinement — turning skills from disposable generation outputs into "long-lived, experience-aware, testable assets." Two extensions matter for the catastrophic-forgetting claim. First, skills are validated through unit tests plus runtime feedback, so the library does not just grow but is continuously checked for reliability — addressing the gap where Voyager stores any successfully-executed program regardless of later robustness. Second, MUSE adds skill-level memory that accumulates per-skill experience across tasks, so reuse improves over time rather than staying static after first synthesis. On SkillsBench, generated skills reach 87.94% on their tasks and transfer to other agents with minimal accuracy loss, evidence that lifecycle management (not just synthesis) is what makes externalized skills durable infrastructure.

Inquiring lines that use this note as a source 103

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 10

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
22 direct connections · 160 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

compositional skill libraries that compound through synthesis enable lifelong learning without catastrophic forgetting