SYNTHESIS NOTE
Agentic Systems and Tool Use

How do agentic AI systems decompose into adaptation paradigms?

What are the core dimensions that distinguish different approaches to adapting agents and tools in agentic systems? Understanding this taxonomy could clarify which adaptation strategy fits which problem.

Synthesis note · 2026-02-23 · sourced from Agents

The adaptation landscape for agentic AI systems is cleaner than it appears. Two binary dimensions — what gets optimized (agent or tool) and what provides the signal (tool execution or agent output) — generate four paradigms that cover the principal modes of adaptation:

A1: Tool Execution Signaled Agent Adaptation — The agent is optimized using feedback from external tool execution. When the agent generates a retrieval query and the retriever returns documents, metrics like recall or nDCG computed from retrieval results directly reward the agent. Example: DeepRetrieval optimizes the agent's query generation using retrieval quality scores.

A2: Agent Output Signaled Agent Adaptation — The agent is optimized using evaluation of its final output after incorporating tool results. The full pipeline runs (retrieve → integrate → answer), and the answer's correctness drives the reward signal. Example: Search-R1 rewards based on exact match of the final answer, not the retrieval quality.

T1: Agent-Agnostic Tool Adaptation — Tools are trained independently of any specific agent. Retrievers, domain-specific models, and pretrained components function as plug-and-play modules. The agent remains frozen; the tool improves on its own.

T2: Agent-Supervised Tool Adaptation — Tools are adapted using signals derived from the frozen agent's outputs. Reward-driven retriever tuning, adaptive rerankers, and memory-update modules all fall here — the agent defines what "good" means for the tool.

The taxonomy is practically useful because it maps directly to implementation decisions. A1 vs A2 determines where the loss function sits: at the tool boundary or at the output boundary. T1 vs T2 determines whether tool improvement requires an agent in the loop or not. Since How do knowledge injection methods trade off flexibility and cost? provides a parallel taxonomy for knowledge injection, these two frameworks are complementary: one classifies what gets injected, the other classifies how the system adapts.

The RAG setting illustrates the A1/A2 distinction clearly: A1 optimizes the agent to write better queries (retrieval quality as reward), while A2 optimizes the agent to produce better final answers (answer correctness as reward). These are different objectives and can pull in different directions — a query that retrieves the best documents is not necessarily the query that produces the best final answer when the agent has limited context integration ability.

Inquiring lines that use this note as a source 2

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 5

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
19 direct connections · 175 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

agentic AI adaptation decomposes into four paradigms along two dimensions — agent versus tool optimization target and execution-signaled versus output-signaled feedback