SYNTHESIS NOTE
Reasoning, Retrieval, and Evaluation Training, RL, and Test-Time Scaling

Do search steps follow the same scaling rules as reasoning tokens?

Exploring whether the overthinking curve observed in reasoning models also appears in deep research agents. This matters because it could reveal universal scaling laws governing all inference-time compute.

Synthesis note · 2026-02-21 · sourced from Deep Research

Writing angle — Medium/LinkedIn post.

Hook: The overthinking papers showed that more reasoning tokens helps — until it doesn't. Now the same curve is showing up in a completely different place: search. Deep research agents improve with more search budget following the same monotonic-then-degrading relationship. Scaling laws aren't just for training anymore. They're for every inference loop.

The claim: Test-time scaling generalizes from single-query reasoning to multi-step retrieval. The "search budget law" (Agentic Deep Research paper) shows that answer quality scales with search steps in a way that mirrors the relationship between reasoning quality and thinking tokens.

Why it matters:

  1. It means inference-compute optimization now has two levers: reasoning budget and search budget. The old question was "how many tokens should we think?" The new question is "how many retrieval rounds should we run, and how much reasoning per round?"
  2. It raises the same ceiling question: if reasoning has an overthinking threshold, does search? ASearcher's turn-limit finding suggests yes — unrestricted per-turn reasoning in iterative search loops degrades iterative quality, which means the search version of overthinking exists too.
  3. It reframes DR quality as an infrastructure decision as much as a model decision. A weaker model with more search budget can match a stronger model with a smaller one.

The synthesis: Does search budget scale like reasoning tokens for answer quality? + Does limiting reasoning per turn improve multi-turn search quality? together make the full argument: search has its own TTS curve, it follows similar shape, and it has its own overthinking variant.

Inquiring lines that use this note as a source 44

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 3

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
13 direct connections · 122 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

the search budget law — why deep research agents follow the same scaling rules as reasoning models