SYNTHESIS NOTE
Agentic Systems and Tool Use Reasoning, Retrieval, and Evaluation Training, RL, and Test-Time Scaling

Can we automatically generate formal verifiers from policy text?

Verifier scarcity blocks process verification in most domains. Can language models synthesize correct-by-construction formal checkers directly from natural-language policies, bridging informal rules and rigorous proof?

Synthesis note · 2026-05-28 · sourced from Test Time Compute

The chronic obstacle to process verification beyond math and code is verifier scarcity: someone has to write the checker, and most domains lack one. interwhen's second contribution attacks this directly — it synthesizes verifiers automatically from natural-language policy documents. Given a policy stated in prose, the system generates code-based verifiers, and in the strong case produces provably correct verifiers expressed in Lean or z3 (an SMT solver). Once a verifier exists, a language model extracts the verifier's input variables from a partial reasoning trace at runtime, and any formal or code-based verifier can be plugged in.

The significance is the inversion of the usual neuro-symbolic division of labor. Typically the LLM does fuzzy reasoning while a hand-built formal system checks it. Here the LLM is also the bridge that builds and feeds the formal system: it translates prose policy into formal verifier code, and it extracts the formal verifier's inputs from informal trace text. The formal layer (Lean/z3) supplies the provable-correctness guarantee; the LLM supplies the translation between informal policy/trace and formal specification.

This generalizes the reach of process verification from domains with native checkers (theorem proving, code execution) to any domain whose rules can be written down in prose. It connects to the vault's broader thread on offloading reliability to deterministic systems — since Can symbolic solvers fix how LLMs reason about logic?, the established pattern is "LLM formulates, solver executes"; interwhen extends it to "LLM formulates the verifier itself from policy, then solver executes the verifier." Counterpoint and risk: the provable-correctness guarantee covers the verifier's logic, not the LLM's translation of prose into that logic — a mis-synthesized verifier is confidently wrong, so the weakest link migrates from the checker to the prose-to-formal step. Why it matters: it makes formal policy compliance achievable for ordinary agentic tasks without a human writing every verifier.

Inquiring lines that use this note as a source 26

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 4

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
13 direct connections · 109 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

formal verifiers can be auto-synthesized from natural-language policy documents into lean or z3