Attentive Reasoning Queries: A Systematic Method for Optimizing Instruction-Following in Large Language Models

Paper · arXiv 2503.03669 · Published March 5, 2025

We present Attentive Reasoning Queries (ARQs), a novel structured reasoning approach that significantly improves instruction-following in Large Language Models through domain-specialized reasoning blueprints. While LLMs demonstrate remarkable capabilities across diverse tasks, they often fail to maintain adherence to complex, use-case-specific instructions during multi-turn conversations, presenting challenges for business-critical applications. ARQs address this limitation by guiding LLMs through systematic reasoning steps with targeted queries that reinstate critical instructions and facilitate intermediate reasoning throughout the completion process. In extensive testing within Parlant, our framework for reliable customer-facing agents in which ARQs were born out of necessity, they achieved a 90.2% success rate across 87 test scenarios, outperforming both Chain-of-Thought reasoning (86.1%) and direct response generation (81.5%). ARQs showed particular strength in addressing persistent failure modes like guideline re-application and hallucination prevention. Our analysis also revealed that ARQs can potentially be more computationally efficient than free-form reasoning when carefully designed.

Introduction. Large Language Models (LLMs) have demonstrated remarkable capabilities across diverse tasks, from knowledge retrieval to creative content generation [1, 2]. However, ensuring that these models perform systematic, reliable reasoningparticularly in multi-turn conversational settingsremains challenging [3]. LLMs often struggle with hallucinations, remembering instructions, and maintaining consistent reasoning patterns across complex tasks. These challenges are especially pronounced in high-stakes customer-facing applications, such as a bank’s customer service where a dynamic and temporal understanding of the context, and adherence to specific behavioral guidelines in relation to it, are critical. Traditional approaches to enhancing LLM reasoning, such as free-form chainof-thought prompting [4] or step-by-step instruction, have shown promise but offer limited control over how models process information. While these methods encourage models to ”think aloud,” they provide minimal structure to guide the reasoning process through domain-specific considerations or known failure modes.

Discussion / Conclusion. In this work, we introduced Attentive Reasoning Queries (ARQs), a structured approach to guide the reasoning processes of Large Language Models. ARQs utilize targeted, domain-specific questions organized within a predefined JSON schema to direct model attention to critical instructions and decision points. We implemented and evaluated ARQs within the Parlant framework, testing their effectiveness in conversational agent applications that require strict adherence to behavioral guidelines. Our evaluation compared ARQs against Chain-of-Thought (CoT) reasoning, demonstrating that ARQs improves performance across the system’s core modules. While both Chain-of-Thought and ARQs aim to enhance LLM reasoning capabilities, they differ fundamentally in their structure and implementation. CoT prompting encourages models to generate intermediate reasoning steps in a free-form manner before producing a final answer.

Attentive Reasoning Queries: A Systematic Method for Optimizing Instruction-Following in Large Language Models

Synthesis notes from this paper's topics