Cumulative Reasoning with Large Language Models

Paper · arXiv 2308.04371 · Published August 8, 2023
Chain-of-Thought and Reasoning MethodsReasoning Architectures

Despite the recent advancements in language models (LMs), their ability to solve complex problems remains limited. This paper introduces Cumulative Reasoning (CR), a novel approach that utilizes LMs cumulatively and iteratively, mirroring human thought processes for problem-solving. CR decomposes tasks into smaller, manageable components and leverages previous propositions for effective composition, significantly enhancing problem-solving capabilities. We demonstrate CR ’s superiority through several complex reasoning tasks: it outperforms existing methods in logical inference tasks with up to a 9.3% improvement, achieving 98.04% accuracy on the curated FOLIO wiki dataset. In the Game of 24, it achieves 98% accuracy, marking a 24% improvement over the prior state-of-the-art. Additionally, CR sets new state-of-the-art on the MATH dataset, achieving a 4.2% increase from previous methods and a 43% relative improvement in the most challenging problems. By extending CR to incorporate a code environment without external aids like retrieval or web browsing, we further harness the computational and logical reasoning capabilities of LMs, achieving a remarkable 72.2% accuracy on the MATH dataset and outperforming the PAL/PoT method by 38.8%.

Introduction. Despite the remarkable advances made by large language models (LLMs) in a variety of applications [3, 9, 40– 42, 44], they still struggle to provide stable and accurate answers when faced with highly complex tasks. For instance, it has been observed that language models have difficulty directly generating correct answers for high school math problems [29]. Drawing from Kahneman’s dual-process theory [24], which distinguishes between fast, intuitive thought (System 1) and slower, more deliberate thought (System 2), current LLMs are predominantly aligned with System 1, which restricts their ability to engage in the systematic and logical reasoning required for complex problem-solving tasks. Recent efforts to bridge this gap include Chain-of-Thought (CoT) prompting [58] and Tree-of-Thought (ToT) methodologies [32, 63], which guide LLMs through a more structured reasoning process. However, these In this work, we introduce Cumulative Reasoning (CR), a novel framework that characterizes a more holistic representation of the thinking process.

Discussion / Conclusion. In this work, we introduce Cumulative Reasoning (CR), a novel approach leveraging LLMs in a structured, iterative process that mirrors human cognitive strategies. By orchestrating the roles of proposer, verifier(s), and reporter, CR not only decomposes complex problems into manageable tasks but also effectively recomposes the validated steps into comprehensive solutions. This methodology has demonstrated superior performance across various domains, including logical inference, the Game of 24, and MATH problems, showcasing the versatility and potential of CR in advancing the capabilities of LLMs in complex problem-solving scenarios.