PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makers

Paper · arXiv 2406.12430 · Published June 18, 2024

In this paper, we conduct a study to utilize LLMs as a solution for decision making that requires complex data analysis. We define Decision QA as the task of answering the best decision, dbest, for a decision-making question Q, business rules R and a database D. Since there is no benchmark that can examine Decision QA, we propose Decision QA benchmark, DQA. It has two scenarios, Locating and Building, constructed from two video games (Europa Universalis IV and Victoria 3) that have almost the same goal as Decision QA. To address Decision QA effectively, we also propose a new RAG technique called the iterative plan-thenretrieval augmented generation (PlanRAG). Our PlanRAG-based LM generates the plan for decision making as the first step, and the retriever generates the queries for data analysis as the second step. The proposed method outperforms the state-of-the-art iterative RAG method by 15.8% in the Locating scenario and by 7.4% in the Building scenario, respectively. We release our code and benchmark at https: //github.com/myeon9h/PlanRAG.

Introduction. In many business situations, decision making plays a crucial role for the success of organizations (Kasie et al., 2017; Gupta et al., 2002). Here, decision making involves analyzing data, ultimately leading to the selection of the most suitable alternative to achieve a specific goal (Provost and Fawcett, 2013; Diván, 2017). For example, we assume that one of the goals of the pharmacy company “Pfizer” is to minimize the production cost while maintaining on-time delivery from plants to customers in the pharmaceutical distribution network (Gupta et al., 2002), and the production cost is proportional to the amount of operation time and number of employees of a plant. Then, Pfizer may face the following decision-making problems: (P1) which plant it should operate or stop, and (P2) how many employees it should hire for each plant.

Discussion / Conclusion. In this paper, we explored the capability of LLMs as a solution for decision making. We proposed the new decision-making task, Decision QA, which answers the best decision for a given complex decision-making question that requires considering both the business rules and business situation represented in a large database (in either RDB or GDB). We built the benchmark for Decision QA, called DQA, by extracting 301 sets of a database (in both RDB and GDB), a question, and an answers(ground truth) from two popular video games imitating real business situations that require decision making. We also proposed the new RAG technique called PlanRAG, which performs planning before retrieving and re-planning if the initial plan is not good enough. Through extensive experiments, we demonstrated that PlanRAG significantly outperforms the SOTA iterative RAG for the Decision QA task. In this paper, we explored the capability of LLM as a solution for decision making. However, our study still has several limitations. First, in this study, we focused on Decision QA using graph database or relational database. Decision making based on other databases, such as a hybrid form of database and vector database, could be explored in future research.

PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makers

Synthesis notes from this paper's topics