Knowledge Graph Prompting for Multi-Document Question Answering

Paper · arXiv 2308.11730 · Published August 22, 2023
Knowledge GraphsQuestion Answering and Search

The ’pre-train, prompt, predict’ paradigm of large language models (LLMs) has achieved remarkable success in opendomain question answering (OD-QA). However, few works explore this paradigm in the scenario of multi-document question answering (MD-QA), a task demanding a thorough understanding of the logical associations among the contents and structures of different documents. To fill this crucial gap, we propose a Knowledge Graph Prompting (KGP) method to formulate the right context in prompting LLMs for MD- QA, which consists of a graph construction module and a graph traversal module. For graph construction, we create a knowledge graph (KG) over multiple documents with nodes symbolizing passages or document structures (e.g., pages/tables), and edges denoting the semantic/lexical similarity between passages or intra-document structural relations. For graph traversal, we design an LM-guided graph traverser that navigates across nodes and gathers supporting passages assisting LLMs in MD-QA. The constructed graph serves as the global ruler that regulates the transitional space among passages and reduces retrieval latency.

Introduction. Due to the emergence of large language models (LLMs), the ”pre-train, prompt, predict” paradigm has revolutionized natural language processing (NLP) in real-world applications, such as open-domain question answering (O-QA), fact-checking (FC), and arithmetic reasoning (AR) (Chen et al. 2017; Asai et al. 2019; Karpukhin et al. 2020; Thorne et al. 2018; Aly et al. 2021; Qin et al. 2023). However, no significant efforts have investigated this framework in the scenario of multi-documental question answering (MD- QA), which enjoys practical usage in academic research, customer support, and financial/legal inquiries that require analysis/insights derived from multiple documents (Tessuto 2011; Bolino, Long, and Turnley 2016). To investigate the capability of LLMs for MD-QA, we randomly sample multi-document questions from the

Discussion / Conclusion. Answering multi-document questions demands knowledge reasoning and retrieving from different documents across various modalities, presenting challenges for applying the paradigm of ‘pre-train, prompt and predict’ with LLMs. Recognizing that the logical associations among passages and structural relations within the documents can be unified into a graphical representation, we propose a Knowledge Graph Prompting method (KGP) for aiding LLMs in MD- QA. The KGP constructs KGs from documents with nodes depicting sentences or document structures and edges denoting their lexical/semantic similarity or structural relations. Since the constructed KGs may contain irrelevant neighbor information, we further design an LM-guided graph traverser that selectively visits the most promising node in approaching the question. In the future, we plan to investigate the capability of LLMs in understanding graph topology and explore the potential of fine-tuning/prompting LLMs to encode complex topological signals hidden in the graph.