Dynamic Planning with a LLM
While Large Language Models (LLMs) can solve many NLP tasks in zero-shot settings, applications involving embodied agents remain problematic. In particular, complex plans that require multi-step reasoning become difficult and too costly as the context window grows. Planning requires understanding the likely effects of one’s actions and identifying whether the current environment satisfies the goal state. While symbolic planners find optimal solutions quickly, they require a complete and accurate representation of the planning problem, severely limiting their use in practical scenarios. In contrast, modern LLMs cope with noisy observations and high levels of uncertainty when reasoning about a task. Our work presents LLM Dynamic Planner (LLM-DP): a neurosymbolic framework where an LLM works hand-in-hand with a traditional planner to solve an embodied task. Given action-descriptions, LLM-DP solves Alfworld faster and more efficiently than a naive LLM ReAct baseline.
Introduction. Large Language Models (LLMs), like GPT-4 (OpenAI, 2023), have proven remarkably effective at various natural language processing tasks, particularly in zero-shot or few-shot settings (Brown et al., 2020). However, employing LLMs in embodied agents, which interact with dynamic environments, presents substantial challenges. LLMs tend to generate incorrect or spurious information, a phenomenon known as hallucination, and their performance is brittle to the phrasing of prompts (Ji et al., 2022). Moreover, LLMs are ill-equipped for naive long-term planning since managing an extensive context over multiple steps is complex and resource-consuming (Silver et al., 2022; Liu et al., 2023). Various approaches have aimed to mitigate some of these limitations. For instance, methods like Chain-of-Thought (Wei et al., 2022) and Self- Consistency (Wang et al., 2023b) augment the context with reasoning traces.
Discussion / Conclusion. The LLM-DP agent effectively integrates language understanding, symbolic planning, and state tracking in a dynamic environment. It uses the language model to understand tasks and scenes expressed in natural language, constructs and solves planning problems to decide on a course of action, and keeps track of the world state to adapt to changes and make informed decisions. This workflow enables the agent to perform complex tasks in the Alfworld environment, making it a promising approach for embodied tasks that involve language understanding, reasoning, and decision-making. LLM-DP offers a cost and efficiency trade-off between a wholly symbolic solution and an LLMonly model. The LLM’s semantic knowledge of the world is leveraged to translate the problem into PDDL while guiding the search process through belief instantiation. We find that not only is LLM-DP cheaper, on a per-token comparison, but it is also faster and more successful at long-term planning in an embodied environment. LLM-DP validates the need for LLM research to incorporate specialised tools, such as PDDL solvers, in embodied agents to promote valid Despite these promising results, numerous topics and unresolved issues remain open for future investigation.