Ask an Expert: Leveraging Language Models to Improve Strategic Reasoning in Goal-Oriented Dialogue Models

Paper · arXiv 2305.17878 · Published May 29, 2023
Conversation Architecture and StructureDomain Specialization in LLMs

Existing dialogue models may encounter scenarios which are not well-represented in the training data, and as a result generate responses that are unnatural, inappropriate, or unhelpful. We propose the “Ask an Expert” framework in which the model is trained with access to an “expert” which it can consult at each turn. Advice is solicited via a structured dialogue with the expert, and the model is optimized to selectively utilize (or ignore) it given the context and dialogue history. In this work the expert takes the form of an LLM. We evaluate this framework in a mental health support domain, where the structure of the expert conversation is outlined by pre-specified prompts which reflect a reasoning strategy taught to practitioners in the field. Blenderbot models utilizing “Ask an Expert” show quality improvements across all expert sizes, including those with fewer parameters than the dialogue model itself. Our best model provides a ∼10% improvement over baselines, approaching human-level scores on “engingingness” and “helpfulness” metrics.

Introduction. Dialogue systems based on pre-trained language models (PLMs) can be easily tailored via finetuning to exhibit particular characteristics, such as empathy (Roller et al., 2021) and emotion (Adiwardana et al., 2020). However, it has been previously observed that such models tend to produce vacuous “fallback” responses when presented with unfamiliar situations (e.g., extraneous (Li et al., 2016; Adiwardana et al., 2020)). For instance, we observe that fine-tuned BlenderBot (Roller et al., 2021) models have a propensity to use the response, “Do you have any hobbies?” as a substitute for furthering the conversation in helpful ways when the situation becomes too complicated. For goaldirected dialogues, where the discourse should consistently move towards a desired resolution or effect (Ham et al., 2020), frequent reliance on such fallback responses may result in them performing poorly. We hypothesize that the use of fallback responses may stem from the model being unable to formulate a more suitable reply in the absence of appropriate knowledge of the situation.

Discussion / Conclusion. What are the advantages of utilizing LLMs for strategic reasoning? Goal-oriented dialogue systems not based upon LLMs often rely on inferring dialogue states to carry out only meaningful conversations, and thus significantly rely on the definition of the task and an ontology of possible dialogue trajectories (Xie et al., 2022). This makes the systems brittle and open to catastrophic errors when the dialogue breaks significantly from the categories of the ontology. LLMs show similar ontological knowledge and planning ability in many domains, but are more flexible. As language models, interfacing with LLM experts is as straightforward as establishing a short goal-oriented conversation, and incorporating their responses into the dialogue model via the model’s context is similarly easy. In that sense, utilizing LLMs greatly reduces the efforts defining a complicated ontology and dialogue state tracking module by providing necessary reasoning power and knowledge. Why not use GPT-3 directly for dialogue generation? Is the dialogue model still necessary when there is an expert model?