Semantic Parsing for Task Oriented Dialog using Hierarchical Representations

Task oriented dialog systems typically first parse user utterances to semantic frames comprised of intents and slots. Previous work on task oriented intent and slot-filling work has been restricted to one intent per query and one slot label per token, and thus cannot model complex compositional requests. Alternative semantic parsing systems have represented queries as logical forms, but these are challenging to annotate and parse. We propose a hierarchical annotation scheme for semantic parsing that allows the representation of compositional queries, and can be efficiently and accurately parsed by standard constituency parsing models. We release a dataset of 44k annotated queries 1, and show that parsing models outperform sequence-to-sequence approaches on this dataset.
Introduction. Intelligent personal assistants are now ubiquitous, but modeling the semantics of complex compositional natural language queries remains challenging. Typical systems classify the intent of a query (e.g. GET DIRECTIONS) and tag the necessary slots (e.g. San Francisco) (Mesnil et al., 2013; Liu and Lane, 2016). It is difficult for such representations to adequately represent nested queries such as “Driving directions to the Eagles game”, which is composed of GET DIRECTIONS and GET EVENT intents. We explore a hierarchical representation for such queries, which dramatically improves the expressive power while remaining accurate and efficient to annotate and parse (see Figure 1). We introduce a Task Oriented Parsing (TOP) representation for intent-slot based dialog systems. This hierarchical representation is expressive enough to capture the semantics of com- plex nested queries, but is easier to annotate and parse than alternative representations such as logical forms or dependency graphs.
Discussion / Conclusion. While sequence-to-sequence models have shown strong parsing performance when trained on very large amounts of data (Vinyals et al., 2015); in our setting the inductive bias provided by the RNNG model is crucial to achieving high performance. The model has several useful biases, such as guaranteeing a well-formed output tree, and shortening the dependencies between intents and their slots. A further advantage of RNNG is that inference has linear time complexity, whereas seq2seq models are quadratic because attention is recomputed at every time step. Drawing on ideas from slot-filling and semantic parsing, we introduce a hierarchical generalization of traditional intents and slots that allows the representation of complex nested queries, leading to 30% higher coverage of user requests. We show that the representation can be annotated with high agreement. We are releasing a large dataset of annotated utterances at http://fb. me/semanticparsingdialog. The representation allows the use of existing constituency parsing algorithms, resulting in higher accuracy than sequence-to-sequence models.