Informed Named Entity Recognition Decoding For Generative Language Models
Ever-larger language models with ever-increasing capabilities are by now well-established text processing tools. Alas, information extraction tasks such as named entity recognition are still largely unaffected by this progress as they are primarily based on the previous generation of encoder-only transformer models. Here, we propose a simple yet effective approach, Informed Named Entity Recognition Decoding (iNERD), which treats named entity recognition as a generative process. It leverages the language understanding capabilities of recent generative models in a future-proof manner and employs an informed decoding scheme incorporating the restricted nature of information extraction into open-ended text generation, improving performance and eliminating any risk of hallucinations. We coarse-tune our model on a merged named entity corpus to strengthen its performance, evaluate five generative language models on eight named entity recognition datasets, and achieve remarkable results, especially in an environment with an unknown entity class set, demonstrating the adaptability of the approach.
Introduction. Recent public releases of large language models (LLMs) with human-like writing skills have drawn unprecedented attention to natural language processing (NLP). Indeed, the performance of transformer-based LLMs increases notably, and they develop “emergent abilities”, i.e. their performance increases significantly, when their number of parameters exceeds a certain level (Wei et al., 2022). On the other hand, tasks not based on generative transformers, say sentiment analysis, contradiction detection, or named entity recognition, have been relegated to the backseat of this latest push in NLP. As of this writing, they are usually tackled using “encoder-only”1 language models (Heinsen, 2022; Deußer et al., 2023; Verma et al., 2023) which are typically much smaller than their “decoder-only” counterparts. Here, we intend to narrow the gap between generative and extractive NLP and introduce a novel named entity recognition (NER) framework. Our Informed Named Entity Recognition Decoding (iNERD) approach has three main features: First, it leverages proven capabilities of “decoder-only” models.
Discussion / Conclusion. We introduced a novel approach for named entity recognition (NER) which leverages the outstanding language understanding capabilities of modern large language models (LLMs). Our Informed Named Entity Recognition Decoding (iNERD) algorithm is easy to implement and arguably as simple as an “encoder-only” transformer plus multilayer-perceptron classifier approach as proposed in the seminal BERT (Devlin et al., 2019) paper. It builds on top of recent LLMs and is thus future-proof, as the employed LLMs can easily be replaced by improved models whenever they become available. It furthermore incorporates an informed decoding scheme which further improves performance, eliminates any risk of hallucinations, and significantly increases the adaptability. This informed scheme leverages the named entity decoding structure proposed herein to mask out disallowed tokens during the prediction phase. Extensive experimental validation shows the performance of our framework to be mostly on par with competing “encoder-only” approaches, if not better.