Exploiting Rationale Data for Explainable NLP Models

Loading...
Thumbnail Image
Date
2021-09-15
Volume
Issue
Journal
Series Titel
Book Title
Publisher
Hannover : Gottfried Wilhelm Leibniz Universität
Link to publishers version
Abstract

In recent years, deep learning models have become very powerful – even outperforming humans on a variety of tasks. This enables more real-world applications, including also sensitive fields such as medical diagnoses or jurisdiction. Besides achieving sufficiently good performance, the requirement to justify and explain the models’ decisions is becoming increasingly important.

This work aims to enable a broader application of a specific model class that is inherently interpretable, namely explain-then-predict models, by reducing the annotation cost of the explanations. We focus on the ExPred model as a representative of explain-then-predict models.

We investigate its dependency on rationale annotations, a special kind of explanation, through training using gradually fewer rationale-labeled instances. Furthermore, we ex- plore different approaches that aim to reduce the number of human-labeled instances required during training, such as active learning and weak supervision. Our results show that even with only a fraction of instances annotated with rationales from the original dataset, ExPred still achieves good performance (within 95% of the performance when using 100% annotation). Depending on the dataset, only a few thousand annotated rationales are required. Using weak supervision, this can be further reduced, at least in specific settings. On the Movie Reviews dataset, we achieve good performance with only 5% of the original rational labels. The tested off-the-shelf active learning methods do not provide any benefit over randomly selecting instances to label. However, the extensive behavior analysis enables the future design of active learning methods that are tailored to explain-then-predict models. We start by proposing an active learning method that outperforms the random baseline on the Movie Reviews dataset.

Description
Keywords
License
CC BY 3.0 DE