Interpreting Text Classification with Human-Understandable Counterfactual Instances

Li, Teng

Startseite
→
Fakultäten
→
Fakultät für Elektrotechnik und Informatik
→
Dokumentanzeige

dc.identifier.uri	http://dx.doi.org/10.15488/11892	eng
dc.identifier.uri	https://www.repo.uni-hannover.de/handle/123456789/11987
dc.contributor.advisor	Anand, Avishek
dc.contributor.advisor	Lindauer, Marius
dc.contributor.author	Li, Teng	eng
dc.date.accessioned	2022-03-18T14:22:21Z
dc.date.available	2022-03-18T14:22:21Z
dc.date.issued	2022
dc.identifier.citation	Li, Teng: Interpreting Text Classification with Human-Understandable Counterfactual Instances. Hannover : Gottfried Wilhelm Leibniz Universität, Master Thesis, 2022, 26 S. DOI: http://doi.org/10.15488/11892	eng
dc.description.abstract	As the omnipresent machine learning models play increasingly important roles in our society, powerful interpretation tools to uncover their black boxes are needed. On the other hand, proven by psychological study, we humans are more likely to learn new concepts presented with contrastive instances. Therefore, interpreting ML models using the contrast between the original data instance and its counterfactuals has become a popular problem. Traditional counterfactual interpretation approaches tend to generate counterfactuals faithful to the ML model. However, they have little or no constraint on the meaningfulness of generated counterfactuals. This thesis proposes an approach generating a meaningful counterfactual interpretation of text classification models constrained with cosine similarity and POS (part-of-speech) properties of tokens. In this thesis, I use the text CNN model based on Kims Cnn\cite{KimsCnn} with fine-tuned Word2Vec embedding layer as the model to interpret. Then for the counterfactual generation, I leverage token-level HotFlip\cite{hotflip} and replace tokens under several constraints. Lastly, I will present that my approach results in more meaningful counterfactual interpretations compared with the vanilla HotFlip approaches using several examples.	eng
dc.language.iso	eng	eng
dc.publisher	Hannover : Gottfried Wilhelm Leibniz Universität Hannover
dc.rights	CC BY 3.0 DE	eng
dc.rights.uri	http://creativecommons.org/licenses/by/3.0/de/	eng
dc.subject	Artificial Inteligence	eng
dc.subject	Interpretability	eng
dc.subject	Machine Learning	eng
dc.subject	Natural Language Processing	eng
dc.subject	AI	eng
dc.subject	NLP	eng
dc.subject	Künstliche Intelligenz, Interpretierbarkeit, Maschinelles Lernen, Verarbeitung natürlicher Sprache	ger
dc.subject.ddc	500 \| Naturwissenschaften	eng
dc.title	Interpreting Text Classification with Human-Understandable Counterfactual Instances	eng
dc.type	MasterThesis	eng
dc.type	Text	eng
dcterms.extent	26 S.
dc.description.version	publishedVersion	eng
tib.accessRights	frei zug�nglich	eng

Name: master.pdf

Größe: 1.502Mb

Format: PDF

Öffnen

Die Publikation erscheint in Sammlung(en):

Fakultät für Elektrotechnik und Informatik
Frei zugängliche Publikationen aus der Fakultät für Elektrotechnik und Informatik

Interpreting Text Classification with Human-Understandable Counterfactual Instances

Die Publikation erscheint in Sammlung(en):

Suche im Repositorium

Durchblättern

Gesamter Bestand

Diese Sammlung

Mein Nutzer/innenkonto

Nutzungsstatistiken