Feature-based detection of automated language models: Tackling GPT-2, GPT-3 and Grover

Fröhling, Leon; Zubiaga, Arkaitz

dc.identifier.uri	http://dx.doi.org/10.15488/15750
dc.identifier.uri	https://www.repo.uni-hannover.de/handle/123456789/15874
dc.contributor.author	Fröhling, Leon
dc.contributor.author	Zubiaga, Arkaitz
dc.date.accessioned	2023-12-14T06:39:40Z
dc.date.available	2023-12-14T06:39:40Z
dc.date.issued	2021
dc.identifier.citation	Fröhling, L.; Zubiaga, A.: Feature-based detection of automated language models: Tackling GPT-2, GPT-3 and Grover. In: PeerJ Computer Science 7 (2021), e443. DOI: https://doi.org/10.7717/peerj-cs.443
dc.description.abstract	The recent improvements of language models have drawn much attention to potential cases of use and abuse of automatically generated text. Great effort is put into the development of methods to detect machine generations among human-written text in order to avoid scenarios in which the large-scale generation of text with minimal cost and effort undermines the trust in human interaction and factual information online. While most of the current approaches rely on the availability of expensive language models, we propose a simple feature-based classifier for the detection problem, using carefully crafted features that attempt to model intrinsic differences between human and machine text. Our research contributes to the field in producing a detection method that achieves performance competitive with far more expensive methods, offering an accessible “first line-of-defense” against the abuse of language models. Furthermore, our experiments show that different sampling methods lead to different types of ﬂaws in generated text.	eng
dc.language.iso	eng
dc.publisher	London : PeerJ, Ltd.
dc.relation.ispartofseries	PeerJ Computer Science 7 (2021)
dc.rights	CC BY 4.0 Unported
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Feature-based detection	eng
dc.subject	Language generation	eng
dc.subject	Language models	eng
dc.subject	NLP	eng
dc.subject	Text classification	eng
dc.subject.ddc	004 \| Informatik
dc.title	Feature-based detection of automated language models: Tackling GPT-2, GPT-3 and Grover	eng
dc.type	Article
dc.type	Text
dc.relation.essn	2376-5992
dc.relation.doi	https://doi.org/10.7717/peerj-cs.443
dc.bibliographicCitation.volume	7
dc.bibliographicCitation.firstPage	e443
dc.description.version	publishedVersion
tib.accessRights	frei zug�nglich

Name: Feature-based_det ...

Größe: 1.299Mb

Format: PDF

Öffnen

Die Publikation erscheint in Sammlung(en):

Wirtschaftswissenschaftliche Fakultät
Frei zugängliche Publikationen aus der Wirtschaftswissenschaftlichen Fakultät

Feature-based detection of automated language models: Tackling GPT-2, GPT-3 and Grover

Die Publikation erscheint in Sammlung(en):

Suche im Repositorium

Durchblättern

Gesamter Bestand

Diese Sammlung

Mein Nutzer/innenkonto

Nutzungsstatistiken