NLPContributions: An Annotation Scheme for Machine Reading of Scholarly Contributions in Natural Language Processing Literature

D'Souza, Jennifer; Auer, Sören

NLPContributions: An Annotation Scheme for Machine Reading of Scholarly Contributions in Natural Language Processing Literature

Services

Deutsch English

About the Repository Search and Browse Publish

Download statistics - Document (COUNTER):

D'Souza, J.; Auer, S.: NLPContributions: An Annotation Scheme for Machine Reading of Scholarly Contributions in Natural Language Processing Literature. In: Zhang, Chengzhi; Mayr, Philipp; Lu, Wie; Zhang, Yi (Eds.): EEKE 2020, 1st Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Documents : proceedings of the 1st Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Documents, co-located with the ACM/IEEE Joint Conference on Digital Libraries in 2020 (JCDL 2020). Aachen, Germany : RWTH Aachen, 2020 (CEUR Workshop Proceedings ; 2658), S. 16-27.

Repository version

To cite the version in the repository, please use this identifier: https://doi.org/10.15488/16292

Selected time period:

Sum total of downloads: 20

distribution of downloads over the selected time period
downloads by country

back to single item view (close usage statistics)

FileNLPContributions.pdf

Size2.49 MB

FormatAdobe PDF

View

Abstract:
We describe an annotation initiative to capture the scholarly contributions in natural language processing (NLP) articles, particularly, for the articles that discuss machine learning (ML) approaches for various information extraction tasks. We develop the annotation task based on a pilot annotation exercise on 50 NLP-ML scholarly articles presenting contributions to five information extraction tasks 1. machine translation, 2. named entity recognition, 3. Question answering, 4. relation classification, and 5. text classification. In this article, we describe the outcomes of this pilot annotation phase. Through the exercise we have obtained an annotation methodology; and found ten core information units that reflect the contribution of the NLP-ML scholarly investigations. The resulting annotation scheme we developed based on these information units is called NLPContributions. The overarching goal of our endeavor is four-fold: 1) to find a systematic set of patterns of subject-predicate-object statements for the semantic structuring of scholarly contributions that are more or less generically applicable for NLP-ML research articles; 2) to apply the discovered patterns in the creation of a larger annotated dataset for training machine readers [18] of research contributions; 3) to ingest the dataset into the Open Research Knowledge Graph (ORKG) infrastructure as a showcase for creating user-friendly state-of-the-art overviews; 4) to integrate the machine readers into the ORKG to assist users in the manual curation of their respective article contributions. We envision that the NLPContributions methodology engenders a wider discussion on the topic toward its further refinement and development. Our pilot annotated dataset of 50 NLP-ML scholarly articles according to the NLPContributions scheme is openly available to the research community at https://doi.org/10.25835/0019761.
License of this version:	CC BY 4.0 Unported
Document Type:	BookPart
Publishing status:	publishedVersion
Issue Date:	2020
Appears in Collections:	Zentrale Einrichtungen Forschungszentren

distribution of downloads over the selected time period:

downloads by country:

pos.	country		downloads
pos.	country		total	perc.
1		Germany	15	75.00%
2		United States	3	15.00%
3		Russian Federation	2	10.00%

Further download figures and rankings:

Hinweis

Zur Erhebung der Downloadstatistiken kommen entsprechend dem „COUNTER Code of Practice for e-Resources“ international anerkannte Regeln und Normen zur Anwendung. COUNTER ist eine internationale Non-Profit-Organisation, in der Bibliotheksverbände, Datenbankanbieter und Verlage gemeinsam an Standards zur Erhebung, Speicherung und Verarbeitung von Nutzungsdaten elektronischer Ressourcen arbeiten, welche so Objektivität und Vergleichbarkeit gewährleisten sollen. Es werden hierbei ausschließlich Zugriffe auf die entsprechenden Volltexte ausgewertet, keine Aufrufe der Website an sich.

Search the repository

Browse

All content
- Communities & Collections
- By Issue Date
- Authors
- Titles
- Subjects
- Subjects (GND)
- DDC
- License
- Type
This Collection
- By Issue Date
- Authors
- Titles
- Subjects
- Subjects (GND)
- DDC
- License
- Type

NLPContributions: An Annotation Scheme for Machine Reading of Scholarly Contributions in Natural Language Processing Literature

Download statistics - Document (COUNTER):

Repository version

Selected time period:

Sum total of downloads: 20

distribution of downloads over the selected time period:

downloads by country:

Further download figures and rankings:

Search the repository

Browse

All content

This Collection