Citation needed: A taxonomy and algorithmic assessment of Wikipedia's verifiability

Show simple item record

dc.identifier.uri http://dx.doi.org/10.15488/5061
dc.identifier.uri https://www.repo.uni-hannover.de/handle/123456789/5105
dc.contributor.author Redi, Miriam
dc.contributor.author Morgan, Jonathan
dc.contributor.author Fetahu, Besnik
dc.contributor.author Taraborelli, Dario
dc.date.accessioned 2019-07-02T07:58:23Z
dc.date.available 2019-07-02T07:58:23Z
dc.date.issued 2019
dc.identifier.citation Redi, M.; Morgan, J.; Fetahu, B.; Taraborelli, D.: Citation needed: A taxonomy and algorithmic assessment of Wikipedia's verifiability. In: The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019, S. 1567-1578. DOI: https://doi.org/10.1145/3308558.3313618
dc.description.abstract Wikipedia is playing an increasingly central role on the web, and the policies its contributors follow when sourcing and fact-checking content affect million of readers. Among these core guiding principles, verifiability policies have a particularly important role. Verifiability requires that information included in a Wikipedia article be corroborated against reliable secondary sources. Because of the manual labor needed to curate Wikipedia at scale, however, its contents do not always evenly comply with these policies. Citations (i.e. reference to external sources) may not conform to verifiability requirements or may be missing altogether, potentially weakening the reliability of specific topic areas of the free encyclopedia. In this paper, we aim to provide an empirical characterization of the reasons why and how Wikipedia cites external sources to comply with its own verifiability guidelines. First, we construct a taxonomy of reasons why inline citations are required, by collecting labeled data from editors of multiple Wikipedia language editions. We then crowdsource a large-scale dataset of Wikipedia sentences annotated with categories derived from this taxonomy. Finally, we design algorithmic models to determine if a statement requires a citation, and to predict the citation reason. We evaluate the accuracy of such models across different classes of Wikipedia articles of varying quality, and on external datasets of claims annotated for fact-checking purposes. eng
dc.language.iso eng
dc.publisher New York, NY : Association for Computing Machinery, Inc
dc.relation.ispartofseries The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019
dc.rights CC BY 4.0 Unported
dc.rights.uri https://creativecommons.org/licenses/by/4.0/
dc.subject Citations eng
dc.subject Crowdsourcing eng
dc.subject Neural Networks eng
dc.subject Wikipedia eng
dc.subject Crowdsourcing eng
dc.subject Large dataset eng
dc.subject Neural networks eng
dc.subject Taxonomies eng
dc.subject Algorithmic model eng
dc.subject Citations eng
dc.subject External sources eng
dc.subject Guiding principles eng
dc.subject Large-scale dataset eng
dc.subject Secondary sources eng
dc.subject Wikipedia eng
dc.subject Wikipedia articles eng
dc.subject Websites eng
dc.subject.classification Konferenzschrift ger
dc.subject.ddc 004 | Informatik ger
dc.title Citation needed: A taxonomy and algorithmic assessment of Wikipedia's verifiability
dc.type BookPart
dc.type Text
dc.relation.isbn 978-1-4503-6674-8
dc.relation.doi https://doi.org/10.1145/3308558.3313618
dc.bibliographicCitation.firstPage 1567
dc.bibliographicCitation.lastPage 1578
dc.description.version publishedVersion
tib.accessRights frei zug�nglich


Files in this item

This item appears in the following Collection(s):

Show simple item record

 

Search the repository


Browse

My Account

Usage Statistics