Retrieval, crawling and fusion of entity-centric data on the web

Download statistics - Document (COUNTER):

Dietze, S.: Retrieval, crawling and fusion of entity-centric data on the web. In: Lecture Notes in Computer Science 10151 (2017), S. 3-16. DOI:

Repository version

To cite the version in the repository, please use this identifier:

Selected time period:


Sum total of downloads: 250

While the Web of (entity-centric) data has seen tremendous growth over the past years, take-up and re-use is still limited. Data vary heavily with respect to their scale, quality, coverage or dynamics, what poses challenges for tasks such as entity retrieval or search. This chapter provides an overview of approaches to deal with the increasing heterogeneity of Web data. On the one hand, recommendation, linking, profiling and retrieval can provide efficient means to enable discovery and search of entity-centric data, specifically when dealing with traditional knowledge graphs and linked data. On the other hand, embedded markup such as Microdata and RDFa has emerged a novel, Web-scale source of entitycentric knowledge. While markup has seen increasing adoption over the last few years, driven by initiatives such as, it constitutes an increasingly important source of entity-centric data on the Web, being in the same order of magnitude as the Web itself with regards to dynamics and scale. To this end, markup data lends itself as a data source for aiding tasks such as knowledge base augmentation, where data fusion techniques are required to address the inherent characteristics of markup data, such as its redundancy, heterogeneity and lack of links. Future directions are concerned with the exploitation of the complementary nature of markup data and traditional knowledge graphs. The final publication is available at Springer via 10.1007/978-3-319-53640-8_1.
License of this version: Es gilt deutsches Urheberrecht. Das Dokument darf zum eigenen Gebrauch kostenfrei genutzt, aber nicht im Internet bereitgestellt oder an Außenstehende weitergegeben werden.
Document Type: Text
Publishing status: acceptedVersion
Issue Date: 2017
Appears in Collections:Fakultät für Elektrotechnik und Informatik

distribution of downloads over the selected time period:

downloads by country:

pos. country downloads
total perc.
1 image of flag of Germany Germany 105 42.00%
2 image of flag of Algeria Algeria 81 32.40%
3 image of flag of China China 21 8.40%
4 image of flag of Russian Federation Russian Federation 12 4.80%
5 image of flag of India India 4 1.60%
6 image of flag of Greece Greece 4 1.60%
7 image of flag of United Kingdom United Kingdom 4 1.60%
8 image of flag of United States United States 3 1.20%
9 image of flag of France France 3 1.20%
10 image of flag of Austria Austria 2 0.80%
    other countries 11 4.40%

Further download figures and rankings:


Zur Erhebung der Downloadstatistiken kommen entsprechend dem „COUNTER Code of Practice for e-Resources“ international anerkannte Regeln und Normen zur Anwendung. COUNTER ist eine internationale Non-Profit-Organisation, in der Bibliotheksverbände, Datenbankanbieter und Verlage gemeinsam an Standards zur Erhebung, Speicherung und Verarbeitung von Nutzungsdaten elektronischer Ressourcen arbeiten, welche so Objektivität und Vergleichbarkeit gewährleisten sollen. Es werden hierbei ausschließlich Zugriffe auf die entsprechenden Volltexte ausgewertet, keine Aufrufe der Website an sich.

Search the repository