Pereira Nunes, B.; Mera, A.; Casanova, M.A.; Fetahu, B.; Paes Leme, L.A.P.; Dietze, S.: Complex matching of RDF datatype properties. In: Decker, H.; Lhotská, L.; Link, S.; Basl, J.; Tjoa, A M. (Eds.): Database and Expert Systems Applications : 24th International Conference, DEXA 2013, Prague, Czech Republic, August 26-29, 2013, Proceedings, Part I. Heidelberg : Springer, 2013 (Lecture Notes in Computer Science ; 8055), S. 195-208. DOI: https://doi.org/10.1007/978-3-642-40285-2_18
Abstract: | |
Property mapping is a fundamental component of ontology matching, and yet there is little support that goes beyond the identification of single property matches. Real data often requires some degree of composition, trivially exemplified by the mapping of "first name" and "last name" to "full name" on one end, to complex matchings, such as parsing and pairing symbol/digit strings to SSN numbers, at the other end of the spectrum. In this paper, we propose a two-phase instance-based technique for complex datatype property matching. Phase 1 computes the Estimate Mutual Information matrix of the property values to (1) find simple, 1:1 matches, and (2) compute a list of possible complex matches. Phase 2 applies Genetic Programming to the much reduced search space of candidate matches to find complex matches. We conclude with experimental results that illustrate how the technique works. Furthermore, we show that the proposed technique greatly improves results over those obtained if the Estimate Mutual Information matrix or the Genetic Programming techniques were to be used independently. The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-40285-2_18. | |
License of this version: | Es gilt deutsches Urheberrecht. Das Dokument darf zum eigenen Gebrauch kostenfrei genutzt, aber nicht im Internet bereitgestellt oder an Außenstehende weitergegeben werden. |
Document Type: | BookPart |
Publishing status: | acceptedVersion |
Issue Date: | 2013 |
Appears in Collections: | Fakultät für Elektrotechnik und Informatik |
pos. | country | downloads | ||
---|---|---|---|---|
total | perc. | |||
1 | Germany | 94 | 44.76% | |
2 | United States | 35 | 16.67% | |
3 | China | 11 | 5.24% | |
4 | France | 6 | 2.86% | |
5 | Brazil | 6 | 2.86% | |
6 | Taiwan | 5 | 2.38% | |
7 | Netherlands | 5 | 2.38% | |
8 | Korea, Republic of | 5 | 2.38% | |
9 | Ireland | 5 | 2.38% | |
10 | Spain | 4 | 1.90% | |
other countries | 34 | 16.19% |
Hinweis
Zur Erhebung der Downloadstatistiken kommen entsprechend dem „COUNTER Code of Practice for e-Resources“ international anerkannte Regeln und Normen zur Anwendung. COUNTER ist eine internationale Non-Profit-Organisation, in der Bibliotheksverbände, Datenbankanbieter und Verlage gemeinsam an Standards zur Erhebung, Speicherung und Verarbeitung von Nutzungsdaten elektronischer Ressourcen arbeiten, welche so Objektivität und Vergleichbarkeit gewährleisten sollen. Es werden hierbei ausschließlich Zugriffe auf die entsprechenden Volltexte ausgewertet, keine Aufrufe der Website an sich.