FAIR data representation in times of eScience: a comparison of instance-based and class-based semantic representations of empirical data using phenotype descriptions as example

Zur Kurzanzeige

dc.identifier.uri http://dx.doi.org/10.15488/12211
dc.identifier.uri https://www.repo.uni-hannover.de/handle/123456789/12309
dc.contributor.author Vogt, Lars
dc.date.accessioned 2022-06-09T07:10:55Z
dc.date.available 2022-06-09T07:10:55Z
dc.date.issued 2021
dc.identifier.citation Vogt, L.: FAIR data representation in times of eScience: a comparison of instance-based and class-based semantic representations of empirical data using phenotype descriptions as example. In: Journal of biomedical semantics 12 (2021), 20. DOI: https://doi.org/10.1186/s13326-021-00254-0
dc.description.abstract Background: The size, velocity, and heterogeneity of Big Data outclasses conventional data management tools and requires data and metadata to be fully machine-actionable (i.e., eScience-compliant) and thus findable, accessible, interoperable, and reusable (FAIR). This can be achieved by using ontologies and through representing them as semantic graphs. Here, we discuss two different semantic graph approaches of representing empirical data and metadata in a knowledge graph, with phenotype descriptions as an example. Almost all phenotype descriptions are still being published as unstructured natural language texts, with far-reaching consequences for their FAIRness, substantially impeding their overall usability within the life sciences. However, with an increasing amount of anatomy ontologies becoming available and semantic applications emerging, a solution to this problem becomes available. Researchers are starting to document and communicate phenotype descriptions through the Web in the form of highly formalized and structured semantic graphs that use ontology terms and Uniform Resource Identifiers (URIs) to circumvent the problems connected with unstructured texts. Results: Using phenotype descriptions as an example, we compare and evaluate two basic representations of empirical data and their accompanying metadata in the form of semantic graphs: the class-based TBox semantic graph approach called Semantic Phenotype and the instance-based ABox semantic graph approach called Phenotype Knowledge Graph. Their main difference is that only the ABox approach allows for identifying every individual part and property mentioned in the description in a knowledge graph. This technical difference results in substantial practical consequences that significantly affect the overall usability of empirical data. The consequences affect findability, accessibility, and explorability of empirical data as well as their comparability, expandability, universal usability and reusability, and overall machine-actionability. Moreover, TBox semantic graphs often require querying under entailment regimes, which is computationally more complex. Conclusions: We conclude that, from a conceptual point of view, the advantages of the instance-based ABox semantic graph approach outweigh its shortcomings and outweigh the advantages of the class-based TBox semantic graph approach. Therefore, we recommend the instance-based ABox approach as a FAIR approach for documenting and communicating empirical data and metadata in a knowledge graph. eng
dc.language.iso eng
dc.publisher London : BioMed Central
dc.relation.ispartofseries Journal of biomedical semantics 12 (2021)
dc.rights CC BY 4.0 Unported
dc.rights.uri https://creativecommons.org/licenses/by/4.0/
dc.subject Phenotype data eng
dc.subject Phenotype knowledge graph eng
dc.subject Semantic phenotype eng
dc.subject Ontology eng
dc.subject Knowledge management eng
dc.subject Semantic graph eng
dc.subject Data representation eng
dc.subject FAIR data eng
dc.subject ABox expression eng
dc.subject TBox expression eng
dc.subject.ddc 570 | Biowissenschaften, Biologie ger
dc.subject.ddc 610 | Medizin, Gesundheit ger
dc.title FAIR data representation in times of eScience: a comparison of instance-based and class-based semantic representations of empirical data using phenotype descriptions as example
dc.type Article
dc.type Text
dc.relation.essn 2041-1480
dc.relation.doi https://doi.org/10.1186/s13326-021-00254-0
dc.bibliographicCitation.volume 12
dc.bibliographicCitation.firstPage 20
dc.description.version publishedVersion
tib.accessRights frei zug�nglich


Die Publikation erscheint in Sammlung(en):

  • Zentrale Einrichtungen
    Frei zugängliche Publikationen aus Zentralen Einrichtungen der Leibniz Universität Hannover

Zur Kurzanzeige

 

Suche im Repositorium


Durchblättern

Mein Nutzer/innenkonto

Nutzungsstatistiken