Characterization and classification of semantic image-text relations

Zur Kurzanzeige

dc.identifier.uri http://dx.doi.org/10.15488/10705
dc.identifier.uri https://www.repo.uni-hannover.de/handle/123456789/10783
dc.contributor.author Otto, Christian
dc.contributor.author Springstein, Matthias
dc.contributor.author Anand, Avishek
dc.contributor.author Ewerth, Ralph
dc.date.accessioned 2021-03-30T11:22:30Z
dc.date.available 2021-03-30T11:22:30Z
dc.date.issued 2020
dc.identifier.citation Otto, C.; Springstein, M.; Anand, A.; Ewerth, R.: Characterization and classification of semantic image-text relations. In: International Journal of Multimedia Information Retrieval 9 (2020), S. 31-45. DOI: https://doi.org/10.1007/s13735-019-00187-6
dc.description.abstract The beneficial, complementary nature of visual and textual information to convey information is widely known, for example, in entertainment, news, advertisements, science, or education. While the complex interplay of image and text to form semantic meaning has been thoroughly studied in linguistics and communication sciences for several decades, computer vision and multimedia research remained on the surface of the problem more or less. An exception is previous work that introduced the two metrics Cross-Modal Mutual Information and Semantic Correlation in order to model complex image-text relations. In this paper, we motivate the necessity of an additional metric called Status in order to cover complex image-text relations more completely. This set of metrics enables us to derive a novel categorization of eight semantic image-text classes based on three dimensions. In addition, we demonstrate how to automatically gather and augment a dataset for these classes from the Web. Further, we present a deep learning system to automatically predict either of the three metrics, as well as a system to directly predict the eight image-text classes. Experimental results show the feasibility of the approach, whereby the predict-all approach outperforms the cascaded approach of the metric classifiers. © 2020, The Author(s). eng
dc.language.iso eng
dc.publisher London : Springer
dc.relation.ispartofseries International Journal of Multimedia Information Retrieval 9 (2020)
dc.rights CC BY 4.0 Unported
dc.rights.uri https://creativecommons.org/licenses/by/4.0/
dc.subject data augmentation eng
dc.subject image-text class eng
dc.subject multimodality eng
dc.subject Ssemantic gap eng
dc.subject.ddc 004 | Informatik ger
dc.subject.ddc 020 | Bibliotheks- und Informationswissenschaft ger
dc.title Characterization and classification of semantic image-text relations
dc.type Article
dc.type Text
dc.relation.essn 2192-662X
dc.relation.issn 2192-6611
dc.relation.doi https://doi.org/10.1007/s13735-019-00187-6
dc.bibliographicCitation.volume 9
dc.bibliographicCitation.firstPage 31
dc.bibliographicCitation.lastPage 45
dc.description.version publishedVersion
tib.accessRights frei zug�nglich


Die Publikation erscheint in Sammlung(en):

Zur Kurzanzeige

 

Suche im Repositorium


Durchblättern

Mein Nutzer/innenkonto

Nutzungsstatistiken