Characterization and classification of semantic image-text relations

Show simple item record

dc.identifier.uri Otto, Christian Springstein, Matthias Anand, Avishek Ewerth, Ralph 2021-03-30T11:22:30Z 2021-03-30T11:22:30Z 2020
dc.identifier.citation Otto, C.; Springstein, M.; Anand, A.; Ewerth, R.: Characterization and classification of semantic image-text relations. In: International Journal of Multimedia Information Retrieval 9 (2020), S. 31-45. DOI:
dc.description.abstract The beneficial, complementary nature of visual and textual information to convey information is widely known, for example, in entertainment, news, advertisements, science, or education. While the complex interplay of image and text to form semantic meaning has been thoroughly studied in linguistics and communication sciences for several decades, computer vision and multimedia research remained on the surface of the problem more or less. An exception is previous work that introduced the two metrics Cross-Modal Mutual Information and Semantic Correlation in order to model complex image-text relations. In this paper, we motivate the necessity of an additional metric called Status in order to cover complex image-text relations more completely. This set of metrics enables us to derive a novel categorization of eight semantic image-text classes based on three dimensions. In addition, we demonstrate how to automatically gather and augment a dataset for these classes from the Web. Further, we present a deep learning system to automatically predict either of the three metrics, as well as a system to directly predict the eight image-text classes. Experimental results show the feasibility of the approach, whereby the predict-all approach outperforms the cascaded approach of the metric classifiers. © 2020, The Author(s). eng
dc.language.iso eng
dc.publisher London : Springer
dc.relation.ispartofseries International Journal of Multimedia Information Retrieval 9 (2020)
dc.rights CC BY 4.0 Unported
dc.subject data augmentation eng
dc.subject image-text class eng
dc.subject multimodality eng
dc.subject Ssemantic gap eng
dc.subject.ddc 004 | Informatik ger
dc.subject.ddc 020 | Bibliotheks- und Informationswissenschaft ger
dc.title Characterization and classification of semantic image-text relations
dc.type Article
dc.type Text
dc.relation.essn 2192-662X
dc.relation.issn 2192-6611
dc.bibliographicCitation.volume 9
dc.bibliographicCitation.firstPage 31
dc.bibliographicCitation.lastPage 45
dc.description.version publishedVersion
tib.accessRights frei zug�nglich

Files in this item

This item appears in the following Collection(s):

Show simple item record


Search the repository


My Account

Usage Statistics