Multimodal news analytics using measures of cross-modal entity and context consistency

Müller-Budack, Eric; Theiner, Jonas; Diering, Sebastian; Idahl, Maximilian; Hakimov, Sherzod; Ewerth, Ralph

Startseite
→
Forschungseinrichtungen
→
Forschungszentren
→
Dokumentanzeige

dc.identifier.uri	http://dx.doi.org/10.15488/12349
dc.identifier.uri	https://www.repo.uni-hannover.de/handle/123456789/12448
dc.contributor.author	Müller-Budack, Eric
dc.contributor.author	Theiner, Jonas
dc.contributor.author	Diering, Sebastian
dc.contributor.author	Idahl, Maximilian
dc.contributor.author	Hakimov, Sherzod
dc.contributor.author	Ewerth, Ralph
dc.date.accessioned	2022-06-27T04:37:00Z
dc.date.available	2022-06-27T04:37:00Z
dc.date.issued	2021
dc.identifier.citation	Müller-Budack, E.; Theiner, J.; Diering, S.; Idahl, M.; Hakimov, S. et al.: Multimodal news analytics using measures of cross-modal entity and context consistency. In: International Journal of Multimedia Information Retrieval 10 (2021), Nr. 2, S. 111-125. DOI: https://doi.org/10.1007/s13735-021-00207-4
dc.description.abstract	The World Wide Web has become a popular source to gather information and news. Multimodal information, e.g., supplement text with photographs, is typically used to convey the news more effectively or to attract attention. The photographs can be decorative, depict additional details, but might also contain misleading information. The quantification of the cross-modal consistency of entity representations can assist human assessors’ evaluation of the overall multimodal message. In some cases such measures might give hints to detect fake news, which is an increasingly important topic in today’s society. In this paper, we present a multimodal approach to quantify the entity coherence between image and text in real-world news. Named entity linking is applied to extract persons, locations, and events from news texts. Several measures are suggested to calculate the cross-modal similarity of the entities in text and photograph by exploiting state-of-the-art computer vision approaches. In contrast to previous work, our system automatically acquires example data from the Web and is applicable to real-world news. Moreover, an approach that quantifies contextual image-text relations is introduced. The feasibility is demonstrated on two datasets that cover different languages, topics, and domains. © 2021, The Author(s).	eng
dc.language.iso	eng
dc.publisher	London : Springer
dc.relation.ispartofseries	International Journal of Multimedia Information Retrieval 10 (2021), Nr. 2
dc.rights	CC BY 4.0 Unported
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Cross-modal consistency	eng
dc.subject	Image repurposing detection	eng
dc.subject	Image-text relations	eng
dc.subject	News analytics	eng
dc.subject.ddc	660 \| Technische Chemie	ger
dc.subject.ddc	070 \| Nachrichtenmedien, Journalismus, Verlagswesen	ger
dc.subject.ddc	020 \| Bibliotheks- und Informationswissenschaft	ger
dc.subject.ddc	004 \| Informatik	ger
dc.title	Multimodal news analytics using measures of cross-modal entity and context consistency
dc.type	Article
dc.type	Text
dc.relation.essn	2192-662X
dc.relation.doi	https://doi.org/10.1007/s13735-021-00207-4
dc.bibliographicCitation.issue	2
dc.bibliographicCitation.volume	10
dc.bibliographicCitation.firstPage	111
dc.bibliographicCitation.lastPage	125
dc.description.version	publishedVersion
tib.accessRights	frei zug�nglich