A Multimodal Approach for Semantic Patent Image Retrieval

Pustu-Iren, Kader; Bruns, Gerrit; Ewerth, Ralph

Startseite
→
Weitere Einrichtungen
→
Zentrale Einrichtungen
→
Dokumentanzeige

dc.identifier.uri	http://dx.doi.org/10.15488/16876
dc.identifier.uri	https://www.repo.uni-hannover.de/handle/123456789/17003
dc.contributor.author	Pustu-Iren, Kader
dc.contributor.author	Bruns, Gerrit
dc.contributor.author	Ewerth, Ralph
dc.contributor.editor	Krestel, Ralf
dc.contributor.editor	Aras, Hidir
dc.contributor.editor	Andersson, Linda
dc.contributor.editor	Piroi, Florina
dc.contributor.editor	Hanbury, Allan
dc.contributor.editor	Alderucci, Dean
dc.date.accessioned	2024-04-04T08:54:05Z
dc.date.available	2024-04-04T08:54:05Z
dc.date.issued	2021
dc.identifier.citation	Pustu-Iren, K.; Bruns, G.; Ewerth, R.: A Multimodal Approach for Semantic Patent Image Retrieval. In: Krestel, Ralf; Aras, Hidir; Andersson, Linda; Piroi, Florina; Hanbury, Allan; Alderucci, Dean (Eds.): PatentSemTech 2021: Patent Text Mining and Semantic Technologies 2021 : proceedings of the 2nd Workshop on Patent Text Mining and Semantic Technologies (PatentSemTech) 2021, co-located with the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2021). Aachen, Germany : RWTH Aachen, 2021 (CEUR Workshop Proceedings ; 2909), S. 45-49.
dc.description.abstract	Patent images such as technical drawings contain valuable information and are frequently used by experts to compare patents. However, current approaches to patent information retrieval are largely focused on textual information. Consequently, we review previous work on patent retrieval with a focus on illustrations in figures. In this paper, we report on work in progress for a novel approach for patent image retrieval that uses deep multimodal features. Scene text spotting and optical character recognition are employed to extract numerals from an image to subsequently identify references to corresponding sentences in the patent document. Furthermore, we use a neural state-of-the-art CLIP model to extract structural features from illustrations and additionally derive textual features from the related patent text using a sentence transformer model. To fuse our multimodal features for similarity search we apply re-ranking according to averaged or maximum scores. In our experiments, we compare the impact of different modalities on the task of similarity search for patent images. The experimental results suggest that patent image retrieval can be successfully performed using the proposed feature sets, while the best results are achieved when combining the features of both modalities.	eng
dc.language.iso	eng
dc.publisher	Aachen, Germany : RWTH Aachen
dc.relation.ispartof	PatentSemTech 2021: Patent Text Mining and Semantic Technologies 2021 : proceedings of the 2nd Workshop on Patent Text Mining and Semantic Technologies (PatentSemTech) 2021, co-located with the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2021)
dc.relation.ispartofseries	CEUR Workshop Proceedings ; 2909
dc.relation.uri	https://ceur-ws.org/Vol-2909/paper6.pdf
dc.rights	CC BY 4.0 Unported
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Patent Image Similarity Search	eng
dc.subject	Deep Learning	eng
dc.subject	Mulitmodal Feature Representations	eng
dc.subject	Scene Text Spotting	eng
dc.subject.classification	Konferenzschrift	ger
dc.subject.ddc	004 \| Informatik
dc.subject.ddc	020 \| Bibliotheks- und Informationswissenschaft
dc.title	A Multimodal Approach for Semantic Patent Image Retrieval	eng
dc.type	BookPart
dc.type	Text
dc.relation.essn	1613-0073
dc.bibliographicCitation.volume	2909
dc.bibliographicCitation.firstPage	45
dc.bibliographicCitation.lastPage	49
dc.description.version	publishedVersion
tib.accessRights	frei zug�nglich

Name: A_Multimodal_Appr ...

Größe: 1.297Mb

Format: PDF

Öffnen

Die Publikation erscheint in Sammlung(en):

Zentrale Einrichtungen
Frei zugängliche Publikationen aus Zentralen Einrichtungen der Leibniz Universität Hannover
Forschungszentren
Frei zugängliche Publikationen aus den Forschungszentren

A Multimodal Approach for Semantic Patent Image Retrieval

Die Publikation erscheint in Sammlung(en):

Suche im Repositorium

Durchblättern

Gesamter Bestand

Diese Sammlung

Mein Nutzer/innenkonto

Nutzungsstatistiken