A Multimodal Approach for Semantic Patent Image Retrieval

Download statistics - Document (COUNTER):

Pustu-Iren, K.; Bruns, G.; Ewerth, R.: A Multimodal Approach for Semantic Patent Image Retrieval. In: Krestel, Ralf; Aras, Hidir; Andersson, Linda; Piroi, Florina; Hanbury, Allan; Alderucci, Dean (Eds.): PatentSemTech 2021: Patent Text Mining and Semantic Technologies 2021 : proceedings of the 2nd Workshop on Patent Text Mining and Semantic Technologies (PatentSemTech) 2021, co-located with the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2021). Aachen, Germany : RWTH Aachen, 2021 (CEUR Workshop Proceedings ; 2909), S. 45-49.

Repository version

To cite the version in the repository, please use this identifier: https://doi.org/10.15488/16876

Selected time period:


Sum total of downloads: 26

Patent images such as technical drawings contain valuable information and are frequently used by experts to compare patents. However, current approaches to patent information retrieval are largely focused on textual information. Consequently, we review previous work on patent retrieval with a focus on illustrations in figures. In this paper, we report on work in progress for a novel approach for patent image retrieval that uses deep multimodal features. Scene text spotting and optical character recognition are employed to extract numerals from an image to subsequently identify references to corresponding sentences in the patent document. Furthermore, we use a neural state-of-the-art CLIP model to extract structural features from illustrations and additionally derive textual features from the related patent text using a sentence transformer model. To fuse our multimodal features for similarity search we apply re-ranking according to averaged or maximum scores. In our experiments, we compare the impact of different modalities on the task of similarity search for patent images. The experimental results suggest that patent image retrieval can be successfully performed using the proposed feature sets, while the best results are achieved when combining the features of both modalities.
License of this version: CC BY 4.0 Unported
Document Type: BookPart
Publishing status: publishedVersion
Issue Date: 2021
Appears in Collections:Zentrale Einrichtungen

distribution of downloads over the selected time period:

downloads by country:

pos. country downloads
total perc.
1 image of flag of Germany Germany 12 46.15%
2 image of flag of Switzerland Switzerland 7 26.92%
3 image of flag of United States United States 4 15.38%
4 image of flag of Indonesia Indonesia 1 3.85%
5 image of flag of China China 1 3.85%
6 image of flag of Austria Austria 1 3.85%

Further download figures and rankings:


Zur Erhebung der Downloadstatistiken kommen entsprechend dem „COUNTER Code of Practice for e-Resources“ international anerkannte Regeln und Normen zur Anwendung. COUNTER ist eine internationale Non-Profit-Organisation, in der Bibliotheksverbände, Datenbankanbieter und Verlage gemeinsam an Standards zur Erhebung, Speicherung und Verarbeitung von Nutzungsdaten elektronischer Ressourcen arbeiten, welche so Objektivität und Vergleichbarkeit gewährleisten sollen. Es werden hierbei ausschließlich Zugriffe auf die entsprechenden Volltexte ausgewertet, keine Aufrufe der Website an sich.

Search the repository
