The Sweet Spot between Inverted Indices and Metric-Space Indexing for Top-K–List Similarity Search

Milchevski, Evica; Anand, Avishek; Michel, Sebastian

Startseite
→
Fakultäten
→
Fakultät für Elektrotechnik und Informatik
→
Dokumentanzeige

Originalpublikation

Milchevski, E.; Anand, A.; Michel, S.: The Sweet Spot between Inverted Indices and Metric-Space Indexing for Top-K–List Similarity Search. In: Advances in Database Technology - EDBT 2015 Proceedings, S. 253-264

Version im Repositorium

Zum Zitieren der Version im Repositorium verwenden Sie bitte diesen DOI: https://doi.org/10.15488/5478

Name: paper-65.pdf

Größe: 2.697Mb

Format: PDF

Öffnen

Zusammenfassung:
We consider the problem of processing similarity queries over a set of top-k rankings where the query ranking and the similarity threshold are provided at query time. Spearman’s Footrule distance is used to compute the similarity between rankings, considering how well rankings agree on the positions (ranks) of ranked items (i.e., the L1 distance). This setup allows the application of metric index structures such as M- or BK-trees and, alternatively, enables the use of traditional inverted indices for retrieving rankings that overlap (in items) with the query. Although both techniques are reasonable, they come with individual drawbacks for our specific problem. In this paper, we propose a hybrid indexing strategy, which blends inverted indices and metric space indexing, resulting in a structure that resembles both indexing methods with tunable emphasis on one or the other. To find the sweet spot, we propose an assumption-lean but highly accurate (empirically validated) cost model through theoretical analysis. We further present optimizations to the inverted index component, for early termination and minimizing bookkeeping. The performance of the proposed algorithms, hybrid variants, and competitors is studied in a comprehensive evaluation using real-world benchmark data consisting of Web-search–result rankings and entity rankings based on Wikipedia.
Lizenzbestimmungen:	CC BY-NC-ND 4.0 Unported - https://creativecommons.org/licenses/by-nc-nd/4.0/
Publikationstyp:	BookPart
Publikationsstatus:	publishedVersion
Erstveröffentlichung:	2015
Schlagwörter (englisch):	top-k ranking, Spearman’s Footrule, similarity
Fachliche Zuordnung (DDC):	004 \| Informatik
Kontrollierte Schlagwörter:	Konferenzschrift

Downloadstatistik

Zur Langanzeige

Die Publikation erscheint in Sammlung(en):

Fakultät für Elektrotechnik und Informatik
Frei zugängliche Publikationen aus der Fakultät für Elektrotechnik und Informatik

The Sweet Spot between Inverted Indices and Metric-Space Indexing for Top-K–List Similarity Search

Originalpublikation

Version im Repositorium

Die Publikation erscheint in Sammlung(en):

Suche im Repositorium

Durchblättern

Gesamter Bestand

Diese Sammlung

Mein Nutzer/innenkonto

Nutzungsstatistiken