The Sweet Spot between Inverted Indices and Metric-Space Indexing for Top-K–List Similarity Search

Zur Kurzanzeige

dc.identifier.uri http://dx.doi.org/10.15488/5478
dc.identifier.uri https://www.repo.uni-hannover.de/handle/123456789/5525
dc.contributor.author Milchevski, Evica ger
dc.contributor.author Anand, Avishek ger
dc.contributor.author Michel, Sebastian ger
dc.date.accessioned 2019-09-19T08:59:53Z
dc.date.available 2019-09-19T08:59:53Z
dc.date.issued 2015
dc.identifier.citation Milchevski, E.; Anand, A.; Michel, S.: The Sweet Spot between Inverted Indices and Metric-Space Indexing for Top-K–List Similarity Search. In: Advances in Database Technology - EDBT 2015 Proceedings, S. 253-264 ger
dc.description.abstract We consider the problem of processing similarity queries over a set of top-k rankings where the query ranking and the similarity threshold are provided at query time. Spearman’s Footrule distance is used to compute the similarity between rankings, considering how well rankings agree on the positions (ranks) of ranked items (i.e., the L1 distance). This setup allows the application of metric index structures such as M- or BK-trees and, alternatively, enables the use of traditional inverted indices for retrieving rankings that overlap (in items) with the query. Although both techniques are reasonable, they come with individual drawbacks for our specific problem. In this paper, we propose a hybrid indexing strategy, which blends inverted indices and metric space indexing, resulting in a structure that resembles both indexing methods with tunable emphasis on one or the other. To find the sweet spot, we propose an assumption-lean but highly accurate (empirically validated) cost model through theoretical analysis. We further present optimizations to the inverted index component, for early termination and minimizing bookkeeping. The performance of the proposed algorithms, hybrid variants, and competitors is studied in a comprehensive evaluation using real-world benchmark data consisting of Web-search–result rankings and entity rankings based on Wikipedia. ger
dc.language.iso eng ger
dc.publisher Konstanz : OpenProceedings
dc.relation.ispartof Advances in Database Technology - EDBT 2015 Proceedings ger
dc.rights CC BY-NC-ND 4.0 Unported
dc.rights.uri https://creativecommons.org/licenses/by-nc-nd/4.0/ ger
dc.subject top-k ranking eng
dc.subject Spearman’s Footrule eng
dc.subject similarity eng
dc.subject.classification Konferenzschrift ger
dc.subject.ddc 004 | Informatik ger
dc.title The Sweet Spot between Inverted Indices and Metric-Space Indexing for Top-K–List Similarity Search eng
dc.type BookPart ger
dc.type Text ger
dc.relation.isbn 978-3-89318-067-7
dc.description.version publishedVersion ger
tib.accessRights frei zug�nglich ger


Die Publikation erscheint in Sammlung(en):

Zur Kurzanzeige

 

Suche im Repositorium


Durchblättern

Mein Nutzer/innenkonto

Nutzungsstatistiken