Efficient and Explainable Neural Ranking

Leonhardt, Lutz Jurek

Downloadstatistik des Dokuments (Auswertung nach COUNTER):

Leonhardt, Lutz Jurek: Efficient and Explainable Neural Ranking. Hannover : Gottfried Wilhelm Leibniz Universität, Diss., 2023, xii, 165 S., DOI: https://doi.org/10.15488/15769

Zeitraum, für den die Download-Zahlen angezeigt werden:

Summe der Downloads: 349

Verteilung der Downloads über den gewählten Zeitraum
Herkunft der Downloads nach Ländern

zurück zum Einzeltitelnachweis (Ansicht Nutzungsstatistik schließen)

Namephdthesis_leonhar ...

Größe1,86 MB

FormatAdobe PDF

Öffnen

Zusammenfassung:
The recent availability of increasingly powerful hardware has caused a shift from traditional information retrieval (IR) approaches based on term matching, which remained the state of the art for several decades, to large pre-trained neural language models. These neural rankers achieve substantial improvements in performance, as their complexity and extensive pre-training give them the ability of understanding natural language in a way. As a result, neural rankers go beyond term matching by performing relevance estimation based on the semantics of queries and documents.However, these improvements in performance don't come without sacrifice. In this thesis, we focus on two fundamental challenges of neural ranking models, specifically, ones based on large language models: On the one hand, due to their complexity, the models are inefficient; they require considerable amounts of computational power, which often comes in the form of specialized hardware, such as GPUs or TPUs. Consequently, the carbon footprint is an increasingly important aspect of systems using neural IR. This effect is amplified when low latency is required, as in, for example, web search. On the other hand, neural models are known for being inherently unexplainable; in other words, it is often not comprehensible for humans why a neural model produced a specific output. In general, explainability is deemed important in order to identify undesired behavior, such as bias.We tackle the efficiency challenge of neural rankers by proposing Fast-Forward indexes, which are simple vector forward indexes that heavily utilize pre-computation techniques. Our approach substantially reduces the computational load during query processing, enabling efficient ranking solely on CPUs without requiring hardware acceleration. Furthermore, we introduce BERT-DMN to show that the training efficiency of neural rankers can be improved by training only parts of the model.In order to improve the explainability of neural ranking, we propose the Select-and-Rank paradigm to make ranking models explainable by design: First, a query-dependent subset of the input document is extracted to serve as an explanation; second, the ranking model makes its decision based only on the extracted subset, rather than the complete document. We show that our models exhibit performance similar to models that are not explainable by design and conduct a user study to determine the faithfulness of the explanations.Finally, we introduce BoilerNet, a web content extraction technique that allows the removal of boilerplate from web pages, leaving only the main content in plain text. Our method requires no feature engineering and can be used to aid in the process of creating new document corpora from the web.
Lizenzbestimmungen:	CC BY 3.0 DE
Publikationstyp:	DoctoralThesis
Publikationsstatus:	publishedVersion
Erstveröffentlichung:	2023
Die Publikation erscheint in Sammlung(en):	Fakultät für Elektrotechnik und Informatik Dissertationen

nach oben

Verteilung der Downloads über den gewählten Zeitraum:

nach oben

Herkunft der Downloads nach Ländern:

Pos.	Land		Downloads
Pos.	Land		Anzahl	Proz.
1		Netherlands	146	41,83%
2		Germany	93	26,65%
3		United States	37	10,60%
4		Canada	9	2,58%
5		Korea, Republic of	8	2,29%
6		United Kingdom	8	2,29%
7		Russian Federation	7	2,01%
8		Taiwan	4	1,15%
9		India	4	1,15%
10		France	4	1,15%
		andere	29	8,31%

nach oben

Weitere Download-Zahlen und Ranglisten:

Hinweis

Zur Erhebung der Downloadstatistiken kommen entsprechend dem „COUNTER Code of Practice for e-Resources“ international anerkannte Regeln und Normen zur Anwendung. COUNTER ist eine internationale Non-Profit-Organisation, in der Bibliotheksverbände, Datenbankanbieter und Verlage gemeinsam an Standards zur Erhebung, Speicherung und Verarbeitung von Nutzungsdaten elektronischer Ressourcen arbeiten, welche so Objektivität und Vergleichbarkeit gewährleisten sollen. Es werden hierbei ausschließlich Zugriffe auf die entsprechenden Volltexte ausgewertet, keine Aufrufe der Website an sich.

Suche im Repositorium

Durchblättern

Gesamter Bestand
Diese Sammlung

Efficient and Explainable Neural Ranking

Downloadstatistik des Dokuments (Auswertung nach COUNTER):

Zeitraum, für den die Download-Zahlen angezeigt werden:

Summe der Downloads: 349

Verteilung der Downloads über den gewählten Zeitraum:

Herkunft der Downloads nach Ländern:

Weitere Download-Zahlen und Ranglisten:

Suche im Repositorium

Durchblättern

Gesamter Bestand

Diese Sammlung