Semi-supervised learning and fairness-aware learning under class imbalance

Download statistics - Document (COUNTER):

Iosifidis, Vasileios: Semi-supervised learning and fairness-aware learning under class imbalance. Hannover : Gottfried Wilhelm Leibniz Universität, Diss., 2020, xviii, 130 S. DOI:

Selected time period:


Sum total of downloads: 271

With the advent of Web 2.0 and the rapid technological advances, there is a plethora of data in every field; however, more data does not necessarily imply more information, rather the quality of data (veracity aspect) plays a key role. Data quality is a major issue, since machine learning algorithms are solely based on historical data to derive novel hypotheses. Data may contain noise, outliers, missing values and/or class labels, and skewed data distributions. The latter case, the so-called class-imbalance problem, is quite old and still affects dramatically machine learning algorithms. Class-imbalance causes classification models to learn effectively one particular class (majority) while ignoring other classes (minority). In extend to this issue, machine learning models that are applied in domains of high societal impact have become biased towards groups of people or individuals who are not well represented within the data. Direct and indirect discriminatory behavior is prohibited by international laws; thus, there is an urgency of mitigating discriminatory outcomes from machine learning algorithms.In this thesis, we address the aforementioned issues and propose methods that tackle class imbalance, and mitigate discriminatory outcomes in machine learning algorithms. As part of this thesis, we make the following contributions:• Tackling class-imbalance in semi-supervised learning – The class-imbalance problem is very often encountered in classification. There is a variety of methods that tackle this problem; however, there is a lack of methods that deal with class-imbalance in the semi-supervised learning. We address this problem by employing data augmentation in semi-supervised learning process in order to equalize class distributions. We show that semi-supervised learning coupled with data augmentation methods can overcome class-imbalance propagation and significantly outperform the standard semi-supervised annotation process.• Mitigating unfairness in supervised models – Fairness in supervised learning has received a lot of attention over the last years. A growing body of pre-, in- and postprocessing approaches has been proposed to mitigate algorithmic bias; however, these methods consider error rate as the performance measure of the machine learning algorithm, which causes high error rates on the under-represented class. To deal with this problem, we propose approaches that operate in pre-, in- and post-processing layers while accounting for all classes. Our proposed methods outperform state-of-the-art methods in terms of performance while being able to mitigate unfair outcomes.
License of this version: CC BY 3.0 DE
Document Type: doctoralThesis
Publishing status: publishedVersion
Issue Date: 2020
Appears in Collections:Fakultät für Elektrotechnik und Informatik

distribution of downloads over the selected time period:

downloads by country:

pos. country downloads
total perc.
1 image of flag of Germany Germany 102 37.64%
2 image of flag of United States United States 44 16.24%
3 image of flag of China China 33 12.18%
4 image of flag of Taiwan Taiwan 9 3.32%
5 image of flag of India India 8 2.95%
6 image of flag of Korea, Republic of Korea, Republic of 7 2.58%
7 image of flag of Iran, Islamic Republic of Iran, Islamic Republic of 6 2.21%
8 image of flag of Italy Italy 5 1.85%
9 image of flag of United Kingdom United Kingdom 5 1.85%
10 image of flag of Australia Australia 5 1.85%
    other countries 47 17.34%

Further download figures and rankings:


Zur Erhebung der Downloadstatistiken kommen entsprechend dem „COUNTER Code of Practice for e-Resources“ international anerkannte Regeln und Normen zur Anwendung. COUNTER ist eine internationale Non-Profit-Organisation, in der Bibliotheksverbände, Datenbankanbieter und Verlage gemeinsam an Standards zur Erhebung, Speicherung und Verarbeitung von Nutzungsdaten elektronischer Ressourcen arbeiten, welche so Objektivität und Vergleichbarkeit gewährleisten sollen. Es werden hierbei ausschließlich Zugriffe auf die entsprechenden Volltexte ausgewertet, keine Aufrufe der Website an sich.

Search the repository