AdaCC: cumulative cost-sensitive boosting for imbalanced classification

Zur Kurzanzeige

dc.identifier.uri http://dx.doi.org/10.15488/13683
dc.identifier.uri https://www.repo.uni-hannover.de/handle/123456789/13793
dc.contributor.author Iosifidis, Vasileios
dc.contributor.author Papadopoulos, Symeon
dc.contributor.author Rosenhahn, Bodo
dc.contributor.author Ntoutsi, Eirini
dc.date.accessioned 2023-05-12T06:32:48Z
dc.date.available 2023-05-12T06:32:48Z
dc.date.issued 2022
dc.identifier.citation Iosifidis, V.; Papadopoulos, S.; Rosenhahn, B.; Ntoutsi, E.: AdaCC: cumulative cost-sensitive boosting for imbalanced classification. In: Knowledge and information systems 65 (2023), Nr. 2, S. 789-826. DOI: https://doi.org/10.1007/s10115-022-01780-8
dc.description.abstract Class imbalance poses a major challenge for machine learning as most supervised learning models might exhibit bias towards the majority class and under-perform in the minority class. Cost-sensitive learning tackles this problem by treating the classes differently, formulated typically via a user-defined fixed misclassification cost matrix provided as input to the learner. Such parameter tuning is a challenging task that requires domain knowledge and moreover, wrong adjustments might lead to overall predictive performance deterioration. In this work, we propose a novel cost-sensitive boosting approach for imbalanced data that dynamically adjusts the misclassification costs over the boosting rounds in response to model’s performance instead of using a fixed misclassification cost matrix. Our method, called AdaCC, is parameter-free as it relies on the cumulative behavior of the boosting model in order to adjust the misclassification costs for the next boosting round and comes with theoretical guarantees regarding the training error. Experiments on 27 real-world datasets from different domains with high class imbalance demonstrate the superiority of our method over 12 state-of-the-art cost-sensitive boosting approaches exhibiting consistent improvements in different measures, for instance, in the range of [0.3–28.56%] for AUC, [3.4–21.4%] for balanced accuracy, [4.8–45%] for gmean and [7.4–85.5%] for recall. eng
dc.language.iso eng
dc.publisher London : Springer
dc.relation.ispartofseries Knowledge and information systems 65 (2023), Nr. 2
dc.rights CC BY 4.0 Unported
dc.rights.uri https://creativecommons.org/licenses/by/4.0
dc.subject Boosting eng
dc.subject Class imbalance eng
dc.subject Cost-sensitive learning eng
dc.subject Cumulative costs eng
dc.subject Dynamic costs eng
dc.subject.ddc 004 | Informatik ger
dc.subject.ddc 070 | Nachrichtenmedien, Journalismus, Verlagswesen ger
dc.title AdaCC: cumulative cost-sensitive boosting for imbalanced classification eng
dc.type Article
dc.type Text
dc.relation.essn 0219-3116
dc.relation.issn 0219-1377
dc.relation.doi https://doi.org/10.1007/s10115-022-01780-8
dc.bibliographicCitation.issue 2
dc.bibliographicCitation.volume 65
dc.bibliographicCitation.date 2023
dc.bibliographicCitation.firstPage 789
dc.bibliographicCitation.lastPage 826
dc.description.version publishedVersion
tib.accessRights frei zug�nglich


Die Publikation erscheint in Sammlung(en):

Zur Kurzanzeige

 

Suche im Repositorium


Durchblättern

Mein Nutzer/innenkonto

Nutzungsstatistiken