Resource management for model learning at entity level

Zur Kurzanzeige

dc.identifier.uri http://dx.doi.org/10.15488/12632
dc.identifier.uri https://www.repo.uni-hannover.de/handle/123456789/12732
dc.contributor.author Beyer, Christian
dc.contributor.author Unnikrishnan, Vishnu
dc.contributor.author Brüggemann, Robert
dc.contributor.author Toulouse, Vincent
dc.contributor.author Omar, Hafez Kader
dc.contributor.author Ntoutsi, Eirini
dc.contributor.author Spiliopoulou, Myra
dc.date.accessioned 2022-08-04T08:31:56Z
dc.date.available 2022-08-04T08:31:56Z
dc.date.issued 2020
dc.identifier.citation Beyer, C.; Unnikrishnan, V.; Brüggemann, R.; Toulouse, V.; Omar, H.K. et al.: Resource management for model learning at entity level. In: Annales des Telecommunications/Annals of Telecommunications 75 (2020), Nr. 9-10, S. 549-561. DOI: https://doi.org/10.1007/s12243-020-00800-4
dc.description.abstract Many current and future applications plan to provide entity-specific predictions. These range from individualized healthcare applications to user-specific purchase recommendations. In our previous stream-based work on Amazon review data, we could show that error-weighted ensembles that combine entity-centric classifiers, which are only trained on reviews of one particular product (entity), and entity-ignorant classifiers, which are trained on all reviews irrespective of the product, can improve prediction quality. This came at the cost of storing multiple entity-centric models in primary memory, many of which would never be used again as their entities would not receive future instances in the stream. To overcome this drawback and make entity-centric learning viable in these scenarios, we investigated two different methods of reducing the primary memory requirement of our entity-centric approach. Our first method uses the lossy counting algorithm for data streams to identify entities whose instances make up a certain percentage of the total data stream within an error-margin. We then store all models which do not fulfil this requirement in secondary memory, from which they can be retrieved in case future instances belonging to them should arrive later in the stream. The second method replaces entity-centric models with a much more naive model which only stores the past labels and predicts the majority label seen so far. We applied our methods on the previously used Amazon data sets which contained up to 1.4M reviews and added two subsets of the Yelp data set which contain up to 4.2M reviews. Both methods were successful in reducing the primary memory requirements while still outperforming an entity-ignorant model. © 2020, The Author(s). eng
dc.language.iso eng
dc.publisher Berlin : Springer
dc.relation.ispartofseries Annales des Telecommunications/Annals of Telecommunications 75 (2020), Nr. 9-10
dc.rights CC BY 4.0 Unported
dc.rights.uri https://creativecommons.org/licenses/by/4.0/
dc.subject Data streams eng
dc.subject Error margins eng
dc.subject Future applications eng
dc.subject Health care application eng
dc.subject Model learning eng
dc.subject Prediction quality eng
dc.subject Primary memory eng
dc.subject Resource management eng
dc.subject Secondary memory eng
dc.subject Learning systems eng
dc.subject Document prediction eng
dc.subject Entity-centric learning eng
dc.subject Memory reduction eng
dc.subject Stream classification eng
dc.subject Text ignorant models eng
dc.subject.ddc 620 | Ingenieurwissenschaften und Maschinenbau ger
dc.title Resource management for model learning at entity level
dc.type Article
dc.type Text
dc.relation.essn 1958-9395
dc.relation.issn 0003-4347
dc.relation.doi https://doi.org/10.1007/s12243-020-00800-4
dc.bibliographicCitation.issue 9-10
dc.bibliographicCitation.volume 75
dc.bibliographicCitation.firstPage 549
dc.bibliographicCitation.lastPage 561
dc.description.version publishedVersion
tib.accessRights frei zug�nglich


Die Publikation erscheint in Sammlung(en):

Zur Kurzanzeige

 

Suche im Repositorium


Durchblättern

Mein Nutzer/innenkonto

Nutzungsstatistiken