Resource management for model learning at entity level

Beyer, Christian; Unnikrishnan, Vishnu; Brüggemann, Robert; Toulouse, Vincent; Omar, Hafez Kader; Ntoutsi, Eirini; Spiliopoulou, Myra

Startseite
→
Forschungseinrichtungen
→
Forschungszentren
→
Dokumentanzeige

dc.identifier.uri	http://dx.doi.org/10.15488/12632
dc.identifier.uri	https://www.repo.uni-hannover.de/handle/123456789/12732
dc.contributor.author	Beyer, Christian
dc.contributor.author	Unnikrishnan, Vishnu
dc.contributor.author	Brüggemann, Robert
dc.contributor.author	Toulouse, Vincent
dc.contributor.author	Omar, Hafez Kader
dc.contributor.author	Ntoutsi, Eirini
dc.contributor.author	Spiliopoulou, Myra
dc.date.accessioned	2022-08-04T08:31:56Z
dc.date.available	2022-08-04T08:31:56Z
dc.date.issued	2020
dc.identifier.citation	Beyer, C.; Unnikrishnan, V.; Brüggemann, R.; Toulouse, V.; Omar, H.K. et al.: Resource management for model learning at entity level. In: Annales des Telecommunications/Annals of Telecommunications 75 (2020), Nr. 9-10, S. 549-561. DOI: https://doi.org/10.1007/s12243-020-00800-4
dc.description.abstract	Many current and future applications plan to provide entity-specific predictions. These range from individualized healthcare applications to user-specific purchase recommendations. In our previous stream-based work on Amazon review data, we could show that error-weighted ensembles that combine entity-centric classifiers, which are only trained on reviews of one particular product (entity), and entity-ignorant classifiers, which are trained on all reviews irrespective of the product, can improve prediction quality. This came at the cost of storing multiple entity-centric models in primary memory, many of which would never be used again as their entities would not receive future instances in the stream. To overcome this drawback and make entity-centric learning viable in these scenarios, we investigated two different methods of reducing the primary memory requirement of our entity-centric approach. Our first method uses the lossy counting algorithm for data streams to identify entities whose instances make up a certain percentage of the total data stream within an error-margin. We then store all models which do not fulfil this requirement in secondary memory, from which they can be retrieved in case future instances belonging to them should arrive later in the stream. The second method replaces entity-centric models with a much more naive model which only stores the past labels and predicts the majority label seen so far. We applied our methods on the previously used Amazon data sets which contained up to 1.4M reviews and added two subsets of the Yelp data set which contain up to 4.2M reviews. Both methods were successful in reducing the primary memory requirements while still outperforming an entity-ignorant model. ¬© 2020, The Author(s).	eng
dc.language.iso	eng
dc.publisher	Berlin : Springer
dc.relation.ispartofseries	Annales des Telecommunications/Annals of Telecommunications 75 (2020), Nr. 9-10
dc.rights	CC BY 4.0 Unported
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Data streams	eng
dc.subject	Error margins	eng
dc.subject	Future applications	eng
dc.subject	Health care application	eng
dc.subject	Model learning	eng
dc.subject	Prediction quality	eng
dc.subject	Primary memory	eng
dc.subject	Resource management	eng
dc.subject	Secondary memory	eng
dc.subject	Learning systems	eng
dc.subject	Document prediction	eng
dc.subject	Entity-centric learning	eng
dc.subject	Memory reduction	eng
dc.subject	Stream classification	eng
dc.subject	Text ignorant models	eng
dc.subject.ddc	620 \| Ingenieurwissenschaften und Maschinenbau	ger
dc.title	Resource management for model learning at entity level
dc.type	Article
dc.type	Text
dc.relation.essn	1958-9395
dc.relation.issn	0003-4347
dc.relation.doi	https://doi.org/10.1007/s12243-020-00800-4
dc.bibliographicCitation.issue	9-10
dc.bibliographicCitation.volume	75
dc.bibliographicCitation.firstPage	549
dc.bibliographicCitation.lastPage	561
dc.description.version	publishedVersion
tib.accessRights	frei zug�nglich