Efficiently identifying top k similar entities

Hanasoge Sudheendra, Supreetha

Startseite
→
Fakultäten
→
Fakultät für Elektrotechnik und Informatik
→
Dokumentanzeige

dc.identifier.uri	http://dx.doi.org/10.15488/10466
dc.identifier.uri	https://www.repo.uni-hannover.de/handle/123456789/10542
dc.contributor.advisor	Vidal, Maria-Esther
dc.contributor.author	Hanasoge Sudheendra, Supreetha	eng
dc.date.accessioned	2021-03-01T12:57:04Z
dc.date.available	2021-03-01T12:57:04Z
dc.date.issued	2020-12-28
dc.identifier.citation	Hanasoge Sudheendra, Supreetha: Efficiently identifying top k similar entities. Hannover : Gottfried Wilhelm Leibniz Universität, Master Thesis, 2020, 81 S. DOI: https://doi.org/10.15488/10466	eng
dc.description.abstract	With the rapid growth in genomic studies, more and more successful researches are being produced that integrate tools and technologies from interdisciplinary sciences. Computational biology or bioinformatics is one such field that successfully applies computational tools to capture and transcribe biological data. Specifically in genomic studies, detection and analysis of co-occurring mutations is an leading area of study. Concurrently, in the recent years, computer science and information technology have seen an increased interest in the area association analysis and co-occurrence computation. The traditional method of finding top similar entities involves examining every possible pair of entities, which leads to a prohibitive quadratic time complexity. Most of the existing approaches also require a similarity measure and threshold beforehand to retrieve the top similar entities. These parameters are not always easy to tune. Heuristically, an adaptive method can have wider applications for identifying the top most similar pair of mutations (or entities in general). In this thesis, we have presented an algorithm to efficiently identify top k similar pair of mutations using co-occurrence as the similarity measure. Our approach used an upperbound condition to iteratively prune the search space and tackled the quadratic complexity. The empirical evaluations show that the proposed approach shows the computational efficiency in terms of execution time and accuracy of our approach particularly in large size datasets. In addition, we also evaluate the impact of various parameters like input size, k on the execution time in top k approaches. This study concludes that systematic pruning of the search space using an adaptive threshold condition optimizes the process of identifying top similar pair of entities.	eng
dc.language.iso	eng	eng
dc.publisher	Hannover : Gottfried Wilhelm Leibniz Universität
dc.rights	Es gilt deutsches Urheberrecht. Das Dokument darf zum eigenen Gebrauch kostenfrei genutzt, aber nicht im Internet bereitgestellt oder an Außenstehende weitergegeben werden.	eng
dc.subject	Bioinformatics	eng
dc.subject	Genomic studies	eng
dc.subject	Similarity	eng
dc.subject	Co-occurence computation	eng
dc.subject	Time complexity	eng
dc.subject	Algorithm	eng
dc.subject.classification	Algorithmus	eng
dc.subject.classification	Bioinformatik	eng
dc.subject.classification	Ähnlichkeit	eng
dc.subject.ddc	004 \| Informatik	eng
dc.title	Efficiently identifying top k similar entities	eng
dc.type	MasterThesis	eng
dc.type	Text	eng
dcterms.extent	81 S.
dc.description.version	publishedVersion	eng
tib.accessRights	frei zug�nglich	eng

Name: Efficiently_ident ...

Größe: 3.094Mb

Format: PDF

Öffnen

Die Publikation erscheint in Sammlung(en):

Fakultät für Elektrotechnik und Informatik
Frei zugängliche Publikationen aus der Fakultät für Elektrotechnik und Informatik

Efficiently identifying top k similar entities

Die Publikation erscheint in Sammlung(en):

Suche im Repositorium

Durchblättern

Gesamter Bestand

Diese Sammlung

Mein Nutzer/innenkonto

Nutzungsstatistiken