Methods for improving entity linking and exploiting social media messages across crises

Zur Kurzanzeige

dc.identifier.uri http://dx.doi.org/10.15488/13888
dc.identifier.uri https://www.repo.uni-hannover.de/handle/123456789/14002
dc.contributor.author Stoffalette Joao, Renato eng
dc.date.accessioned 2023-06-22T07:30:14Z
dc.date.available 2023-06-22T07:30:14Z
dc.date.issued 2023
dc.identifier.citation Stoffalette Joao, Renato: Methods for improving entity linking and exploiting social media messages across crises. Hannover : Gottfried Wilhelm Leibniz Universität, Diss., 2023, xviii, 117 S., DOI: https://doi.org/10.15488/13888 eng
dc.description.abstract Entity Linking (EL) is the task of automatically identifying entity mentions in texts and resolving them to a corresponding entity in a reference knowledge base (KB). There is a large number of tools available for different types of documents and domains, however the literature in entity linking has shown the quality of a tool varies across different corpus and depends on specific characteristics of the corpus it is applied to. Moreover the lack of precision on particularly ambiguous mentions often spoils the usefulness of automated disambiguation results in real world applications. In the first part of this thesis I explore an approximation of the difficulty to link entity mentions and frame it as a supervised classification task. Classifying difficult to disambiguate entity mentions can facilitate identifying critical cases as part of a semi-automated system, while detecting latent corpus characteristics that affect the entity linking performance. Moreover, despiteless the large number of entity linking tools that have been proposed throughout the past years, some tools work better on short mentions while others perform better when there is more contextual information. To this end, I proposed a solution by exploiting results from distinct entity linking tools on the same corpus by leveraging their individual strengths on a per-mention basis. The proposed solution demonstrated to be effective and outperformed the individual entity systems employed in a series of experiments. An important component in the majority of the entity linking tools is the probability that a mentions links to one entity in a reference knowledge base, and the computation of this probability is usually done over a static snapshot of a reference KB. However, an entity’s popularity is temporally sensitive and may change due to short term events. Moreover, these changes might be then reflected in a KB and EL tools can produce different results for a given mention at different times. I investigated the prior probability change over time and the overall disambiguation performance using different KB from different time periods. The second part of this thesis is mainly concerned with short texts. Social media has become an integral part of the modern society. Twitter, for instance, is one of the most popular social media platforms around the world that enables people to share their opinions and post short messages about any subject on a daily basis. At first I presented one approach to identifying informative messages during catastrophic events using deep learning techniques. By automatically detecting informative messages posted by users during major events, it can enable professionals involved in crisis management to better estimate damages with only relevant information posted on social media channels, as well as to act immediately. Moreover I have also performed an analysis study on Twitter messages posted during the Covid-19 pandemic. Initially I collected 4 million tweets posted in Portuguese since the begining of the pandemic and provided an analysis of the debate aroud the pandemic. I used topic modeling, sentiment analysis and hashtags recomendation techniques to provide isights around the online discussion of the Covid-19 pandemic. eng
dc.language.iso eng eng
dc.publisher Hannover : Institutionelles Repositorium der Gottfried Wilhelm Leibniz Unviersität Hannover
dc.rights Es gilt deutsches Urheberrecht. Das Dokument darf zum eigenen Gebrauch kostenfrei genutzt, aber nicht im Internet bereitgestellt oder an Außenstehende weitergegeben werden. eng
dc.subject Entity Linking eng
dc.subject Ensemble Learning eng
dc.subject Knowledge Base eng
dc.subject Deep Learning eng
dc.subject Entity Linking ger
dc.subject Ensemble Learning ger
dc.subject Wissensbasis ger
dc.subject Deep Learning ger
dc.subject.ddc 000 | Informatik, Informationswissenschaft, allgemeine Werke eng
dc.title Methods for improving entity linking and exploiting social media messages across crises eng
dc.type DoctoralThesis eng
dc.type Text eng
dcterms.extent xviii, 117 S. eng
dc.description.version publishedVersion eng
tib.accessRights frei zug�nglich eng


Die Publikation erscheint in Sammlung(en):

Zur Kurzanzeige

 

Suche im Repositorium


Durchblättern

Mein Nutzer/innenkonto

Nutzungsstatistiken