Abstract: | |
Events have always been fundamental building blocks of individual lives as well as of the whole world. Nowadays, thanks to the several technological advances achieved within the digital age, the processes of capturing, describing and spreading events have never been so simple and intuitive. This results in an ubiquitous presence of event-related information, which is digitally embedded in any form of media. Both the pervasiveness of such information as well as the benefits of its exploitation for many purposes have fostered decades of research effort to detect and summarize it. However, several issues emerge at subsequent stages and shall be addressed to support the proper exploitation and consumption of event-related information. The work presented within this thesis is indeed committed to this goal. The aforementioned ubiquity of events makes them exhibit different characteristics and appear in a diverse range of scenarios. Therefore, we categorize events according to three main aspects that come into play when considering the management and usage of event-related information over time, once it has been created. These are the degree of privacy, as events can be of public domain or rather pertain to a more personal sphere, the type of description, which is the form (e.g. textual or visual) in which events are described, and the time of usage, namely the temporal horizon over which event-related information is expected to be accessed and used. The problems addressed in this thesis regard different combinations of such aspects, each one subject to specific issues to be dealt with. Concerning the private sphere, we aim at properly managing large amounts of photographs taken during personal events, so that they can be easily revisited and enjoyed in the future. The common habit of dumping every single picture, encouraged by the availability of cheap storage devices, poses serious threats to their future revisiting and calls for more selective strategies to identify the most important pictures from an entire collection, thus making the future reminiscence of the related events more enjoyable and less tedious. In fact, going through the whole stored photo collections can be such a cumbersome procedure to discourage from doing it at all. We present a selection method that learns to identify the photos that the collection owner would like to keep from a whole collection for future reminiscence, outperforming approaches based on clustering and on the concept of coverage. Then, moving towards more public settings, we consider the problem of validating the occurrence of events of public domain in the real world based on the information contained in textual document collections. In scenarios where events are detected from large amounts of natural language text by automatic procedures, which might introduce false positive detections, being able to retain true events while discarding the false ones becomes fundamental for a proper exploitation of the detected event-related information for any subsequent purpose. We therefore validate the verity of events by checking whether they are reported within a set of documents, which serve as ground truth, reaching substantial agreement with human evaluators. Moreover, when performing event validation as a post-processing step of event detection, we observed an increase of precision within the set of detected events. Finally, we make a temporal jump and consider a scenario where descriptive information of public events (e.g. news articles) are read after few decades. Since the original context of an event, needed for its proper comprehension, might have been forgotten or never known at all after such a relatively long time, we aim at retrieving contextualizing information to support the understanding of old events in presence of wide temporal and contextual gaps. We investigate methods to formulate queries from event descriptions as seeds for retrieving topically and temporally relevant information from a context source, particularly aiming at high recall. Targeting recall as query performance criterion makes the set of retrieved results a favorable starting point for pursuing additional objectives at subsequent stages.
|
|
License of this version: | CC BY 3.0 DE - http://creativecommons.org/licenses/by/3.0/de/ |
Publication type: | DoctoralThesis |
Publishing status: | publishedVersion |
Publication date: | 2018 |
Keywords german: | Persönliche Fotoauswahl, Ereignisvalidierung, Recall-basierte Anfragenformulierung |
Keywords english: | personal photo selection, event validation, recall-based query formulation |
DDC: | 004 | Informatik |