Using Natural Language Processing to Enable In-depth Analysis of Clinical Messages Posted to an Internet Mailing List: A Feasibility Study

Bekhuis, Tanja; Kreinacke, Marcos; Spallek, Heiko; Song, Mei; O'Donnell, Jean A.

Using Natural Language Processing to Enable In-depth Analysis of Clinical Messages Posted to an Internet Mailing List: A Feasibility Study

Services

Deutsch English

Über das Repositorium Suchen und Entdecken Publizieren

Downloadstatistik des Dokuments (Auswertung nach COUNTER):

Bekhuis, T. et al.: Using Natural Language Processing to Enable In-depth Analysis of Clinical Messages Posted to an Internet Mailing List: A Feasibility Study. In: Journal of Medical Internet Research 13 (2011), Nr. 4, e98. DOI: https://doi.org/10.2196/jmir.1799

Version im Repositorium

Zum Zitieren der Version im Repositorium verwenden Sie bitte diesen DOI: https://doi.org/10.15488/4867

Zeitraum, für den die Download-Zahlen angezeigt werden:

Summe der Downloads: 157

Verteilung der Downloads über den gewählten Zeitraum
Herkunft der Downloads nach Ländern

zurück zum Einzeltitelnachweis (Ansicht Nutzungsstatistik schließen)

NameUsing Natural ...

Größe2,39 MB

FormatAdobe PDF

Öffnen

Zusammenfassung:
Background: An Internet mailing list may be characterized as a virtual community of practice that serves as an information hub with easy access to expert advice and opportunities for social networking. We are interested in mining messages posted to a list for dental practitioners to identify clinical topics. Once we understand the topical domain, we can study dentists' real information needs and the nature of their shared expertise, and can avoid delivering useless content at the point of care in future informatics applications. However, a necessary first step involves developing procedures to identify messages that are worth studying given our resources for planned, labor-intensive research. Objectives: The primary objective of this study was to develop a workflow for finding a manageable number of clinically relevant messages from a much larger corpus of messages posted to an Internet mailing list, and to demonstrate the potential usefulness of our procedures for investigators by retrieving a set of messages tailored to the research question of a qualitative research team. Methods: We mined 14,576 messages posted to an Internet mailing list from April 2008 to May 2009. The list has about 450 subscribers, mostly dentists from North America interested in clinical practice. After extensive preprocessing, we used the Natural Language Toolkit to identify clinical phrases and keywords in the messages. Two academic dentists classified collocated phrases in an iterative, consensus-based process to describe the topics discussed by dental practitioners who subscribe to the list. We then consulted with qualitative researchers regarding their research question to develop a plan for targeted retrieval. We used selected phrases and keywords as search strings to identify clinically relevant messages and delivered the messages in a reusable database. Results: About half of the subscribers (245/450, 54.4%) posted messages. Natural language processing (NLP) yielded 279,193 clinically relevant tokens or processed words (19% of all tokens). Of these, 2.02% (5634 unique tokens) represent the vocabulary for dental practitioners. Based on pointwise mutual information score and clinical relevance, 325 collocated phrases (eg, fistula filled obturation and herpes zoster) with 108 keywords (eg, mercury) were classified into 13 broad categories with subcategories. In the demonstration, we identified 305 relevant messages (2.1% of all messages) over 10 selected categories with instances of collocated phrases, and 299 messages (2.1%) with instances of phrases or keywords for the category systemic disease. Conclusions: A workflow with a sequence of machine-based steps and human classification of NLP-discovered phrases can support researchers who need to identify relevant messages in a much larger corpus. Discovered phrases and keywords are useful search strings to aid targeted retrieval. We demonstrate the potential value of our procedures for qualitative researchers by retrieving a manageable set of messages concerning systemic and oral disease.
Lizenzbestimmungen:	CC BY 2.0 Unported
Publikationstyp:	Article
Publikationsstatus:	publishedVersion
Erstveröffentlichung:	2011
Die Publikation erscheint in Sammlung(en):	Wirtschaftswissenschaftliche Fakultät

nach oben

Verteilung der Downloads über den gewählten Zeitraum:

nach oben

Herkunft der Downloads nach Ländern:

Pos.	Land		Downloads
Pos.	Land		Anzahl	Proz.
1		Germany	47	29,94%
2		United States	43	27,39%
3		United Kingdom	9	5,73%
4		China	8	5,10%
5		No geo information available	7	4,46%
6		Vietnam	6	3,82%
7		France	6	3,82%
8		Australia	5	3,18%
9		Indonesia	4	2,55%
10		Canada	2	1,27%
		andere	20	12,74%

nach oben

Weitere Download-Zahlen und Ranglisten:

Hinweis

Zur Erhebung der Downloadstatistiken kommen entsprechend dem „COUNTER Code of Practice for e-Resources“ international anerkannte Regeln und Normen zur Anwendung. COUNTER ist eine internationale Non-Profit-Organisation, in der Bibliotheksverbände, Datenbankanbieter und Verlage gemeinsam an Standards zur Erhebung, Speicherung und Verarbeitung von Nutzungsdaten elektronischer Ressourcen arbeiten, welche so Objektivität und Vergleichbarkeit gewährleisten sollen. Es werden hierbei ausschließlich Zugriffe auf die entsprechenden Volltexte ausgewertet, keine Aufrufe der Website an sich.

Suche im Repositorium

Durchblättern

Gesamter Bestand
Diese Sammlung

Using Natural Language Processing to Enable In-depth Analysis of Clinical Messages Posted to an Internet Mailing List: A Feasibility Study

Downloadstatistik des Dokuments (Auswertung nach COUNTER):

Version im Repositorium

Zeitraum, für den die Download-Zahlen angezeigt werden:

Summe der Downloads: 157

Verteilung der Downloads über den gewählten Zeitraum:

Herkunft der Downloads nach Ländern:

Weitere Download-Zahlen und Ranglisten:

Suche im Repositorium

Durchblättern

Gesamter Bestand

Diese Sammlung