How to sort out uncategorisable documents for interpretive social science? On limits of currently employed text mining techniques

Philipps, Axel

Home
→
Fakultäten
→
Philosophische Fakultät
→
View Item

Original version

Philipps, Axel: How to sort out uncategorisable documents for interpretive social science? On limits of currently employed text mining techniques. In: Proceedings of the 2nd International Conference on Advanced Research Methods and Analytics (CARMA) 2018, S. 19-27. DOI: https://doi.org/10.4995/carma2018.2018.8301

Repository version

To cite the version in the repository, please use this identifier: https://doi.org/10.15488/5182

Files in this item

Name: 8301-23254-1-PB.pdf

Size: 873.8Kb

Format: PDF

View/Open

Abstract:
Current text mining applications statistically work on the basis of linguistic models and theories and certain parameter settings. This enables researchers to classify, group and rank a large textual corpus – a useful feature for scholars who study all forms of written text. However, these underlying conditions differ in respect to the way how interpretively-oriented social scientists approach textual data. They aim to understand the meaning of text by heuristically using known categorisations, concepts and other formal methods. More importantly, they are primarily interested in documents that are incomprehensible with our current knowledge because these documents offer a chance to formulate new empirically-grounded typifications, hypotheses, and theories. In this paper, therefore, I propose for a text mining technique with different aims and procedures. It includes a shift away from methods of grouping and clustering the whole text corpus to a process that sorts out uncategorisable documents. Such an approach will be demonstrated using a simple example. While more elaborate text mining techniques might become tools for more complex tasks, the given example just presents the essence of a possible working principle. As such, it supports social inquiries that search for and examine unfamiliar patterns and regularities.
License of this version:	CC BY-NC-ND 4.0 Unported - https://creativecommons.org/licenses/by-nc-nd/4.0/
Publication type:	Article
Publishing status:	publishedVersion
Publication date:	2018
Keywords english:	Qualitative research, Data science, Computer science, Text mining, Big data, sort
DDC:	300 \| Sozialwissenschaften, Soziologie, Anthropologie
Controlled keywords(GND):	Konferenzschrift

Usage Statistics

Show full item record

This item appears in the following Collection(s):

Philosophische Fakultät
Frei zugängliche Publikationen aus der Philosophischen Fakultät

Search the repository

Browse

All content
- Communities & Collections
- By Issue Date
- Authors
- Titles
- Subjects
- Subjects (GND)
- DDC
- License
- Type
This Collection
- By Issue Date
- Authors
- Titles
- Subjects
- Subjects (GND)
- DDC
- License
- Type