How to sort out uncategorisable documents for interpretive social science? On limits of currently employed text mining techniques

Download statistics - Document (COUNTER):

Philipps, Axel: How to sort out uncategorisable documents for interpretive social science? On limits of currently employed text mining techniques. In: Proceedings of the 2nd International Conference on Advanced Research Methods and Analytics (CARMA) 2018, S. 19-27. DOI:

Repository version

To cite the version in the repository, please use this identifier:

Selected time period:


Sum total of downloads: 95

Current text mining applications statistically work on the basis of linguistic models and theories and certain parameter settings. This enables researchers to classify, group and rank a large textual corpus – a useful feature for scholars who study all forms of written text. However, these underlying conditions differ in respect to the way how interpretively-oriented social scientists approach textual data. They aim to understand the meaning of text by heuristically using known categorisations, concepts and other formal methods. More importantly, they are primarily interested in documents that are incomprehensible with our current knowledge because these  documents offer a chance to formulate new empirically-grounded typifications, hypotheses, and theories. In this paper, therefore, I propose for a text mining technique with different aims and procedures. It includes a shift away from methods of grouping and clustering the whole text corpus to a process that sorts out uncategorisable documents. Such an approach will be demonstrated using a simple example. While more elaborate text mining techniques might become tools for more complex tasks, the given example just presents the essence of a possible working principle. As such, it supports social inquiries that search for and examine unfamiliar patterns and regularities.
License of this version: CC BY-NC-ND 4.0 Unported
Document Type: Article
Publishing status: publishedVersion
Issue Date: 2018
Appears in Collections:Philosophische Fakultät

distribution of downloads over the selected time period:

downloads by country:

pos. country downloads
total perc.
1 image of flag of Germany Germany 56 58.95%
2 image of flag of United States United States 18 18.95%
3 image of flag of Netherlands Netherlands 8 8.42%
4 image of flag of China China 6 6.32%
5 image of flag of Taiwan Taiwan 1 1.05%
6 image of flag of Malaysia Malaysia 1 1.05%
7 image of flag of Israel Israel 1 1.05%
8 image of flag of Ireland Ireland 1 1.05%
9 image of flag of Indonesia Indonesia 1 1.05%
10 image of flag of United Kingdom United Kingdom 1 1.05%
    other countries 1 1.05%

Further download figures and rankings:


Zur Erhebung der Downloadstatistiken kommen entsprechend dem „COUNTER Code of Practice for e-Resources“ international anerkannte Regeln und Normen zur Anwendung. COUNTER ist eine internationale Non-Profit-Organisation, in der Bibliotheksverbände, Datenbankanbieter und Verlage gemeinsam an Standards zur Erhebung, Speicherung und Verarbeitung von Nutzungsdaten elektronischer Ressourcen arbeiten, welche so Objektivität und Vergleichbarkeit gewährleisten sollen. Es werden hierbei ausschließlich Zugriffe auf die entsprechenden Volltexte ausgewertet, keine Aufrufe der Website an sich.

Search the repository
