
Do you want to gain a deeper understanding of how big tech analyzes and exploits our text data, or investigate how political parties differ by analyzing textual styles in documents? This book explores how to apply state-of-the-art text analytics methods to detect and visualize phenomena in text data. Solidly based on methods from corpus linguistics, natural language processing, text analytics and digital humanities, this book shows readers how to conduct experiments with their own corpora and research questions, underpin their theories, quantify the differences and pinpoint characteristics. Case studies and experiments are detailed in every chapter using real-world and open access corpora from politics, World English, history, and literature. The results are interpreted and put into perspective, pitfalls are pointed out, and necessary pre-processing steps are demonstrated. This book also demonstrates how to use the programming language R, as well as simple alternatives and additions to R, to conduct experiments and employ visualisations by example, with extensible R-code, recipes, links to corpora, and a wide range of methods. The methods introduced can be used across texts of all disciplines, from history or literature to party manifestos and patient reports.
11476 Digital Society Initiative, UFSP13-9 Digital Religion(s), corpus linguistics, R, 10097 English Department, liri Linguistic Research Infrastructure (LiRI), 11551 Zurich Center for Linguistics, computational linguistics, 10105 Institute of Computational Linguistics, on R programming, history, hand, digital humanities, 820 English & Old English literatures
11476 Digital Society Initiative, UFSP13-9 Digital Religion(s), corpus linguistics, R, 10097 English Department, liri Linguistic Research Infrastructure (LiRI), 11551 Zurich Center for Linguistics, computational linguistics, 10105 Institute of Computational Linguistics, on R programming, history, hand, digital humanities, 820 English & Old English literatures
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 4 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
