Filtering Tweets for Social Unrest

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object , Other literature type 01 Jan 2017Embargo end date: 01 Jan 2017Publisher:IEEEJournal:2017 IEEE 11th International Conference on Semantic Computing (ICSC)

Authors: Alan Mishler; Kevin Wonus; Wendy Chambers; Michael Bloodgood;

doi: 10.1109/icsc.2017.75 , 10.13016/m2jz6j , 10.48550/arxiv.1702.06216

arXiv: 1702.06216

handle: 1903/19182

Filtering Tweets for Social Unrest

- Summary
- Subjects
- Metrics

Abstract

Since the events of the Arab Spring, there has been increased interest in using social media to anticipate social unrest. While efforts have been made toward automated unrest prediction, we focus on filtering the vast volume of tweets to identify tweets relevant to unrest, which can be provided to downstream users for further analysis. We train a supervised classifier that is able to label Arabic language tweets as relevant to unrest with high reliability. We examine the relationship between training data size and performance and investigate ways to optimize the model building process while minimizing cost. We also explore how confidence thresholds can be set to achieve desired levels of performance.

Related Organizations

Carnegie Mellon University
United States
University of Maryland, College Park
United States
College of New Jersey
United States
University of Maryland, College Park
United States
University of Maryland
United States

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, text classification, 330, social media, social unrest, stopping criteria, Machine Learning (stat.ML), computer science, Computer Science - Information Retrieval, Machine Learning (cs.LG), text filtering, computational linguistics, H.3.3, Statistics - Machine Learning, active learning, natural language processing, selective sampling, Computer Science - Computation and Language, I.2.6, I.2.7, I.5.4, artificial intelligence, H.3.3; I.2.6; I.2.7; I.5.4, 004, human language technology, machine learning, stopping methods, statistical methods, text processing, Computation and Language (cs.CL), Information Retrieval (cs.IR)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	9
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%