Mean Birds: Detecting Aggression and Bullying on Twitter

Conference object, Preprint OPEN
Despoina Chatzakou ; Nicolas Kourtellis ; Jeremy Blackburn ; Emiliano De Cristofaro ; Gianluca Stringhini ; Athena Vakali (2017)
  • Publisher: ACM publishing
  • Related identifiers: doi: 10.1145/3091478.3091487
  • Subject: Computer Science - Social and Information Networks | Computer Science - Computers and Society

In recent years, bullying and aggression against users on social media have grown significantly, causing serious consequences to victims of all demographics. In particular, cyberbullying affects more than half of young social media users worldwide, and has also led to teenage suicides, prompted by prolonged and/or coordinated digital harassment. Nonetheless, tools and technologies for understanding and mitigating it are scarce and mostly ineffective. In this paper, we present a principled and scalable approach to detect bullying and aggressive behavior on Twitter. We propose a robust methodology for extracting text, user, and network-based attributes, studying the properties of cyberbullies and aggressors, and what features distinguish them from regular users. We find that bully users post less, participate in fewer online communities, and are less popular than normal users, while aggressors are quite popular and tend to include more negativity in their posts. We evaluate our methodology using a corpus of 1.6M tweets posted over 3 months, and show that machine learning classification algorithms can accurately detect users exhibiting bullying and aggressive behavior, achieving over 90% AUC.
  • References (40)
    40 references, page 1 of 4

    [1] J. Blackburn, R. Simha, N. Kourtellis, X. Zuo, M. Ripeanu, J. Skvoretz, and A. Iamnitchi. Branded with a scarlet "c": cheaters in a gaming social network. In WWW, pages 81-90, 2012.

    [2] V. Blondel, J. Guillaume, R. Lambiotte, and E. Lefebvre. The louvain method for community detection in large networks. Statistical Mechanics: Theory and Experiment, 10:P10008, 2011.

    [3] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer. Smote: Synthetic minority over-sampling technique. J. Artif. Int. Res., 16(1):321-357, 2002.

    [4] C. Chen, A. Liaw, and L. Breiman. Using random forest to learn imbalanced data. University of California, Berkeley, pages 1-12, 2004.

    [5] C. Chen, J. Zhang, X. Chen, Y. Xiang, and W. Zhou. 6 million spam tweets: A large ground truth for timely Twitter spam detection. In IEEE ICC, 2015.

    [6] Y. Chen, Y. Zhou, S. Zhu, and H. Xu. Detecting Offensive Language in Social Media to Protect Adolescent Online Safety. In PASSAT and SocialCom, 2012.

    [7] L. Corcoran, C. M. Guckin, and G. Prentice. Cyberbullying or cyber aggression?: A review of existing definitions of cyber-based peer-to-peer aggression. Societies, 5(2):245, 2015.

    [8] M. Dadvar, D. Trieschnigg, and F. Jong. Experts and machines against bullies: A hybrid approach to detect cyberbullies. In Canadian Conference on Artificial Intelligence, pages 275-281, 2014.

    [9] K. Dinakar, R. Reichart, and H. Lieberman. Modeling the detection of textual cyberbullying. The Social Mobile Web, 11:02, 2011.

    [10] N. Djuric, J. Zhou, R. Morris, M. Grbovic, V. Radosavljevic, and N. Bhamidipati. Hate Speech Detection with Comment Embeddings. In WWW, 2015.

  • Metrics
    No metrics available
Share - Bookmark