publication . Conference object . Preprint . Other literature type . 2018

A Quality Type-aware Annotated Corpus and Lexicon for Harassment Research

Krishnaprasad Thirunarayan; Valerie L. Shalin; Amit P. Sheth; Lakshika Balasuriya; Mohammadreza Rezvan; Saeedeh Shekarpour;
Open Access
  • Published: 26 Feb 2018
  • Publisher: ACM
Abstract
Having a quality annotated corpus is essential especially for applied research. Despite the recent focus of Web science community on researching about cyberbullying, the community dose not still have standard benchmarks. In this paper, we publish first, a quality annotated corpus and second, an offensive words lexicon capturing different types type of harassment as (i) sexual harassment, (ii) racial harassment, (iii) appearance-related harassment, (iv) intellectual harassment, and (v) political harassment.We crawled data from Twitter using our offensive lexicon. Then relied on the human judge to annotate the collected tweets w.r.t. the contextual types because u...
Subjects
ACM Computing Classification System: InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL
free text keywords: Computer Science - Computation and Language, World Wide Web, Computer science, Web science, Harassment, Lexicon
Related Organizations
Any information missing or wrong?Report an Issue