
In recent times, verbal aggression and related phenomena of hate speech, abusive language, trolling, etc. have become a major problem over social media. In this paper, I present the results of a large-scale quantitative study of aggression based on a target-based typology in a manually-annotated multilingual dataset of over 20,000 Facebook comments and tweets each written in Hindi, English or code-mixed Hindi-English. Taking insights from this study, I develop 2 different classifiers for detecting aggression in Hindi, English and Hindi-English mixed Facebook and Twitter conversations. The classifiers are developed using an annotatedcorpus of approximately 9,000 Facebook comments and 5,000 tweets. Since a phenomenon like aggression is highly subjective, the study shows a comparatively modest inter-annotator agreement of 0.72 and an overall F1 score of 0.64 for both Facebook and Twitter. Consequently, I also carried out two user studies, where humans were asked to evaluate the annotations by the classifier, to test the actual 'acceptance' of the classifier's judgments. I discuss the results of this user study and give an analysis of the overall performance of the system.
SocArXiv|Social and Behavioral Sciences|Linguistics, Linguistics, Social and Behavioral Sciences, bepress|Social and Behavioral Sciences|Linguistics|Computational Linguistics, Computational Linguistics, bepress|Social and Behavioral Sciences|Linguistics|Discourse and Text Linguistics, Semantics and Pragmatics, bepress|Social and Behavioral Sciences|Linguistics|Semantics and Pragmatics, bepress|Social and Behavioral Sciences, SocArXiv|Social and Behavioral Sciences|Linguistics|Computational Linguistics, SocArXiv|Social and Behavioral Sciences, SocArXiv|Social and Behavioral Sciences|Linguistics|Semantics and Pragmatics, bepress|Social and Behavioral Sciences|Linguistics, SocArXiv|Social and Behavioral Sciences|Linguistics|Discourse and Text Linguistics, Discourse and Text Linguistics
SocArXiv|Social and Behavioral Sciences|Linguistics, Linguistics, Social and Behavioral Sciences, bepress|Social and Behavioral Sciences|Linguistics|Computational Linguistics, Computational Linguistics, bepress|Social and Behavioral Sciences|Linguistics|Discourse and Text Linguistics, Semantics and Pragmatics, bepress|Social and Behavioral Sciences|Linguistics|Semantics and Pragmatics, bepress|Social and Behavioral Sciences, SocArXiv|Social and Behavioral Sciences|Linguistics|Computational Linguistics, SocArXiv|Social and Behavioral Sciences, SocArXiv|Social and Behavioral Sciences|Linguistics|Semantics and Pragmatics, bepress|Social and Behavioral Sciences|Linguistics, SocArXiv|Social and Behavioral Sciences|Linguistics|Discourse and Text Linguistics, Discourse and Text Linguistics
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
