Effects of Feature Extraction and Classification Methods on Cyberbully Detection

Article OPEN
ÖZEL, Selma Ayşe ; SARAÇ, Esra (2016)
  • Publisher: Süleyman Demirel Üniversitesi
  • Journal: (issn: 1308-6529, eissn: 1308-6529)
  • Related identifiers: doi: 10.19113/sdufbed.20964
  • Subject: Agriculture | S1-972 | Technology (General) | Cyberbullying | S | Preprocessing | Agriculture (General) | Classification | T1-995 | Cyberbullying,Preprocessing; Feature selection; Classification | Feature selection

<p>Cyberbullying is defined as an aggressive, intentional action against a defenseless person by using the Internet, or other electronic contents. Researchers have found that many of the bullying cases have tragically ended in suicides; hence automatic detection of cyberbullying has become important. In this study we show the effects of feature extraction, feature selection, and classification methods that are used, on the performance of automatic detection of cyberbullying. To perform the experiments FormSpring.me dataset is used and the effects of preprocessing methods; several classifiers like C4.5, Naïve Bayes, kNN, and SVM; and information gain and chi square feature selection methods are investigated. Experimental results indicate that the best classification results are obtained when alphabetic tokenization, no stemming, and no stopwords removal are applied. Using feature selection also improves cyberbully detection performance. When classifiers are compared, C4.5 performs the best for the used dataset.</p>
Share - Bookmark