<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>

COPY SCRIPT

For further information contact us at helpdesk@openaire.eu

RP-Mod & RP-Crowd: Moderator- and Crowd-Annotated German News Comment Datasets

Name: RP-Mod & RP-Crowd: Moderator- and Crowd-Annotated German News Comment Datasets
Keywords: 16. Peace & justice

Research datakeyboard_double_arrow_right Dataset 24 Aug 2021 German Publisher:Zenodo

Authors: Assenmacher, Dennis; Niemann, Marco; Müller, Kilian; Seiler, Moritz V.; Riehle, Dennis M.; Trautmann, Heike;

doi: 10.5281/zenodo.5242915 , 10.5281/zenodo.5291339 , 10.5281/zenodo.5242916

RP-Mod & RP-Crowd: Moderator- and Crowd-Annotated German News Comment Datasets

- Summary
- Metrics

Abstract

Abuse and hate are penetrating social media and many comment sections of news media companies. These platform providers invest considerable efforts to moderate user-generated contributions to prevent losing readers who get appalled by inappropriate texts. This is further enforced by legislative actions, which make non-clearance of these comments a punishable action. While (semi-)automated solutions using Natural Language Processing and advanced Machine Learning techniques are getting increasingly sophisticated, the domain of abusive language detection still struggles as large non-English and well-curated datasets are scarce or not publicly available. With this work, we publish and analyse the largest annotated German abusive language comment datasets to date. In contrast to existing datasets, we achieve a high labelling standard by conducting a thorough crowd-based annotation study that complements professional moderators' decisions, which are also included in the dataset. We compare and cross-evaluate the performance of baseline algorithms and state-of-the-art transformer-based language models, which are fine-tuned on our datasets and an existing alternative, showing the usefulness for the community.

The research leading to these results received funding from the federal state of North Rhine-Westphalia and the European Regional Development Fund (EFRE.NRW 2014-2020), Project: MODERAT! (No. CM-2-2-036a).

Related Organizations

University of Koblenz and Landau
Germany
University of Münster
Germany

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	1
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average