Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2023
License: CC BY
Data sources: ZENODO
addClaim

GPT-3 Curie generated synthetic datasets based on the datasets: Founta, Stormfront, HatEval 2019, Davidson, GermEval 2021, SemEval 2022 Task 4

Authors: Schmidhuber, Maximilian;

GPT-3 Curie generated synthetic datasets based on the datasets: Founta, Stormfront, HatEval 2019, Davidson, GermEval 2021, SemEval 2022 Task 4

Abstract

This dataset is a composition of six toxic or hateful synthetic datasets based on the datasets published by: "Large scale crowdsourcing and characterization of twitter abusive behavior" "Hate Speech Dataset from a White Supremacy Forum" "Automated hate speech detection and the problem of offensive language" "Semeval-2019 task 5: Multilingual detection of hate speech against immigrants and women in twitter" "Overview of the GermEval 2021 shared task on the identification of toxic, engaging, and fact-claiming comments" "Don't patronize me! An annotated dataset with patronizing and condescending language towards vulnerable communities" All data is generated by a separate GPT-3 Curie model fine-tuned on one label of the dataset. The data is not filtered and likely needs to be processed before being useful.

Related Organizations
Keywords

Synthetic Data, Data Augmentation

EOSC Subjects

Twitter Data

Powered by OpenAIRE graph
Found an issue? Give us feedback