
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>
Dataset for paper: Disturbed YouTube for Kids: Characterizing and Detecting Inappropriate Videos Targeting Young Children The dataset consists of five files: 1. groundtruth_videos.json: This is the ground truth dataset. We have 4797 manually annotated videos (1513 suitable, 929 disturbing, 419 restricted, and 1936 irrelevant). You can distinguish among the different labels by observing the 'classification_label' field. 2. elsagate_related_videos.json: Contains the data for 233K elsagate-related YouTube videos (1K seed and 232K recommended) that were obtained as described in the paper. 3. other_child_related_videos.json: Contains the data for 155K other child-related YouTube videos (2K seed and 153K recommended) that were obtained as described in the paper. 4. random_videos.json: Contains the data for 482K random YouTube videos (8K seed and 474K recommended) that were obtained as described in the paper. 5. popular_videos.json: Contains the data for 11K popular YouTube videos (500 seed and 10.5K recommended) that were obtained between November 18 and November 21, 2018, as described in the paper. For each video in all sets, you can check the predicted label of our classifier by observing the 'prediction' field.
Acknowledgments: This project has received funding from the European Union's Horizon 2020 Research and Innovation program under the Marie Skłodowska-Curie ENCASE project (Grant Agreement No. 691025) and from the National Science Foundation under grant CNS-1942610.
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 2 | |
popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
views | 202 | |
downloads | 15 |