Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2019
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2019
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2019
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2019
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2019
License: CC BY
Data sources: Datacite
versions View all 3 versions
addClaim

Augmented dataset of rumours and non-rumours for rumour detection

Authors: Sooji Han; Gao, Jie; Ciravegna, Fabio;

Augmented dataset of rumours and non-rumours for rumour detection

Abstract

This data set contains a collection of Twitter rumours and non-rumours during six real-world events: 1) 2013 Boston marathon bombings, 2) 2014 Ottawa shooting, 3) 2014 Sydney siege, 4) 2015 Charlie Hebdo Attack, 5) 2014 Ferguson unrest, and 6) 2015 Germanwings plane crash The data set is an augmented data set of the PHEME dataset of rumours and non-rumours based on two data sets: the PHEME data [2] (downloaded via https://figshare.com/articles/PHEME_dataset_for_Rumour_Detection_and_Veracity_Classification/6392078), and the CrisisLexT26 data [3] (downloaded via https://github.com/sajao/CrisisLex/tree/master/data/CrisisLexT26/2013_Boston_bombings). PHEME-Aug v2.0 (aug-rnr-data_filtered.tar.bz2 and aur-rnr-data_full.tar.bz2) contains augmented data for all six events. aug-rnr-data_full.tar.bz2 contains source tweets and replies without temporal filtering. Please refer to [1] for details about temporal filtering. The statistics are as follows: * 2013 Boston marathon bombings: 392 rumours and 784 non-rumours * 2014 Ottawa shooting: 1,047 rumours and 2,072 non-rumours * 2014 Sydney siege: 1,764 rumours and 3,530 non-rumours * 2015 Charlie Hebdo Attack: 1,225 rumours and 2,450 non-rumours * 2014 Ferguson unrest: 737 rumours and 1,476 non-rumours * 2015 Germanwings plane crash: 502 rumours and 604 non-rumours aug-rnr-data_filtered.tar.bz2 contains source tweets, replies, and retweets after temporal filtering and deduplication. Please refer to [1] for details. The statistics are as follows: * 2013 Boston marathon bombings: 323 rumours and 645 non-rumours * 2014 Ottawa shooting: 713 rumours and 1,420 non-rumours * 2014 Sydney siege: 1,134 rumours and 2,262 non-rumours * 2015 Charlie Hebdo Attack: 812 rumours and 1,673 non-rumours * 2014 Ferguson unrest: 471 rumours and 949 non-rumours * 2015 Germanwings plane crash: 375 rumours and 402 non-rumours The data structure follows the format of the PHEME data [2]. Each event has a directory, with two subfolders, rumours and non-rumours. These two folders have folders named with a tweet ID. The tweet itself can be found on the 'source-tweet' directory of the tweet in question, and the directory 'reactions' has the set of tweets responding to that source tweet. Also each folder contains ‘aug_complete.csv’ and ‘reference.csv'. 'aug_complete.csv' file contains the metadata (tweet ID, tweet text, timestamp, and rumour label) of augmented tweets before deduplication and filtering tweets without context (i.e., replies). 'reference.csv' file contains manually annotated reference tweets [2, 3]. If you use our augmented data (PHEME-Aug v2.0), please also cite: [1] Han S., Gao, J., Ciravegna, F. (2019). "Neural Language Model Based Training Data Augmentation for Weakly Supervised Early Rumor Detection", The 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2019), Vancouver, Canada, 27-30 August, 2019 ============================================================================================== [2] Kochkina, E., Liakata, M., & Zubiaga, A. (2018). All-in-one: Multi-task Learning for Rumour Verification. COLING. [3] Olteanu, A., Vieweg, S., & Castillo, C. (2015, February). What to expect when the unexpected happens: Social media communications across crises. In Proceedings of the 18th ACM conference on computer supported cooperative work & social computing (pp. 994-1009). ACM

{"references": ["Han, S., Gao, J., Ciravegna, F. (2019). \"Data Augmentation for Rumor Detection Using Context-Sensitive Neural Language Model With Large-Scale Credibility Corpus\",\u00a0Seventh International Conference on Learning Representations (ICLR) LLD, May 2019, New Orleans, Louisiana, US", "Kochkina, E., Liakata, M., & Zubiaga, A. (2018). All-in-one: Multi-task Learning for Rumour Verification. COLING.", "Olteanu, A., Vieweg, S., & Castillo, C. (2015, February). What to expect when the unexpected happens: Social media communications across crises. In\u00a0Proceedings of the 18th ACM conference on computer supported cooperative work & social computing\u00a0(pp. 994-1009). ACM."]}

Related Organizations
Keywords

crisis, social media, pheme, rumour, rumor detection, tweets, rumor

EOSC Subjects

Twitter Data

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 49
    download downloads 17
  • 49
    views
    17
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
0
Average
Average
Average
49
17