Downloads provided by UsageCounts
This dataset consists of tweets collected during 48 disasters over 10 disaster types with human annotations denoting if a tweet is related to this disaster or not. This collection is intended as a benchmarking dataset for filtering algorithms. Dataset Specification Tweets are separated into files based on individual disasters, where each file contains a balanced number of positive and negative examples. The naming scheme is as follows: <disaster type>-<name or region>[-<sub-type>]-<year>.ndjson Each line in the data files is a complete json-object, containing the tweet-id, the text, and the annotations as: {"id": "12345", "text": "let's all pray for nepal!", "relevance": 1} References To reference this collection as a whole, please use the following citation: Wiegmann, M., Kersten, J., Klan, F., Potthast, M., Stein, B. (2020). Analysis of Filtering Models for Disaster-Related Tweets. Proceedings of the 17th ISCRAM. This dataset compiles tweets collected, annotated, and published in several other works. Please consider to cite those too: 1. Imran, M., Castillo, C., Lucas, J., Meier, P., and Vieweg, S. (2014). AIDR: artificial intelligence for disaster response. In: WWW (Companion Volume). 2. Olteanu, A., Castillo, C., Diaz, F., and Vieweg, S. (2014). CrisisLex: A Lexicon for Collecting and Filtering Microblogged Communications in Crises. Proceedings of the 8th ICWSM. 3. Olteanu, A., Vieweg, S., and Castillo, C. (2015). What to Expect When the Unexpected Happens: Social Media Communications Across Crises. Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing. 4. Imran, M., Mitra, P., and Srivastava, J. (2016). Enabling Rapid Classification of Social Media Communications During Crises. IJISCRAM 8. 5. Alam, F., Ofli, F., and Imran, M. (2018). CrisisMMD: Multimodal Twitter Datasets from Natural Disasters. Proceedings of the 12th ICWSM. 6. Stowe, K., Palmer, M., Anderson, J., Kogan, M., Palen, L., Anderson, K. M., Morss, R., Demuth, J., and Lazrus, H. (2018). Developing and Evaluating Annotation Procedures for Twitter Data during Hazard Events. Proceedings of the LAW-MWE-CxG-2018. 7. McMinn, A. J., Moshfeghi, Y., and Jose, J. M. (2013). Building a Large-scale Corpus for Evaluating Event Detection on Twitter. Proceedings of the 22nd ACM CIKM.
tweets, disaster, crisis, relevance, filter
Twitter Data
tweets, disaster, crisis, relevance, filter
Twitter Data
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 295 | |
| downloads | 80 |

Views provided by UsageCounts
Downloads provided by UsageCounts