research data . Dataset . Under curation

A large-scale COVID-19 Twitter chatter dataset for open scientific research - an international collaboration

Banda, Juan M.; Tekumalla, Ramya; Wang, Guanyu; Yu, Jingyuan; Liu, Tuo; Ding, Yuning; Artemova, Katya; Tutubalina, Elena; Chowell, Gerardo;
Open Access English
  • Publisher: Zenodo
Abstract
<p><em><strong>Version 27&nbsp;of the dataset, we have refactored the full_dataset.tsv and full_dataset_clean.tsv files (since version 20)&nbsp;to include two additional columns: language and place country code (when available). This change now includes language and country code for ALL the tweets in the dataset, not only clean tweets. With this change we have removed the&nbsp;clean_place_country.tar.gz and clean_languages.tar.gz files.&nbsp;With our refactoring of the dataset generating code&nbsp;we also found a small bug that made some of the retweets not be counted properly, hence the extra increase on tweets available.&nbsp;</strong></em></p> <p><strong>Due...
Download from
Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue