research data . Dataset . 2020

A large-scale COVID-19 Twitter chatter dataset for open scientific research - an international collaboration

Banda, Juan M.; Tekumalla, Ramya; Wang, Guanyu; Yu, Jingyuan; Liu, Tuo; Ding, Yuning; Artemova, Katya; Tutubalina, Elena; Chowell, Gerardo;
Open Access English
  • Published: 01 Jan 2020
  • Publisher: Zenodo
Abstract
<p><em><strong>Version 21&nbsp;of the dataset, we have refactored the full_dataset.tsv and full_dataset_clean.tsv files (since version 20)&nbsp;to include two additional columns: language and place country code (when available). This change now includes language and country code for ALL the tweets in the dataset, not only clean tweets. With this change we have removed the&nbsp;clean_place_country.tar.gz and clean_languages.tar.gz files.&nbsp;With our refactoring of the dataset generating code&nbsp;we also found a small bug that made some of the retweets not be counted properly, hence the extra increase on tweets available.&nbsp;</strong></em></p> <p><strong>Due ...
Subjects
free text keywords: social media, twitter, nlp, covid-19, covid19
Communities
COVID-19
Zenodo
Dataset . 2020
Provider: Zenodo
Zenodo
Dataset . 2020
Provider: Zenodo
Zenodo
Dataset . 2020
Provider: Zenodo
Zenodo
Dataset . 2020
Provider: Zenodo
Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue
research data . Dataset . 2020

A large-scale COVID-19 Twitter chatter dataset for open scientific research - an international collaboration

Banda, Juan M.; Tekumalla, Ramya; Wang, Guanyu; Yu, Jingyuan; Liu, Tuo; Ding, Yuning; Artemova, Katya; Tutubalina, Elena; Chowell, Gerardo;