publication . Other literature type . Article . Conference object . 2021

H-TFIDF: What makes areas specific over time in the massive flow of tweets related to the covid pandemic?

Rémy Decoupes; Rodrique Kafando; Mathieu Roche; Maguelonne Teisseire;
Open Access
  • Published: 04 Jun 2021 Journal: AGILE: GIScience Series, volume 2, pages 1-8 (eissn: 2700-8150, Copyright policy)
  • Publisher: Copernicus GmbH
  • Country: France
Data produced by social networks may contain weak signals of possible epidemic outbreaks. In this paper, we focus on Twitter data during the waiting period before the appearance of COVID-19 first cases outside China. Among the huge flow of tweets that reflects a global growing concern in all countries, we propose to analyze such data with an adaptation of the TF-IDF measure. It allows the users to extract the discriminant vocabularies used across time and space. The results are then discussed to show how the specific spatio-temporal anchoring of the extracted terms make it possible to follow the crisis dynamics on different scales of time and space.
Persistent Identifiers
free text keywords: General Earth and Planetary Sciences, General Environmental Science, H-TFIDF, Pandemic situation, Hierarchical analysis, TF-IDF, [INFO]Computer Science [cs], [SDE]Environmental Sciences, Coronavirus disease 2019 (COVID-19), Flow (mathematics), Social network, business.industry, business, Focus (computing), Waiting period, Information retrieval, tf–idf, Adaptation (computer science), Computer science, Pandemic
Communities with gateway
OpenAIRE Connect image
Funded by
MOnitoring Outbreak events for Disease surveillance in a data science context
  • Funder: European Commission (EC)
  • Project Code: 874850
  • Funding stream: H2020 | RIA
Validated by funder
Any information missing or wrong?Report an Issue