Downloads provided by UsageCounts
The WE1S twitter dataset contains 5,024,756 tweets posted to Twitter between December 6th, 2013 and June 30th, 2019. The dataset is divided into subcollections based on the query terms "humanities", "liberal arts", "stem", "science", and "science-es" (that is a query for the presence of either "science" or "sciences"). Subcollections can be identified in the dataset from the value of the metapath property. The number of tweets in each subcollections is as follows: humanities: 1,705,038 liberal-arts: 7,663 stem: 865,156 science: 2,089,985 science-es: 356,914 The tweets are distributed over the following date range: 2013: 16,335 2014: 862,746 2015: 1,711,823 2016: 947,561 2017: 976,971 2018: 3,24,133 2019: 185,187 Collectively, the tweets represent the work of 1,886,739 distinct usernames. Each tweet's mentions, hashtags, and links are recorded, as well the number of likes and retweets. Unlike most other WE1S datasets, the Twitter dataset does not contain extracted features. Instead, it contains the original text of the tweet (the value of the content property, along with a tidy_tweet property, which contains the text of the tweet after preprocessing. Tweets were preprocessed using a modified form of the WE1S preprocessing algorithm. Details can be found in the WE1S Tweet-Suite repository. (See WE1S Research Materials Overview for the relation between the project's "datasets" and "collections.")
The data has been archived in jsonl format (each json document is delimited by a line break).
Twitter Data
Twitter Data
| citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 360 | |
| downloads | 115 |

Views provided by UsageCounts
Downloads provided by UsageCounts