research data . Dataset . 2020

Temporally-Informed Analysis of Named Entity Recognition

Rijhwani, Shruti; Preoțiuc-Pietro, Daniel;
Open Access English
  • Published: 17 Jun 2020
  • Publisher: Zenodo
Abstract
This repository contains the data set developed for the paper: “Shruti Rijhwani and Daniel Preoțiuc-Pietro. Temporally-Informed Analysis of Named Entity Recognition. In Proceedings of the Association for Computational Linguistics (ACL). 2020.” It includes 12,000 tweets annotated for the named entity recognition task. The tweets are uniformly distributed over the years 2014-2019, with 2,000 tweets from each year. The goal is to have a temporally diverse corpus to account for data drift over time when building NER models. The entity types annotated are locations (LOC), persons (PER) and organizations (ORG). The tweets are preprocessed to replace userna...
Subjects
free text keywords: named entity recognition, twitter, ner, twitter ner, tweets, temporal analysis, information extraction
Download fromView all 2 versions
Open Access
Zenodo
Dataset . 2020
Provider: Datacite
Open Access
Zenodo
Dataset . 2020
Provider: Datacite
Open Access
Zenodo
Dataset . 2020
Provider: Zenodo
Any information missing or wrong?Report an Issue