Downloads provided by UsageCounts
The ACL-cite dataset was created for the paper: “On the Use of Context for Predicting Citation Worthiness of Sentences in Scholarly Articles” published in NAACL 2021. This dataset contains over 2.7 million sentences extracted from scholarly articles (from ACL Anthology [Bird et al.]) and their corresponding citation worthiness labels. The goal of the citation worthiness task is to determine whether a given sentence requires a citation. There are three CSV files in the dataset: train.csv: 1,625,268 rows dev.csv: 539,085 rows test.csv: 542,081 rows Each CSV file contains the following columns: document_id: identifier of the paper the sentence was extracted from section: name of the section the sentence was extracted from, (e.g. Abstract, Introduction, etc.) section_id: sequential identifier of the section in the paper paragraph_id: sequential identifier of the paragraph the sentence was extracted from sentence: the sentence with the citations removed raw_sentence: the raw sentence including the citations sentence_id: sequential identifier of the sentence in the paper label: citation worthiness label Note: The train/dev/test splits are done at the document_id level.
citation, citation worthiness
citation, citation worthiness
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 9 | |
| downloads | 1 |

Views provided by UsageCounts
Downloads provided by UsageCounts