EvaNIL: silver standard dataset for large-scale NIL entity linking evaluation

The EvaNIL dataset can be used to train or evaluate approaches developed for NIL entity linking. It was built from several Biomedical and Life Sciences corpora: PubMed DS CRAFT corpus MedMentions These corpora contain entities associated with knowledge base concepts. To build the EvaNIL dataset, we assumed that those knowledge base concepts did not exist in the respective knowledge bases, so each entity is associated instead with the direct ancestors of those original concepts. The EvaNIL dataset is divided into 6 partitions including annotations from several knowledge bases: "medic" (CTD-MEDIC) "ctd_anatomy" (CTD-Anatomy) "ctd_chemicals" (CTD-Chemicals) "chebi" (ChEBI) "go_bp" (GO-Biological Process) "hp" (HPO) Size of the uncompressed dataset: 957.5 MB

Funding by Fundação para a Ciência e a Tecnologia (FCT) through the following grants: 2020.05393.BD, PTDC/CCI-BIO/28685/2017, UIDB/00408/2020, UIDP/00408/2020

Related Organizations

University of Lisbon
Portugal

Keywords

NIL entity, Entity Linking, Natural Language Processing, Text Mining, Biomedical text

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Usage byUsageCounts

visibility	views	12
download	downloads	2

12
views
2
downloads
Powered by

Found an issue? Give us feedback

visibility

download

0

Average

12

2