Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Collection . 2022
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Collection . 2022
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Collection . 2022
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Collection . 2022
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Collection . 2022
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Collection . 2022
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Collection . 2022
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Collection . 2022
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Collection . 2022
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Collection . 2022
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Collection . 2022
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Collection . 2022
License: CC BY
Data sources: ZENODO
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Collection . 2022
License: CC BY
Data sources: Datacite
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Collection . 2022
License: CC BY
Data sources: ZENODO
versions View all 7 versions
addClaim

LivingNER corpus: Named entity recognition, normalization & classification of species, pathogens and food

Authors: Miranda-Escalada, Antonio; Farré-Maduell, Eulàlia; González Gacio, Gloria; Krallinger, Martin;

LivingNER corpus: Named entity recognition, normalization & classification of species, pathogens and food

Abstract

LivingNER corpus - training, validation, test and background sets + MULTILINGUAL RESOURCES The LivingNER corpus is a collection of 2000 clinical cases from over 10 different medical areas annotated with SPECIES mentions, that are mapped to NCBI Taxonomy. It is used for the LivingNER Shared Task on occupations and employment status detection and normalization in Spanish medical documents, which will be celebrated as part of IberLEF 2022. Please cite if you use this dataset: A. Miranda-Escalada, E. Farré-Maduell, S. Lima-López, D. Estrada, L. Gascó, M. Krallinger, Mention detection, normalization & classification of species, pathogens, humans and food in clinical documents: Overview of livingner shared task and resources, Procesamiento del Lenguaje Natural (2022) @article{amiranda2022nlp, title={Mention detection, normalization \& classification of species, pathogens, humans and food in clinical documents: Overview of LivingNER shared task and resources}, author={Miranda-Escalada, Antonio and Farr{\'e}-Maduell, Eul{`a}lia and Lima-L{\'o}pez, Salvador and Estrada, Darryl and Gasc{\'o}, Luis and Krallinger, Martin}, journal = {Procesamiento del Lenguaje Natural}, year={2022} } Training and validation sets The training set is composed of 1000 clinical case reports extracted from miscellaneous medical specialties including COVID, oncology, infectious diseases, tropical medicine, urology, pediatrics, and others. The files are distributed as follows: - For subtask 1 (LivingNER-Species NER track), annotations are distributed in a tab-separated file (TSV) file. It has one row per annotation, with the following columns: filename: document name mark: identifier mention mark label: mention type (SPECIES or HUMAN) off0: starting position of the mention in the document off1: ending position of the mention in the document span: textual span - For subtask 2 (LivingNER-Species Norm track), annotations are distributed in a TSV file. It has one row per annotation, with the same columns as the previous one, plus: isH: whether the span is narrower than the NCBITax assigned code isN: whether the mention corresponds to a nosocomial infection iscomplex: whether the span has assigned a combination of NCBITax codes NCBITax: mention code in the NCBI Taxonomy - For subtask 3 (LivingNER-Clinical IMPACT track), annotations are distributed in a (TSV). It has one row per clinical case report, with the following columns: filename isPet (Yes/No) PetIDs (NCBITaxonomy codes of pet & farm animals present in document) isAnimalInjury (Yes/No) AnimalInjuryIDs (NCBITaxonomy codes of animals causing injuries present in document) IsFood (Yes/No) FoodIDs (NCBITaxonomy codes of food mentions present in document) isNosocomial (Yes/No) NosocomialIDs (NCBITaxonomy codes of nosocomial species mentions present in document) The validation set contains 500 clinical case reports in the same format as the training ones. Test and background sets The test+background set is a collection of 13467 clinical case reports. The goal of the LivingNER task is to make predictions for this set. Among the 13467 clinical case reports, 485 will be used for evaluation (this is the test set). The rest (background set) are added to prevent manual annotations and to create a Silver Standard. Multilingual resources We have generated the annotated training and validation sets in 6 languages: English, Portuguese, Catalan, Italian, French and Romanian. The process was: The text files were translated with a neural machine translation system. The annotations were translated with the same neural machine translation system. The translated annotations were transferred to the translated text files using an annotation transfer technology. The text files are stored in the multilingual_resources/training-text-files and multilingual_resources/validation-text-files subfolders. The annotated TSV files are stored in the multilingual_resources/annotation_transfer subfolder. For the sake of comparison, we incorporate as well the annotations that resulted from the LINNAEUS tool in the multilingual_resources/linneaus subfolder. If you want to visualise the multilingual resources, check out this Brat server: https://temu.bsc.es/mLivingNER/#/translations/ For instance, you can see the parallel annotations in English vs in French, or in Spanish (the gold standard) vs in Catalan. Important notes about subtask 3 (LivingNER-Clinical IMPACT track): Less clinical case reports. Subtask 3 (LivingNER-Clinical IMPACT track) contains half of the clinical case reports (500 in the training partition, 250 in the validation partition). The list of valid clinical case reports for task 3 is included in the data (train_files_task3.txt and validation_files_task3.txt) Enriched dataset. The GS format is the one described above (a TSV with one line per clinical case report). However, we believe participants may find useful and enriched dataset. Then, we provide an additional dataset, with the mentions of the NER track classified in the 4 Clinical impact categories (food, pet&farm animals, animals causing injuries and nosocomial). It is a TSV file with one row per annotation, and with the following columns: filename, mark, label, off0, off1, span, isPet, isAnimalInjury, isFood, isNosocomial, isH, iscomplex, code All text files are distributed as plain UTF-8 text files. Resources Web Annotation guidelines Evaluation library LivingNER terminology For further information, please visit https://temu.bsc.es/livingner/ or email us at encargo-pln-life@bsc.es

Funded by the Plan de Impulso de las Tecnologías del Lenguaje (Plan TL).

Related Organizations
Keywords

normalization, NER, gold standard, species, corpus, NLP

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 20
    download downloads 1
  • 20
    views
    1
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
0
Average
Average
Average
20
1
Related to Research communities