
Abstract Automatic extraction and normalization of human phenotypes from unstructured physical examination reports is a crucial and challenging task in clinical genetics. This paper presents the system submitted by IKMLab for the BioCreative VIII Task 3 - Genetic Phenotype Extraction and Normalization. We target Subtask 3b and aim at providing accurate locations of human phenotype findings given an observation text. Our system consists of two stages. In the first stage, we use the output of an existing baseline (e.g., PhenoTagger) to obtain a preliminary set of Human Phenotype Ontology (HPO) terms for each observation. Then, in the second stage, we design a sequence tagging schema based on a pre-trained language model and perform token classification to locate spans for the HPO terms. Our best system achieved 60.4% and 64.2% in Exact and Overlapping F1 scores during the final evaluations. In addition, further experiments show that our approach helps to better locate separated and consecutive spans describing HPO terms from observations. This article is part of the Proceedings of the BioCreative VIII Challenge and Workshop: Curation and Evaluation in the era of Generative Models.
ner, phenotypes, entity linking, bionlp
ner, phenotypes, entity linking, bionlp
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
