Downloads provided by UsageCounts
handle: 10261/368392
Entity normalization is a common strategy to resolve ambiguities by mapping all the synonym mentions to a single concept identifer in standard terminology. Normalizing medical entities is challenging, especially for languages other than English, where lexical variation is considerably under-represented. Here, we report a new linguistic resource for medical entity normalization in Spanish. We applied a UMLS-based medical lexicon (MedLexSp) to automatically normalize mentions from 2000 medical referrals of the Chilean Waiting List Corpus. Three medical students manually revised the automatic normalization. The inter-coder agreement was computed, and the distribution of concepts, errors, and linguistic sources of variation was analyzed. The automatic method normalized 52% of the mentions, compared to 91% after manual revision. The lowest agreement between automatic and automatic-manual normalization was observed for Finding, Disease, and Procedure entities. Errors in normalization were associated with ortho-typographic, semantic, and grammatical linguistic inadequacies, mainly of the hyponymy/hyperonymy, polysemy/metonymy, and acronym-abbreviation types. This new resource can enrich dictionaries and lexicons with new mentions to improve the functioning of modern entity normalization methods. The linguistic analysis ofers insight into the sources of lexical variety in the Spanish clinical environment related to error generation using lexicon-based normalization methods. This article also introduces a workfow that can serve as a benchmark for comparison in studies replicating our analysis in Romance languages
This study was supported by ANID Postdoctoral FONDECYT 3210395 and FONDECYT 11201250, Basal Funds for Center of Excellence FB210005 (CMM), Millennium Science Initiative Program ICN2021_004 (iHealth) and ICN17_002 (IMFD), and partly supported by the CLARA-MeD project (PID2020-116001RA-C33, funded by MICIU/AEI/10.13039/501100011033/ in call “Retos de investigación”)
Peer reviewed
Linguistic resources, Entity linking, Normalization, Lexical variation, Educación, 370, Linguistic research, Medical lexicon, Clinical text, 400
Linguistic resources, Entity linking, Normalization, Lexical variation, Educación, 370, Linguistic research, Medical lexicon, Clinical text, 400
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 48 | |
| downloads | 82 |

Views provided by UsageCounts
Downloads provided by UsageCounts