Automatic Annotation of Narrative Radiology Reports

descriptionPublicationkeyboard_double_arrow_right Article , Other literature type 01 Apr 2020 Germany, Croatia English Publisher:MDPI AGJournal:Diagnostics, volume 10, page 196 (eissn: 2075-4418,

Copyright policy )

Authors: Ivan Krsnik; Goran Glavaš; Marina Krsnik; Damir Miletić; Ivan Štajduhar;

doi: 10.3390/diagnostics10040196

pmid: 32244833

pmc: PMC7235892

Automatic Annotation of Narrative Radiology Reports

- Summary
- Subjects
- Metrics

Abstract

Narrative texts in electronic health records can be efficiently utilized for building decision support systems in the clinic, only if they are correctly interpreted automatically in accordance with a specified standard. This paper tackles the problem of developing an automated method of labeling free-form radiology reports, as a precursor for building query-capable report databases in hospitals. The analyzed dataset consists of 1295 radiology reports concerning the condition of a knee, retrospectively gathered at the Clinical Hospital Centre Rijeka, Croatia. Reports were manually labeled with one or more labels from a set of 10 most commonly occurring clinical conditions. After primary preprocessing of the texts, two sets of text classification methods were compared: (1) traditional classification models—Naive Bayes (NB), Logistic Regression (LR), Support Vector Machine (SVM), and Random Forests (RF)—coupled with Bag-of-Words (BoW) features (i.e., symbolic text representation) and (2) Convolutional Neural Network (CNN) coupled with dense word vectors (i.e., word embeddings as a semantic text representation) as input features. We resorted to nested 10-fold cross-validation to evaluate the performance of competing methods using accuracy, precision, recall, and F 1 score. The CNN with semantic word representations as input yielded the overall best performance, having a micro-averaged F 1 score of 86.7 % . The CNN classifier yielded particularly encouraging results for the most represented conditions: degenerative disease ( 95.9 % ), arthrosis ( 93.3 % ), and injury ( 89.2 % ). As a data-hungry deep learning model, the CNN, however, performed notably worse than the competing models on underrepresented classes with fewer training instances such as multicausal disease or metabolic disease. LR, RF, and SVM performed comparably well, with the obtained micro-averaged F 1 scores of 84.6 % , 82.2 % , and 82.1 % , respectively.

Countries

Germany, Croatia

Related Organizations

SVEUCILISTE U ZAGREBU FAKULTET ELEKTROTEHNIKE I RACUNARSTVA
Croatia
University of Zagreb
Croatia
University Hospital Centre Zagreb
Croatia
University of Rijeka
Croatia
University of Mannheim
Germany

Keywords

automatic labeling, Medicine (General), decision support system, knee, free-form radiology report, word embedding, Article, 004, machine learning, R5-920, free-form radiology report ; automatic labelling ; decision support system ; natural language processing ; machine learning ; word embedding ; knee, automatic labelling, natural language processing

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	9
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%