Highdicom: a Python Library for Standardized Encoding of Image Annotations and Machine Learning Model Outputs in Pathology and Radiology

descriptionPublicationkeyboard_double_arrow_right Article , Other literature type , Preprint 22 Aug 2022Embargo end date: 01 Jan 2021 English Publisher:Springer Science and Business Media LLCJournal:Journal of Digital Imaging, volume 35, pages 1,719-1,737 (issn: 0897-1889, eissn: 1618-727X,

Copyright policy )Funded by:NIH | Neuroimaging Analysis Cen..., NIH | Advanced Technologies - N..., NIH | Lymph Node Quantification... +3 projects

Authors: Christopher P. Bridge; Chris Gorman; Steven D. Pieper; Sean W. Doyle; Jochen K. Lennerz; Jayashree Kalpathy-Cramer; David A. Clunie; +2 Authors

doi: 10.1007/s10278-022-00683-y , 10.48550/arxiv.2106.07806

pmid: 35995898

pmc: PMC9712874

arXiv: 2106.07806

Highdicom: a Python Library for Standardized Encoding of Image Annotations and Machine Learning Model Outputs in Pathology and Radiology

- Summary
- Subjects
- Related research
  (8)
- Metrics

Abstract

AbstractMachine learning (ML) is revolutionizing image-based diagnostics in pathology and radiology. ML models have shown promising results in research settings, but the lack of interoperability between ML systems and enterprise medical imaging systems has been a major barrier for clinical integration and evaluation. The DICOM® standard specifies information object definitions (IODs) and services for the representation and communication of digital images and related information, including image-derived annotations and analysis results. However, the complexity of the standard represents an obstacle for its adoption in the ML community and creates a need for software libraries and tools that simplify working with datasets in DICOM format. Here we present the highdicom library, which provides a high-level application programming interface (API) for the Python programming language that abstracts low-level details of the standard and enables encoding and decoding of image-derived information in DICOM format in a few lines of Python code. The highdicom library leverages NumPy arrays for efficient data representation and ties into the extensive Python ecosystem for image processing and machine learning. Simultaneously, by simplifying creation and parsing of DICOM-compliant files, highdicom achieves interoperability with the medical imaging systems that hold the data used to train and run ML models, and ultimately communicate and store model outputs for clinical use. We demonstrate through experiments with slide microscopy and computed tomography imaging, that, by bridging these two ecosystems, highdicom enables developers and researchers to train and evaluate state-of-the-art ML models in pathology and radiology while remaining compliant with the DICOM standard and interoperable with clinical systems at all stages. To promote standardization of ML research and streamline the ML model development and deployment process, we made the library available free and open-source at https://github.com/herrmannlab/highdicom.

Related Organizations

Harvard University
United States
BRIGHAM AND WOMEN'S HOSPITAL
Massachusetts General Hospital
United States
Harvard Medical School
United States
Brigham and Women's Hospital
United States

View all View all

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Vision and Pattern Recognition (cs.CV), Methods Paper, Image and Video Processing (eess.IV), Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing, Machine Learning (cs.LG), Machine Learning, Radiology Information Systems, FOS: Electrical engineering, electronic engineering, information engineering, Humans, Radiology, Tomography, X-Ray Computed, Ecosystem, Data Curation

8 Research products, page 1 of 1

Sample "LABELMAP" DICOM Segmentation Files
2025IsSourceOf
dicomweb-client software on GitHub
IsRelatedTo
dcm4chee-arc-light software on GitHub
IsRelatedTo
highdicom software on GitHub
IsRelatedTo
slim software on GitHub
IsRelatedTo
lidc2dicom software on GitHub
IsRelatedTo
viewerjs software on GitHub
IsRelatedTo
slim software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	15
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%