
AbstractMachine learning (ML) is revolutionizing image-based diagnostics in pathology and radiology. ML models have shown promising results in research settings, but the lack of interoperability between ML systems and enterprise medical imaging systems has been a major barrier for clinical integration and evaluation. The DICOM® standard specifies information object definitions (IODs) and services for the representation and communication of digital images and related information, including image-derived annotations and analysis results. However, the complexity of the standard represents an obstacle for its adoption in the ML community and creates a need for software libraries and tools that simplify working with datasets in DICOM format. Here we present the highdicom library, which provides a high-level application programming interface (API) for the Python programming language that abstracts low-level details of the standard and enables encoding and decoding of image-derived information in DICOM format in a few lines of Python code. The highdicom library leverages NumPy arrays for efficient data representation and ties into the extensive Python ecosystem for image processing and machine learning. Simultaneously, by simplifying creation and parsing of DICOM-compliant files, highdicom achieves interoperability with the medical imaging systems that hold the data used to train and run ML models, and ultimately communicate and store model outputs for clinical use. We demonstrate through experiments with slide microscopy and computed tomography imaging, that, by bridging these two ecosystems, highdicom enables developers and researchers to train and evaluate state-of-the-art ML models in pathology and radiology while remaining compliant with the DICOM standard and interoperable with clinical systems at all stages. To promote standardization of ML research and streamline the ML model development and deployment process, we made the library available free and open-source at https://github.com/herrmannlab/highdicom.
FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Vision and Pattern Recognition (cs.CV), Methods Paper, Image and Video Processing (eess.IV), Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing, Machine Learning (cs.LG), Machine Learning, Radiology Information Systems, FOS: Electrical engineering, electronic engineering, information engineering, Humans, Radiology, Tomography, X-Ray Computed, Ecosystem, Data Curation
FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Vision and Pattern Recognition (cs.CV), Methods Paper, Image and Video Processing (eess.IV), Computer Science - Computer Vision and Pattern Recognition, Electrical Engineering and Systems Science - Image and Video Processing, Machine Learning (cs.LG), Machine Learning, Radiology Information Systems, FOS: Electrical engineering, electronic engineering, information engineering, Humans, Radiology, Tomography, X-Ray Computed, Ecosystem, Data Curation
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 15 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
