Downloads provided by UsageCounts
doi: 10.5061/dryad.pv85m
The categorization of intraductal proliferative lesions of the breast based on routine light microscopic examination of histopathologic sections is in many cases challenging, even for experienced pathologists. The development of computational tools to aid pathologists in the characterization of these lesions would have great diagnostic and clinical value. As a first step to address this issue, we evaluated the ability of computational image analysis to accurately classify DCIS and UDH and to stratify nuclear grade within DCIS. Using 116 breast biopsies diagnosed as DCIS or UDH from the Massachusetts General Hospital (MGH), we developed a computational method to extract 392 features corresponding to the mean and standard deviation in nuclear size and shape, intensity, and texture across 8 color channels. We used L1-regularized logistic regression to build classification models to discriminate DCIS from UDH. The top-performing model contained 22 active features and achieved an AUC of 0.95 in cross-validation on the MGH data-set. We applied this model to an external validation set of 51 breast biopsies diagnosed as DCIS or UDH from the Beth Israel Deaconess Medical Center, and the model achieved an AUC of 0.86. The top-performing model contained active features from all color-spaces and from the three classes of features (morphology, intensity, and texture), suggesting the value of each for prediction. We built models to stratify grade within DCIS and obtained strong performance for stratifying low nuclear grade vs. high nuclear grade DCIS (AUC = 0.98 in cross-validation) with only moderate performance for discriminating low nuclear grade vs. intermediate nuclear grade and intermediate nuclear grade vs. high nuclear grade DCIS (AUC = 0.83 and 0.69, respectively). These data show that computational pathology models can robustly discriminate benign from malignant intraductal proliferative lesions of the breast and may aid pathologists in the diagnosis and classification of these lesions.
Data: Original ImagesThis contains two data sets (MGH - Training and BIDMC - Evaluation).OriginalImages.zipData: Segmented ImagesThis contains nuclei segmented images.SegmentedImages.zipData: Original and Segmented ImagesThis contains both original and segmented images.CombinedImages.rarFig: Analysis FiguresThis contains analysis figures that describes the framework performance on data sets.Figures.rarCode: Texture Features ComputationThis is C++ code that computes intensity and texture features of each segmented nuclei. This code requires ITK 4 or above version and Boost library to compile and run. It also requires a library of color transformation into different color spaces which you can find at this link (https://github.com/midas-journal/midas-journal-780.git).TextureFeaturesComputation.rarCode: Nuclei Segmentation & Morphological FeaturesThis contains a Fiji (ImageJ) Macro that segment nuclei and compute morphological Features.NucleiSegmentation_MorphologicalFeatures.ijmFile: Selected Features ListThis file contains a list of selected features._SelectedFeatures.csvFile: Computed Features with Class LabelThis file contains all computed features including Morphological, intensity and textural features with class label.ComputedFeatures_Label.csvFile: Breast Cancer Cases (UDH & DCIS)This file contains a list of all cases with clinical data that used for class labelling.BreastCancerCases_UDH_DCIS_167.xlsCode: Analysis in RThis code used computed features as input and generate analysis figures as output that described the framework performance.Analysis.R
Breast cancer, ductal carcinoma in situ, Breast Cancer, Ductal carcinoma in situ, pathology informatics, usual ductal hyperplasia, computational pathology
Breast cancer, ductal carcinoma in situ, Breast Cancer, Ductal carcinoma in situ, pathology informatics, usual ductal hyperplasia, computational pathology
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 44 | |
| downloads | 19 |

Views provided by UsageCounts
Downloads provided by UsageCounts