Powered by OpenAIRE graph
Found an issue? Give us feedback
ZENODOarrow_drop_down
ZENODO
Dataset . 2024
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2024
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2024
License: CC BY
Data sources: Datacite
versions View all 3 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Expert and AI-generated annotations of the tissue types for the RMS-Mutation-Prediction microscopy images

Authors: Bridge, Christopher; Brown, G. Thomas; Jung, Hyun; Lisle, Curtis; Clunie, David; Milewski, David; Liu, Yanling; +6 Authors

Expert and AI-generated annotations of the tissue types for the RMS-Mutation-Prediction microscopy images

Abstract

This dataset corresponds to a collection of images and/or image-derived data available from National Cancer Institute Imaging Data Commons (IDC) [1]. This dataset was converted into DICOM representation and ingested by the IDC team. You can explore and visualize the corresponding images using IDC Portal here: https://portal.imaging.datacommons.cancer.gov/explore/filters/?analysis_results_id=RMS-Mutation-Prediction-Expert-Annotations.. You can use the manifests included in this Zenodo record to download the content of the collection following the Download instructions below. Collection description This dataset contains 2 components: Annotations of multiple regions of interest performed by an expert pathologist with eight years of experience for a subset of hematoxylin and eosin (H&E) stained images from the RMS-Mutation-Prediction image collection [1,2]. Annotations were generated manually, using the Aperio ImageScope tool, to delineate regions of alveolar rhabdomyosarcoma (ARMS), embryonal rhabdomyosarcoma (ERMS), stroma, and necrosis [3]. The resulting planar contour annotations were originally stored in ImageScope-specific XML format, and subsequently converted into Digital Imaging and Communications in Medicine (DICOM) Structured Report (SR) representation using the open source conversion tool [4]. AI-generated annotations stored as probabilistic segmentations. WARNING: After the release of v20, it was discovered that a mistake had been made during data conversion that affected the newly-released segmentations accompanying the "RMS-Mutation-Prediction" collection. Segmentations released in v20 for this collection have the segment labels for alveolar rhabdomyosarcoma (ARMS) and embryonal rhabdomyosarcoma (ERMS) switched in the metadata relative to the correct labels. Thus segment 3 in the released files is labelled in the metadata (the SegmentSequence) as ARMS but should correctly be interpreted as ERMS, and conversely segment 4 in the released files is labelled as ERMS but should be correctly interpreted as ARMS. We apologize for the mistake and any confusion that it has caused, and will be releasing a corrected version of the files in the next release as soon as possible. Many pixels from the whole slide images annotated by this dataset are not contained inside any annotation contours and are considered to belong to the background class. Other pixels are contained inside only one annotation contour and are assigned to a single class. However, cases also exist in this dataset where annotation contours overlap. In these cases, the pixels contained in multiple contours could be assigned membership in multiple classes. One example is a necrotic tissue contour overlapping an internal subregion of an area designated by a larger ARMS or ERMS annotation. The ordering of annotations in this DICOM dataset preserves the order in the original XML generated using ImageScope. These annotations were converted, in sequence, into segmentation masks and used in the training of several machine learning models. Details on the training methods and model results are presented in [1]. In the case of overlapping contours, the order in which annotations are processed may affect the generated segmentation mask if prior contours are overwritten by later contours in the sequence. It is up to the application consuming this data to decide how to interpret tissues regions annotated with multiple classes. The annotations included in this dataset are available for visualization and exploration from the National Cancer Institute Imaging Data Commons (IDC) [5] (also see IDC Portal at https://imaging.datacommons.cancer.gov) as of data release v18. Direct link to open the collection in IDC Portal: https://portal.imaging.datacommons.cancer.gov/explore/filters/?analysis_results_id=RMS-Mutation-Prediction-Expert-Annotations. Files included A manifest file's name indicates the IDC data release in which a version of collection data was first introduced. For example, pan_cancer_nuclei_seg_dicom-collection_id-idc_v19-aws.s5cmd corresponds to the annotations for th eimages in the collection_id collection introduced in IDC data release v19. DICOM Binary segmentations were introduced in IDC v20. If there is a subsequent version of this Zenodo page, it will indicate when a subsequent version of the corresponding collection was introduced. For each of the collections, the following manifest files are provided: rms_mutation_prediction_expert_annotations-idc_v20-aws.s5cmd: manifest of files available for download from public IDC Amazon Web Services buckets rms_mutation_prediction_expert_annotations-idc_v20-gcs.s5cmd: manifest of files available for download from public IDC Google Cloud Storage buckets rms_mutation_prediction_expert_annotations-idc_v20-dcf.dcf: Gen3 manifest (for details see https://learn.canceridc.dev/data/organization-of-data/guids-and-uuids) Note that manifest files that end in -aws.s5cmd reference files stored in Amazon Web Services (AWS) buckets, while -gcs.s5cmd reference files in Google Cloud Storage. The actual files are identical and are mirrored between AWS and GCP. Download instructions Each of the manifests include instructions in the header on how to download the included files. To download the files using .s5cmd manifests: install idc-index package: pip install --upgrade idc-index download the files referenced by manifests included in this dataset by passing the .s5cmd manifest file: idc download manifest.s5cmd To download the files using .dcf manifest, see manifest header. Acknowledgments Imaging Data Commons team has been funded in whole or in part with Federal funds from the National Cancer Institute, National Institutes of Health, under Task Order No. HHSN26110071 under Contract No. HHSN261201500003l. If you use the files referenced in the attached manifests, we ask you to cite this dataset, as well as the publication describing the original dataset [2] and publication acknowledging IDC [5]. References [1] D. Milewski et al., "Predicting molecular subtype and survival of rhabdomyosarcoma patients using deep learning of H&E images: A report from the Children's Oncology Group," Clin. Cancer Res., vol. 29, no. 2, pp. 364–378, Jan. 2023, doi: 10.1158/1078-0432.CCR-22-1663. [2] Clunie, D., Khan, J., Milewski, D., Jung, H., Bowen, J., Lisle, C., Brown, T., Liu, Y., Collins, J., Linardic, C. M., Hawkins, D. S., Venkatramani, R., Clifford, W., Pot, D., Wagner, U., Farahani, K., Kim, E., & Fedorov, A. (2023). DICOM converted whole slide hematoxylin and eosin images of rhabdomyosarcoma from Children's Oncology Group trials [Data set]. Zenodo. https://doi.org/10.5281/zenodo.8225132 [3] Agaram NP. Evolving classification of rhabdomyosarcoma. Histopathology. 2022 Jan;80(1):98-108. doi: 10.1111/his.14449. PMID: 34958505; PMCID: PMC9425116,https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9425116/ [4] Chris Bridge. (2024). ImagingDataCommons/idc-sm-annotations-conversion: v1.0.0 (v1.0.0). Zenodo. https://doi.org/10.5281/zenodo.10632182 [5] Fedorov, A., Longabaugh, W. J. R., Pot, D., Clunie, D. A., Pieper, S. D., Gibbs, D. L., Bridge, C., Herrmann, M. D., Homeyer, A., Lewis, R., Aerts, H. J. W. L., Krishnaswamy, D., Thiriveedhi, V. K., Ciausu, C., Schacherer, D. P., Bontempi, D., Pihl, T., Wagner, U., Farahani, K., Kim, E. & Kikinis, R. National cancer institute imaging data commons: Toward transparency, reproducibility, and scalability in imaging artificial intelligence. Radiographics 43, (2023).

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Related to Research communities
Cancer Research