Powered by OpenAIRE graph
Found an issue? Give us feedback
ZENODOarrow_drop_down
ZENODO
Dataset . 2021
Data sources: Datacite
ZENODO
Dataset . 2021
Data sources: ZENODO
ZENODO
Dataset . 2021
Data sources: ZENODO
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Histology images from uniform tumor regions in TCGA Whole Slide Images (TCGA-UT)

Authors: Daisuke Komura; Shumpei Ishikawa;

Histology images from uniform tumor regions in TCGA Whole Slide Images (TCGA-UT)

Abstract

TCGA-UT Dataset Documentation Quick Links Dataset on Hugging Face: For users interested in benchmarking foundation models or feature extractors, please visit TCGA-UT on Hugging Face Original Paper: Universal encoding of pan-cancer histology by deep texture representations Dataset Overview The TCGA-UT dataset is a large-scale collection of histopathological image patches from human cancer tissues. It contains 1,608,060 image patches extracted from hematoxylin & eosin (H&E) stained histological samples across 32 different types of solid cancers. Key Features Size: Over 1.6 million image patches Resolution: All patches are standardized to 256 x 256 pixels Source: Derived from The Cancer Genome Atlas (TCGA) dataset Quality: Curated by trained pathologists Coverage: 32 different cancer types Patient Base: 7,175 patients from 8,736 diagnostic slides Data Collection Process Image Source: Whole Slide Images (WSI) were downloaded from the GDC legacy database between December 2016 and June 2017 Expert Annotation: Two trained pathologists selected at least three representative tumor regions per slide Quality Control: 926 slides were removed due to various quality issues (poor staining, low resolution, focus problems, etc.) Patch Extraction: 10 patches were randomly cropped at 6 different magnification levels from each annotated region File Structure Files are organized using the following format: Copy [cancer_type]/[resolution]/[TCGA Barcode]/[region]-[number]-[pixel resolution].jpg Resolution Key 0: 0.5 μm/pixel 1: 0.6 μm/pixel 2: 0.7 μm/pixel 3: 0.8 μm/pixel 4: 0.9 μm/pixel 5: 1.0 μm/pixel License Non-Commercial Use: CC-BY-NC-SA 4.0 Commercial Use: Please contact ishum-prm@m.u-tokyo.ac.jp for licensing Citation If you use this dataset in your research, please cite: Copy Komura, D., et al. (2022). Universal encoding of pan-cancer histology by deep texture representations. Cell Reports 38, 110424. https://doi.org/10.1016/j.celrep.2022.110424 For Model Benchmarking If you're interested in using this dataset for benchmarking foundation models or feature extractors, we recommend accessing the dataset through the Hugging Face Hub at dakomura/tcga-ut. The Hugging Face version provides: Predefined train/validation/test splits (both internal and external facility-based splits) Ready-to-use benchmarking framework for foundation models WebDataset format support for efficient data loading Example implementations for state-of-the-art model evaluation

Related Organizations
Keywords

histopathology, cancer

  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    2
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
2
Average
Average
Average
Related to Research communities
Cancer Research