Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Article . 2019
License: CC BY
Data sources: Datacite
versions View all 1 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Biological datasets for SMBA

Authors: Nardone, Davide;

Biological datasets for SMBA

Abstract

In the following, a brief description of all datasets employed in the experiments. 1. ALLAML dataset contains in total 72 samples in 2 classes, ALL and AML, which have 47 and 25 samples, respectively. Every sample contains 7,129 gene expression values. 2. LEUKEMIA dataset contains in total 72 samples in 2 classes: acute lymphoblastic and acute myeloid. From 7,129 genes, the baseline genes were cut off before further analysis. The number of genes that are used in the multiclass classification task is 7,070. 3. CLL_SUB_111 dataset has gene expressions from high density oligonucleotide arrays containing genetically and clinically distinct subgroups of B-cell chronic lymphocytic leukemia (B-CLL). The dataset consists of 11,340 attributes, 111 instances and 3 classes. 4. GLIOMA dataset contains in total 50 samples in 4 classes: cancer glioblastomas, non-cancer glioblastomas, cancer oligodendrogliomas and non-cancer oligodendrogliomas, which have 14, 14, 7, 15 samples, respectively. Each sample has 12,625 genes. After a preprocessing, the dataset has been shrunk to 50 samples and 4,433 genes. 5. LUNG dataset contains in total 203 samples in 5 classes, adenocarcinomas, squamous cell lung carcinomas, pulmonary carcinoids, small-cell lung carcinomas and normal lung, with 139, 21, 20, 6, 17 samples, respectively. The genes with standard deviations smaller than 50 expression units were removed getting a dataset with 203 samples and 3,312 genes. 6. LUNG_DISCRETE dataset contains 73 samples in 7 classes where, each sample consists of 325 gene expressions. The cardinalities of each sample in the LUNG_DISCRETE dataset are 6, 5, 5, 16, 7, 13, 21, respectively. 7. DLBCL dataset is a modified version of the original DLBCL dataset. It consists of 96 samples in 9 classes, where each sample is defined by the expression of 4,026 genes. The cardinalities of each sample in the DLBCL dataset are 46, 10, 9, 11, 6, 6, 4, 2, 2, respectively. 8. CARCINOM dataset contains 174 samples in 11 classes, prostate, bladder/ureter, breast, colorectal, gastroesophagus, kidney, liver, ovary, pancreas, lung adenocarcinomas and lung squamous cell carcinoma, with 26, 8, 26, 23, 12, 11, 7, 27, 6, 14, 14 samples, respectively. After a preprocessing the dataset has been shrunk to 174 samples and 9,182 genes.

Related Organizations
Keywords

Biological,DNA,expression,genes

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    2
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    OpenAIRE UsageCounts
    Usage byUsageCounts
    visibility views 129
    download downloads 53
  • 129
    views
    53
    downloads
    Powered byOpenAIRE UsageCounts
Powered by OpenAIRE graph
Found an issue? Give us feedback
visibility
download
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
views
OpenAIRE UsageCountsViews provided by UsageCounts
downloads
OpenAIRE UsageCountsDownloads provided by UsageCounts
2
Average
Average
Average
129
53
Green
Related to Research communities
Cancer Research