Introducing a Stable Bootstrap Validation Framework for Reliable Genomic Signature Extraction

descriptionPublicationkeyboard_double_arrow_right Article , Other literature type 01 Jan 2018 Germany Publisher:Institute of Electrical and Electronics Engineers (IEEE)Journal:IEEE/ACM Transactions on Computational Biology and Bioinformatics, volume 15, pages 181-190 (issn: 1545-5963,

Copyright policy )

Authors: Nikolaos-Kosmas Chlis; Ekaterini S. Bei; Michalis E. Zervakis;

doi: 10.1109/tcbb.2016.2633267

pmid: 27913357

Introducing a Stable Bootstrap Validation Framework for Reliable Genomic Signature Extraction

- Summary
- Subjects
- Metrics

Abstract

The application of machine learning methods for the identification of candidate genes responsible for phenotypes of interest, such as cancer, is a major challenge in the field of bioinformatics. These lists of genes are often called genomic signatures and their linkage to phenotype associations may form a significant step in discovering the causation between genotypes and phenotypes. Traditional methods that produce genomic signatures from DNA Microarray data tend to extract significantly different lists under relatively small variations of the training data. That instability hinders the validity of research findings and raises skepticism about the reliability of such methods. In this study, a complete framework for the extraction of stable and reliable lists of candidate genes is presented. The proposed methodology enforces stability of results at the validation step and as a result, it is independent of the feature selection and classification methods used. Furthermore, two different statistical tests are performed in order to assess the statistical significance of the observed results. Moreover, the consistency of the signatures extracted by independent executions of the proposed method is also evaluated. The results of this study highlight the importance of stability issues in genomic signatures, beyond their prediction capabilities.

Country

Germany

Related Organizations

Technical University of Crete
Greece
Helmholtz Zentrum München
Germany

Keywords

Genome, Support Vector Machine, Gene Expression Profiling, Reproducibility of Results, Genomics, Bioinformatics ; Classification ; Dna Microarrays ; Feature Selection ; Machine Learning ; Relevance Vector Machine (rvm) ; Support Vector Machine (svm), Machine Learning, Databases, Genetic, Cluster Analysis, Humans, Oligonucleotide Array Sequence Analysis

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	16
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%