publication . Article . Preprint . Other literature type . 2018

A practical guide to methods controlling false discoveries in computational biology

Ayshwarya Subramanian; Keegan Korthauer; Alejandro Reyes; Eric J. Alm; Eric J. Alm; Mingxiang Teng; Claire Duvallet; Stephanie C. Hicks; Chinmay J. Shukla; Patrick K. Kimes;
Open Access
  • Published: 31 Oct 2018 Journal: Genome Biology, volume 20 (eissn: 1474-760X, Copyright policy)
  • Publisher: Springer Science and Business Media LLC
Abstract
<jats:title>Abstract</jats:title><jats:sec><jats:title>Background</jats:title><jats:p>In high-throughput studies, hundreds to millions of hypotheses are typically tested. Statistical methods that control the false discovery rate (FDR) have emerged as popular and powerful tools for error rate control. While classic FDR methods use only <jats:italic>p</jats:italic>-values as input, more modern FDR methods have been shown to increase power by incorporating complementary information as “informative covariates” to prioritize, weight, and group hypotheses. However, there is currently no consensus on how the modern methods compare to one another. We investigated the ac...
Subjects
free text keywords: Research, Multiple hypothesis testing, RNA-seq, ScRNA-seq, ChIP-seq, Microbiome, GWAS, Biology (General), QH301-705.5, Genetics, QH426-470, Computational biology, Covariate, False discovery rate, Statistical hypothesis testing, Gene set analysis, Usability, business.industry, business, Word error rate, Multiple comparisons problem, Biology, Computer science
Funded by
NIH| MOFFITT CANCER CENTER SUPPORT GRANT
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 3P30CA076292-08S4
  • Funding stream: NATIONAL CANCER INSTITUTE
,
NIH| MOLECULAR ANALYSIS OF THE YEAST HEAT SHOCK TRANSCRIPTION
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 1F32GM013039-01
  • Funding stream: NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES
,
NIH| Overcoming bias and unwanted variability in next generation sequencing
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5R01HG005220-08
  • Funding stream: NATIONAL HUMAN GENOME RESEARCH INSTITUTE
,
NIH| Software for the statistical analysis of microarray probe level data
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 8R01GM103552-05
  • Funding stream: NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES
,
NIH| Bioconductor: An Open Computing Resource for Genomics
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5U41HG004059-10
  • Funding stream: NATIONAL HUMAN GENOME RESEARCH INSTITUTE
81 references, page 1 of 6

Dudoit, S, Shaffer, JP, Boldrick, JC. Multiple hypothesis testing in microarray experiments. Stat Sci. 2003; 18 (1): 71-103

J, GJ, Aldo, S. Multiple hypothesis testing in genomics. Stat Med. 2014; 33 (11): 1946-78 [PubMed]

Genovese, CR, Lazar, NA, Nichols, T. Thresholding of statistical maps in functional neuroimaging using the false discovery rate. NeuroImage. 2002; 15 (4): 870-878 [PubMed]

Choi, H, Nesvizhskii, AI. False discovery rates and related statistical concepts in mass spectrometry-based proteomics. J Proteome Res. 2007; 7 (01): 47-50 [PubMed]

Shaffer, JP. Multiple hypothesis testing. Annu Rev Psychol. 1995; 46 (1): 561-84

Keselman, H, Cribbie, R, Holland, B. Controlling the rate of type I error over a large set of statistical tests. Br J Math Stat Psychol. 2002; 55 (1): 27-39 [PubMed]

Bajgrowicz, P, Scaillet, O. Technical trading revisited: false discoveries, persistence tests, and transaction costs. J Financ Econ. 2012; 106 (3): 473-91 [OpenAIRE]

Dunn, OJ. Multiple comparisons among means. J Am Stat Assoc. 1961; 56 (293): 52-64

Bonferroni, CE. Teoria statistica delle classi e calcolo delle probabilità. Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commerciali di Firenze. 1936; 8: 3-62

Holm, S. A simple sequentially rejective multiple test procedure. Scan J Stat. 1979; 6 (2): 65-70

Hommel, G. A stagewise rejective multiple test procedure based on a modified Bonferroni test. Biometrika. 1988; 75 (2): 383-6 [OpenAIRE]

Hochberg, Y. A sharper Bonferroni procedure for multiple tests of significance. Biometrika. 1988; 75 (4): 800-2

Benjamini, Y, Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Royal Stat Soc Ser B. 1995; 57 (1): 289-300

Storey, JD. A direct approach to estimating false discovery. J Royal Stat Soc Ser B. 2002; 64 (3): 479-98

Ignatiadis, N, Klaus, B, Zaugg, JB, Huber, W. Data-driven hypothesis weighting increases detection power in genome-scale multiple testing. Nat Methods. 2016; 13: 577-80 [OpenAIRE] [PubMed]

81 references, page 1 of 6
Abstract
<jats:title>Abstract</jats:title><jats:sec><jats:title>Background</jats:title><jats:p>In high-throughput studies, hundreds to millions of hypotheses are typically tested. Statistical methods that control the false discovery rate (FDR) have emerged as popular and powerful tools for error rate control. While classic FDR methods use only <jats:italic>p</jats:italic>-values as input, more modern FDR methods have been shown to increase power by incorporating complementary information as “informative covariates” to prioritize, weight, and group hypotheses. However, there is currently no consensus on how the modern methods compare to one another. We investigated the ac...
Subjects
free text keywords: Research, Multiple hypothesis testing, RNA-seq, ScRNA-seq, ChIP-seq, Microbiome, GWAS, Biology (General), QH301-705.5, Genetics, QH426-470, Computational biology, Covariate, False discovery rate, Statistical hypothesis testing, Gene set analysis, Usability, business.industry, business, Word error rate, Multiple comparisons problem, Biology, Computer science
Funded by
NIH| MOFFITT CANCER CENTER SUPPORT GRANT
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 3P30CA076292-08S4
  • Funding stream: NATIONAL CANCER INSTITUTE
,
NIH| MOLECULAR ANALYSIS OF THE YEAST HEAT SHOCK TRANSCRIPTION
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 1F32GM013039-01
  • Funding stream: NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES
,
NIH| Overcoming bias and unwanted variability in next generation sequencing
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5R01HG005220-08
  • Funding stream: NATIONAL HUMAN GENOME RESEARCH INSTITUTE
,
NIH| Software for the statistical analysis of microarray probe level data
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 8R01GM103552-05
  • Funding stream: NATIONAL INSTITUTE OF GENERAL MEDICAL SCIENCES
,
NIH| Bioconductor: An Open Computing Resource for Genomics
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5U41HG004059-10
  • Funding stream: NATIONAL HUMAN GENOME RESEARCH INSTITUTE
81 references, page 1 of 6

Dudoit, S, Shaffer, JP, Boldrick, JC. Multiple hypothesis testing in microarray experiments. Stat Sci. 2003; 18 (1): 71-103

J, GJ, Aldo, S. Multiple hypothesis testing in genomics. Stat Med. 2014; 33 (11): 1946-78 [PubMed]

Genovese, CR, Lazar, NA, Nichols, T. Thresholding of statistical maps in functional neuroimaging using the false discovery rate. NeuroImage. 2002; 15 (4): 870-878 [PubMed]

Choi, H, Nesvizhskii, AI. False discovery rates and related statistical concepts in mass spectrometry-based proteomics. J Proteome Res. 2007; 7 (01): 47-50 [PubMed]

Shaffer, JP. Multiple hypothesis testing. Annu Rev Psychol. 1995; 46 (1): 561-84

Keselman, H, Cribbie, R, Holland, B. Controlling the rate of type I error over a large set of statistical tests. Br J Math Stat Psychol. 2002; 55 (1): 27-39 [PubMed]

Bajgrowicz, P, Scaillet, O. Technical trading revisited: false discoveries, persistence tests, and transaction costs. J Financ Econ. 2012; 106 (3): 473-91 [OpenAIRE]

Dunn, OJ. Multiple comparisons among means. J Am Stat Assoc. 1961; 56 (293): 52-64

Bonferroni, CE. Teoria statistica delle classi e calcolo delle probabilità. Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commerciali di Firenze. 1936; 8: 3-62

Holm, S. A simple sequentially rejective multiple test procedure. Scan J Stat. 1979; 6 (2): 65-70

Hommel, G. A stagewise rejective multiple test procedure based on a modified Bonferroni test. Biometrika. 1988; 75 (2): 383-6 [OpenAIRE]

Hochberg, Y. A sharper Bonferroni procedure for multiple tests of significance. Biometrika. 1988; 75 (4): 800-2

Benjamini, Y, Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Royal Stat Soc Ser B. 1995; 57 (1): 289-300

Storey, JD. A direct approach to estimating false discovery. J Royal Stat Soc Ser B. 2002; 64 (3): 479-98

Ignatiadis, N, Klaus, B, Zaugg, JB, Huber, W. Data-driven hypothesis weighting increases detection power in genome-scale multiple testing. Nat Methods. 2016; 13: 577-80 [OpenAIRE] [PubMed]

81 references, page 1 of 6
Any information missing or wrong?Report an Issue