publication . Article . 2017

Two‐phase designs for joint quantitative‐trait‐dependent and genotype‐dependent sampling in post‐GWAS regional sequencing

Espin‐Garcia, Osvaldo; Craiu, Radu V.; Bull, Shelley B.;
Open Access English
  • Published: 01 Dec 2017 Journal: Genetic Epidemiology, volume 42, issue 1, pages 104-116 (issn: 0741-0395, eissn: 1098-2272, Copyright policy)
  • Publisher: John Wiley and Sons Inc.
Abstract
ABSTRACT We evaluate two‐phase designs to follow‐up findings from genome‐wide association study (GWAS) when the cost of regional sequencing in the entire cohort is prohibitive. We develop novel expectation‐maximization‐based inference under a semiparametric maximum likelihood formulation tailored for post‐GWAS inference. A GWAS‐SNP (where SNP is single nucleotide polymorphism) serves as a surrogate covariate in inferring association between a sequence variant and a normally distributed quantitative trait (QT). We assess test validity and quantify efficiency and power of joint QT‐SNP‐dependent sampling and analysis under alternative sample allocations by simulati...
Subjects
free text keywords: Research Article, Research Articles, fine‐mapping, Genetic Analysis Workshop 19, genetic association studies, joint outcome covariate dependent sampling, outcome‐/covariate‐dependent sampling
Funded by
NIH| Identifying T2D Variants by DNA Sequencing in Multiethnic Samples
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 1U01DK085584-01
  • Funding stream: NATIONAL INSTITUTE OF DIABETES AND DIGESTIVE AND KIDNEY DISEASES
,
CIHR
Project
  • Funder: Canadian Institutes of Health Research (CIHR)
,
NIH| Multiethnic Study of Type 2 Diabetes Genes
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5U01DK085526-05
  • Funding stream: NATIONAL INSTITUTE OF DIABETES AND DIGESTIVE AND KIDNEY DISEASES
,
NIH| Identifying variants causal for Type 2 Diabetes in Major human populations
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5U01DK085545-02
  • Funding stream: NATIONAL INSTITUTE OF DIABETES AND DIGESTIVE AND KIDNEY DISEASES
,
NIH| GENETICS OF GALLBLADDER DISEASE IN MEXICAN AMERICANS
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5R01DK053889-04
  • Funding stream: NATIONAL INSTITUTE OF DIABETES AND DIGESTIVE AND KIDNEY DISEASES
35 references, page 1 of 3

Blangero, J., Teslovich, T. M., Sim, X., Almeida, M. A., Jun, G., Dyer, T. D., … Almasy, L. (2016). Omics‐squared: human genomic, transcriptomic and phenotypic data for Genetic Analysis Workshop 19. BMC Proceedings, 10(7), 71–77.27980614 [OpenAIRE] [PubMed]

Breslow, N. E., & Cain, K. C. (1988). Logistic regression for two‐stage case‐control data. Biometrika, 75, 11–20.

Breslow, N. E., & Chatterjee, N. (1999). Design and analysis of two‐phase studies with binary outcome applied to Wilms tumour prognosis. Applied Statistics, 48(4), 457–468.

Breslow, N. E., & Holubkov, R. (1997). Maximum likelihood estimation of logistic regression parameters under two‐phase, outcome‐dependent sampling. Journal of the Royal Statistical Society, Series B, 59(2), 447–461.

Breslow, N. E., & Wellner, J. A. (2007). Weighted likelihood for semiparametric models and two‐phase stratified samples, with application to Cox regression. Scandinavian Journal of Statistics, 34(1), 86–102.

Chatterjee, N., Chen, Y.‐H., & Breslow, N. E. (2003). A pseudoscore estimator for regression problems with two‐phase sampling. Journal of the American Statistical Association, 98(461), 158–168.

Chen, Z., Craiu, R. V., & Bull, S. B. (2012). Two‐phase stratified sampling designs for regional sequencing. Genetic Epidemiology, 36(4), 320–332.22460746 [PubMed]

Chen, Z., Craiu, R. V., & Bull, S. B. (2014). A note on the efficiencies of sampling strategies in two‐stage Bayesian regional fine mapping of a quantitative trait. Genetic Epidemiology, 38(7), 599–609.25132153 [PubMed]

Derkach, A., Lawless, J. F., & Sun, L. (2015). Score tests for association under response‐dependent sampling designs for expensive covariates. Biometrika, 99(2015), 1–8.

Faye, L. L., Machiela, M. J., Kraft, P., Bull, S. B., & Sun, L. (2013). Re‐ranking sequencing variants in the post‐GWAS era for accurate causal variant identification. PLoS Genetics, 9(8), e1003609.

Lawless, J. F. (2016). Two‐phase outcome‐dependent studies for failure times and testing for effects of expensive covariates. Lifetime Data Analysis, 1–17. https://doi.org/10.1007/s10985-016-9386-8 25504515 [OpenAIRE] [PubMed]

Lawless, J. F., Kalbfleisch, J. D., & Wild, C. J. (1999). Semiparametric methods for response‐selective and missing data problems in regression. Journal of the Royal Statistical Society, Series B, 61(2), 413–438.

Li, D., Lewinger, J. P., Gauderman, W. J., Murcray, C. E., & Conti, D. (2011). Using extreme phenotype sampling to identify the rare causal variants of quantitative traits in association studies. Genetic Epidemiology, 35(8), 790–799.21922541 [OpenAIRE] [PubMed]

Lin, D.‐Y., Zeng, D., & Tang, Z. Z. (2013). Quantitative trait analysis in sequencing studies under trait‐dependent sampling. Proceedings of the National Academy of Sciences of the United States of America, 110(30), 12247–12252.23847208 [OpenAIRE] [PubMed]

Louis, T. A. (1982). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society, 44, 226–233.

35 references, page 1 of 3
Abstract
ABSTRACT We evaluate two‐phase designs to follow‐up findings from genome‐wide association study (GWAS) when the cost of regional sequencing in the entire cohort is prohibitive. We develop novel expectation‐maximization‐based inference under a semiparametric maximum likelihood formulation tailored for post‐GWAS inference. A GWAS‐SNP (where SNP is single nucleotide polymorphism) serves as a surrogate covariate in inferring association between a sequence variant and a normally distributed quantitative trait (QT). We assess test validity and quantify efficiency and power of joint QT‐SNP‐dependent sampling and analysis under alternative sample allocations by simulati...
Subjects
free text keywords: Research Article, Research Articles, fine‐mapping, Genetic Analysis Workshop 19, genetic association studies, joint outcome covariate dependent sampling, outcome‐/covariate‐dependent sampling
Funded by
NIH| Identifying T2D Variants by DNA Sequencing in Multiethnic Samples
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 1U01DK085584-01
  • Funding stream: NATIONAL INSTITUTE OF DIABETES AND DIGESTIVE AND KIDNEY DISEASES
,
CIHR
Project
  • Funder: Canadian Institutes of Health Research (CIHR)
,
NIH| Multiethnic Study of Type 2 Diabetes Genes
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5U01DK085526-05
  • Funding stream: NATIONAL INSTITUTE OF DIABETES AND DIGESTIVE AND KIDNEY DISEASES
,
NIH| Identifying variants causal for Type 2 Diabetes in Major human populations
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5U01DK085545-02
  • Funding stream: NATIONAL INSTITUTE OF DIABETES AND DIGESTIVE AND KIDNEY DISEASES
,
NIH| GENETICS OF GALLBLADDER DISEASE IN MEXICAN AMERICANS
Project
  • Funder: National Institutes of Health (NIH)
  • Project Code: 5R01DK053889-04
  • Funding stream: NATIONAL INSTITUTE OF DIABETES AND DIGESTIVE AND KIDNEY DISEASES
35 references, page 1 of 3

Blangero, J., Teslovich, T. M., Sim, X., Almeida, M. A., Jun, G., Dyer, T. D., … Almasy, L. (2016). Omics‐squared: human genomic, transcriptomic and phenotypic data for Genetic Analysis Workshop 19. BMC Proceedings, 10(7), 71–77.27980614 [OpenAIRE] [PubMed]

Breslow, N. E., & Cain, K. C. (1988). Logistic regression for two‐stage case‐control data. Biometrika, 75, 11–20.

Breslow, N. E., & Chatterjee, N. (1999). Design and analysis of two‐phase studies with binary outcome applied to Wilms tumour prognosis. Applied Statistics, 48(4), 457–468.

Breslow, N. E., & Holubkov, R. (1997). Maximum likelihood estimation of logistic regression parameters under two‐phase, outcome‐dependent sampling. Journal of the Royal Statistical Society, Series B, 59(2), 447–461.

Breslow, N. E., & Wellner, J. A. (2007). Weighted likelihood for semiparametric models and two‐phase stratified samples, with application to Cox regression. Scandinavian Journal of Statistics, 34(1), 86–102.

Chatterjee, N., Chen, Y.‐H., & Breslow, N. E. (2003). A pseudoscore estimator for regression problems with two‐phase sampling. Journal of the American Statistical Association, 98(461), 158–168.

Chen, Z., Craiu, R. V., & Bull, S. B. (2012). Two‐phase stratified sampling designs for regional sequencing. Genetic Epidemiology, 36(4), 320–332.22460746 [PubMed]

Chen, Z., Craiu, R. V., & Bull, S. B. (2014). A note on the efficiencies of sampling strategies in two‐stage Bayesian regional fine mapping of a quantitative trait. Genetic Epidemiology, 38(7), 599–609.25132153 [PubMed]

Derkach, A., Lawless, J. F., & Sun, L. (2015). Score tests for association under response‐dependent sampling designs for expensive covariates. Biometrika, 99(2015), 1–8.

Faye, L. L., Machiela, M. J., Kraft, P., Bull, S. B., & Sun, L. (2013). Re‐ranking sequencing variants in the post‐GWAS era for accurate causal variant identification. PLoS Genetics, 9(8), e1003609.

Lawless, J. F. (2016). Two‐phase outcome‐dependent studies for failure times and testing for effects of expensive covariates. Lifetime Data Analysis, 1–17. https://doi.org/10.1007/s10985-016-9386-8 25504515 [OpenAIRE] [PubMed]

Lawless, J. F., Kalbfleisch, J. D., & Wild, C. J. (1999). Semiparametric methods for response‐selective and missing data problems in regression. Journal of the Royal Statistical Society, Series B, 61(2), 413–438.

Li, D., Lewinger, J. P., Gauderman, W. J., Murcray, C. E., & Conti, D. (2011). Using extreme phenotype sampling to identify the rare causal variants of quantitative traits in association studies. Genetic Epidemiology, 35(8), 790–799.21922541 [OpenAIRE] [PubMed]

Lin, D.‐Y., Zeng, D., & Tang, Z. Z. (2013). Quantitative trait analysis in sequencing studies under trait‐dependent sampling. Proceedings of the National Academy of Sciences of the United States of America, 110(30), 12247–12252.23847208 [OpenAIRE] [PubMed]

Louis, T. A. (1982). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society, 44, 226–233.

35 references, page 1 of 3
Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue