Downloads provided by UsageCounts
Inclusion of individuals with diverse or admixed genetic ancestries is crucial to discover novel findings that may be missed by genomics analyses rooted solely in Caucasian population. Here, we present an analysis framework, SPAmix, which is scalable to a large-scale biobank data analysis including hundreds of thousands of admixed individuals and is universally applicable to various types of complex traits including binary trait, quantitative trait, time-to-event trait, longitudinal traits, etc. For each genetic variant, SPAmix uses genotype data and genetic principal components (PCs) to estimate individual-level allele frequency, which is subsequently used to calibrate p values via a retrospective analysis. A hybrid strategy including saddlepoint approximation (SPA) can greatly increase the accuracy to analyze rare genetic variants, especially if the phenotypic distribution is unbalanced or extremely unbalanced. Compared to Tractor, SPAmix does not require local ancestry information and can be straightforwardly applicable to a multi-way admixed population. Meanwhile, SPAmix can also be extended to SPAmixlocal in which the local ancestry can be incorporated if available. In addition, we propose SPAmixCCT to combine the p values of SPAmix and SPAmixlocal via Cauchy combination (CCT). SPAmixlocal performs close to Tractor when analyzing quantitative traits and is more accurate when analyzing binary traits with an unbalanced case-control ratio. And SPAmixCCT is an optimal unified approach for various cross-ancestry genetic architectures. Extensive simulation studies and real data analyses of 369,314 UK Biobank individuals from multiple ancestries demonstrated that SPAmix is scalable and can discover novel hits while controlling type I error rates well.
GWAS; admixed population; retrospective association analysis; saddlepoint approximation; complex traits
GWAS; admixed population; retrospective association analysis; saddlepoint approximation; complex traits
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 13 | |
| downloads | 11 |

Views provided by UsageCounts
Downloads provided by UsageCounts