Multi-ancestry transcriptome prediction with functionally informed variants in TOPMed MESA improves performance of transcriptome-wide association studies

This Zenodo file collection includes three developed transcriptome prediciton models with funtionally informed variants (FIVs) by using TOPMed MESA multi-ancestry participants with TOPMed Freeze 8 whole-genome sequencing (WGS) data and RNA-seq data from peripheral blood mononuclear cells (PBMCs). These prediction models can be used for transcriptome-wide association study (TWAS) analysis by integrating with GWAS summary statistics. EN-FM: Elastic Net with Fine-Mapped variants. We first used code "SuSiE_fine_mapping.R" to perform SuSiE fine-mapping (PMID:37220626) on 1,287 TOPMed MESA multi-ancestry participants for each gene to get fine-mapped variants. We then built EN models on fine-mapped variants by using "EN-FM_model.R". The examples of input data for models are provided here https://github.com/hakyimlab/PredictDB-Tutorial. PUMICE: Prediction Using Models Informed by Chromatin conformation and Epigenomics (PMID:35672318). 3D genomic data and epigenomic annotation from EBV-transformed lymphocytes were used to construct PUMICE models. We built PUMICE models by following code provided in PUMICE Github (https://github.com/ckhunsr1/PUMICE). More specifically, we first ran code "PUMICE_nested_cv.sh" to find out optimal values of parameters for each gene, and then we ran code "PUMICE_compute_weights.sh" to get weights of SNPs included in the model for each gene. You will need to follow the instructions on PUMICE GitHub to install required R packages first before using the code mentioned above. The examples of input data for models are provided here https://github.com/ckhunsr1/PUMICE/blob/master/examples/example_input.zip. PUMICE-FM: PUMICE with Fine-Mapped variants. The PUMICE-FM model is a variation of PUMICE model, which replaces epigenomic annotation with fine-mapping data and replaces 3D genome window with a constant window size (e.g., 250kb). The fine-mapped variants used for PUMICE-FM models are the same fine-mapped variants from SuSiE fine-mapping described above. The procedure to build PUMICE-FM model is similar to that for PUMICE model. More specifically, we first ran code "PUMICE-FM_nested_cv.sh" to find out optimal values of parameters for each gene, and then we ran code "PUMICE-FM_compute_weights.sh" to get weights of SNPs included in the model for each gene. TWAS: We applied Summary-PrediXcan (S-PrediXcan, PMID:29739930) pipeline to integrate our prediction models with publicly available GWAS summary statistics to get TWAS results. Before running our TWAS code, please follow the S-PrediXcan tutorial (https://github.com/hakyimlab/MetaXcan/wiki/S-PrediXcan-Command-Line-Tutorial) to install required software. We first ran "TWAS.sh" to get TWAS results for each GWAS trait via S-PrediXcan framework. Then we applied "postTWAS.R" to do follow up analyses based on TWAS results, including correcting TWAS inflation and conducting Omnibus approach. Figures: We provided code "Figures.R" below for generating our main figures. The code files mentioned above and model files are provided below for your information. The maunscript is accepted in priciple at AJHG. The preprint is here https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5194962.

Related Organizations

University of Virginia
United States

Keywords

TOPMed, MESA, Multi-ancestry, Transcriptome prediction, TWAS

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average