
doi: 10.7302/5963
handle: 2027.42/174232
Heart disease is the leading cause of death globally. The advancement of precision medicine can aid in identification of high risk individuals to initiate early preventive treatment. The growth of electronic health record (EHR)-linked biobanks around the world provides an opportunity to integrate clinical and genetic information to improve risk prediction. In this dissertation, I have illustrated how leveraging both clinical and genetic data from Michigan Medicine biobank could identify patients with high risk of cardiovascular risk using a well-powered polygenic risk score (PRS) and a novel clinical risk score (ClinRS), created using adapted natural language processing (NLP) method. In chapter 2, I analyzed the Michigan Medicine Precision Health COVID-19 Survey, deployed by our research group in May 2020 to study the impact of the ‘Stay Home Stay Safe’ Executive Order on health behavior changes that could potentially lead to an increase of cardiovascular risk. This study found that African Americans, women, and the lowest income group reported worsening health behaviors during the Executive Order in Michigan. In chapter 3, I investigated the power of genetic diversity on creating PRS for heart failure risk estimation. In this study, I evaluated the association between heart failure PRS and phenotypic subtypes (heart failure with reduced ejection fraction [HFrEF] and heart failure with preserved ejection fraction [HFpEF]). The heart failure PRS was calculated using both single- and multi-ancestry genome-wide association study (GWAS) summary statistics meta-analyzed by Global Biobank Meta-analysis Initiative (GBMI). The GBMI meta-analyzed heart failure multi-ancestry GWAS, included a total of 1,354,739 individuals (5% cases) from 5 ancestral populations and 13 biobanks. Of the 1.35 million participants, 24.7% were of non-European ancestry. The results showed that the multi-ancestry GWAS based PRS is the most powerful genetic risk score that is significantly associated with both HFrEF and HFpEF in European American and HFrEF in African American ancestry samples. In chapter 4, I developed a novel clinical risk score using NLP to learn the co-occurrence patterns within the EHR system and to further extract independent information to summarize the EHR data into low-dimensional features. Next, I evaluated the performances of heart failure prediction models using baseline demographic information, PRS, ClinRS, and a model with both PRS and ClinRS as predictors. The results showed that the model including both PRS and ClinRS yielded superior accuracy to predict future heart failure events up to 10 years in advance, showing the additive power of integrating clinical and genetic information in precision health. This dissertation developed risk scores using novel methodology and demonstrated the benefits of incorporating clinical and genetic data using large-scale EHR-linked biobanks. Together, the research conducted in this dissertation can enhance precision medicine and improve disease prediction and modify disease progression by initiating earlier preventive care.
electronic health records, cardiovascular disease, precision medicine, Health Sciences, genetics, Public Health
electronic health records, cardiovascular disease, precision medicine, Health Sciences, genetics, Public Health
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
