Downloads provided by UsageCounts
Objective: Understanding the molecular drivers of disease is a vital component of personalized medicine. Unfortunately, molecular data are not currently available in most electronic health records (EHRs). To solve this problem we created Med2Mech, a joint learning framework for inferring molecular characterizations of patients from clinical data and publicly available biomedical data. Methods: Med2Mech was evaluated using pediatric EHR data from a subset of rare disease and other similarly medically complex patients. First, patient-level clinical embeddings were generated. Then, a PheKnowLator knowledge graph (KG) was used to generate mechanism embeddings. Finally, patient-level mechanism embeddings were derived by summing or averaging each patient’s unique set of mechanism embeddings. A one-vs-the-rest multiclass classification strategy, with five cross-fold validation, was used to evaluate the discriminatory ability of the mechanism and clinical embeddings. Rare disease subphenotype differences, using both clinical and mechanism embeddings, were further investigated using K-Means, which were verified by PhD- and MD-level domain experts. As external validation, the ability to infer the genotype of the rare disease patients using an independent sample of publicly available transcriptomic data was examined. Results: Clinical embeddings were built for four rare disease groups (n = 2,646) and 10,000 similarly complex patients using 6,382 conditions, 2,334 medications, and 272 measurements. Mechanism embeddings were generated from a PheKnowlator KG with 129,875 nodes and 3,838,935 edges. On classification, the mechanism embeddings out-performed 82.2% of the clinical embedding parameterizations. Domain expert review confirmed the mechanism embeddings produced more clinically-relevant clusters of comorbidities for each rare disease subphenotype than the clinical embeddings. External validation further demonstrated the utility of this framework by accurately inferring the genotype and phenotype of EHR-derived rare disease patients from publicly available molecular data. Conclusion: These results illustrate the translational utility of PheKnowLator KGs and demonstrate that it is possible to derive clinically meaningful and biologically relevant patient representations from disparate sources of EHR data and expert-curated publicly available transcriptomic data.
This is submission serves as a placeholder for a preprint that is being submitted to arXiv. As soon as a valid DOI has been produced, this submission will be updated with the preprint PDF, the DOI, and the submission authors.
Knowledge Graph, Pediatric Rare Disease, PheKnowLator, Patient Representation Learning, OMOP2OBO, Open Biomedical Ontologies, Neurosymbolic Representation Learning, Graph Representation Learning, Deep Computational Phenotyping, OMOP
Knowledge Graph, Pediatric Rare Disease, PheKnowLator, Patient Representation Learning, OMOP2OBO, Open Biomedical Ontologies, Neurosymbolic Representation Learning, Graph Representation Learning, Deep Computational Phenotyping, OMOP
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 13 | |
| downloads | 20 |

Views provided by UsageCounts
Downloads provided by UsageCounts