
ABSTRACT Electronic health record (EHR) analysis can leverage great insights to improve the quality of human healthcare. However, the low data quality problems of missing values, inconsistency, and errors in the data setseverely hinder buildingrobust machine learning models for data analysis. In this paper, we develop a methodology ofartificial intelligence (AI)-based data governance to predict the missing values or verify if the existing values are correct and what they should be when they are wrong. We demonstrate the performance of this methodology through a case study ofpatient gender prediction and verification. Experimental resultsshow that the deep learning algorithm of convolutional neural network (CNN) works very wellaccording to the testing performance measured by the quantitative metric of F1-Score, and it outperformsthe support vector machine (SVM) models with different vector representations for documents. KEYWORDS EHR Analysis, Data Governance, Vector Space Model, Word Embeddings, Machine Learning, Convolutional Neural Networks, Deep Learning. Original Source URL: http://aircconline.com/ijdkp/V8N3/8318ijdkp03.pdf For more details...
Data Governance, Vector Space Model, Word Embeddings
Data Governance, Vector Space Model, Word Embeddings
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 2 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
