• shareshare
  • link
  • cite
  • add
auto_awesome_motion View all 9 versions
Publication . Article . 2018

Using predicate and provenance information from a knowledge graph for drug efficacy screening

Wytze J. Vlietstra; Rein Vos; Anneke M. Sijbers; Erik M. van Mulligen; Jan A. Kors;
Open Access
Published: 06 Sep 2018 Journal: Journal of Biomedical Semantics, volume 9, issue 1, pages 1-10 (issn: 2041-1480, Copyright policy )
Publisher: BMC
Country: Netherlands

Background Biomedical knowledge graphs have become important tools to computationally analyse the comprehensive body of biomedical knowledge. They represent knowledge as subject-predicate-object triples, in which the predicate indicates the relationship between subject and object. A triple can also contain provenance information, which consists of references to the sources of the triple (e.g. scientific publications or database entries). Knowledge graphs have been used to classify drug-disease pairs for drug efficacy screening, but existing computational methods have often ignored predicate and provenance information. Using this information, we aimed to develop a supervised machine learning classifier and determine the added value of predicate and provenance information for drug efficacy screening. To ensure the biological plausibility of our method we performed our research on the protein level, where drugs are represented by their drug target proteins, and diseases by their disease proteins. Results Using random forests with repeated 10-fold cross-validation, our method achieved an area under the ROC curve (AUC) of 78.1% and 74.3% for two reference sets. We benchmarked against a state-of-the-art knowledge-graph technique that does not use predicate and provenance information, obtaining AUCs of 65.6% and 64.6%, respectively. Classifiers that only used predicate information performed superior to classifiers that only used provenance information, but using both performed best. Conclusion We conclude that both predicate and provenance information provide added value for drug efficacy screening. Electronic supplementary material The online version of this article (10.1186/s13326-018-0189-6) contains supplementary material, which is available to authorized users.

Subjects by Vocabulary

Library of Congress Subject Headings: lcsh:Computer applications to medicine. Medical informatics lcsh:R858-859.7

ACM Computing Classification System: ComputingMethodologies_PATTERNRECOGNITION

Microsoft Academic Graph classification: Knowledge graph Learning classifier system Artificial intelligence business.industry business Biological plausibility Biomedical knowledge Predicate (grammar) Provenance Natural language processing computer.software_genre computer Added value Computer science Random forest


Predicate, Provenance, Drug efficacy screening, Machine learning, Knowledge graph, Drug repurposing, Biological Ontologies, Computer Graphics, Drug Evaluation, Preclinical, False Negative Reactions, ROC Curve, Research, Systems pharmacology, Computational pharmacology, Computer Networks and Communications, Health Informatics, Computer Science Applications, Information Systems, OTITIS-MEDIA, DISEASE, NETWORK, COMBINATION, NARCOLEPSY, GENES

Related Organizations
59 references, page 1 of 6

1.Ehrlinger L, Wöß W. Towards a definition of knowledge graphs. CEUR Workshop Proc. 2016;1695

2.Manola F, Miller E. Triple specification. [cited 2018 Jun 4]. Available from:

Croft, D, O’Kelly, G, Wu, G, Haw, R, Gillespie, M, Matthews, L. Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res. 2011; 39: 691-697 [OpenAIRE] [DOI]

Chen, H, Ding, L, Wu, Z, Yu, T, Dhanapalan, L, Chen, JY. Semantic web for integrated network analysis in biomedicine. Brief Bioinform. 2009; 10: 177-192 [OpenAIRE] [PubMed] [DOI]

Vlietstra, WJ, Zielman, R, van Dongen, RM, Schultes, EA, Wiesman, F, Vos, R. Automated extraction of potential migraine biomarkers using a semantic graph. J Biomed Inform. 2017; 71: 178-189 [OpenAIRE] [PubMed] [DOI]

Hofmann-Apitius, M, Ball, G, Gebel, S, Bagewadi, S, De Bono, B, Schneider, R. Bioinformatics mining and modeling methods for the identification of disease mechanisms in neurodegenerative disorders. Int J Mol Sci. 2015; 16: 29179-29206 [OpenAIRE] [PubMed] [DOI]

Azzaoui, K, Jacoby, E, Senger, S, Rodríguez, EC, Loza, M, Zdrazil, B. Scientific competency questions as the basis for semantically enriched open pharmacological space development. Drug Discov Today. 2013; 18: 843-852 [OpenAIRE] [PubMed] [DOI]

Hodos, RA, Kidd, BA, Shameer, K, Readhead, BP, Dudley, JT. In silico methods for drug repurposing and pharmacology. Wiley Interdiscip Rev Syst Biol Med. 2016; 8: 186-210 [OpenAIRE] [PubMed] [DOI]

9.Sirota M, Dudley JT, Kim J, Chiang AP, Morgan AA, Sweet-Cordero A, et al. Discovery and preclinical validation of drug indications using compendia of public gene expression data. Sci Transl Med. 2011;3:96ra77.

Lee, H, Bae, T, Lee, JH, Kim, D, Oh, Y, Jang, Y. Rational drug repositioning guided by an integrated pharmacological network of protein, disease and drug. BMC Syst Biol. 2012; 6: 80 [OpenAIRE] [PubMed] [DOI]

Funded by
NWO| XCiDE: Crossing the Combustion modes in Diesel Engines
  • Funder: Netherlands Organisation for Scientific Research (NWO) (NWO)
  • Project Code: 2300153186
NWO| ODEX4all Open Discovery and Exchange for all
  • Funder: Netherlands Organisation for Scientific Research (NWO) (NWO)
  • Project Code: 11200
Download fromView all 8 sources
Article . 2018
Providers: NARCIS