Accuracy of protein-level disorder predictions

descriptionPublicationkeyboard_double_arrow_right Article 15 Oct 2019 English Publisher:Oxford University Press (OUP)Journal:Briefings in Bioinformatics, volume 21, pages 1,509-1,522 (eissn: 1477-4054,

Copyright policy )

Authors: Akila Katuwawala; Christopher J. Oldfield; Lukasz A. Kurgan;

doi: 10.1093/bib/bbz100

pmid: 31616935

Accuracy of protein-level disorder predictions

- Summary
- Subjects
- Metrics

Abstract

AbstractExperimental annotations of intrinsic disorder are available for 0.1% of 147 000 000 of currently sequenced proteins. Over 60 sequence-based disorder predictors were developed to help bridge this gap. Current benchmarks of these methods assess predictive performance on datasets of proteins; however, predictions are often interpreted for individual proteins. We demonstrate that the protein-level predictive performance varies substantially from the dataset-level benchmarks. Thus, we perform first-of-its-kind protein-level assessment for 13 popular disorder predictors using 6200 disorder-annotated proteins. We show that the protein-level distributions are substantially skewed toward high predictive quality while having long tails of poor predictions. Consequently, between 57% and 75% proteins secure higher predictive performance than the currently used dataset-level assessment suggests, but as many as 30% of proteins that are located in the long tails suffer low predictive performance. These proteins typically have relatively high amounts of disorder, in contrast to the mostly structured proteins that are predicted accurately by all 13 methods. Interestingly, each predictor provides the most accurate results for some number of proteins, while the best-performing at the dataset-level method is in fact the best for only about 30% of proteins. Moreover, the majority of proteins are predicted more accurately than the dataset-level performance of the most accurate tool by at least four disorder predictors. While these results suggests that disorder predictors outperform their current benchmark performance for the majority of proteins and that they complement each other, novel tools that accurately identify the hard-to-predict proteins and that make accurate predictions for these proteins are needed.

Related Organizations

Virginia Commonwealth University
United States

Keywords

Intrinsically Disordered Proteins, Protein Conformation, Sequence Analysis, Protein, Computational Biology, Datasets as Topic, Crystallography, X-Ray, Databases, Protein, Nuclear Magnetic Resonance, Biomolecular, Algorithms

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	47
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

47

Top 10%

gold

Fields of Science (3) View all

medical and health sciences

basic medicine

Fields of Science

medical and health sciences

basic medicine

View all