Protein Function Prediction with Incomplete Annotations

descriptionPublicationkeyboard_double_arrow_right Article 01 May 2014Publisher:Institute of Electrical and Electronics Engineers (IEEE)Journal:IEEE/ACM Transactions on Computational Biology and Bioinformatics, volume 11, pages 579-591 (issn: 1545-5963,

Copyright policy )

Authors: Guo-Xian Yu; Huzefa Rangwala; Carlotta Domeniconi; Guoji Zhang; Zhiwen Yu 0002;

doi: 10.1109/tcbb.2013.142

pmid: 26356025

Protein Function Prediction with Incomplete Annotations

- Summary
- Subjects
- Metrics

Abstract

Automated protein function prediction is one of the grand challenges in computational biology. Multi-label learning is widely used to predict functions of proteins. Most of multi-label learning methods make prediction for unlabeled proteins under the assumption that the labeled proteins are completely annotated, i.e., without any missing functions. However, in practice, we may have a subset of the ground-truth functions for a protein, and whether the protein has other functions is unknown. To predict protein functions with incomplete annotations, we propose a Protein Function Prediction method with Weak-label Learning (ProWL) and its variant ProWL-IF. Both ProWL and ProWL-IF can replenish the missing functions of proteins. In addition, ProWL-IF makes use of the knowledge that a protein cannot have certain functions, which can further boost the performance of protein function prediction. Our experimental results on protein-protein interaction networks and gene expression benchmarks validate the effectiveness of both ProWL and ProWL-IF.

Related Organizations

South China University of Technology
China (People's Republic of)
Guangzhou Higher Education Mega Center
China (People's Republic of)
George Mason University
United States
Southern University of Science and Technology
China (People's Republic of)
Southeast University
China (People's Republic of)

Keywords

Models, Statistical, Computational Biology, Proteins, Molecular Sequence Annotation, Protein Interaction Maps, Transcriptome

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	25
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%