
As genome sequencing outstrips the rate of high-quality, low-throughput biochemical and genetic experimentation, accurate annotation of protein function becomes a bottleneck in the progress of the biomolecular sciences. Most gene products are now annotated by homology, in which an experimentally determined function is applied to a similar sequence. This procedure becomes error-prone between more divergent sequences and can contaminate biomolecular databases. Here, we propose a computational method of assignment of function, termed Generalized Functional Linkages (GFL), that combines nonhomology-based methods with other types of data. Functional linkages describe pairwise relationships between proteins that work together to perform a biological task. GFL provides a Bayesian framework that improves annotation by arbitrating a competition among biological process annotations to best describe the target protein. GFL addresses the unequal strengths of functional linkages among proteins, the quality of existing annotations, and the similarity among them while incorporating available knowledge about the cellular location or individual molecular function of the target protein. We demonstrate GFL with functional linkages defined by an algorithm known as zorch that quantifies connectivity in protein–protein interaction networks. Even when using proteins linked only by indirect or high-throughput interactions, GFL predicts the biological processes of many proteins in Saccharomyces cerevisiae , improving the accuracy of annotation by 20% over majority voting.
Proteins, Algorithms
Proteins, Algorithms
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 14 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
