<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>
Using computational database searches, we have demonstrated previously that no gene sequences could be found for at least 36% of enzyme activities that have been assigned an Enzyme Commission number. Here we present a follow-up literature-based survey involving a statistically significant sample of such "orphan" activities. The survey was intended to determine whether sequences for these enzyme activities are truly unknown, or whether these sequences are absent from the public sequence databases but can be found in the literature.We demonstrate that for ~80% of sampled orphans, the absence of sequence data is bona fide. Our analyses further substantiate the notion that many of these enzyme activities play biologically important roles.This survey points toward significant scientific cost of having such a large fraction of characterized enzyme activities disconnected from sequence data. It also suggests that a larger effort, beginning with a comprehensive survey of all putative orphan activities, would resolve nearly 300 artifactual orphans and reconnect a wealth of enzyme research with modern genomics. For these reasons, we propose that a systematic effort to identify the cognate genes of orphan enzymes be undertaken.
Proteomics, Databases, Factual, QH301-705.5, Data Collection, Computer applications to medicine. Medical informatics, R858-859.7, Computational Biology, Reproducibility of Results, Genomics, Biochemistry, Computer Science Applications, Enzymes, Species Specificity, Databases, Genetic, Biology (General), Databases, Protein, Molecular Biology, Research Article
Proteomics, Databases, Factual, QH301-705.5, Data Collection, Computer applications to medicine. Medical informatics, R858-859.7, Computational Biology, Reproducibility of Results, Genomics, Biochemistry, Computer Science Applications, Enzymes, Species Specificity, Databases, Genetic, Biology (General), Databases, Protein, Molecular Biology, Research Article
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 37 | |
popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |