descriptionPublicationkeyboard_double_arrow_right Article , Other literature type 01 Sep 2003 English Publisher:Springer Science and Business Media LLCJournal:Nature Reviews Genetics, volume 4, pages 741-749 (issn: 1471-0056, eissn: 1471-0064,

Authors: Jun Yu; Jun Yu; Jun Yu; Yong Zhang; Yong Zhang; Zhao Xu; Hongkun Zheng; +7 Authors

doi: 10.1038/nrg1160 , 10.7939/r3w37kx55

pmid: 12951575

Vertebrate gene predictions and the problem of large genes

- Summary
- Subjects
- Metrics

Abstract

To find unknown protein-coding genes, annotation pipelines use a combination of ab initio gene prediction and similarity to experimentally confirmed genes or proteins. Here, we show that although the ab initio predictions have an intrinsically high false-positive rate, they also have a consistently low false-negative rate. The incorporation of similarity information is meant to reduce the false-positive rate, but in doing so it increases the false-negative rate. The crucial variable is gene size (including introns)--genes of the most extreme sizes, especially very large genes, are most likely to be incorrectly predicted.

Related Organizations

Peking University
China (People's Republic of)
Peking University
China (People's Republic of)
University of Washington
United States
University of Alberta
Canada
University of Southern Denmark
Denmark

View all View all

Keywords

Genome, Models, Genetic, Dna-sequences, Genome, Human, Annotation, Human Genome, Gene Expression, Exons, Introns, Resources, Transcriptomes, Experimental-verification, Genetic, Genetic Techniques, Models, Organ Specificity, Predictive Value of Tests, Vertebrates, Animals, Humans, Mouse Genome, Human

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	57
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%