descriptionPublicationkeyboard_double_arrow_right Article , Part of book or chapter of book 01 Apr 1999 English Publisher:Mary Ann Liebert IncJournal:Journal of Computational Biology, volume 6, pages 419-430 (issn: 1066-5277, eissn: 1557-8666,

Authors: Pachter, Lior; Batzoglou, Serafim; Spitkovsky, Valentin I.; Banks, Eric; Lander, Eric S.; Kleitman, Daniel J.; Berger, Bonnie;

doi: 10.1089/106652799318364 , 10.1145/299432.299504

pmid: 10582576

A Dictionary-Based Approach for Gene Annotation

- Summary
- Subjects
- Metrics

Abstract

This paper describes a fast and fully automated dictionary-based approach to gene annotation and exon prediction. Two dictionaries are constructed, one from the nonredundant protein OWL database and the other from the dbEST database. These dictionaries are used to obtain O (1) time lookups of tuples in the dictionaries (4 tuples for the OWL database and 11 tuples for the dbEST database). These tuples can be used to rapidly find the longest matches at every position in an input sequence to the database sequences. Such matches provide very useful information pertaining to locating common segments between exons, alternative splice sites, and frequency data of long tuples for statistical purposes. These dictionaries also provide the basis for both homology determination, and statistical approaches to exon prediction.

Related Organizations

Massachusetts Institute of Technology
United States
California Institute of Technology
United States

Keywords

Expressed Sequence Tags, Databases, Factual, Dictionaries as Topic, Proteins, Exons, 004, Alternative Splicing, Genes, Genetic Techniques, Animals, Humans, Sequence Alignment, Software

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	18
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

Average

Top 10%

Fields of Science (3) View all

medical and health sciences

basic medicine

Fields of Science

medical and health sciences

basic medicine

View all

Upload OA version

Are you the author of this publication? Upload your Open Access version to Zenodo!

It’s fast and easy, just two clicks!

uploadUpload now