descriptionPublicationkeyboard_double_arrow_right Article , Other literature type 09 Apr 2008 English Publisher:Springer Science and Business Media LLCJournal:BMC Bioinformatics, volume 9 (eissn: 1471-2105,

Authors: Ralf Schmid; Ralf Schmid; Mark Blaxter;

doi: 10.1186/1471-2105-9-180

pmid: 18400082

pmc: PMC2324097

handle: 20.500.11820/7ef83d8b-1bbb-42db-acde-86483af743c3 , 2381/14339

annot8r: GO, EC and KEGG annotation of EST datasets

- Summary
- Subjects
- Related research
  (4)
- Metrics

Abstract

The expressed sequence tag (EST) methodology is an attractive option for the generation of sequence data for species for which no completely sequenced genome is available. The annotation and comparative analysis of such datasets poses a formidable challenge for research groups that do not have the bioinformatics infrastructure of major genome sequencing centres. Therefore, there is a need for user-friendly tools to facilitate the annotation of non-model species EST datasets with well-defined ontologies that enable meaningful cross-species comparisons. To address this, we have developed annot8r, a platform for the rapid annotation of EST datasets with GO-terms, EC-numbers and KEGG-pathways.annot8r automatically downloads all files relevant for the annotation process and generates a reference database that stores UniProt entries, their associated Gene Ontology (GO), Enzyme Commission (EC) and Kyoto Encyclopaedia of Genes and Genomes (KEGG) annotation and additional relevant data. For each of GO, EC and KEGG, annot8r extracts a specific sequence subset from the UniProt dataset based on the information stored in the reference database. These three subsets are then formatted for BLAST searches. The user provides the protein or nucleotide sequences to be annotated and annot8r runs BLAST searches against these three subsets. The BLAST results are parsed and the corresponding annotations retrieved from the reference database. The annotations are saved both as flat files and also in a relational postgreSQL results database to facilitate more advanced searches within the results. annot8r is integrated with the PartiGene suite of EST analysis tools.annot8r is a tool that assigns GO, EC and KEGG annotations for data sets resulting from EST sequencing projects both rapidly and efficiently. The benefits of an underlying relational database, flexibility and the ease of use of the program make it ideally suited for non-model species EST-sequencing projects.

Related Organizations

University of Edinburgh
United Kingdom
University of Leicester
United Kingdom
Universtity of Edinburgh
United Kingdom

Keywords

/dk/atira/pure/subjectarea/asjc/1300/1312, QH301-705.5, Computer applications to medicine. Medical informatics, Molecular Sequence Data, R858-859.7, Information Storage and Retrieval, Documentation, Biochemistry, Databases, Genetic, Databases, Genetic, Biology (General), Molecular Biology, Expressed Sequence Tags, /dk/atira/pure/subjectarea/asjc/1300/1303, /dk/atira/pure/subjectarea/asjc/1700/1706, Base Sequence, 005, Chromosome Mapping, DNA, Sequence Analysis, DNA, 004, Computer Science Applications, Database Management Systems, Sequence Analysis, Software

4 Research products, page 1 of 1

Annot8r: GO, EC and KEGG annotation of EST datasets-1
2011IsAmongTopNSimilarDocuments
Annot8r: GO, EC and KEGG annotation of EST datasets-0
2011IsAmongTopNSimilarDocuments
Additional file 5 of Gene discovery in EST sequences from the wheat leaf rust fungus Puccinia triticina sexual spores, asexual spores and haustoria, compared to other rust and corn smut fungi
2020IsAmongTopNSimilarDocuments
Annot8r: GO, EC and KEGG annotation of EST datasets-2
2011IsAmongTopNSimilarDocuments

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	90
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%