descriptionPublicationkeyboard_double_arrow_right Article , Other literature type 28 Jun 2005 English Publisher:Springer Science and Business Media LLCJournal:BMC Bioinformatics, volume 6 (eissn: 1471-2105,

Authors: Stocsits, Roman R.; Hofacker, Ivo L.; Fried, Claudia; Stadler, Peter F.;

doi: 10.1186/1471-2105-6-160

pmid: 15985156

pmc: PMC1182351

Multiple sequence alignments of partially coding nucleic acid sequences

- Summary
- Subjects
- Metrics

Abstract

Abstract Background High quality sequence alignments of RNA and DNA sequences are an important prerequisite for the comparative analysis of genomic sequence data. Nucleic acid sequences, however, exhibit a much larger sequence heterogeneity compared to their encoded protein sequences due to the redundancy of the genetic code. It is desirable, therefore, to make use of the amino acid sequence when aligning coding nucleic acid sequences. In many cases, however, only a part of the sequence of interest is translated. On the other hand, overlapping reading frames may encode multiple alternative proteins, possibly with intermittent non-coding parts. Examples are, in particular, RNA virus genomes. Results The standard scoring scheme for nucleic acid alignments can be extended to incorporate simultaneously information on translation products in one or more reading frames. Here we present a multiple alignment tool, codaln, that implements a combined nucleic acid plus amino acid scoring model for pairwise and progressive multiple alignments that allows arbitrary weighting for almost all scoring parameters. Resource requirements of codaln are comparable with those of standard tools such as ClustalW. Conclusion We demonstrate the applicability of codaln to various biologically relevant types of sequences (bacteriophage Levivirus and Vertebrate Hox clusters) and show that the combination of nucleic acid and amino acid sequence information leads to improved alignments. These, in turn, increase the performance of analysis tools that depend strictly on good input alignments such as methods for detecting conserved RNA secondary structure elements.

Related Organizations

Universität Wien
Austria
University of Leipzig, Interdisciplinary Centre for Bioinformatics
Germany
Universität Wien
Austria
University of Vienna
Austria
Leipzig University
Germany

View all View all

Keywords

Models, Molecular, QH301-705.5, Computer applications to medicine. Medical informatics, R858-859.7, Protein Structure, Secondary, Open Reading Frames, Sequence Homology, Nucleic Acid, Amino Acid Sequence, Biology (General), Codon, Conserved Sequence, Levivirus, 1040 Chemie, Homeodomain Proteins, info:eu-repo/classification/ddc/572.8, Reproducibility of Results, Sequence Analysis, DNA, Biochemistry, Evolutionary biology, RNA, DNA, Nucleic acid sequence, ddc:572.8, RNA, 1040 Chemistry, Sequence Alignment, Software, Algorithms

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	26
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

Top 10%

Green

gold

Fields of Science (3) View all

medical and health sciences

basic medicine

Fields of Science

medical and health sciences

basic medicine

View all