
doi: 10.1093/bioinformatics/btv306 , 10.3929/ethz-b-000103876 , 10.21256/zhaw-3884 , 10.5167/uzh-155197
pmid: 25987568
handle: 20.500.11850/103876
doi: 10.1093/bioinformatics/btv306 , 10.3929/ethz-b-000103876 , 10.21256/zhaw-3884 , 10.5167/uzh-155197
pmid: 25987568
handle: 20.500.11850/103876
Abstract Motivation: Currently, more than 40 sequence tandem repeat detectors are published, providing heterogeneous, partly complementary, partly conflicting results. Results: We present TRAL, a tandem repeat annotation library that allows running and parsing of various detection outputs, clustering of redundant or overlapping annotations, several statistical frameworks for filtering false positive annotations, and importantly a tandem repeat annotation and refinement module based on circular profile hidden Markov models (cpHMMs). Using TRAL, we evaluated the performance of a multi-step tandem repeat annotation workflow on 547 085 sequences in UniProtKB/Swiss-Prot. The researcher can use these results to predict run-times for specific datasets, and to choose annotation complexity accordingly. Availability and implementation: TRAL is an open-source Python 3 library and is available, together with documentation and tutorials via http://www.vital-it.ch/software/tral. Contact: elke.schaper@isb-sib.ch
FOS: Computer and information sciences, 1303 Biochemistry, Bioinformatics, Knowledge Bases, Molecular Sequence Data, Documentation, Molecular sequence, 142-005 142-005, 1312 Molecular Biology, 1706 Computer Science Applications, Profile hidden Markov models, Cluster Analysis, Humans, Amino Acid Sequence, 2613 Statistics and Probability, Amino Acid Sequence; Cluster Analysis; Databases, Protein; Documentation; Gene Library; Humans; Knowledge Bases; Molecular Sequence Annotation; Molecular Sequence Data; Software; Tandem Repeat Sequences/genetics, Databases, Protein, Gene Library, 572: Biochemie, Molecular Sequence Annotation, Tandem repeat, Tandem Repeat Sequences, 004: Informatik, 2605 Computational Mathematics, Software, 1703 Computational Theory and Mathematics
FOS: Computer and information sciences, 1303 Biochemistry, Bioinformatics, Knowledge Bases, Molecular Sequence Data, Documentation, Molecular sequence, 142-005 142-005, 1312 Molecular Biology, 1706 Computer Science Applications, Profile hidden Markov models, Cluster Analysis, Humans, Amino Acid Sequence, 2613 Statistics and Probability, Amino Acid Sequence; Cluster Analysis; Databases, Protein; Documentation; Gene Library; Humans; Knowledge Bases; Molecular Sequence Annotation; Molecular Sequence Data; Software; Tandem Repeat Sequences/genetics, Databases, Protein, Gene Library, 572: Biochemie, Molecular Sequence Annotation, Tandem repeat, Tandem Repeat Sequences, 004: Informatik, 2605 Computational Mathematics, Software, 1703 Computational Theory and Mathematics
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 16 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
