
doi: 10.1109/tcbb.2005.24
pmid: 17044180
One of the key components in protein structure prediction by protein threading technique is to choose the best overall template for a given target sequence after all the optimal sequence-template alignments are generated. The chosen template should have the best alignment with the target sequence since the three-dimensional structure of the target sequence is built on the sequence-template alignment. The traditional method for template selection is called Z-score, which uses a statistical test to rank all the sequence-template alignments and then chooses the first-ranked template for the sequence. However, the calculation of Z-score is time-consuming and not suitable for genome-scale structure prediction. Z-scores are also hard to interpret when the threading scoring function is the weighted sum of several energy items of different physical meanings. This paper presents a Support Vector Machine (SVM) regression approach to directly predict the alignment accuracy of a sequence-template alignment, which is used to rank all the templates for a specific target sequence. Experimental results on a large-scale benchmark demonstrate that SVM regression performs much better than the composition-corrected Z-score method. SVM regression also runs much faster than the Z-score method.
Models, Molecular, Protein Folding, Binding Sites, Sequence Homology, Amino Acid, Molecular Sequence Data, Proteins, Reproducibility of Results, Sensitivity and Specificity, Models, Chemical, Sequence Analysis, Protein, Computer Simulation, Amino Acid Sequence, Sequence Alignment, Algorithms, Protein Binding
Models, Molecular, Protein Folding, Binding Sites, Sequence Homology, Amino Acid, Molecular Sequence Data, Proteins, Reproducibility of Results, Sensitivity and Specificity, Models, Chemical, Sequence Analysis, Protein, Computer Simulation, Amino Acid Sequence, Sequence Alignment, Algorithms, Protein Binding
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 33 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
