
pmid: 11847077
Abstract Motivation: The prediction of localization sites of various proteins is an important and challenging problem in the field of molecular biology. TargetP, by Emanuelsson et al. (J. Mol. Biol. , 300, 1005–1016, 2000) is a neural network based system which is currently the best predictor in the literature for N-terminalsorting signals. One drawback of neural networks, however, is that it is generally difficult to understand and interpret how and why they make such predictions. In this paper, we aim to generate simple and interpretable rules as predictors, and still achieve a practical prediction accuracy. We adopt an approach which consists of an extensive search for simple rules and various attributes which is partially guided by human intuition. Results: We have succeeded in finding rules whose prediction accuracies come close to that of TargetP, while still retaining a very simple and interpretable form. We also discuss and interpret the discovered rules. Availability: An (experimental) web service usingrules obtained by our method is provided at http://hypothesiscreator.net/iPSORT/. Contact: bannai@ims.u-tokyo.ac.jp
Computational Biology, Proteins, Amino Acid Sequence, Neural Networks, Computer, Protein Sorting Signals, Software, Subcellular Fractions
Computational Biology, Proteins, Amino Acid Sequence, Neural Networks, Computer, Protein Sorting Signals, Software, Subcellular Fractions
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 650 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 1% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 1% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 0.1% |
