
Virus‒host protein‒lncRNA interaction (VHPLI) predictions are critical for decoding the molecular mechanisms of viral pathogens and host immune processes. Although VHPLI interactions have been predicted in both plants and animals, they have not been extensively studied in viruses. For the first time, we propose a new deep learning-based approach that consists mainly of a convolutional neural network and bidirectional long and short-term memory network modules in combination with transfer learning named CBIL‒VHPLI to predict viral-host protein‒lncRNA interactions. The models were first trained on large and diverse datasets (including plants, animals, etc.). Protein sequence features were extracted using a k-mer method combined with the one-hot encoding and composition-transition-distribution (CTD) methods, and lncRNA sequence features were extracted using a k-mer method combined with the one-hot encoding and Z curve methods. The results obtained on three independent external validation datasets showed that the pre-trained CBIL‒VHPLI model performed the best with an accuracy of approximately 0.9. Pretraining was followed by conducting transfer learning on a viral protein-human lncRNA dataset, and the fine-tuning results showed that the accuracy of CBIL‒VHPLI was 0.946, which was significantly greater than that of the previous models. The final case study results showed that CBIL‒VHPLI achieved a prediction reproducibility rate of 91.6% for the RIP-Seq experimental screening results. This model was then used to predict the interactions between human lncRNA PIK3CD-AS2 and the nonstructural protein 1 (NS1) of the H5N1 virus, and RNA pull-down experiments were used to prove the prediction readiness of the model in terms of prediction. The source code of CBIL‒VHPLI and the datasets used in this work are available at https://github.com/Liu-Lab-Lnu/CBIL-VHPLI for academic usage.
Neural Networks, Science, Transfer learning methods, 610, Convolutional neural network, 630, LncRNA–protein interactions, Article, Machine Learning, Computer, Viral Proteins, Deep Learning, RNA, Long Noncoding - genetics - metabolism, Humans, Four-sequence preprocessing, Bidirectional long short-term memory, Q, R, Computational Biology, Long Noncoding - genetics - metabolism, Viral Proteins - metabolism - genetics, Host-Pathogen Interactions - genetics, Host-Pathogen Interactions, RNA, Computational Biology - methods, Medicine, RNA, Long Noncoding, Neural Networks, Computer
Neural Networks, Science, Transfer learning methods, 610, Convolutional neural network, 630, LncRNA–protein interactions, Article, Machine Learning, Computer, Viral Proteins, Deep Learning, RNA, Long Noncoding - genetics - metabolism, Humans, Four-sequence preprocessing, Bidirectional long short-term memory, Q, R, Computational Biology, Long Noncoding - genetics - metabolism, Viral Proteins - metabolism - genetics, Host-Pathogen Interactions - genetics, Host-Pathogen Interactions, RNA, Computational Biology - methods, Medicine, RNA, Long Noncoding, Neural Networks, Computer
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 4 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
