
doi: 10.1101/2022.02.11.479495 , 10.3389/fgene.2022.876721 , 10.60692/5t1vq-c0q27 , 10.60692/rhefr-8mw24
pmid: 35685437
pmc: PMC9173695
doi: 10.1101/2022.02.11.479495 , 10.3389/fgene.2022.876721 , 10.60692/5t1vq-c0q27 , 10.60692/rhefr-8mw24
pmid: 35685437
pmc: PMC9173695
AbstractLong non-coding RNAs (lncRNAs) play crucial roles in many biological processes and are implicated in several diseases. With the next-generation sequencing technologies, substantial un-annotated transcripts have been discovered. Classifying unannotated transcripts using biological experiments is more time-consuming and expensive than computational approaches. Several tools for identifying long non-coding RNAs are available. These tools, however, did not explain which features in their tools contributed to the prediction results. Here, we present Xlnc1DCNN, a tool for distinguishing long non-coding RNAs (lncRNAs) from protein-coding transcripts (PCTs) using a one-dimensional convolutional neural network with prediction explanations. The evaluation results of the human test set showed that Xlnc1DCNN outperformed other state-of-the-art tools in terms of accuracy and F1-score. The explanation results revealed that lncRNA transcripts were mainly identified as sequences with no conserved regions or with a region of transmembrane helix while protein-coding transcripts were mostly classified by conserved protein domains or families. The explanation results also conveyed the probably inconsistent annotations among the public databases, lncRNA transcripts which contain protein domains or families, as well as protein-coding transcripts which are nonsense-mediated decay or processed transcripts. Xlnc1DCNN is freely available at https://github.com/cucpbioinfo/Xlnc1DCNN.
explainable artificial intelligence (XAI), Cancer Research, Artificial intelligence, Non-coding RNA Networks, one-dimensional convolutional neural network (1D CNN), Role of Long Noncoding RNAs in Cancer and Development, Convolutional neural network, QH426-470, long non-coding RNA (lncRNA), Gene, Computational biology, SHAP (SHapley additive exPlanations), Long Noncoding RNAs, Identification (biology), Biochemistry, Genetics and Molecular Biology, Genetics, FOS: Mathematics, RNA Sequencing Data Analysis, Molecular Biology, tRNA, Biology, Coding (social sciences), Statistics, Botany, deep learning, Life Sciences, Computer science, RNA Methylation and Modification in Gene Expression, FOS: Biological sciences, Long non-coding RNA, RNA, Mathematics
explainable artificial intelligence (XAI), Cancer Research, Artificial intelligence, Non-coding RNA Networks, one-dimensional convolutional neural network (1D CNN), Role of Long Noncoding RNAs in Cancer and Development, Convolutional neural network, QH426-470, long non-coding RNA (lncRNA), Gene, Computational biology, SHAP (SHapley additive exPlanations), Long Noncoding RNAs, Identification (biology), Biochemistry, Genetics and Molecular Biology, Genetics, FOS: Mathematics, RNA Sequencing Data Analysis, Molecular Biology, tRNA, Biology, Coding (social sciences), Statistics, Botany, deep learning, Life Sciences, Computer science, RNA Methylation and Modification in Gene Expression, FOS: Biological sciences, Long non-coding RNA, RNA, Mathematics
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 2 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
