Downloads provided by UsageCounts
The human body generates more proteins than it has genes that code for proteins. The diversity of proteins stems from the alternative ways in which RNA can be spliced and reassembled. Each alternative version of RNA produces a different protein, providing a way for our bodies to produce a wide range of proteins with a single gene. Some alternative RNA transcripts, however, have splicing errors and produce faulty proteins involved in genetic diseases. Understanding splicing patterns and profiles has wide implications for our understanding of healthy and diseased tissue states. Currently little is known regarding the splicing profiles of healthy tissue which vary across individuals and within individuals by tissue type. Therefore, this project explored the use of RNA splicing data from the first chromosome to predict the tissue type of non-cancerous samples using distribution analysis and supervised learning methods. The Kolmogorov-Smirnov test was used to classify the samples based on empirical cumulative distribution functions and was not able to reliably distinguish between tissue types. However using Support Vector Models (SVM) we had high classification accu- racy, even when using different splice junction representations. Overall, the findings suggest the utility of using splice junction data in biological classification and sets the foundation for future work of mapping splicing patterns with phenotype.
splice junctions, tissue classification, UVA MSDS 2023, RNA-seq, distribution analysis, support vector machines
splice junctions, tissue classification, UVA MSDS 2023, RNA-seq, distribution analysis, support vector machines
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 9 | |
| downloads | 10 |

Views provided by UsageCounts
Downloads provided by UsageCounts