Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ Padua Thesis and Dis...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
addClaim

Deep Learning For Genomic Sequences

Authors: GUARNIERI, LUCA#idabnull;

Deep Learning For Genomic Sequences

Abstract

Nowadays, thanks to advanced techniques, such as the Next Generation Sequencing (NGS), the time and costs of DNA sequencing are constantly lowering. These kinds of techniques offer high throughput, speed and scalability, so that the amount of available data on DNA sequences is greater than ever before. Nevertheless, when it comes to decoding and understanding what is encoded within this great number of sequences, there is an urgent need for new technolo- gies, which can keep up with the data production and be able to comprehend the contextual information of genes, scattered over long sequences of DNA. Artificial Intelligence and Deep Learning, in the field of Genomics, promise great advances in the interpretation and classification of genomic sequences. These kinds of models can learn and recognize significant genomic sequences and patterns, without the need for expensive, time- consuming, complicated wet-lab experiments. Moreover, they have been proven to do that, even when trained with a shortage of data. This study will describe the state-of-the-art deep-learning model architecture, namely the Transformer, and how it works. Afterward, two examples of its application to the biolog- ical problem will be presented: Nucleotide Transformers and Gena-LM, both implementing advanced foundational DNA language models, capable of high performances in numerous se- quence prediction tasks. These works will be described and compared. Lastly, the aforementioned models will be tested: the fine-tuning technique will be exploited, assessing the performances of each model on different datasets. All the results and the fine-tuned models can be found on the HuggingFace page of the author: https://huggingface.co/LiukG

Country
Italy
Related Organizations
Powered by OpenAIRE graph
Found an issue? Give us feedback