
Background: Transcriptome profiling by RNA-seq has enhanced scientific understanding of gene regulation. Despite the benefits these data have brought in terms of transcriptome coverage and accuracy, there are considerable barriers-to-entry for the novice computational biologist to analyse these large data sets. There is a definite need for a repository of uniformly processed RNA-seq data that is easy to use and represents major model organisms. Findings: To address these obstacles, we developed Digital Expression Explorer 2 (DEE2), a web-based repository of RNA-seq data in the form of gene-level and transcript-level expression counts. DEE2 contains over 400,000 RNA-seq data sets from several species including yeast, Arabidopsis, worm, fruit fly, zebrafish, rat, mouse and human. Base-space sequence data downloaded from NCBI Sequence Read Archive underwent quality analysis, filtering and trimming prior to transcriptome and genome alignment and read counting using open-source tools. Uniform reference-genome and data processing methods ensure consistency across experiments, facilitating fast and reproducible meta-analyses. Conclusions: The web interface enables users to quickly identify data sets of interest through accession number and keyword searches. These data can also be accessed programmatically using a specifically designed R script. We demonstrate how DEE2 data is compatible with statistical packages such as edgeR or DESeq. DEE2 can be found at http://dee2.io
{"references": ["Barrett et al, 2013. DOI: 10.1093/nar/gks1193", "Dobin et al, 2013. DOI: 10.1093/bioinformatics/bts635", "Bray et al, 2016. DOI: 10.1038/nbt.3519", "Lachmann et al, 2018. DOI: 10.1038/s41467-018-03751-6", "Collado-Torres et al, 2017. DOI: 10.1038/nbt.3838"]}