
pmid: 30184049
Abstract Summary A number of limiting factors mean that traditional genome annotation tools either fail or perform sub-optimally when trying to detect coding sequences in poor quality genome assemblies/genome reports. This means that potentially useful data is accessible only to those with specific skills and expertise in assembly and annotation. We present an Assembled-Genome mIning pipeLinE (AGILE) written in Perl that combines bioinformatics tools with a number of steps to overcome the limitations imposed by such assemblies when applied to highly fragmented genomes. Our methodology uses user-specified query genes from a closely related species to mine and annotate coding sequences that would traditionally be missed by standard annotation packages. Despite a focus on mammalian genomes, the generalized implementation means that it may be applied to any genome assembly, providing a means for non-specialists to gather gene sequences for downstream analyses. Availability and implementation Source code and associated files are available at: https://github.com/batlabucd/GenomeMining and https://bitbucket.org/BatlabUCD/genomemining/src. Singularity and Virtual Box images available at https://figshare.com/s/a0004bf93dc43484b0c0. Supplementary information Supplementary data are available at Bioinformatics online.
Genome, Animals, Data Mining, Exons, Genomics, Software
Genome, Animals, Data Mining, Exons, Genomics, Software
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 5 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
