
With the advent of short-read RNA-seq technologies, transcriptome assembly has become both more accessible and more complicated. This problem, known as de novo transcriptome assembly, remains the only option for transcriptomic exploration in most non-model organisms, where no reference genome is available or where existing references are too divergent. Inexact repeats in the transcriptome generate complex regions in the assembly graph that are difficult to resolve. Among the most problematic repeats are transposable elements (TEs)—mobile sequences capable of copying and inserting themselves throughout the genome. Their high copy number and sequence similarity introduce ambiguities in read mapping and transcript structure inference. These issues are especially severe in de novo assemblies where no reference exists to anchor and disambiguate repetitive reads, leading to tangled graph structures and misassemblies. We specifically utilise De Bruijn graphs, an efficient data structure where each transcript corresponds to a path within the graph. Our research focuses on characterising complex regions that contain families of repeats and replacing them with consensus nodes. The objective of this novel method is to operate de novo, without relying on genomic references or repeat consensus sequences. This de novo approach aims to avoid the ambiguous mapping of TEs, utilising widely available short-read sequences and making it applicable to non-model species.
Algorithm, Short reads, [INFO.INFO-DS] Computer Science [cs]/Data Structures and Algorithms [cs.DS], [SDV.GEN] Life Sciences [q-bio]/Genetics, RNA-seq, De Bruijn Graph, k-mers, [INFO.INFO-BI] Computer Science [cs]/Bioinformatics [q-bio.QM]
Algorithm, Short reads, [INFO.INFO-DS] Computer Science [cs]/Data Structures and Algorithms [cs.DS], [SDV.GEN] Life Sciences [q-bio]/Genetics, RNA-seq, De Bruijn Graph, k-mers, [INFO.INFO-BI] Computer Science [cs]/Bioinformatics [q-bio.QM]
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
