Models and Algorithms for Managing Repeats in the De Novo Assembly of Transcriptomes Models and Algorithms for Managing Repeats in the De Novo Assembly of Transcriptomes

Name: Models and Algorithms for Managing Repeats in the De Novo Assembly of Transcriptomes Models and Algorithms for Managing Repeats in the De Novo Assembly of Transcriptomes
Keywords: Algorithm, Short reads, [INFO.INFO-DS] Computer Science [cs]/Data Structures and Algorithms [cs.DS], [SDV.GEN] Life Sciences [q-bio]/Genetics, RNA-seq, De Bruijn Graph, k-mers, [INFO.INFO-BI] Computer Science [cs]/Bioinformatics [q-bio.QM]

Darmon, Sasha; Mary, Arnaud; Lacroix, Vincent

Found an issue? Give us feedback

INRIA2arrow_drop_down

INRIA2

Conference object . 2025

License: CC BY NC SA

Data sources: INRIA2

HAL INRAE

Conference object . 2025

License: CC BY NC SA

Data sources: HAL INRAE

INRIA a CCSD electronic archive server

Conference object . 2025

License: CC BY NC SA

Data sources: INRIA a CCSD electronic archive server

Models and Algorithms for Managing Repeats in the De Novo Assembly of Transcriptomes Models and Algorithms for Managing Repeats in the De Novo Assembly of Transcriptomes

descriptionPublicationkeyboard_double_arrow_right Conference object 01 Jan 2025 France English

Authors: Darmon, Sasha; Mary, Arnaud; Lacroix, Vincent;

Models and Algorithms for Managing Repeats in the De Novo Assembly of Transcriptomes Models and Algorithms for Managing Repeats in the De Novo Assembly of Transcriptomes

- Summary
- Subjects
- Metrics

Abstract

With the advent of short-read RNA-seq technologies, transcriptome assembly has become both more accessible and more complicated. This problem, known as de novo transcriptome assembly, remains the only option for transcriptomic exploration in most non-model organisms, where no reference genome is available or where existing references are too divergent. Inexact repeats in the transcriptome generate complex regions in the assembly graph that are difficult to resolve. Among the most problematic repeats are transposable elements (TEs)—mobile sequences capable of copying and inserting themselves throughout the genome. Their high copy number and sequence similarity introduce ambiguities in read mapping and transcript structure inference. These issues are especially severe in de novo assemblies where no reference exists to anchor and disambiguate repetitive reads, leading to tangled graph structures and misassemblies. We specifically utilise De Bruijn graphs, an efficient data structure where each transcript corresponds to a path within the graph. Our research focuses on characterising complex regions that contain families of repeats and replacing them with consensus nodes. The objective of this novel method is to operate de novo, without relying on genomic references or repeat consensus sequences. This de novo approach aims to avoid the ambiguous mapping of TEs, utilising widely available short-read sequences and making it applicable to non-model species.

Country

France

Related Organizations

Keywords

Algorithm, Short reads, [INFO.INFO-DS] Computer Science [cs]/Data Structures and Algorithms [cs.DS], [SDV.GEN] Life Sciences [q-bio]/Genetics, RNA-seq, De Bruijn Graph, k-mers, [INFO.INFO-BI] Computer Science [cs]/Bioinformatics [q-bio.QM]

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green

Related to Research communities

INRIA

INRAE