
Molecular phylogenetic techniques do not generally account for such common evolutionary events as site insertions and deletions (known as indels). Instead tree building algorithms and ancestral state inference procedures typically rely on substitution-only models of sequence evolution. In practice these methods are extended beyond this simplified setting with the use of heuristics that produce global alignments of the input sequences--an important problem which has no rigorous model-based solution. In this paper we consider a new version of the multiple sequence alignment in the context of stochastic indel models. More precisely, we introduce the following {\em trace reconstruction problem on a tree} (TRPT): a binary sequence is broadcast through a tree channel where we allow substitutions, deletions, and insertions; we seek to reconstruct the original sequence from the sequences received at the leaves of the tree. We give a recursive procedure for this problem with strong reconstruction guarantees at low mutation rates, providing also an alignment of the sequences at the leaves of the tree. The TRPT problem without indels has been studied in previous work (Mossel 2004, Daskalakis et al. 2006) as a bootstrapping step towards obtaining optimal phylogenetic reconstruction methods. The present work sets up a framework for extending these works to evolutionary models with indels.
Statistics and Probability, FOS: Computer and information sciences, Systems biology, networks, branching processes, Mathematics - Statistics Theory, Statistics Theory (math.ST), Quantitative Biology - Quantitative Methods, Problems related to evolution, Modelling and Simulation, Applications of branching processes, phylogenetic inference, Computer Science - Data Structures and Algorithms, FOS: Mathematics, Data Structures and Algorithms (cs.DS), Computational methods for problems pertaining to biology, Quantitative Biology - Populations and Evolution, Quantitative Methods (q-bio.QM), Applied Mathematics, Probability (math.PR), Populations and Evolution (q-bio.PE), Branching processes, FOS: Biological sciences, Phylogenetic inference, Markov models on trees, Mathematics - Probability
Statistics and Probability, FOS: Computer and information sciences, Systems biology, networks, branching processes, Mathematics - Statistics Theory, Statistics Theory (math.ST), Quantitative Biology - Quantitative Methods, Problems related to evolution, Modelling and Simulation, Applications of branching processes, phylogenetic inference, Computer Science - Data Structures and Algorithms, FOS: Mathematics, Data Structures and Algorithms (cs.DS), Computational methods for problems pertaining to biology, Quantitative Biology - Populations and Evolution, Quantitative Methods (q-bio.QM), Applied Mathematics, Probability (math.PR), Populations and Evolution (q-bio.PE), Branching processes, FOS: Biological sciences, Phylogenetic inference, Markov models on trees, Mathematics - Probability
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 10 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
