
Abstract Whole Genome Sequencing is increasingly used to identify Mendelian variants in clinical pipelines. These pipelines focus on single nucleotide variants (SNVs) and also structural variants, while ignoring more complex repeat sequence variants. We consider the problem of genotyping Variable Number Tandem Repeats (VNTRs), composed of inexact tandem duplications of short (6-100bp) repeating units. VNTRs span 3% of the human genome, are frequently present in coding regions, and have been implicated in multiple Mendelian disorders. While existing tools recognize VNTR carrying sequence, genotyping VNTRs (determining repeat unit count and sequence variation) from whole genome sequenced reads remains challenging. We describe a method, adVNTR, that uses Hidden Markov Models to model each VNTR, count repeat units, and detect sequence variation. adVNTR models can be developed for short-read (Illumina) and single molecule (PacBio) whole genome and exome sequencing, and show good results on multiple simulated and real data sets. adVNTR is available at https://github.com/mehrdadbakhtiari/adVNTR
Genome, Polymorphism, Genetic, Genotyping Techniques, Bioinformatics, Genome, Human, Human Genome, Bioinformatics and Computational Biology, Method, Minisatellite Repeats, Biological Sciences, Medical and Health Sciences, Markov Chains, Genetic, Genetics, Humans, Polymorphism, Human
Genome, Polymorphism, Genetic, Genotyping Techniques, Bioinformatics, Genome, Human, Human Genome, Bioinformatics and Computational Biology, Method, Minisatellite Repeats, Biological Sciences, Medical and Health Sciences, Markov Chains, Genetic, Genetics, Humans, Polymorphism, Human
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 66 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 1% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
