
Abstract Motivation: Identifying somatic changes from tumor and matched normal sequences has become a standard approach in cancer research. More specifically, this requires accurate detection of somatic point mutations with low allele frequencies in impure and heterogeneous cancer samples. Although haplotype phasing information derived by using heterozygous germ line variants near candidate mutations would improve accuracy, no somatic mutation caller that uses such information is currently available. Results: We propose a Bayesian hierarchical method, termed HapMuC, in which power is increased by using available information on heterozygous germ line variants located near candidate mutations. We first constructed two generative models (the mutation model and the error model). In the generative models, we prepared candidate haplotypes, considering a heterozygous germ line variant if available, and the observed reads were realigned to the haplotypes. We then inferred the haplotype frequencies and computed the marginal likelihoods using a variational Bayesian algorithm. Finally, we derived a Bayes factor for evaluating the possibility of the existence of somatic mutations. We also demonstrated that our algorithm has superior specificity and sensitivity compared with existing methods, as determined based on a simulation, the TCGA Mutation Calling Benchmark 4 datasets and data from the COLO-829 cell line. Availability and implementation: The HapMuC source code is available from http://github.com/usuyama/hapmuc. Contact: imoto@ims.u-tokyo.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online.
Heterozygote, DNA Mutational Analysis, Bayes Theorem, Original Papers, Polymorphism, Single Nucleotide, Gene Frequency, Haplotypes, Cell Line, Tumor, Neoplasms, Mutation, Humans, Algorithms
Heterozygote, DNA Mutational Analysis, Bayes Theorem, Original Papers, Polymorphism, Single Nucleotide, Gene Frequency, Haplotypes, Cell Line, Tumor, Neoplasms, Mutation, Humans, Algorithms
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 19 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
