
pmid: 30308178
Abstract The first terms of the Wright-Fisher (WF) site frequency spectrum that follow the coalescent approximation are determined precisely, with a view to understanding the accuracy of the coalescent approximation for large samples. The perturbing terms show that the probability of a single mutant in the sample (singleton probability) is elevated in WF but the rest of the frequency spectrum is lowered. A part of the perturbation can be attributed to a mismatch in rates of merger between WF and the coalescent. The rest of it can be attributed to the difference in the way WF and the coalescent partition children between parents. In particular, the number of children of a parent is approximately Poisson under WF and approximately geometric under the coalescent. Whereas the mismatch in rates raises the probability of singletons under WF, its offspring distribution being approximately Poisson lowers it. The two effects are of opposite sense everywhere except at the tail of the frequency spectrum. The WF frequency spectrum begins to depart from that of the coalescent only for sample sizes that are comparable to the population size. These conclusions are confirmed by a separate analysis that assumes the sample size n to be equal to the population size N . Partly thanks to the canceling effects, the total variation distance of WF minus coalescent is 0.12 / log N for a population sized sample with n = N , which is only 1% for N = 2 × 10 4 .
Population Density, sample frequency spectrum, Models, Genetic, Reproduction, Protein sequences, DNA sequences, coalescent, Applications of statistics to biology and medical sciences; meta analysis, Genetics, Population, Problems related to evolution, Gene Frequency, Mutation, Humans, Computer Simulation, Poisson Distribution, Genetics and epigenetics, Wright-Fisher, Applications of Brownian motions and diffusion theory (population genetics, absorption problems, etc.), multiple mergers, Probability
Population Density, sample frequency spectrum, Models, Genetic, Reproduction, Protein sequences, DNA sequences, coalescent, Applications of statistics to biology and medical sciences; meta analysis, Genetics, Population, Problems related to evolution, Gene Frequency, Mutation, Humans, Computer Simulation, Poisson Distribution, Genetics and epigenetics, Wright-Fisher, Applications of Brownian motions and diffusion theory (population genetics, absorption problems, etc.), multiple mergers, Probability
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 5 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
