
Abstract We propose models for describing replacement rate variation in genes and proteins, in which the profile of relative replacement rates along the length of a given sequence is defined as a function of the site number. We consider here two types of functions, one derived from the cosine Fourier series, and the other from discrete wavelet transforms. The number of parameters used for characterizing the substitution rates along the sequences can be flexibly changed and in their most parameter-rich versions, both Fourier and wavelet models become equivalent to the unrestricted-rates model, in which each site of a sequence alignment evolves at a unique rate. When applied to a few real data sets, the new models appeared to fit data better than the discrete gamma model when compared with the Akaike information criterion and the likelihood-ratio test, although the parametric bootstrap version of the Cox test performed for one of the data sets indicated that the difference in likelihoods between the two models is not significant. The new models are applicable to testing biological hypotheses such as the statistical identity of rate variation profiles among homologous protein families. These models are also useful for determining regions in genes and proteins that evolve significantly faster or slower than the sequence average. We illustrate the application of the new method by analyzing human immunoglobulin and Drosophilid alcohol dehydrogenase sequences.
570, Confidence-Intervals, Genetic Vectors, Models-Genetic, Fourier-Analysis, 510, Genes-Immunoglobulin, SUPPORT-U-S-GOVT-P-H-S, Likelihood-Functions, Confidence Intervals, Animals, Humans, Variation-(Genetics), Mammals, Likelihood Functions, Fourier Analysis, Genes, Immunoglobulin, Models, Genetic, Animal, Genetic-Vectors, Alcohol Dehydrogenase, Genetic Variation, SUPPORT-U-S-GOVT-NON-P-H-S, Drosophila, Human
570, Confidence-Intervals, Genetic Vectors, Models-Genetic, Fourier-Analysis, 510, Genes-Immunoglobulin, SUPPORT-U-S-GOVT-P-H-S, Likelihood-Functions, Confidence Intervals, Animals, Humans, Variation-(Genetics), Mammals, Likelihood Functions, Fourier Analysis, Genes, Immunoglobulin, Models, Genetic, Animal, Genetic-Vectors, Alcohol Dehydrogenase, Genetic Variation, SUPPORT-U-S-GOVT-NON-P-H-S, Drosophila, Human
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 24 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
