
AbstractAnimal age data are valuable for management of wildlife populations. Yet, for most species, there is no practical method for determining the age of unknown individuals. However, epigenetic clocks, a molecular‐based method, are capable of age prediction by sampling specific tissue types and measuring DNA methylation levels at specific loci. Developing an epigenetic clock requires a large number of samples from animals of known ages. For most species, there are no individuals whose exact ages are known, making epigenetic clock calibration inaccurate or impossible. For many epigenetic clocks, calibration samples with inaccurate age estimates introduce a degree of error to epigenetic clock calibration. In this study, we investigated how much error in the training data set of an epigenetic clock can be tolerated before it resulted in an unacceptable increase in error for age prediction. Using four publicly available data sets, we artificially increased the training data age error by iterations of 1% and then tested the model against an independent set of known ages. A small effect size increase (Cohen's d >0.2) was detected when the error in age was higher than 22%. The effect size increased linearly with age error. This threshold was independent of sample size. Downstream applications for age data may have a more important role in deciding how much error can be tolerated for age prediction. If highly precise age estimates are required, then it may be futile to embark on the development of an epigenetic clock when there is no accurately aged calibration population to work with. However, for other problems, such as determining the relative age order of pairs of individuals, a lower‐quality calibration data set may be adequate.
bioinfomatics/phyloinfomatics, molecular evolution, Evolution, QH359-425, wildlife management, Original Articles
bioinfomatics/phyloinfomatics, molecular evolution, Evolution, QH359-425, wildlife management, Original Articles
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 7 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
