
arXiv: 2312.01870
Abstract Citizen science mobilizes many observers and gathers huge datasets but often without strict sampling protocols, resulting in observation biases due to heterogeneous sampling effort, which can lead to biased predictions. We develop a spatio-temporal Bayesian hierarchical model for bias-corrected estimation of arrival dates of the first migratory bird individuals at their breeding sites. Higher sampling effort could be correlated with earlier observed dates. We implement data fusion of two citizen-science datasets with fundamentally different protocols (Breeding Bird Survey, eBird) and obtain posterior distributions of the latent process, which contains four spatial components endowed with Gaussian process priors: species niche; sampling effort; position and scale parameters of annual first arrival date. The data layer consists of four response variables: counts of observed eBird locations (Poisson); presence–absence at observed eBird locations (Binomial); BBS occurrence counts (Poisson); first arrival dates (generalized extreme-value). We devise a Markov chain Monte Carlo scheme and check by simulation that the latent process components are identifiable. We apply our model to several migratory bird species in the northeastern US for 2001–2021 and find that the sampling effort significantly modulates the observed first arrival dates. We exploit this relationship to effectively bias-correct predictions of the true first arrivals.
Methodology (stat.ME), FOS: Computer and information sciences, opportunistic data, species distribution, Applications (stat.AP), sampling effort, [INFO.INFO-MO] Computer Science [cs]/Modeling and Simulation, bias correction, Statistics - Applications, Bayesian hierarchical model, Statistics - Methodology, bird phenology
Methodology (stat.ME), FOS: Computer and information sciences, opportunistic data, species distribution, Applications (stat.AP), sampling effort, [INFO.INFO-MO] Computer Science [cs]/Modeling and Simulation, bias correction, Statistics - Applications, Bayesian hierarchical model, Statistics - Methodology, bird phenology
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 2 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
