
doi: 10.1002/sta4.70061
ABSTRACTModelling time series data with complex dependencies poses significant challenges. While higher‐order Markov chains can capture longer memory of the process, they can suffer from the curse of dimensionality. Variable length Markov chains (VLMCs) offer a more efficient alternative by focusing on relevant past information. This paper introduces a novel variable selection method for covariates in VLMCs. The goal is to identify significant covariates and their optimal lag structures, resulting in improved model interpretability and predictive performance. By leveraging the connection between model testing and variable selection, we develop a parsimonious and sparse framework that is consistent in the sense of recovering the true dependence of the past and covariates with high probability as the sample size increases. Simulation studies demonstrate the superior performance of the proposed method in recovering the true model structure compared with existing approaches. A real‐data analysis of Dengue incidence in Brazil is conducted to demonstrate the practicality of the proposed method.
variable length Markov chains, Statistics, exogenous covariates, Bonferroni, false discovery rate, variable selection
variable length Markov chains, Statistics, exogenous covariates, Bonferroni, false discovery rate, variable selection
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
