
handle: 1959.4/101493
Many alternative approaches for selecting mortality models and calibration periods have been proposed. The usual practice is to base forecasts on a single mortality model se- lected using in-sample goodness-of-fit measures and an arbitrarily chosen calibration period. However, cross-validation measures are increasingly being used in calibration period selection, model selection, and model combination methods are becoming a common alternative to using a single mortality model and calibration period. First, we propose a stacked regression ensemble that optimally combines different mortal- ity models to reduce out-of-sample mean squared errors and mitigate model uncertainty. Stacked regression uses a meta-learner to approximate horizon-specific weights by minimizing a cross-validation criterion for each forecasting horizon. The horizon-specific weights determine a mortality model combination customized to each horizon. We use 44 popula- tions from the Human Mortality Database (HMD) to compare the stacked regression ensemble with alternative methods. We show that, using one-year-ahead to 15−year-ahead out-of-sample mean squared errors, the stacked regression ensemble improves mortality forecast accuracy by 13% - 49% for males and 19% - 90% for females over individual mortality models. Second, we propose an automated procedure to select diverse multiple starting points to fit a mortality model and weights to combine the out-of-sample forecasts from the selected periods using methods like lasso regression. Using 19 male mortality data from HMD, combining forecasts from multiple fitting periods produces lower mean squared error of mortality rate forecasts than fitting the mortality models to the longest calibration period. For example, we show that the gain in the forecast accuracy of mortality rate forecasts combined based on the Age-Period-Cohort model relative to the longest fitting period is between 11.7% and 31.5% across forecast horizons for 19 male populations. Lastly, we propose a new interpretable stacked regression ensemble (ISRE), which ex- presses a standard stacked regression ensemble in terms of diverse mortality features from individual mortality models, which directly allows for interpreting the contribution of each feature to the out-of-sample mortality forecasts. Our empirical experiments based on US male data from HMD show that ISRE can attain similar out-of-sample forecast error as the standard stacked regression ensemble while allowing for model interpretability.
Average forecasts across models and windows, 330, age-period-cohort model, Stacked regression, ensemble learning, model uncertainty, mortality forecasting, Structural breaks and forecasting, anzsrc-for: 490508 Statistical data science, model combination, uncertainty, 490508 Statistical data science, cross-validation
Average forecasts across models and windows, 330, age-period-cohort model, Stacked regression, ensemble learning, model uncertainty, mortality forecasting, Structural breaks and forecasting, anzsrc-for: 490508 Statistical data science, model combination, uncertainty, 490508 Statistical data science, cross-validation
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
