A quasi-Monte Carlo comparison of developments in parametric and semi-parametric regression methods for heavy-tailed and non-normal data : with an application to healthcare costs
Jones, Andrew Michael
Health econometrics; healthcare costs; heavy tails; quasi-Monte Carlo
We conduct a quasi-Monte Carlo comparison of the recent developments in parametric and semi-parametric regression methods for healthcare costs against each other and against standard practice. The population of English NHS hospital inpatient episodes for the nancial year 2007-2008 (summed for each patient: 6,164,114 observations in total) is randomly divided into two equally sized sub-populations to form an estimation and a validation set. Evaluating out-of-sample using the validaton set, a conditional density estimator shows considerable promise in forecasting conditional means, performing best for accuracy of forecasting and amongst the best four (of sixteen compared) for bias and goodness-of- t. The best performing model for bias is linear regression with square root transformed dependent variable, and a generalised linear model with square root link function and Poisson distribution performs best in terms of goodness-of- t. Commonly used models utilising a log-link are shown to perform badly relative to other models considered in our comparison.