A quasi-Monte Carlo comparison of developments in parametric and semi-parametric regression methods for heavy-tailed and non-normal data : with an application to healthcare costs

Article, Preprint English OPEN
Jones, Andrew Michael ; Lomas, James ; Moore, Peter ; Rice, Nigel (2015)
  • Subject: Health econometrics; healthcare costs; heavy tails; quasi-Monte Carlo
    • jel: jel:C1 | jel:C5

We conduct a quasi-Monte Carlo comparison of the recent developments in parametric and semi-parametric regression methods for healthcare costs against each other and against standard practice. The population of English NHS hospital inpatient episodes for the nancial year 2007-2008 (summed for each patient: 6,164,114 observations in total) is randomly divided into two equally sized sub-populations to form an estimation and a validation set. Evaluating out-of-sample using the validaton set, a conditional density estimator shows considerable promise in forecasting conditional means, performing best for accuracy of forecasting and amongst the best four (of sixteen compared) for bias and goodness-of- t. The best performing model for bias is linear regression with square root transformed dependent variable, and a generalised linear model with square root link function and Poisson distribution performs best in terms of goodness-of- t. Commonly used models utilising a log-link are shown to perform badly relative to other models considered in our comparison.
  • References (39)
    39 references, page 1 of 4

    Arrow, K. J. and Lind, R. C. (1970) Uncertainty and the evaluation of public investment decisions. Am. Econ. Rev., 60, 364-378.

    Basu, A., Arondekar, B. V. and Rathouz, P. J. (2006) Scale of interest versus scale of estimation: comparing alternative estimators for the incremental costs of a comorbidity. Hlth Econ., 15, 1091-1107.

    Basu, A., Manning, W. G. and Mullahy, J. (2004) Comparing alternative models: log vs Cox proportional hazard? Hlth Econ., 13, 749-765.

    Basu, A. and Rathouz, P. J. (2005) Estimating marginal and incremental effects on health outcomes using flexible link and variance function models. Biostatistics, 6, 93-109.

    Blough, D. K., Madden, C. W. and Hornbrook, M. C. (1999) Modeling risk using generalized linear models. J. Hlth Econ., 18, 153-171.

    Bordley, R., McDonald, J. and Mantrala, A. (1997) Something new, something old: parametric models for the size of distribution of income. J. Incm. Distribn, 6, 91-103.

    Buntin, M. B. and Zaslavsky, A. M. (2004) Too much ado about two-part models and transformation?: comparing methods of modeling medicare expenditures. J. Hlth Econ., 23, 525-542.

    Cawley, J. and Meyerhoefer, C. (2012) The medical care costs of obesity: an instrumental variables approach. J. Hlth Econ., 31, 219-230.

    Copas, J. B. (1983) Regression, prediction and shrinkage (with discussion). J. R. Statist Soc. B, 45, 311-354.

    Cummins, J. D., Dionne, G., McDonald, J. B. and Pritchett, B. M. (1990) Applications of the GB2 family of distributions in modeling insurance loss processes. Insur. Math. Econ., 9, 257-272.

  • Metrics
    No metrics available
Share - Bookmark