Equitability revisited: why the “equitable threat score” is not equitable

Article English OPEN
Hogan, Robin J. ; Ferro, Christopher A. T. ; Jolliffe, Ian T. ; Stephenson, David B. (2010)

In the forecasting of binary events, verification measures that are “equitable” were defined by Gandin and Murphy to satisfy two requirements: 1) they award all random forecasting systems, including those that always issue the same forecast, the same expected score (typically zero), and 2) they are expressible as the linear weighted sum of the elements of the contingency table, where the weights are independent of the entries in the table, apart from the base rate. The authors demonstrate that the widely used “equitable threat score” (ETS), as well as numerous others, satisfies neither of these requirements and only satisfies the first requirement in the limit of an infinite sample size. Such measures are referred to as “asymptotically equitable.” In the case of ETS, the expected score of a random forecasting system is always positive and only falls below 0.01 when the number of samples is greater than around 30. Two other asymptotically equitable measures are the odds ratio skill score and the symmetric extreme dependency score, which are more strongly inequitable than ETS, particularly for rare events; for example, when the base rate is 2% and the sample size is 1000, random but unbiased forecasting systems yield an expected score of around −0.5, reducing in magnitude to −0.01 or smaller only for sample sizes exceeding 25 000. This presents a problem since these nonlinear measures have other desirable properties, in particular being reliable indicators of skill for rare events (provided that the sample size is large enough). A potential way to reconcile these properties with equitability is to recognize that Gandin and Murphy’s two requirements are independent, and the second can be safely discarded without losing the key advantages of equitability that are embodied in the first. This enables inequitable and asymptotically equitable measures to be scaled to make them equitable, while retaining their nonlinearity and other properties such as being reliable indicators of skill for rare events. It also opens up the possibility of designing new equitable verification measures.
  • References (29)
    29 references, page 1 of 3

    Baldwin, M. E., and J. S. Kain, 2006: Sensitivity of several performance measures to displacement error, bias, and event frequency. Wea. Forecasting, 21, 636-648.

    Brill, K. F., 2009: A general analytic method for assessing sensitivity to bias of performance measures for dichotomous forecasts. Wea. Forecasting, 24, 307-318.

    Donaldson, R. J., R. M. Dyer, and M. J. Kraus, 1975: An objective evaluator of techniques for predicting severe weather events. Preprints, Ninth Conf. on Severe Local Storms, Norman, OK, Amer. Meteor. Soc., 321-326.

    Doswell, C. A., III, R. Davies-Jones, and D. L. Keller, 1990: On summary measures of skill in rare event forecasting based on contingency tables. Wea. Forecasting, 5, 576-586.

    Finley, J. P., 1884: Tornado predictions. Amer. Meteor. J., 1, 85-88.

    Gandin, K. S., and A. H. Murphy, 1992: Equitable scores for categorical forecasts. Mon. Wea. Rev., 120, 361-370.

    Gilbert, G. K., 1884: Finley's tornado predictions. Amer. Meteor. J., 1, 166-172.

    Gringorten, I. I., 1967: Verification to determine and measure forecasting skill. J. Appl. Meteor., 6, 742-747.

    Heidke, P., 1926: Calculation of the success and goodness of strong wind forecasts in the storm warning service. Geogr. Ann. Stockholm, 8, 301-349.

    Hilliker, J. L., 2004: The sensitivity of the number of correctly forecasted events to the threat score: A practical application. Wea. Forecasting, 19, 646-650.

  • Metrics
    0
    views in OpenAIRE
    0
    views in local repository
    396
    downloads in local repository

    The information is available from the following content providers:

    From Number Of Views Number Of Downloads
    Central Archive at the University of Reading - IRUS-UK 0 396
Share - Bookmark