
doi: 10.1002/sim.8991
pmid: 33866577
Polytomous regression models generalize logistic models for the case of a categorical outcome variable with more than two distinct categories. These models are currently used in clinical research, and it is essential to measure their abilities to distinguish between the categories of the outcome. In 2012, van Calster et al proposed the polytomous discrimination index (PDI) as an extension of the binary discrimination c‐statistic to unordered polytomous regression. The PDI is a summary of the simultaneous discrimination between all outcome categories. Previous implementations of the PDI are not capable of running on “big data.” This article shows that the PDI formula can be manipulated to depend only on the distributions of the predicted probabilities evaluated for each outcome category and within each observed level of the outcome, which substantially improves the computation time. We present a SAS macro and R function that can rapidly evaluate the PDI and its components. The routines are evaluated on several simulated datasets after varying the number of categories of the outcome and size of the data and two real‐world large administrative health datasets. We compare PDI with two other discrimination indices: M‐index and hypervolume under the manifold (HUM) on simulated examples. We describe situations where the PDI and HUM, indices based on multiple comparisons, are superior to the M‐index, an index based on pairwise comparisons, to detect predictions that are no different than random selection or erroneous due to incorrect ranking.
R function, Logistic Models, SAS macro, Humans, polytomous discrimination index, polytomous regression, Applications of statistics to biology and medical sciences; meta analysis, discrimination
R function, Logistic Models, SAS macro, Humans, polytomous discrimination index, polytomous regression, Applications of statistics to biology and medical sciences; meta analysis, discrimination
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 12 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
