
Prediction models are gaining importance in many areas such as medicine, meteorology, finance, toxicology, etc. In this context, a common distribution for the response variable is the binomial distribution and hence the logistic regression model is a commonly used regression modelling approach. Although it is not recommended from a statistical points of view due to loss of information and power, the categorisation of continuous variables is a common practice in the development of prediction models. However, there are no unified criteria for the selection of the cut points in the categorisation process. In order to provide valid cut points whenever a categorisation is going to be performed, we have developed a valid methodology to categorise continuous variables in a logistic regression model based on the maximisation of the AUC. This methodology has been implemented in an R package called CatPredi . This is a package of R functions that allows the user to categorise a continuous predictor variable in a univariate or multiple logistic regression model. It provides the optimal location of cut points for a chosen number of cut points, fits the prediction model with the categorised predictor variable and returns the estimated and bias-corrected discriminative ability index for this model. Additionally, it allows a comparison of two categorisation proposals for different number of cut points and the selection of the optimal number of cut points.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 2 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
