Categorisation of continuous variables in a logistic regression model using the R package CatPredi

Irantzu Barrio; María Xosé Rodríguez-Álvarez; Inmaculada Arostegui

Found an issue? Give us feedback

https://doi.org/10.3...arrow_drop_down

https://doi.org/10.3390/mol2ne...

Article . 2015 . Peer-reviewed

Data sources: Crossref

https://dx.doi.org/10.3390/mol...

Other literature type

Data sources: Microsoft Academic Graph

Categorisation of continuous variables in a logistic regression model using the R package CatPredi

descriptionPublicationkeyboard_double_arrow_right Article , Other literature type 04 Dec 2015Publisher:MDPI AGJournal:Proceedings of MOL2NET, International Conference on Multidisciplinary Sciences

Authors: Irantzu Barrio; María Xosé Rodríguez-Álvarez; Inmaculada Arostegui;

doi: 10.3390/mol2net-1-e004

Categorisation of continuous variables in a logistic regression model using the R package CatPredi

- Summary
- Metrics

Abstract

Prediction models are gaining importance in many areas such as medicine, meteorology, finance, toxicology, etc. In this context, a common distribution for the response variable is the binomial distribution and hence the logistic regression model is a commonly used regression modelling approach. Although it is not recommended from a statistical points of view due to loss of information and power, the categorisation of continuous variables is a common practice in the development of prediction models. However, there are no unified criteria for the selection of the cut points in the categorisation process. In order to provide valid cut points whenever a categorisation is going to be performed, we have developed a valid methodology to categorise continuous variables in a logistic regression model based on the maximisation of the AUC. This methodology has been implemented in an R package called CatPredi . This is a package of R functions that allows the user to categorise a continuous predictor variable in a univariate or multiple logistic regression model. It provides the optimal location of cut points for a chosen number of cut points, fits the prediction model with the categorised predictor variable and returns the estimated and bias-corrected discriminative ability index for this model. Additionally, it allows a comparison of two categorisation proposals for different number of cut points and the selection of the optimal number of cut points.

Related Organizations

Universidade de Vigo
Spain
Basque Center for Applied Mathematics
Spain
University of the Basque Country
Spain

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average