
Class-imbalance is quite common in real world. For the imbalanced class distribution, traditional state-of-the-art classifiers do not work well on imbalanced data sets. In this paper, we apply logistic regression model to class-imbalance problem, and propose a novel algorithm called CILR (Class Imbalance oriented Logistic Regression) to tackle imbalanced data sets. Unlike traditional logistic regression which tries to optimize MLE (maximum likelihood Estimation) function, CILR optimizes the proposed objective function based on MLE and recall metric in this paper. The loss function takes full use of the characteristic of both majority class and minority class simultaneously, which guarantees that CILR enhances the classification performance of logistic regression on rare class without decreasing accuracy in general. Experimental results on 16 data sets show that CILR performs significantly better than traditional logistic regression, under-sampled logistic regression and over-sampled logistic regression.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 8 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
