Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning

Article, Preprint English OPEN
Lemaitre , Guillaume; Nogueira , Fernando; Aridas , Christos ,;
(2017)
  • Publisher: Journal of Machine Learning Research
  • Subject: Ensemble Learning | Python | Imbalanced Dataset | [SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing | [ STAT.ML ] Statistics [stat]/Machine Learning [stat.ML] | Over-Sampling | Under-Sampling | [STAT.ML]Statistics [stat]/Machine Learning [stat.ML] | Machine Learning | [ SPI.SIGNAL ] Engineering Sciences [physics]/Signal and Image processing | Computer Science - Learning

International audience; imbalanced-learn is an open-source python toolbox aiming at providing a wide range of methods to cope with the problem of imbalanced dataset frequently encountered in machine learning and pattern recognition. The implemented state-of-the-art meth... View more
  • References (18)
    18 references, page 1 of 2

    G. E. Batista, A. L. Bazzan, and M. C. Monard. Balancing training data for automated annotation of keywords: a case study. In WOB, pages 10-18, 2003.

    N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research, pages 321- 357, 2002.

    A. Dal Pozzolo, O. Caelen, S. Waterschoot, and G. Bontempi. Racing for unbalanced methods selection. In International Conference on Intelligent Data Engineering and Automated Learning, pages 24-31. Springer, 2013.

    H. Han, W.-Y. Wang, and B.-H. Mao. Borderline-smote: a new over-sampling method in imbalanced data sets learning. In International Conference on Intelligent Computing, pages 878-887. Springer, 2005.

    P. Hart. The condensed nearest neighbor rule. Information Theory, IEEE Transactions on, 14(3):515-516, May 1968.

    H. He and E. Garcia. Learning from imbalanced data. Knowledge and Data Engineering, IEEE Transactions on, 21(9):1263-1284, 2009.

    M. Kubat, S. Matwin, et al. Addressing the curse of imbalanced training sets: one-sided selection. In International Conference in Machine Learning, volume 97, pages 179-186. Nashville, USA, 1997.

    M. Kuhn. Caret: classification and regression training. Astrophysics Source Code Library, 1:05003, 2015.

    J. Laurikkala. Improving identification of difficult small classes by balancing class distribution. Springer, 2001.

    X.-Y. Liu, J. Wu, and Z.-H. Zhou. Exploratory undersampling for class-imbalance learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 39(2):539- 550, 2009.

  • Related Research Results (1)
  • Metrics