
The rapid spread of mobile banking and e-commerce has coincided with a dramatic increase in fraudulent online payments in recent years. Although machine learning and deep learning are widely used in credit card fraud detection, the typical credit card transaction data set is unbalanced, and the fraud data is much less than the normal transaction data, limiting the effectiveness of traditional binary classification algorithms. To overcome this issue, researchers oversample minority class data and utilize ensemble learning classification algorithms. However, oversampling still has disadvantages. Hence, we improve the generator part of the Variational Autoencoder Generative Adversarial Network (VAEGAN) and propose a new oversampling method that generates convincing and diverse minority class data. The training set is enhanced by generating minority class fraud data to train the ensemble learning classification model. The method is tested on an open credit card dataset, with the experimental results demonstrating that the oversampling method utilizing the improved VAEGAN is superior to the oversampling method of Generative Adversarial Network (GAN), Variational Autoencoder (VAE), and Synthetic Minority Oversampling Technique (SMOTE) in terms of Precision, F1_score, and other indicators. The oversampling method based on the improved VAEGAN effectively deals with the classification problem of imbalanced data.
oversampling, Credit card fraud, variational autoencoder generative adversarial network, ensemble learning, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
oversampling, Credit card fraud, variational autoencoder generative adversarial network, ensemble learning, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 16 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
