Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ DUT Open Scholar (Du...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
https://doi.org/10.51415/10321...
Doctoral thesis . 2022 . Peer-reviewed
Data sources: Crossref
versions View all 1 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Machine learning : a data-point approach to solving misclassifications in the imbalanced Credit Card Datasets

Authors: Mqadi, Nhlakanipho Michael;

Machine learning : a data-point approach to solving misclassifications in the imbalanced Credit Card Datasets

Abstract

Machine learning (ML) uses algorithms with the complexity to iterate over massive datasets to analyse the data for past behaviour with the aim to predict future outcomes. Financial institutions are using ML to detect Credit Card Fraud (CCF) by learning the patterns that distinguish between legitimate and fraudulent actions from historic data of credit card transactions to combat CCF. The market economic order has been negatively affected by CCF, which has contributed to low consumer confidence in financial institutions, and loss of interest from investors. The CCF loses continue increasing every year despite existing efforts to prevent fraud, which amount to billions of dollars lost annually. ML techniques consume large volumes of historical credit card transaction data as examples for learning. In ordinary credit card datasets, there are far fewer fraudulent transactions than legitimate transactions. In dealing with the credit card data imbalance problem, the ideal solution must have low bias, low variance, and high accuracy. The aim of this study was to provide an in-depth experimental investigation of the effect of using the data-point approach to resolve the class misclassification problem in imbalanced credit card datasets. The study focused on finding a novel way to handle imbalanced data, to improve the performance of ML algorithms in identifying fraud or anomaly patterns in massive amounts of financial transaction records, where the class distribution was imbalanced. The experiment led to the introduction of two unique multi-level hybrid data-point approach solutions, namely, Feature Selection with Near Miss Undersampling; and Feature Selection with SMOTe based Oversampling. The results were obtained using four widely used ML algorithms, namely, Random Forest, Support Vector Machine, Decision Tree, and Logistic Regression to build the classifiers. These algorithms were implemented for classification of credit card datasets and the performance was assessed using selected performance metrics. The findings show that using the data-point approach improved the predictive accuracy of the ML fraud detection solution.

Related Organizations
Keywords

Credit Card Fraud (CCF), Credit cards, Database management, Credit card fraud, Identity theft--South Africa--Prevention, ML fraud detection, Credit card datasets, Credit cards--Security measures--South Africa, 006, Machine learning (ML), Data sets

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green
bronze