Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ IEEE Accessarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
IEEE Access
Article . 2025 . Peer-reviewed
License: CC BY
Data sources: Crossref
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
IEEE Access
Article . 2025
Data sources: DOAJ
DBLP
Article
Data sources: DBLP
versions View all 3 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Unbalancing Datasets to Enhance CNN Models Learnability: A Class-Wise Metrics-Based Closed-Loop Strategy Proposal

Authors: Somayeh Shahrabadi; Victor Alves; Emanuel Peres; Raul Morais dos Santos; Telmo Adão;

Unbalancing Datasets to Enhance CNN Models Learnability: A Class-Wise Metrics-Based Closed-Loop Strategy Proposal

Abstract

Developing deep learning models often involves working with balanced datasets. However, in real-world scenarios, class equilibrium is rarely observed. To address this, oversampling and undersampling techniques are commonly employed, with appropriate monitoring and countermeasures to mitigate the risk of overfitting. Yet, even when datasets are balanced, they can still introduce biases, leading to better classification performance for certain classes over others. One significant cause of such biases is class underrepresentation, which is challenging to control and often arises from the acquisition of uneven feature distributions influenced by endogenous factors and environmental conditions. In such cases, random data augmentation can inadvertently amplify these discrepancies, exacerbating the issue of underrepresentation. To address this challenge, this paper proposes a metrics-based closed-loop (MbCL) approach that strategically unbalances the dataset to enhance class-wise performance and model generalization in line with the training process. The proposed method iteratively adjusts and improves model performance across different classes by employing data augmentation techniques, including both classical and GAN-based methods, to mitigate classes’ underrepresentation. To compare regular (linear) and MbCL-based approaches, a couple of contextually distinct datasets were considered to broaden consistency analysis across some diversity. Using these datasets, 72 models with varying configurations – including different convolutional neural network architectures, initial learning rates, and optimizers – were initially trained and then evaluated against imagery test sets. In this first assessment, MbCL methods matched or outperformed linear training in approximately 85% of the cases. Additionally, all the models were tested on an external image set acquired under different contextual conditions but with matching labels to the training datasets. The results showed that the proposed MbCL methods consistently outperformed linear training, achieving top-accuracies of 32% and 72% for both the encompassed dataset-related contexts, therefore, also indicating that classic and GAN-based augmentation approaches had a positive impact on inference performances as resampling strategies. Furthermore, complementary indicators, including gradient-weighted class activation mapping (Grad-CAM) and intersection over union (IoU), were analyzed to explore the relationship between performance and visualizable features.

Keywords

convolutional neural networks (CNNs), class imbalance, closed-loop optimization, Deep learning, Electrical engineering. Electronics. Nuclear engineering, GAN-based augmentation, data augmentation, TK1-9971

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
gold