
Deep neural networks (DNNs) have proven highly effective in various computational tasks, but their success depends largely on access to large datasets with accurate labels. Obtaining such data may be challenging and costly in real-world scenarios. Common alternatives, such as the use of search engines and crowdsourcing, often result in datasets with inaccurately labeled, or “noisy,” data. This noise may significantly reduce the ability of DNNs to generalize and maintain reliability. Traditional methods for learning with noisy labels mitigate this drawback by training DNNs selectively on reliable data, but they often underutilize available data. Although data augmentation techniques are useful, they do not directly solve the noisy label problem and are limited in such contexts. This paper proposes a confidence-guided Mixup named ConfidentMix, which is a data augmentation strategy based on label confidence. Our method dynamically adjusts the intensity of data augmentation according to label confidence, to protect DNNs from the detrimental effects of noisy labels and maximize the learning potential from the most reliable portions of the dataset. ConfidentMix represents a unique blend of label confidence assessment and customized data augmentation, and improves model resilience and generalizability. Our results on standard benchmarks with synthetic noise, such as CIFAR-10 and CIFAR-100, demonstrate the superiority of ConfidentMix in high-noise environments. Furthermore, extensive experiments on Clothing1M and mini-WebVision have confirmed that ConfidentMix surpasses state-of-the-art methods in handling real-world noise.
semi-supervised learning, Data augmentation, learning with noisy labels, deep learning, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
semi-supervised learning, Data augmentation, learning with noisy labels, deep learning, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 2 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
