
Small particles contaminated in hard disk drives (HDD) potentially cause damage to the device, leading to data loss. Hard disk industries, therefore, pay attention to identifying the types and sources of these contaminants. However, expensive analytical procedures are required for precise identification when testing samples are relatively small and scarce. A traditional tool, Raman spectroscopy, provides spectra with poor signal-to-noise ratios when dealing with sub-micron particles. Hence, human experts find noisy Raman spectral identification a real burden. In this study, we proposed a practically applicable pipeline, consisting of a denoising autoencoder with the spectral gradient correlation for the classification task, followed by the novel validation step based on an ensemble of CNN models to remove the predictions with low certainty. In the experiments, three different backbone models for denoising autoencoders are studied, including multilayer perceptron (MLP), convolutional neural network (CNN), and U-Net. While the ensemble model consists of eight different CNN models that act as independent machine experts whose votes indicate agreement with the correlation approach. When less agreement is observed, the sample is said to be unidentified and rejected from the classification task. With our validation step, the results bestow exceptionally high classification accuracy of 0.965, 0.955, and 0.976 for spectra undergoing our proposed pipeline with MLP, CNN, and U-Net autoencoder denoising models, respectively. This highlights the effectiveness of our proposed pipeline in practical application.
machine learning, Raman spectroscopy, denoising, deep learning, nanoparticles, Electrical engineering. Electronics. Nuclear engineering, polymers, TK1-9971
machine learning, Raman spectroscopy, denoising, deep learning, nanoparticles, Electrical engineering. Electronics. Nuclear engineering, polymers, TK1-9971
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
