
arXiv: 2402.07619
Aim: COVID-19 has affected more than 223 countries worldwide and in the post-COVID era, there is a pressing need for non-invasive, low-cost, and highly scalable solutions to detect COVID-19. This study focuses on the analysis of voice features and machine learning models in the automatic detection of COVID-19. Methods: We develop a deep learning model to identify COVID-19 from voice recording data. The novelty of this work is in the development of deep learning models for COVID-19 identification from only voice recordings. We use the Cambridge COVID-19 Sound database which contains 893 speech samples, crowd-sourced from 4,352 participants via a COVID-19 Sounds app. Voice features including Mel-spectrograms and Mel-frequency cepstral coefficients (MFCC) and convolutional neural network (CNN) Encoder features are extracted. Based on the voice data, we develop deep learning classification models to detect COVID-19 cases. These models include long short-term memory (LSTM), CNN and Hidden-Unit BERT (HuBERT). Results: We compare their predictive power to baseline machine learning models. HuBERT achieves the highest accuracy of 86% and the highest AUC of 0.93. Conclusions: The results achieved with the proposed models suggest promising results in COVID-19 diagnosis from voice recordings when compared to the results obtained from the state-of-the-art.
FOS: Computer and information sciences, Sound (cs.SD), Computer Science - Artificial Intelligence, mfcc, R, deep learning, Information technology, T58.5-58.64, Computer Science - Sound, covid-19 diagnosis, voice analysis, machine learning, Artificial Intelligence (cs.AI), Audio and Speech Processing (eess.AS), FOS: Electrical engineering, electronic engineering, information engineering, Medicine, mel-spectrogram, Electrical Engineering and Systems Science - Audio and Speech Processing
FOS: Computer and information sciences, Sound (cs.SD), Computer Science - Artificial Intelligence, mfcc, R, deep learning, Information technology, T58.5-58.64, Computer Science - Sound, covid-19 diagnosis, voice analysis, machine learning, Artificial Intelligence (cs.AI), Audio and Speech Processing (eess.AS), FOS: Electrical engineering, electronic engineering, information engineering, Medicine, mel-spectrogram, Electrical Engineering and Systems Science - Audio and Speech Processing
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 4 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
