
The physical basis of speech production in humans requires the coordination of multiple anatomical systems, where inhalation and exhalation of air through lungs is at the core of the phenomenon. Vocalization happens during exhalation, while inhalation typically happens between speech pauses. We use deep learning models to predict respiratory signals during speech-breathing, from which the respiration rate is estimated. Bilingual data from a large clinical study (N = 1,005) are used to develop and evaluate a multivariate time series transformer model with speech encoder embeddings as input. The best model shows the predicted respiration rate from speech within ±3 BPM for 82% of test subjects. A noise-aware algorithm was also tested in a simulated hospital environment with varying noise levels to evaluate the impact on performance. This work proposes and validates speech as a virtual sensor for respiration rate, which can be an efficient and cost-effective enabler for remote patient monitoring and telehealth solutions.
QH301-705.5, Validation, Biology (General)
QH301-705.5, Validation, Biology (General)
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
