
Far-field automatic speech recognition (ASR) is challenging, mainly attributed to the high reverberation in the recordings. A novel linear sparse prediction model has been proposed to estimate and suppress reverberation. This model considers reverberation as a mixture of early and late reflections of the direct signal and estimates the late reflection with Lasso. It has been demonstrated that this approach is promising in improving perceptual intelligibility, however it is unknown if the improvement can be propagated to ASR tasks. This paper applies the Lasso-based dereverberation approach to far-field speech recognition, and shows that it can deliver significant performance improvement for ASR based on deep neural networks (DNN). Particularly, we demonstrated that an utterance-based Lasso is sufficient to obtain good performance, which is important for applying the Lasso-based dereverberation to real-time ASR systems.
| citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 2 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
