publication . Part of book or chapter of book . 2007

Voice Activity Detection. Fundamentals and Speech Recognition System Robustness

Ramirez, J.; Gorriz, J. M.; Segura, J. C.;
Open Access
  • Published: 01 Jun 2007
  • Publisher: I-Tech Education and Publishing
An important drawback affecting most of the speech processing systems is the environmental noise and its harmful effect on the system performance. Examples of such systems are the new wireless communications voice services or digital hearing aid devices. In speech recognition, there are still technical barriers inhibiting such systems from meeting the demands of modern applications. Numerous noise reduction techniques have been developed to palliate the effect of the noise on the system performance and often require an estimate of the noise statistics obtained by means of a precise voice activity detector (VAD). Speech/non-speech detection is an unsolved problem...
free text keywords: Environmental noise, Speech processing, Voice activity detection, Background noise, Speech recognition, Engineering, business.industry, business, Pitch detection algorithm, Speech enhancement, Speech coding, Robustness (computer science)
Related Organizations
Download fromView all 2 versions
Part of book or chapter of book
Provider: UnpayWall
Part of book or chapter of book . 2007
Provider: InTech
Part of book or chapter of book . 2007
Provider: Crossref
36 references, page 1 of 3

Basbug, F.; Swaminathan, K.; Nandkumar, S. (2004). Noise reduction and echo cancellation front-end for speech codecs, IEEE Trans. Speech Audio Processing, vol. 11, no. 1, pp. 1-13.

Gustafsson, S.; Martin, R.; Jax, P.; Vary, P. (2002). A psychoacoustic approach to combined acoustic echo cancellation and noise reduction, IEEE Trans. Speech and Audio Processing, vol. 10, no. 5, pp. 245-256.

Sohn, J.; Kim, N.S.; Sung, W. (1999). A statistical model-based voice activity detection, IEEE Signal Processing Letters, vol. 16, no. 1, pp. 1-3.

Cho, Y.D.; Kondoz, A. (2001). Analysis and improvement of a statistical model-based voice activity detector, IEEE Signal Processing Letters, vol. 8, no. 10, pp. 276-278.

Gazor, S.; Zhang, W. (2003). A soft voice activity detector based on a Laplacian-Gaussian model, IEEE Trans. Speech Audio Processing, vol. 11, no. 5, pp. 498-505.

Armani, L.; Matassoni, M.; Omologo, M.; Svaizer, P. (2003). Use of a CSP-based voice activity detector for distant-talking ASR, Proc. EUROSPEECH 2003, Geneva, Switzerland, pp. 501-504. [OpenAIRE]

Bouquin-Jeannes, R.L.; Faucon, G. (1995). Study of a voice activity detector and its influence on a noise reduction system, Speech Communication, vol. 16, pp. 245-254.

Woo, K.; Yang, T.; Park, K.; Lee, C. (2000). Robust voice activity detection algorithm for estimating noise spectrum, Electronics Letters, vol. 36, no. 2, pp. 180-181.

Li, Q.; Zheng, J.; Tsai, A.; Zhou, Q. (2002). Robust endpoint detection and energy normalization for real-time speech and speaker recognition, IEEE Trans. Speech Audio Processing, vol. 10, no. 3, pp. 146-157.

Marzinzik, M.; Kollmeier, B. (2002). Speech pause detection for noise spectrum estimation by tracking power envelope dynamics, IEEE Trans. Speech Audio Processing, vol. 10, no. 6, pp. 341-351. [OpenAIRE]

Chengalvarayan, R. (1999). Robust energy normalization using speech/non-speech discriminator for German connected digit recognition, Proc. EUROSPEECH 1999, Budapest, Hungary, pp. 61-64.

Tucker, R. (1992). Voice activity detection using a periodicity measure, Proc. Inst. Elect. Eng., vol. 139, no. 4, pp. 377-380.

Nemer, E.; Goubran, R.; Mahmoud, S. (2001). Robust voice activity detection using higherorder statistics in the lpc residual domain, IEEE Trans. Speech Audio Processing, vol. 9, no. 3, pp. 217-231. [OpenAIRE]

Tanyer, S.G.; Özer, H. (2000). Voice activity detection in nonstationary noise, IEEE Trans. Speech Audio Processing, vol. 8, no. 4, pp. 478-482.

Freeman, D.K.; Cosier, G.; Southcott, C.B.; Boyd, I. (1989). The Voice Activity Detector for the PAN-European Digital Cellular Mobile Telephone Service, International Conference on Acoustics, Speech and Signal Processing, Vol. 1, pp. 369-372.

36 references, page 1 of 3
Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue