Recognizing GSM digital speech

descriptionPublicationkeyboard_double_arrow_right Article 01 Nov 2005Publisher:Institute of Electrical and Electronics Engineers (IEEE)Journal:IEEE Transactions on Speech and Audio Processing, volume 13, pages 1,186-1,205 (issn: 1063-6676,

Copyright policy )

Authors: Ascensión Gallardo-Antolín; Carmen Peláez-Moreno; Fernando Díaz-de-María;

doi: 10.1109/tsa.2005.853210

handle: 10016/2317

Recognizing GSM digital speech

- Summary
- Subjects
- Metrics

Abstract

The Global System for Mobile (GSM) environment encompasses three main problems for automatic speech recognition (ASR) systems: noisy scenarios, source coding distortion, and transmission errors. The first one has already received much attention; however, source coding distortion and transmission errors must be explicitly addressed. In this paper, we propose an alternative front-end for speech recognition over GSM networks. This front-end is specially conceived to be effective against source coding distortion and transmission errors. Specifically, we suggest extracting the recognition feature vectors directly from the encoded speech (i.e., the bitstream) instead of decoding it and subsequently extracting the feature vectors. This approach offers two significant advantages. First, the recognition system is only affected by the quantization distortion of the spectral envelope. Thus, we are avoiding the influence of other sources of distortion as a result of the encoding-decoding process. Second, when transmission errors occur, our front-end becomes more effective since it is not affected by errors in bits allocated to the excitation signal. We have considered the half and the full-rate standard codecs and compared the proposed front-end with the conventional approach in two ASR tasks, namely, speaker-independent isolated digit recognition and speaker-independent continuous speech recognition. In general, our approach outperforms the conventional procedure, for a variety of simulated channel conditions. Furthermore, the disparity increases as the network conditions worsen.

Related Organizations

Carlos III University of Madrid
Spain

Keywords

Tandeming, Telecomunicaciones, Combined source-channel coding, Decoding, Coding distortion, Speech coding, Transmission errors, Distortion, Global System for Mobile (GSM) networks, Speech codecs, Speech recognition, Code standards, Radio networks, Feature extraction, Quantisation (signal), Wireless networks, Cellular radio

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	21
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average