publication . Preprint . 2016

Deep Recurrent Convolutional Neural Network: Improving Performance For Speech Recognition

Zhang, Zewang; Sun, Zheng; Liu, Jiaqi; Chen, Jingwen; Huo, Zhao; Zhang, Xiao;
Open Access English
  • Published: 22 Nov 2016
Abstract
Comment: 11 pages, 13 figures
Subjects
free text keywords: Computer Science - Computation and Language, Computer Science - Learning
Related Organizations
Download from
45 references, page 1 of 3

[1] Bahl, Lalit R., Frederick Jelinek, and Robert L. Mercer, “A maximum likelihood approach to continuous speech recognition,” IEEE transactions on Pattern Analysis and Machine Intelligence, 1983, pp. 179-190.

[2] Bahl, L. R., et al., “Maximum mutual information estimation of hidden Markov model parameters for speech recognition,” proc. ICASSP, vol. 86, 1986.

[3] Levinson, S. E., L. R. Rabiner, and M. M. Sondhi, “An introduction to the application of the theory of probabilistic functions of a markov process to automatic speech recognition,” Bell Labs Technical Journal, vol. 62, no. 4, 1983, pp. 1035-1074.

[4] Rabiner, Lawrence R., “A tutorial on hidden Markov models and selected applications in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2, 1989, pp. 257-286.

[5] Levinson, S. E., L. R. Rabiner, and M. M. Sondhi, “An Introduction to the Application of the Theory of Probabilistic Functions of a Markov Process to Automatic Speech Recognition,” Bell System Technical Journal, vol. 62, no. 4, 1983, pp. 1035-1074.

[6] Deng, Li, et al., “Recent advances in deep learning for speech research at Microsoft,” 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2013, pp. 8604-8608.

[7] Dahl, George E., et al., “Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 1, 2012, pp. 30-42.

[8] Hinton, Geoffrey, et al., “Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups,” IEEE Signal Processing Magazine, vol. 29, no. 6, 2012, pp. 82-97.

[9] Weng, Chao, et al., “Deep neural networks for single-channel multitalker speech recognition,” 2015 IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 10, 2015, pp. 1670- 1679.

[10] Dahl, G. E., T. N. Sainath, and G. E. Hinton, “Improving deep neural networks for LVCSR using rectified linear units and dropout,” 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2013, pp. 8609-8613.

[11] Miao, Y., and F. Metze. “Improving low-resource CD-DNN-HMM using dropout and multilingual DNN training,” Proc Interspeech, 2013.

[12] Graves Alex, and Navdeep Jaitly, “Towards end-to-end speech recognition with recurrent neural networks,” International Conference of Machine Learning, vol. 14, 2014.

[13] Amodei, Dario, et al., “Deep speech 2: End-to-end speech recognition in english and mandarin,” arXiv preprint arXiv:1512.02595, 2015.

[14] Hannun, Awni, et al., “Deep speech: Scaling up end-to-end speech recognition,” arXiv preprint arXiv:1412.5567, 2014. [OpenAIRE]

[15] Graves Alex, Abdel-rahman Mohamed, and Geoffrey Hinton, “Speech recognition with deep recurrent neural networks” 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2013.

45 references, page 1 of 3
Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue