publication . Doctoral thesis . Other literature type . Preprint . 2017

Deep Learning for Distant Speech Recognition

Ravanelli, Mirco;
Open Access
  • Published: 15 Dec 2017
  • Publisher: University of Trento
  • Country: Italy
Abstract
Comment: PhD Thesis Unitn, 2017
Subjects
free text keywords: ING-INF/05 SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI, Computer Science - Computation and Language, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Related Organizations
17 references, page 1 of 2

C Experimental Setups 217 C.1 Data Contamination - Setup 1 . . . . . . . . . . . . . . . . 217 C.2 Data Contamination - Setup 2 . . . . . . . . . . . . . . . . 219 C.3 Data Contamination - Setup 3 . . . . . . . . . . . . . . . . 221 C.4 Managing Time Contexts - Setup 1 . . . . . . . . . . . . . 222 C.5 Managing Time Contexts - Setup 2 . . . . . . . . . . . . . 224 C.6 Networks of DNNs - Setup 1 . . . . . . . . . . . . . . . . . 226 C.7 Networks of DNNs - Setup 2 . . . . . . . . . . . . . . . . . 228

1. M. Ravanelli, P. Brakel, M. Omologo, Y. Bengio, \Light Gated Recurrent Units for Speech Recognition", in IEEE Transactions on Emerging Topics in Computational Intelligence (to appear).

2. M. Ravanelli, P. Brakel, M. Omologo, Y. Bengio, \A network of deep neural networks for distant speech recognition", in Proceedings of ICASSP 2017 (best IBM student paper award). [OpenAIRE]

3. M. Ravanelli, P. Brakel, M. Omologo, Y. Bengio, \Improving Gated Recurrent Units by Revising Gated Recurrent Units", in Proceedings of Interspeech 2017. [OpenAIRE]

4. M. Ravanelli, P. Brakel, M. Omologo, Y. Bengio, \Batch-normalized joint training for DNN-based distant speech recognition", in Proceedings of STL 2016.

5. M. Ravanelli, P. Svaizer, M. Omologo, \Realistic Multi-Microphone Data Simulation for Distant Speech Recognition", in Proceedings of Interspeech 2016. [OpenAIRE]

6. M. Matassoni, M.Ravanelli, S. Jalalvand, A. Brutti, \The FBK system for the CHiME-4 challenge", in Proceedings of the CHiME 4 challenge.

9. E. Zwyssig, M. Ravanelli, P. Svaizer, M. Omologo, \A multi-channel corpus for distant-speech interaction in presence of known Interferences", in Proceedings of ICASSP 2015. [OpenAIRE]

10. M. Ravanelli, B. Elizalde, J. Bernd, G. Friedland, \Insights into Audio-Based Multimedia Event Classi cation with Neural Networks", in Proceedings of ACM-MMCOMMONS.

11. M. Ravanelli, M. Omologo, \On the selection of the impulse responses for distant-speech recognition based on contaminated speech training", in Proceedings of INTERSPEECH 2014. [OpenAIRE]

12. L. Cristoforetti, M. Ravanelli, M. Omologo, A. Sosi, A. Abad, M. Hagmueller, P. Maragos, \The DIRHA simulated corpus", in Proceedings of LREC 2014. [OpenAIRE]

13. M. Matassoni, R. Astudillo, A. Katsamanis, M. Ravanelli, \The DIRHA-GRID corpus: baseline and tools for multi-room distant speech recognition using distributed microphones", in Proceedings of INTERSPEECH 2014.

14. A. Brutti, M. Ravanelli, M. Omologo, \SASLODOM: Speech Activity detection and Speaker LOcalization in DOMestic environments", in Proceedings of Evalita 2014.

15. A. Brutti, M. Ravanelli, P. Svaizer, M. Omologo, \A speech event detection and localization task for multiroom environments", in Proceedings of HSCMA 2014. [OpenAIRE]

16. M. Ravanelli, V.H. Do, A. Janin, \TANDEM-Bottleneck Feature Combination using Hierarchical Deep Neural Networks", in Proceedings of ISCSLP 2014. [OpenAIRE]

17 references, page 1 of 2
Abstract
Comment: PhD Thesis Unitn, 2017
Subjects
free text keywords: ING-INF/05 SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI, Computer Science - Computation and Language, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Related Organizations
17 references, page 1 of 2

C Experimental Setups 217 C.1 Data Contamination - Setup 1 . . . . . . . . . . . . . . . . 217 C.2 Data Contamination - Setup 2 . . . . . . . . . . . . . . . . 219 C.3 Data Contamination - Setup 3 . . . . . . . . . . . . . . . . 221 C.4 Managing Time Contexts - Setup 1 . . . . . . . . . . . . . 222 C.5 Managing Time Contexts - Setup 2 . . . . . . . . . . . . . 224 C.6 Networks of DNNs - Setup 1 . . . . . . . . . . . . . . . . . 226 C.7 Networks of DNNs - Setup 2 . . . . . . . . . . . . . . . . . 228

1. M. Ravanelli, P. Brakel, M. Omologo, Y. Bengio, \Light Gated Recurrent Units for Speech Recognition", in IEEE Transactions on Emerging Topics in Computational Intelligence (to appear).

2. M. Ravanelli, P. Brakel, M. Omologo, Y. Bengio, \A network of deep neural networks for distant speech recognition", in Proceedings of ICASSP 2017 (best IBM student paper award). [OpenAIRE]

3. M. Ravanelli, P. Brakel, M. Omologo, Y. Bengio, \Improving Gated Recurrent Units by Revising Gated Recurrent Units", in Proceedings of Interspeech 2017. [OpenAIRE]

4. M. Ravanelli, P. Brakel, M. Omologo, Y. Bengio, \Batch-normalized joint training for DNN-based distant speech recognition", in Proceedings of STL 2016.

5. M. Ravanelli, P. Svaizer, M. Omologo, \Realistic Multi-Microphone Data Simulation for Distant Speech Recognition", in Proceedings of Interspeech 2016. [OpenAIRE]

6. M. Matassoni, M.Ravanelli, S. Jalalvand, A. Brutti, \The FBK system for the CHiME-4 challenge", in Proceedings of the CHiME 4 challenge.

9. E. Zwyssig, M. Ravanelli, P. Svaizer, M. Omologo, \A multi-channel corpus for distant-speech interaction in presence of known Interferences", in Proceedings of ICASSP 2015. [OpenAIRE]

10. M. Ravanelli, B. Elizalde, J. Bernd, G. Friedland, \Insights into Audio-Based Multimedia Event Classi cation with Neural Networks", in Proceedings of ACM-MMCOMMONS.

11. M. Ravanelli, M. Omologo, \On the selection of the impulse responses for distant-speech recognition based on contaminated speech training", in Proceedings of INTERSPEECH 2014. [OpenAIRE]

12. L. Cristoforetti, M. Ravanelli, M. Omologo, A. Sosi, A. Abad, M. Hagmueller, P. Maragos, \The DIRHA simulated corpus", in Proceedings of LREC 2014. [OpenAIRE]

13. M. Matassoni, R. Astudillo, A. Katsamanis, M. Ravanelli, \The DIRHA-GRID corpus: baseline and tools for multi-room distant speech recognition using distributed microphones", in Proceedings of INTERSPEECH 2014.

14. A. Brutti, M. Ravanelli, M. Omologo, \SASLODOM: Speech Activity detection and Speaker LOcalization in DOMestic environments", in Proceedings of Evalita 2014.

15. A. Brutti, M. Ravanelli, P. Svaizer, M. Omologo, \A speech event detection and localization task for multiroom environments", in Proceedings of HSCMA 2014. [OpenAIRE]

16. M. Ravanelli, V.H. Do, A. Janin, \TANDEM-Bottleneck Feature Combination using Hierarchical Deep Neural Networks", in Proceedings of ISCSLP 2014. [OpenAIRE]

17 references, page 1 of 2
Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue