A step beyond local observations with a dialog aware bidirectional GRU network for Spoken Language Understanding

Conference object English OPEN
Vukotic , Vedran ; Raymond , Christian ; Gravier , Guillaume (2016)
  • Publisher: HAL CCSD
  • Subject: spoken language understanding | gated recurrent units | [ INFO.INFO-CL ] Computer Science [cs]/Computation and Language [cs.CL] | GRU | bidirectional LSTM | dialog history | dialog | deep learning | LSTM | RNN | bidirectional GRU | recurrent neural networks | SLU | long short-term memory

International audience; Architectures of Recurrent Neural Networks (RNN) recently become a very popular choice for Spoken Language Understanding (SLU) problems; however, they represent a big family of different architectures that can furthermore be combined to form more complex neural networks. In this work, we compare different recurrent networks, such as simple Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM) networks, Gated Memory Units (GRU) and their bidirectional versions, on the popular ATIS dataset and on MEDIA, a more complex French dataset. Additionally, we propose a novel method where information about the presence of relevant word classes in the dialog history is combined with a bidirectional GRU, and we show that combining relevant word classes from the dialog history improves the performance over recurrent networks that work by solely analyzing the current sentence.
  • References (19)
    19 references, page 1 of 2

    [1] C. Raymond and G. Riccardi, “Generative and Discriminative Algorithms for Spoken Language Understanding,” in InterSpeech, Antwerp, Belgium, August 2007, pp. 1605-1608.

    [2] Y. He and S. Young, “Semantic Processing using the Hidden Vector State Model,” Computer Speech and Language, vol. 19, pp. 85-106, 2005.

    [3] K. Yao, G. Zweig, M.-Y. Hwang, Y. Shi, and D. Yu, “Recurrent neural networks for language understanding,” in InterSpeech, 2013, pp. 2524-2528.

    [4] K. Yao, B. Peng, Y. Zhang, D. Yu, G. Zweig, and Y. Shi, “Spoken language understanding using long short-term memory neural networks,” in Spoken Language Technology Workshop (SLT), 2014 IEEE. IEEE, 2014, pp. 189-194.

    [5] G. Kurata, B. Xiang, B. Zhou, and M. Yu, “Leveraging Sentencelevel Information with Encoder LSTM for Natural Language Understanding,” arXiv preprint arXiv:1601.01530, 2016.

    [6] W. C. Zhilin Yang, Ruslan Salakhutdinov, “Multi-Task CrossLingual Sequence Tagging from Scratch,” in arXiv, 2016.

    [7] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.

    [8] F. A. Gers, J. Schmidhuber, and F. Cummins, “Learning to forget: Continual prediction with LSTM,” Neural computation, vol. 12, no. 10, pp. 2451-2471, 2000.

    [9] K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” arXiv preprint arXiv:1406.1078, 2014.

    [10] V. Vukotic, C. Raymond, and G. Gravier, “Is it time to switch to word embedding and recurrent neural networks for spoken language understanding?” in InterSpeech, Dresde, Germany, September 2015.

  • Software (1)
  • Metrics
    No metrics available
Share - Bookmark