publication . Preprint . 2019

MacNet: Transferring Knowledge from Machine Comprehension to Sequence-to-Sequence Models

Pan, Boyuan; Yang, Yazheng; Li, Hao; Zhao, Zhou; Zhuang, Yueting; Cai, Deng; He, Xiaofei;
Open Access English
  • Published: 23 Jul 2019
Abstract
Machine Comprehension (MC) is one of the core problems in natural language processing, requiring both understanding of the natural language and knowledge about the world. Rapid progress has been made since the release of several benchmark datasets, and recently the state-of-the-art models even surpass human performance on the well-known SQuAD evaluation. In this paper, we transfer knowledge learned from machine comprehension to the sequence-to-sequence tasks to deepen the understanding of the text. We propose MacNet: a novel encoder-decoder supplementary architecture to the widely used attention-based sequence-to-sequence models. Experiments on neural machine tr...
Subjects
free text keywords: Computer Science - Computation and Language, Computer Science - Machine Learning
Download from
46 references, page 1 of 4

Bahdanau, D.; Cho, K.; Bengio, Y.; et al. 2015. Neural machine translation by jointly learning to align and translate. In ICLR.

Chan, W.; Jaitly, N.; Le, Q.; and Vinyals, O. 2016. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition. In Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference on, 4960-4964.

Cho, K.; van Merrienboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; and Bengio, Y. 2014. Learning phrase representations using rnn encoder-decoder for statistical machine translation. In EMNLP, 1724-1734.

Collobert, R.; Weston, J.; Bottou, L.; Karlen, M.; Kavukcuoglu, K.; and Kuksa, P. 2011. Natural language processing (almost) from scratch. Journal of Machine Learning Research 12(Aug):2493-2537.

Cui, Y.; Chen, Z.; Wei, S.; Wang, S.; Liu, T.; and Hu, G. 2017. Attention-over-attention neural networks for reading comprehension. In ACL, volume 1, 593-602.

Duchi, J.; Hazan, E.; Singer, Y.; et al. 2011. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research 12(Jul):2121-2159.

Gehring, J.; Auli, M.; Grangier, D.; Yarats, D.; and Dauphin, Y. N. 2017. Convolutional sequence to sequence learning. arXiv preprint arXiv:1705.03122.

Glorot, X.; Bordes, A.; Bengio, Y.; et al. 2011. Domain adaptation for large-scale sentiment classification: A deep learning approach. In ICML, 513-520. [OpenAIRE]

Guo, H.; Pasunuru, R.; and Bansal, M. 2018. Soft layer-specific multi-task summarization with entailment and question generation. arXiv preprint arXiv:1805.11004.

Hermann, K. M.; Kocisky, T.; Grefenstette, E.; Espeholt, L.; Kay, W.; Suleyman, M.; and Blunsom, P. 2015. Teaching machines to read and comprehend. In NIPS, 1693-1701. [OpenAIRE]

Hu, M.; Peng, Y.; Qiu, X.; et al. 2017. Reinforced mnemonic reader for machine comprehension. CoRR, abs/1705.02798.

Joshi, M.; Choi, E.; Weld, D.; and Zettlemoyer, L. 2017. Triviaqa: A large scale distantly supervised challenge dataset for reading comprehension. In ACL.

Kim, Y. 2014. Convolutional neural networks for sentence classification. In EMNLP, 1746-1751.

Li, J.; Xiong, D.; Tu, Z.; Zhu, M.; Zhang, M.; and Zhou, G. 2017. Modeling source syntax for neural machine translation. In ACL, volume 1, 688-697.

Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; and Dollar, P. 2017. Focal loss for dense object detection. In ICCV, 2980-2988.

46 references, page 1 of 4
Abstract
Machine Comprehension (MC) is one of the core problems in natural language processing, requiring both understanding of the natural language and knowledge about the world. Rapid progress has been made since the release of several benchmark datasets, and recently the state-of-the-art models even surpass human performance on the well-known SQuAD evaluation. In this paper, we transfer knowledge learned from machine comprehension to the sequence-to-sequence tasks to deepen the understanding of the text. We propose MacNet: a novel encoder-decoder supplementary architecture to the widely used attention-based sequence-to-sequence models. Experiments on neural machine tr...
Subjects
free text keywords: Computer Science - Computation and Language, Computer Science - Machine Learning
Download from
46 references, page 1 of 4

Bahdanau, D.; Cho, K.; Bengio, Y.; et al. 2015. Neural machine translation by jointly learning to align and translate. In ICLR.

Chan, W.; Jaitly, N.; Le, Q.; and Vinyals, O. 2016. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition. In Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference on, 4960-4964.

Cho, K.; van Merrienboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; and Bengio, Y. 2014. Learning phrase representations using rnn encoder-decoder for statistical machine translation. In EMNLP, 1724-1734.

Collobert, R.; Weston, J.; Bottou, L.; Karlen, M.; Kavukcuoglu, K.; and Kuksa, P. 2011. Natural language processing (almost) from scratch. Journal of Machine Learning Research 12(Aug):2493-2537.

Cui, Y.; Chen, Z.; Wei, S.; Wang, S.; Liu, T.; and Hu, G. 2017. Attention-over-attention neural networks for reading comprehension. In ACL, volume 1, 593-602.

Duchi, J.; Hazan, E.; Singer, Y.; et al. 2011. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research 12(Jul):2121-2159.

Gehring, J.; Auli, M.; Grangier, D.; Yarats, D.; and Dauphin, Y. N. 2017. Convolutional sequence to sequence learning. arXiv preprint arXiv:1705.03122.

Glorot, X.; Bordes, A.; Bengio, Y.; et al. 2011. Domain adaptation for large-scale sentiment classification: A deep learning approach. In ICML, 513-520. [OpenAIRE]

Guo, H.; Pasunuru, R.; and Bansal, M. 2018. Soft layer-specific multi-task summarization with entailment and question generation. arXiv preprint arXiv:1805.11004.

Hermann, K. M.; Kocisky, T.; Grefenstette, E.; Espeholt, L.; Kay, W.; Suleyman, M.; and Blunsom, P. 2015. Teaching machines to read and comprehend. In NIPS, 1693-1701. [OpenAIRE]

Hu, M.; Peng, Y.; Qiu, X.; et al. 2017. Reinforced mnemonic reader for machine comprehension. CoRR, abs/1705.02798.

Joshi, M.; Choi, E.; Weld, D.; and Zettlemoyer, L. 2017. Triviaqa: A large scale distantly supervised challenge dataset for reading comprehension. In ACL.

Kim, Y. 2014. Convolutional neural networks for sentence classification. In EMNLP, 1746-1751.

Li, J.; Xiong, D.; Tu, Z.; Zhu, M.; Zhang, M.; and Zhou, G. 2017. Modeling source syntax for neural machine translation. In ACL, volume 1, 688-697.

Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; and Dollar, P. 2017. Focal loss for dense object detection. In ICCV, 2980-2988.

46 references, page 1 of 4
Any information missing or wrong?Report an Issue