publication . Other literature type . Conference object . Preprint . 2019

Transformer-based Cascaded Multimodal Speech Translation

Wu, Zixiu; Caglayan, Ozan; Ive, Julia; Wang, Josiah; Specia, Lucia;
Open Access English
  • Published: 02 Nov 2019
  • Publisher: Zenodo
Abstract
Comment: Accepted to IWSLT 2019
Subjects
free text keywords: Computer Science - Computation and Language
Funded by
EC| MultiMT
Project
MultiMT
Multi-modal Context Modelling for Machine Translation
  • Funder: European Commission (EC)
  • Project Code: 678017
  • Funding stream: H2020 | ERC | ERC-STG
Download fromView all 4 versions
Zenodo
Other literature type . 2019
Provider: Datacite
Zenodo
Other literature type . 2019
Provider: Datacite
ZENODO
Conference object . 2019
Provider: ZENODO
33 references, page 1 of 3

[2] H. Hassan, A. Aue, C. Chen, V. Chowdhary, J. Clark, C. Federmann, X. Huang, M. Junczys-Dowmunt, W. Lewis, M. Li, et al., “Achieving human parity on automatic chinese to english news translation,” arXiv preprint arXiv:1803.05567, 2018. [OpenAIRE]

[3] R. Sanabria, O. Caglayan, S. Palaskar, D. Elliott, L. Barrault, L. Specia, and F. Metze, “How2: a largescale dataset for multimodal language understanding,” in Proceedings of the Workshop on Visually Grounded Interaction and Language (ViGIL). NeurIPS, 2018. [Online]. Available: http://arxiv.org/abs/1811.00347

[4] Z. Wu, J. Ive, J. Wang, P. Madhyastha, and L. Specia, “Predicting actions to help predict translations,” arXiv preprint arXiv:1908.01665, 2019. [OpenAIRE]

[5] F. Casacuberta, M. Federico, H. Ney, and E. Vidal, “Recent efforts in spoken language translation,” IEEE Signal Processing Magazine, vol. 25, no. 3, pp. 80-88, May 2008.

[6] A. Waibel and C. Fugen, “Spoken language translation,” IEEE Signal Processing Magazine, vol. 25, no. 3, pp. 70-79, 2008. [OpenAIRE]

[7] L. Specia, S. Frank, K. Sima'an, and D. Elliott, “A shared task on multimodal machine translation and crosslingual image description,” in Proceedings of the First Conference on Machine Translation, Berlin, Germany, August 2016, pp. 543-553. [Online].

Available: http://www.aclweb.org/anthology/W/W16/ W16-2346

[8] J. Ive, Madhyastha, P. Swaroop, and L. Specia, “Distilling Translations with Visual Awareness,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019. [OpenAIRE]

[9] O. Caglayan, P. Madhyastha, L. Specia, and L. Barrault, “Probing the need for visual context in multimodal machine translation,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), 2019, pp. 4159-4170. [Online]. Available: https://www.aclweb.org/anthology/N19-1422/ [OpenAIRE]

[10] O. Caglayan, W. Aransa, Y. Wang, M. Masana, M. Garc´ıa-Mart´ınez, F. Bougares, L. Barrault, and J. van de Weijer, “Does multimodality help human and machine for translation and image captioning?” in Proceedings of the First Conference on Machine Translation. Berlin, Germany: Association for Computational Linguistics, August 2016, pp. 627-633. [Online]. Available: http://www.aclweb.org/anthology/ W/W16/W16-2358

[11] J. Libovicky´ and J. Helcl, “Attention strategies for multi-source sequence-to-sequence learning,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 2017, pp. 196-202. [Online]. Available: http://aclweb.org/anthology/P17-2031 [OpenAIRE]

[12] O. Caglayan, W. Aransa, A. Bardet, M. Garc´ıaMart´ınez, F. Bougares, L. Barrault, M. Masana, L. Herranz, and J. van de Weijer, “LIUM-CVC submissions for WMT17 multimodal translation task,” in Proceedings of the Second Conference on Machine Translation, Volume 2: Shared Task Papers. Copenhagen, Denmark: Association for Computational Linguistics, September 2017, pp. 432-439. [Online]. Available: http://www.aclweb.org/anthology/W17-4746.pdf

[13] I. Calixto and Q. Liu, “Incorporating global visual features into attention-based neural machine translation.” in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Copenhagen, Denmark: Association for Computational Linguistics, September 2017, pp. 992-1003. [Online]. Available: https://www.aclweb.org/anthology/D17-1105

[14] P. S. Madhyastha, J. Wang, and L. Specia, “Sheffield multimt: Using object posterior predictions for multimodal machine translation,” in Proceedings of the Second Conference on Machine Translation, 2017, pp. [OpenAIRE]

[15] S.-A. Grnroos, B. Huet, M. Kurimo, J. Laaksonen, B. Merialdo, P. Pham, M. Sjberg, U. Sulubacak, J. Tiedemann, R. Troncy, and R. Vzquez, “The MeMAD submission to the WMT18 multimodal translation task,” in Proceedings of the Third Conference on Machine Translation. Belgium, Brussels: Association for Computational Linguistics, October 2018, pp. 609-617. [Online]. Available: http://www.aclweb.org/anthology/W18-64066 [OpenAIRE]

33 references, page 1 of 3
Abstract
Comment: Accepted to IWSLT 2019
Subjects
free text keywords: Computer Science - Computation and Language
Funded by
EC| MultiMT
Project
MultiMT
Multi-modal Context Modelling for Machine Translation
  • Funder: European Commission (EC)
  • Project Code: 678017
  • Funding stream: H2020 | ERC | ERC-STG
Download fromView all 4 versions
Zenodo
Other literature type . 2019
Provider: Datacite
Zenodo
Other literature type . 2019
Provider: Datacite
ZENODO
Conference object . 2019
Provider: ZENODO
33 references, page 1 of 3

[2] H. Hassan, A. Aue, C. Chen, V. Chowdhary, J. Clark, C. Federmann, X. Huang, M. Junczys-Dowmunt, W. Lewis, M. Li, et al., “Achieving human parity on automatic chinese to english news translation,” arXiv preprint arXiv:1803.05567, 2018. [OpenAIRE]

[3] R. Sanabria, O. Caglayan, S. Palaskar, D. Elliott, L. Barrault, L. Specia, and F. Metze, “How2: a largescale dataset for multimodal language understanding,” in Proceedings of the Workshop on Visually Grounded Interaction and Language (ViGIL). NeurIPS, 2018. [Online]. Available: http://arxiv.org/abs/1811.00347

[4] Z. Wu, J. Ive, J. Wang, P. Madhyastha, and L. Specia, “Predicting actions to help predict translations,” arXiv preprint arXiv:1908.01665, 2019. [OpenAIRE]

[5] F. Casacuberta, M. Federico, H. Ney, and E. Vidal, “Recent efforts in spoken language translation,” IEEE Signal Processing Magazine, vol. 25, no. 3, pp. 80-88, May 2008.

[6] A. Waibel and C. Fugen, “Spoken language translation,” IEEE Signal Processing Magazine, vol. 25, no. 3, pp. 70-79, 2008. [OpenAIRE]

[7] L. Specia, S. Frank, K. Sima'an, and D. Elliott, “A shared task on multimodal machine translation and crosslingual image description,” in Proceedings of the First Conference on Machine Translation, Berlin, Germany, August 2016, pp. 543-553. [Online].

Available: http://www.aclweb.org/anthology/W/W16/ W16-2346

[8] J. Ive, Madhyastha, P. Swaroop, and L. Specia, “Distilling Translations with Visual Awareness,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019. [OpenAIRE]

[9] O. Caglayan, P. Madhyastha, L. Specia, and L. Barrault, “Probing the need for visual context in multimodal machine translation,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), 2019, pp. 4159-4170. [Online]. Available: https://www.aclweb.org/anthology/N19-1422/ [OpenAIRE]

[10] O. Caglayan, W. Aransa, Y. Wang, M. Masana, M. Garc´ıa-Mart´ınez, F. Bougares, L. Barrault, and J. van de Weijer, “Does multimodality help human and machine for translation and image captioning?” in Proceedings of the First Conference on Machine Translation. Berlin, Germany: Association for Computational Linguistics, August 2016, pp. 627-633. [Online]. Available: http://www.aclweb.org/anthology/ W/W16/W16-2358

[11] J. Libovicky´ and J. Helcl, “Attention strategies for multi-source sequence-to-sequence learning,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 2017, pp. 196-202. [Online]. Available: http://aclweb.org/anthology/P17-2031 [OpenAIRE]

[12] O. Caglayan, W. Aransa, A. Bardet, M. Garc´ıaMart´ınez, F. Bougares, L. Barrault, M. Masana, L. Herranz, and J. van de Weijer, “LIUM-CVC submissions for WMT17 multimodal translation task,” in Proceedings of the Second Conference on Machine Translation, Volume 2: Shared Task Papers. Copenhagen, Denmark: Association for Computational Linguistics, September 2017, pp. 432-439. [Online]. Available: http://www.aclweb.org/anthology/W17-4746.pdf

[13] I. Calixto and Q. Liu, “Incorporating global visual features into attention-based neural machine translation.” in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Copenhagen, Denmark: Association for Computational Linguistics, September 2017, pp. 992-1003. [Online]. Available: https://www.aclweb.org/anthology/D17-1105

[14] P. S. Madhyastha, J. Wang, and L. Specia, “Sheffield multimt: Using object posterior predictions for multimodal machine translation,” in Proceedings of the Second Conference on Machine Translation, 2017, pp. [OpenAIRE]

[15] S.-A. Grnroos, B. Huet, M. Kurimo, J. Laaksonen, B. Merialdo, P. Pham, M. Sjberg, U. Sulubacak, J. Tiedemann, R. Troncy, and R. Vzquez, “The MeMAD submission to the WMT18 multimodal translation task,” in Proceedings of the Third Conference on Machine Translation. Belgium, Brussels: Association for Computational Linguistics, October 2018, pp. 609-617. [Online]. Available: http://www.aclweb.org/anthology/W18-64066 [OpenAIRE]

33 references, page 1 of 3
Any information missing or wrong?Report an Issue