publication . Preprint . Part of book or chapter of book . Other literature type . 2016

Bi-directional LSTM Recurrent Neural Network for Chinese Word Segmentation

Yushi Yao; Zheng Huang;
Open Access English
  • Published: 15 Feb 2016
Abstract
Recurrent neural network(RNN) has been broadly applied to natural language processing(NLP) problems. This kind of neural network is designed for modeling sequential data and has been testified to be quite efficient in sequential tagging tasks. In this paper, we propose to use bi-directional RNN with long short-term memory(LSTM) units for Chinese word segmentation, which is a crucial preprocess task for modeling Chinese sentences and articles. Classical methods focus on designing and combining hand-craft features from context, whereas bi-directional LSTM network(BLSTM) does not need any prior knowledge or pre-designing, and it is expert in keeping the contextual ...
Persistent Identifiers
Subjects
free text keywords: Computer Science - Learning, Computer Science - Computation and Language, Contextual information, Chinese word, Sequential data, Natural language processing, computer.software_genre, computer, Segmentation, Text segmentation, Recurrent neural network, Natural language, Artificial intelligence, business.industry, business, Artificial neural network, Computer science
Related Organizations
Communities
Digital Humanities and Cultural Heritage
Download fromView all 3 versions
http://arxiv.org/pdf/1602.0487...
Part of book or chapter of book
Provider: UnpayWall
http://dx.doi.org/10.1007/978-...
Other literature type . 2016
Provider: Datacite
22 references, page 1 of 2

Michael Auli, Michel Galley, Chris Quirk, and Geoffrey Zweig. Joint language and translation modeling with recurrent neural networks. In EMNLP, volume 3, page 0, 2013.

Yoshua Bengio, Re´jean Ducharme, Pascal Vincent, and Christian Janvin. A neural probabilistic language model. The Journal of Machine Learning Research, 3:1137-1155, 2003.

Pi-Chuan Chang, Michel Galley, and Christopher D Manning. Optimizing chinese word segmentation for machine translation performance. In Proceedings of the third workshop on statistical machine translation, pages 224-232. Association for Computational Linguistics, 2008.

Xinchi Chen, Xipeng Qiu, Chenxi Zhu, Pengfei Liu, and Xuanjing Huang. Long short-term memory neural networks for chinese word segmentation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2015. [OpenAIRE]

Klaus Greff, Rupesh Kumar Srivastava, Jan Koutn´ık, Bas R Steunebrink, and Ju¨rgen Schmidhuber. Lstm: A search space odyssey. arXiv preprint arXiv:1503.04069, 2015.

Sepp Hochreiter and Ju¨rgen Schmidhuber. Long shortterm memory. Neural computation, 9(8):1735- 1780, 1997.

Changning Huang and Hai Zhao. Chinese word segmentation: A decade review. Journal of Chinese Information Processing, 21(3):8-20, 2007.

Zhiheng Huang, Wei Xu, and Kai Yu. Bidirectional lstm-crf models for sequence tagging. arXiv preprint arXiv:1508.01991, 2015.

Wang Ling, Tiago Lu´ıs, Lu´ıs Marujo, Ramo´n Fernandez Astudillo, Silvio Amir, Chris Dyer, Alan W Black, and Isabel Trancoso. Finding function in form: Compositional character models for open vocabulary word representation. arXiv preprint arXiv:1508.02096, 2015.

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, pages 3111-3119, 2013.

Razvan Pascanu, Caglar Gulcehre, Kyunghyun Cho, and Yoshua Bengio. How to construct deep recurrent neural networks. arXiv preprint arXiv:1312.6026, 2013. [OpenAIRE]

Fuchun Peng, Fangfang Feng, and Andrew McCallum. Chinese segmentation and new word detection using conditional random fields. In Proceedings of the 20th international conference on Computational Linguistics, page 562. Association for Computational Linguistics, 2004. [OpenAIRE]

Mike Schuster and Kuldip K Paliwal. Bidirectional recurrent neural networks. Signal Processing, IEEE Transactions on, 45(11):2673-2681, 1997.

Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.

Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1):1929-1958, 2014.

22 references, page 1 of 2
Any information missing or wrong?Report an Issue