publication . Preprint . 2016

Information Extraction with Character-level Neural Networks and Free Noisy Supervision

Meerkamp, Philipp; Zhou, Zhengyi;
Open Access English
  • Published: 13 Dec 2016
Abstract
We present an architecture for information extraction from text that augments an existing parser with a character-level neural network. The network is trained using a measure of consistency of extracted data with existing databases as a form of noisy supervision. Our architecture combines the ability of constraint-based information extraction systems to easily incorporate domain knowledge and constraints with the ability of deep neural networks to leverage large amounts of data to learn complex features. Boosting the existing parser's precision, the system led to large improvements over a mature and highly tuned constraint-based production information extraction...
Subjects
free text keywords: Computer Science - Computation and Language, Computer Science - Information Retrieval, Computer Science - Learning
Related Organizations
Download from
20 references, page 1 of 2

[Ballesteros, Dyer, and Smith 2015] Ballesteros, M.; Dyer, C.; and Smith, N. A. 2015. Improved transition-based dependency parsing by modeling characters instead of words with LSTMs. In Proceedings of EMNLP.

[Baum and Petrie 1966] Baum, L. E., and Petrie, T. 1966.

[Chang, Ratinov, and Roth 2012] Chang, M.; Ratinov, L.; and Roth, D. 2012. Structured learning with constrained conditional models. Machine Learning 88(3):399-431.

[Chiticariu, Li, and Reiss 2013] Chiticariu, L.; Li, Y.; and Reiss, F. R. 2013. Rule-based information extraction is dead! Long live rule-based information extraction systems! In Proceedings of EMNLP.

[Chiu and Nichols 2015] Chiu, J. P., and Nichols, E. 2015.

http://arxiv.org/pdf/1511.08308v4.pdf.

[Chollet 2015] Chollet, F.

1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing 45:2673-2681.

[Socher et al. 2013] Socher, R.; Bauer, J.; Manning, C. D.; and Ng, A. Y. 2013. Parsing with compositional vector grammars. In Proceedings of the ACL.

[Srivastava et al. 2014] Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; and Salakhutdinov, R.

2014. Dropout: a simple way to prevent neural networks from overfitting. J. of Machine Learning Research 15.

[Sutskever, Vinyals, and Le 2014] Sutskever, I.; Vinyals, O.; and Le, Q. V. 2014. Sequence to sequence learning with neural networks. In Adv. in Neural Information Processing Systems, volume 27.

[Ta¨ckstro¨m, Ganchev, and Das 2015] Ta¨ckstro¨m, O.; Ganchev, K.; and Das, D. 2015. Efficient inference and structured learning for semantic role labeling. Transactions of the Association for Computational Linguistics 3:29-41.

[Theano Development Team 2016] Theano Development Team. 2016. Theano: A Python framework for fast computation of mathematical expressions. arXiv e-prints.

[Toutanova et al. 2008] Toutanova, K.; Haghighi, A.; ; and Manning, C. D. 2008. A global joint model for semantic role labeling. Computational Linguistics 34:161-191.

20 references, page 1 of 2
Abstract
We present an architecture for information extraction from text that augments an existing parser with a character-level neural network. The network is trained using a measure of consistency of extracted data with existing databases as a form of noisy supervision. Our architecture combines the ability of constraint-based information extraction systems to easily incorporate domain knowledge and constraints with the ability of deep neural networks to leverage large amounts of data to learn complex features. Boosting the existing parser's precision, the system led to large improvements over a mature and highly tuned constraint-based production information extraction...
Subjects
free text keywords: Computer Science - Computation and Language, Computer Science - Information Retrieval, Computer Science - Learning
Related Organizations
Download from
20 references, page 1 of 2

[Ballesteros, Dyer, and Smith 2015] Ballesteros, M.; Dyer, C.; and Smith, N. A. 2015. Improved transition-based dependency parsing by modeling characters instead of words with LSTMs. In Proceedings of EMNLP.

[Baum and Petrie 1966] Baum, L. E., and Petrie, T. 1966.

[Chang, Ratinov, and Roth 2012] Chang, M.; Ratinov, L.; and Roth, D. 2012. Structured learning with constrained conditional models. Machine Learning 88(3):399-431.

[Chiticariu, Li, and Reiss 2013] Chiticariu, L.; Li, Y.; and Reiss, F. R. 2013. Rule-based information extraction is dead! Long live rule-based information extraction systems! In Proceedings of EMNLP.

[Chiu and Nichols 2015] Chiu, J. P., and Nichols, E. 2015.

http://arxiv.org/pdf/1511.08308v4.pdf.

[Chollet 2015] Chollet, F.

1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing 45:2673-2681.

[Socher et al. 2013] Socher, R.; Bauer, J.; Manning, C. D.; and Ng, A. Y. 2013. Parsing with compositional vector grammars. In Proceedings of the ACL.

[Srivastava et al. 2014] Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; and Salakhutdinov, R.

2014. Dropout: a simple way to prevent neural networks from overfitting. J. of Machine Learning Research 15.

[Sutskever, Vinyals, and Le 2014] Sutskever, I.; Vinyals, O.; and Le, Q. V. 2014. Sequence to sequence learning with neural networks. In Adv. in Neural Information Processing Systems, volume 27.

[Ta¨ckstro¨m, Ganchev, and Das 2015] Ta¨ckstro¨m, O.; Ganchev, K.; and Das, D. 2015. Efficient inference and structured learning for semantic role labeling. Transactions of the Association for Computational Linguistics 3:29-41.

[Theano Development Team 2016] Theano Development Team. 2016. Theano: A Python framework for fast computation of mathematical expressions. arXiv e-prints.

[Toutanova et al. 2008] Toutanova, K.; Haghighi, A.; ; and Manning, C. D. 2008. A global joint model for semantic role labeling. Computational Linguistics 34:161-191.

20 references, page 1 of 2
Any information missing or wrong?Report an Issue