publication . Conference object . Preprint . 2018

Learning Combinations of Activation Functions

Manessi, Franco; Rozza, Alessandro;
Open Access
  • Published: 29 Jan 2018
  • Publisher: IEEE
Abstract
Comment: 6 pages, 3 figures. Published as a conference paper at ICPR 2018. Code: https://bitbucket.org/francux/learning_combinations_of_activation_functions
Subjects
free text keywords: Computer Science - Machine Learning
35 references, page 1 of 3

[1] W. Xiong, J. Droppo, X. Huang, F. Seide, M. Seltzer, A. Stolcke, D. Yu, and G. Zweig, “Achieving human parity in conversational speech recognition,” arXiv preprint arXiv:1610.05256, 2016. [OpenAIRE]

[2] D. C. Ciresan, U. Meier, J. Masci, and J. Schmidhuber, “A committee of neural networks for traffic sign classification.” in IJCNN. IEEE, 2011, pp. 1918-1921. [OpenAIRE]

[3] Y. Bengio, P. Simard, and P. Frasconi, “Learning long-term dependencies with gradient descent is difficult,” IEEE transactions on neural networks, vol. 5, no. 2, pp. 157-166, 1994. [OpenAIRE]

[4] X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010, pp. 249-256.

[5] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press, 2016.

[6] K. Jarrett, K. Kavukcuoglu, Y. LeCun et al., “What is the best multistage architecture for object recognition?” in Computer Vision, 2009 IEEE 12th International Conference on. IEEE, 2009, pp. 2146-2153.

[7] X. Glorot, A. Bordes, and Y. Bengio, “Deep sparse rectifier neural networks,” in Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011, pp. 315-323. [OpenAIRE]

[8] A. L. Maas, A. Y. Hannun, and A. Y. Ng, “Rectifier Nonlinearities Improve Neural Network Acoustic Models,” in ICML Workshop on Deep Learning for Audio, Speech and Language Processing, 2013. [OpenAIRE]

[9] K. Konda, R. Memisevic, and D. Krueger, “Zero-bias autoencoders and the benefits of co-adapting features,” arXiv preprint arXiv:1402.3337, 2014. [OpenAIRE]

[10] K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,” in Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), ser. ICCV '15. Washington, DC, USA: IEEE Computer Society, 2015, pp. 1026-1034. [Online]. Available: http://dx.doi.org/10.1109/ICCV.2015.123

[11] C. Dugas, Y. Bengio, F. Be´lisle, C. Nadeau, and R. Garcia, “Incorporating second-order functional knowledge for better option pricing,” in Advances in neural information processing systems, 2001, pp. 472-478.

[12] I. J. Goodfellow, D. Warde-Farley, M. Mirza, A. Courville, and Y. Bengio, “Maxout networks,” in Proceedings of the 30th International Conference on International Conference on Machine Learning-Volume 28. JMLR. org, 2013, pp. III-1319.

[13] W. Sun, F. Su, and L. Wang, “Improving deep neural networks with multilayer maxout networks,” 2014 IEEE Visual Communications and Image Processing Conference, VCIP 2014, pp. 334-337, dec 2015.

[14] C. Gulcehre, K. Cho, R. Pascanu, and Y. Bengio, “Learned-norm pooling for deep feedforward and recurrent neural networks,” in Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 2014, pp. 530-546. [OpenAIRE]

[15] D.-A. Clevert, T. Unterthiner, and S. Hochreiter, “Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs),” Proceedings of ICLR 2016, nov 2016.

35 references, page 1 of 3
Abstract
Comment: 6 pages, 3 figures. Published as a conference paper at ICPR 2018. Code: https://bitbucket.org/francux/learning_combinations_of_activation_functions
Subjects
free text keywords: Computer Science - Machine Learning
35 references, page 1 of 3

[1] W. Xiong, J. Droppo, X. Huang, F. Seide, M. Seltzer, A. Stolcke, D. Yu, and G. Zweig, “Achieving human parity in conversational speech recognition,” arXiv preprint arXiv:1610.05256, 2016. [OpenAIRE]

[2] D. C. Ciresan, U. Meier, J. Masci, and J. Schmidhuber, “A committee of neural networks for traffic sign classification.” in IJCNN. IEEE, 2011, pp. 1918-1921. [OpenAIRE]

[3] Y. Bengio, P. Simard, and P. Frasconi, “Learning long-term dependencies with gradient descent is difficult,” IEEE transactions on neural networks, vol. 5, no. 2, pp. 157-166, 1994. [OpenAIRE]

[4] X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010, pp. 249-256.

[5] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press, 2016.

[6] K. Jarrett, K. Kavukcuoglu, Y. LeCun et al., “What is the best multistage architecture for object recognition?” in Computer Vision, 2009 IEEE 12th International Conference on. IEEE, 2009, pp. 2146-2153.

[7] X. Glorot, A. Bordes, and Y. Bengio, “Deep sparse rectifier neural networks,” in Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011, pp. 315-323. [OpenAIRE]

[8] A. L. Maas, A. Y. Hannun, and A. Y. Ng, “Rectifier Nonlinearities Improve Neural Network Acoustic Models,” in ICML Workshop on Deep Learning for Audio, Speech and Language Processing, 2013. [OpenAIRE]

[9] K. Konda, R. Memisevic, and D. Krueger, “Zero-bias autoencoders and the benefits of co-adapting features,” arXiv preprint arXiv:1402.3337, 2014. [OpenAIRE]

[10] K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,” in Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), ser. ICCV '15. Washington, DC, USA: IEEE Computer Society, 2015, pp. 1026-1034. [Online]. Available: http://dx.doi.org/10.1109/ICCV.2015.123

[11] C. Dugas, Y. Bengio, F. Be´lisle, C. Nadeau, and R. Garcia, “Incorporating second-order functional knowledge for better option pricing,” in Advances in neural information processing systems, 2001, pp. 472-478.

[12] I. J. Goodfellow, D. Warde-Farley, M. Mirza, A. Courville, and Y. Bengio, “Maxout networks,” in Proceedings of the 30th International Conference on International Conference on Machine Learning-Volume 28. JMLR. org, 2013, pp. III-1319.

[13] W. Sun, F. Su, and L. Wang, “Improving deep neural networks with multilayer maxout networks,” 2014 IEEE Visual Communications and Image Processing Conference, VCIP 2014, pp. 334-337, dec 2015.

[14] C. Gulcehre, K. Cho, R. Pascanu, and Y. Bengio, “Learned-norm pooling for deep feedforward and recurrent neural networks,” in Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 2014, pp. 530-546. [OpenAIRE]

[15] D.-A. Clevert, T. Unterthiner, and S. Hochreiter, “Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs),” Proceedings of ICLR 2016, nov 2016.

35 references, page 1 of 3
Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue