publication . Other literature type . Part of book or chapter of book . Preprint . 2016

Identity Mappings in Deep Residual Networks

He, Kaiming; Zhang, Xiangyu; Ren, Shaoqing; Sun, Jian;
Open Access
  • Published: 16 Mar 2016
  • Publisher: Springer International Publishing
Abstract
Deep residual networks have emerged as a family of extremely deep architectures showing compelling accuracy and nice convergence behaviors. In this paper, we analyze the propagation formulations behind the residual building blocks, which suggest that the forward and backward signals can be directly propagated from one block to any other block, when using identity mappings as the skip connections and after-addition activation. A series of ablation experiments support the importance of these identity mappings. This motivates us to propose a new residual unit, which makes training easier and improves generalization. We report improved results using a 1001-layer Res...
Subjects
free text keywords: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Learning
Download fromView all 3 versions
http://arxiv.org/pdf/1603.0502...
Part of book or chapter of book
Provider: UnpayWall
http://link.springer.com/conte...
Part of book or chapter of book . 2016
Provider: Crossref
19 references, page 1 of 2

1. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR. (2016)

2. Nair, V., Hinton, G.E.: Recti ed linear units improve restricted boltzmann machines. In: ICML. (2010)

3. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet Large Scale Visual Recognition Challenge. IJCV (2015) [OpenAIRE]

4. Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollar, P., Zitnick, C.L.: Microsoft COCO: Common objects in context. In: ECCV. (2014)

5. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural computation (1997)

6. Srivastava, R.K., Gre , K., Schmidhuber, J.: Highway networks. In: ICML workshop. (2015)

7. Srivastava, R.K., Gre , K., Schmidhuber, J.: Training very deep networks. In: NIPS. (2015)

8. Io e, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: ICML. (2015)

9. LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Backpropagation applied to handwritten zip code recognition. Neural computation (1989) [OpenAIRE]

10. Krizhevsky, A.: Learning multiple layers of features from tiny images. Tech Report (2009)

11. Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580 (2012)

12. Clevert, D.A., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units (ELUs). In: ICLR. (2016)

13. Lin, M., Chen, Q., Yan, S.: Network in network. In: ICLR. (2014)

14. Lee, C.Y., Xie, S., Gallagher, P., Zhang, Z., Tu, Z.: Deeply-supervised nets. In: AISTATS. (2015)

15. Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: Fitnets: Hints for thin deep nets. In: ICLR. (2015)

19 references, page 1 of 2
Abstract
Deep residual networks have emerged as a family of extremely deep architectures showing compelling accuracy and nice convergence behaviors. In this paper, we analyze the propagation formulations behind the residual building blocks, which suggest that the forward and backward signals can be directly propagated from one block to any other block, when using identity mappings as the skip connections and after-addition activation. A series of ablation experiments support the importance of these identity mappings. This motivates us to propose a new residual unit, which makes training easier and improves generalization. We report improved results using a 1001-layer Res...
Subjects
free text keywords: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Learning
Download fromView all 3 versions
http://arxiv.org/pdf/1603.0502...
Part of book or chapter of book
Provider: UnpayWall
http://link.springer.com/conte...
Part of book or chapter of book . 2016
Provider: Crossref
19 references, page 1 of 2

1. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR. (2016)

2. Nair, V., Hinton, G.E.: Recti ed linear units improve restricted boltzmann machines. In: ICML. (2010)

3. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet Large Scale Visual Recognition Challenge. IJCV (2015) [OpenAIRE]

4. Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollar, P., Zitnick, C.L.: Microsoft COCO: Common objects in context. In: ECCV. (2014)

5. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural computation (1997)

6. Srivastava, R.K., Gre , K., Schmidhuber, J.: Highway networks. In: ICML workshop. (2015)

7. Srivastava, R.K., Gre , K., Schmidhuber, J.: Training very deep networks. In: NIPS. (2015)

8. Io e, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: ICML. (2015)

9. LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Backpropagation applied to handwritten zip code recognition. Neural computation (1989) [OpenAIRE]

10. Krizhevsky, A.: Learning multiple layers of features from tiny images. Tech Report (2009)

11. Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580 (2012)

12. Clevert, D.A., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units (ELUs). In: ICLR. (2016)

13. Lin, M., Chen, Q., Yan, S.: Network in network. In: ICLR. (2014)

14. Lee, C.Y., Xie, S., Gallagher, P., Zhang, Z., Tu, Z.: Deeply-supervised nets. In: AISTATS. (2015)

15. Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: Fitnets: Hints for thin deep nets. In: ICLR. (2015)

19 references, page 1 of 2
Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue