publication . Preprint . 2018

Compression of Deep Convolutional Neural Networks under Joint Sparsity Constraints

Choi, Yoojin; El-Khamy, Mostafa; Lee, Jungwon;
Open Access English
  • Published: 21 May 2018
Abstract
We consider the optimization of deep convolutional neural networks (CNNs) such that they provide good performance while having reduced complexity if deployed on either conventional systems utilizing spatial-domain convolution or lower complexity systems designed for Winograd convolution. Furthermore, we explore the universal quantization and compression of these networks. In particular, the proposed framework produces one compressed model whose convolutional filters can be made sparse either in the spatial domain or in the Winograd domain. Hence, one compressed model can be deployed universally on any platform, without need for re-training on the deployed platfo...
Subjects
ACM Computing Classification System: Data_CODINGANDINFORMATIONTHEORY
free text keywords: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning, Computer Science - Neural and Evolutionary Computing
Download from
38 references, page 1 of 3

[1] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436-444, 2015.

[2] V. Sze, Y.-H. Chen, T.-J. Yang, and J. S. Emer, “Efficient processing of deep neural networks: A tutorial and survey,” Proceedings of the IEEE, vol. 105, no. 12, pp. 2295-2329, 2017.

[3] Y. Cheng, D. Wang, P. Zhou, and T. Zhang, “Model compression and acceleration for deep neural networks: The principles, progress, and challenges,” IEEE Signal Processing Magazine, vol. 35, no. 1, pp. 126-136, 2018.

[4] S. Han, H. Mao, and W. J. Dally, “Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding,” in International Conference on Learning Representations, 2016.

[5] Y. Choi, M. El-Khamy, and J. Lee, “Towards the limit of network quantization,” in International Conference on Learning Representations, 2017. [OpenAIRE]

[6] K. Ullrich, E. Meeds, and M. Welling, “Soft weight-sharing for neural network compression,” in International Conference on Learning Representations, 2017. [OpenAIRE]

[7] E. Agustsson, F. Mentzer, M. Tschannen, L. Cavigelli, R. Timofte, L. Benini, and L. V. Gool, “Soft-tohard vector quantization for end-to-end learning compressible representations,” in Advances in Neural Information Processing Systems, 2017, pp. 1141-1151. [OpenAIRE]

[8] C. Louizos, K. Ullrich, and M. Welling, “Bayesian compression for deep learning,” in Advances in Neural Information Processing Systems, 2017, pp. 3290-3300. [OpenAIRE]

[9] Y. Choi, M. El-Khamy, and J. Lee, “Universal deep neural network compression,” arXiv preprint arXiv:1802.02271, 2018.

[10] M. Mathieu, M. Henaff, and Y. LeCun, “Fast training of convolutional networks through FFTs,” arXiv preprint arXiv:1312.5851, 2013. [OpenAIRE]

[11] N. Vasilache, J. Johnson, M. Mathieu, S. Chintala, S. Piantino, and Y. LeCun, “Fast convolutional nets with fbfft: A GPU performance evaluation,” arXiv preprint arXiv:1412.7580, 2014. [OpenAIRE]

[12] A. Lavin and S. Gray, “Fast algorithms for convolutional neural networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 4013-4021.

[13] S. Han, J. Pool, J. Tran, and W. Dally, “Learning both weights and connections for efficient neural network,” in Advances in Neural Information Processing Systems, 2015, pp. 1135-1143.

[14] V. Lebedev and V. Lempitsky, “Fast convnets using group-wise brain damage,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2554-2564. [OpenAIRE]

[15] W. Wen, C. Wu, Y. Wang, Y. Chen, and H. Li, “Learning structured sparsity in deep neural networks,” in Advances in Neural Information Processing Systems, 2016, pp. 2074-2082.

38 references, page 1 of 3
Abstract
We consider the optimization of deep convolutional neural networks (CNNs) such that they provide good performance while having reduced complexity if deployed on either conventional systems utilizing spatial-domain convolution or lower complexity systems designed for Winograd convolution. Furthermore, we explore the universal quantization and compression of these networks. In particular, the proposed framework produces one compressed model whose convolutional filters can be made sparse either in the spatial domain or in the Winograd domain. Hence, one compressed model can be deployed universally on any platform, without need for re-training on the deployed platfo...
Subjects
ACM Computing Classification System: Data_CODINGANDINFORMATIONTHEORY
free text keywords: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning, Computer Science - Neural and Evolutionary Computing
Download from
38 references, page 1 of 3

[1] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436-444, 2015.

[2] V. Sze, Y.-H. Chen, T.-J. Yang, and J. S. Emer, “Efficient processing of deep neural networks: A tutorial and survey,” Proceedings of the IEEE, vol. 105, no. 12, pp. 2295-2329, 2017.

[3] Y. Cheng, D. Wang, P. Zhou, and T. Zhang, “Model compression and acceleration for deep neural networks: The principles, progress, and challenges,” IEEE Signal Processing Magazine, vol. 35, no. 1, pp. 126-136, 2018.

[4] S. Han, H. Mao, and W. J. Dally, “Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding,” in International Conference on Learning Representations, 2016.

[5] Y. Choi, M. El-Khamy, and J. Lee, “Towards the limit of network quantization,” in International Conference on Learning Representations, 2017. [OpenAIRE]

[6] K. Ullrich, E. Meeds, and M. Welling, “Soft weight-sharing for neural network compression,” in International Conference on Learning Representations, 2017. [OpenAIRE]

[7] E. Agustsson, F. Mentzer, M. Tschannen, L. Cavigelli, R. Timofte, L. Benini, and L. V. Gool, “Soft-tohard vector quantization for end-to-end learning compressible representations,” in Advances in Neural Information Processing Systems, 2017, pp. 1141-1151. [OpenAIRE]

[8] C. Louizos, K. Ullrich, and M. Welling, “Bayesian compression for deep learning,” in Advances in Neural Information Processing Systems, 2017, pp. 3290-3300. [OpenAIRE]

[9] Y. Choi, M. El-Khamy, and J. Lee, “Universal deep neural network compression,” arXiv preprint arXiv:1802.02271, 2018.

[10] M. Mathieu, M. Henaff, and Y. LeCun, “Fast training of convolutional networks through FFTs,” arXiv preprint arXiv:1312.5851, 2013. [OpenAIRE]

[11] N. Vasilache, J. Johnson, M. Mathieu, S. Chintala, S. Piantino, and Y. LeCun, “Fast convolutional nets with fbfft: A GPU performance evaluation,” arXiv preprint arXiv:1412.7580, 2014. [OpenAIRE]

[12] A. Lavin and S. Gray, “Fast algorithms for convolutional neural networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 4013-4021.

[13] S. Han, J. Pool, J. Tran, and W. Dally, “Learning both weights and connections for efficient neural network,” in Advances in Neural Information Processing Systems, 2015, pp. 1135-1143.

[14] V. Lebedev and V. Lempitsky, “Fast convnets using group-wise brain damage,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2554-2564. [OpenAIRE]

[15] W. Wen, C. Wu, Y. Wang, Y. Chen, and H. Li, “Learning structured sparsity in deep neural networks,” in Advances in Neural Information Processing Systems, 2016, pp. 2074-2082.

38 references, page 1 of 3
Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue