publication . Preprint . 2017

Learning Sparse Visual Representations with Leaky Capped Norm Regularizers

Wangni, Jianqiao; Lin, Dahua;
Open Access English
  • Published: 08 Nov 2017
Abstract
Sparsity inducing regularization is an important part for learning over-complete visual representations. Despite the popularity of $\ell_1$ regularization, in this paper, we investigate the usage of non-convex regularizations in this problem. Our contribution consists of three parts. First, we propose the leaky capped norm regularization (LCNR), which allows model weights below a certain threshold to be regularized more strongly as opposed to those above, therefore imposes strong sparsity and only introduces controllable estimation bias. We propose a majorization-minimization algorithm to optimize the joint objective function. Second, our study over monocular 3D...
Subjects
free text keywords: Computer Science - Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Mathematics - Numerical Analysis, Statistics - Machine Learning
Related Organizations
Download from
25 references, page 1 of 2

[Boyd et al. 2011] Boyd, S.; Parikh, N.; Chu, E.; Peleato, B.; and Eckstein, J. 2011. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends R in Machine Learning 3(1):1- 122.

[Chen, Zhou, and Ye 2011] Chen, J.; Zhou, J.; and Ye, J.

2011. Integrating low-rank and group-sparse structures for robust multi-task learning. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, 42-50. ACM.

[Cootes et al. 1995] Cootes, T. F.; Taylor, C. J.; Cooper, D. H.; and Graham, J. 1995. Active shape models-their training and application. Computer vision and image understanding 61(1):38-59. [OpenAIRE]

[Fan and Li 2011] Fan, J., and Li, R. 2011. Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96(456):1348-1360.

[Frank et al. 1993] Frank, I. E.; Friedman, J. H.; Wold, S.; Hastie, T.; and Mallows, C. 1993. A statistical view of some chemometrics regression tools. discussion. author's reply. Technometrics 35(2):109-148.

[Friedman 2012] Friedman, J. H. 2012. Fast sparse regression and classification. International Journal of Forecasting 28(3):722-738.

[Gong, Ye, and Zhang 2012] Gong, P.; Ye, J.; and Zhang, C.- s. 2012. Multi-stage multi-task feature learning. In Advances in Neural Information Processing Systems, 1988- 1996.

[Han and Zhang 2016] Han, L., and Zhang, Y. 2016. Multistage multi-task learning with reduced rank. In AAAI, 1638- 1644.

[Hassibi, Stork, and others 1993] Hassibi, B.; Stork, D. G.; et al. 1993. Second order derivatives for network pruning: Optimal brain surgeon. Advances in neural information processing systems 164-164. [OpenAIRE]

[Hejrati and Ramanan 2012] Hejrati, M., and Ramanan, D.

2012. Analyzing 3d objects in cluttered images. In Advances in Neural Information Processing Systems, 593-601.

[Hunter and Lange 2004] Hunter, D. R., and Lange, K. 2004.

A tutorial on mm algorithms. The American Statistician 58(1):30-37.

[Jiang, Nie, and Huang 2015] Jiang, W.; Nie, F.; and Huang, H. 2015. Robust dictionary learning with capped l1-norm.

25 references, page 1 of 2
Abstract
Sparsity inducing regularization is an important part for learning over-complete visual representations. Despite the popularity of $\ell_1$ regularization, in this paper, we investigate the usage of non-convex regularizations in this problem. Our contribution consists of three parts. First, we propose the leaky capped norm regularization (LCNR), which allows model weights below a certain threshold to be regularized more strongly as opposed to those above, therefore imposes strong sparsity and only introduces controllable estimation bias. We propose a majorization-minimization algorithm to optimize the joint objective function. Second, our study over monocular 3D...
Subjects
free text keywords: Computer Science - Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Mathematics - Numerical Analysis, Statistics - Machine Learning
Related Organizations
Download from
25 references, page 1 of 2

[Boyd et al. 2011] Boyd, S.; Parikh, N.; Chu, E.; Peleato, B.; and Eckstein, J. 2011. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends R in Machine Learning 3(1):1- 122.

[Chen, Zhou, and Ye 2011] Chen, J.; Zhou, J.; and Ye, J.

2011. Integrating low-rank and group-sparse structures for robust multi-task learning. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, 42-50. ACM.

[Cootes et al. 1995] Cootes, T. F.; Taylor, C. J.; Cooper, D. H.; and Graham, J. 1995. Active shape models-their training and application. Computer vision and image understanding 61(1):38-59. [OpenAIRE]

[Fan and Li 2011] Fan, J., and Li, R. 2011. Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96(456):1348-1360.

[Frank et al. 1993] Frank, I. E.; Friedman, J. H.; Wold, S.; Hastie, T.; and Mallows, C. 1993. A statistical view of some chemometrics regression tools. discussion. author's reply. Technometrics 35(2):109-148.

[Friedman 2012] Friedman, J. H. 2012. Fast sparse regression and classification. International Journal of Forecasting 28(3):722-738.

[Gong, Ye, and Zhang 2012] Gong, P.; Ye, J.; and Zhang, C.- s. 2012. Multi-stage multi-task feature learning. In Advances in Neural Information Processing Systems, 1988- 1996.

[Han and Zhang 2016] Han, L., and Zhang, Y. 2016. Multistage multi-task learning with reduced rank. In AAAI, 1638- 1644.

[Hassibi, Stork, and others 1993] Hassibi, B.; Stork, D. G.; et al. 1993. Second order derivatives for network pruning: Optimal brain surgeon. Advances in neural information processing systems 164-164. [OpenAIRE]

[Hejrati and Ramanan 2012] Hejrati, M., and Ramanan, D.

2012. Analyzing 3d objects in cluttered images. In Advances in Neural Information Processing Systems, 593-601.

[Hunter and Lange 2004] Hunter, D. R., and Lange, K. 2004.

A tutorial on mm algorithms. The American Statistician 58(1):30-37.

[Jiang, Nie, and Huang 2015] Jiang, W.; Nie, F.; and Huang, H. 2015. Robust dictionary learning with capped l1-norm.

25 references, page 1 of 2
Any information missing or wrong?Report an Issue