publication . Article . Preprint . 2018

Learning Depth with Convolutional Spatial Propagation Network

Cheng, Xinjing; Wang, Peng; Yang, Ruigang;
Open Access
  • Published: 04 Oct 2018 Journal: IEEE Transactions on Pattern Analysis and Machine Intelligence (issn: 0162-8828, eissn: 1939-3539, Copyright policy)
  • Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Depth prediction is one of the fundamental problems in computer vision. In this paper, we propose a simple yet effective convolutional spatial propagation network (CSPN) to learn the affinity matrix for various depth estimation tasks. Specifically, it is an efficient linear propagation model, in which the propagation is performed with a manner of recurrent convolutional operation, and the affinity among neighboring pixels is learned through a deep convolutional neural network (CNN). We can append this module to any output from a state-of-the-art (SOTA) depth estimation networks to improve their performances. In practice, we further extend CSPN in two aspects: 1)...
Subjects
free text keywords: Computational Theory and Mathematics, Software, Applied Mathematics, Artificial Intelligence, Computer Vision and Pattern Recognition, Pattern recognition, Computer vision, Append, Stereo matching, business.industry, business, Computer science, Affinity matrix, Depth map, Computer Science - Computer Vision and Pattern Recognition
Related Organizations
91 references, page 1 of 7

[1] I. Laina, C. Rupprecht, V. Belagiannis, F. Tombari, and N. Navab, “Deeper depth prediction with fully convolutional residual networks,” in 3D Vision (3DV), 2016 Fourth International Conference on. IEEE, 2016, pp. 239-248. 1, 2, 3, 6, 9

[2] F. Ma and S. Karaman, “Sparse-to-dense: Depth prediction from sparse depth samples and a single image,” ICRA, 2018. 1, 2, 3, 5, 6, 9, 10, 11, 12, 14

[3] J. Uhrig, N. Schneider, L. Schneider, U. Franke, T. Brox, and A. Geiger, “Sparsity invariant cnns,” 3DV, 2017. 1, 3, 12

[4] J.-R. Chang and Y.-S. Chen, “Pyramid stereo matching network,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 5410-5418. 1, 2, 3, 12, 13, 14

[5] N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, “Indoor segmentation and support inference from rgbd images,” ECCV, 2012. 1, 2, 9, 10

[6] A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the kitti vision benchmark suite,” in CVPR, 2012. 1, 2, 9, 11, 12

[7] S. Liu, S. De Mello, J. Gu, G. Zhong, M.-H. Yang, and J. Kautz, “Learning affinity via spatial propagation networks,” in Advances in Neural Information Processing Systems, 2017, pp. 1519-1529. 1, 2, 3, 4, 6, 9, 10, 14

[8] N. Mayer, E. Ilg, P. Hausser, P. Fischer, D. Cremers, A. Dosovitskiy, and T. Brox, “A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation,” in CVPR, 2016. 1, 2, 3, 12

[9] M. Menze, C. Heipke, and A. Geiger, “Object scene flow,” ISPRS Journal of Photogrammetry and Remote Sensing (JPRS), 2018. 1, 2, 12 [OpenAIRE]

[10] C. Chen, A. Seff, A. Kornhauser, and J. Xiao, “Deepdriving: Learning affordance for direct perception in autonomous driving,” in Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 2722-2730. 1

[11] D. Murray and J. J. Little, “Using real-time stereo vision for mobile robot navigation,” autonomous robots, vol. 8, no. 2, pp. 161-171, 2000. 1

[12] J. Biswas and M. Veloso, “Depth camera based localization and navigation for indoor mobile robots,” in RGB-D Workshop at RSS, vol. 2011, 2011, p. 21. 1

[13] A. U. Haque and A. Nejadpak, “Obstacle avoidance using stereo camera,” arXiv preprint arXiv:1705.04114, 2017. 1

[14] B. Bascle and R. Deriche, “Stereo matching, reconstruction and refinement of 3d curves using deformable contours,” in Computer Vision, 1993. Proceedings., Fourth International Conference on. IEEE, 1993, pp. 421-430. 1 [OpenAIRE]

[15] C. Zhang, Z. Li, Y. Cheng, R. Cai, H. Chao, and Y. Rui, “Meshstereo: A global stereo model with mesh alignment regularization for view interpolation,” in Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 2057-2065. 1

91 references, page 1 of 7
Abstract
Depth prediction is one of the fundamental problems in computer vision. In this paper, we propose a simple yet effective convolutional spatial propagation network (CSPN) to learn the affinity matrix for various depth estimation tasks. Specifically, it is an efficient linear propagation model, in which the propagation is performed with a manner of recurrent convolutional operation, and the affinity among neighboring pixels is learned through a deep convolutional neural network (CNN). We can append this module to any output from a state-of-the-art (SOTA) depth estimation networks to improve their performances. In practice, we further extend CSPN in two aspects: 1)...
Subjects
free text keywords: Computational Theory and Mathematics, Software, Applied Mathematics, Artificial Intelligence, Computer Vision and Pattern Recognition, Pattern recognition, Computer vision, Append, Stereo matching, business.industry, business, Computer science, Affinity matrix, Depth map, Computer Science - Computer Vision and Pattern Recognition
Related Organizations
91 references, page 1 of 7

[1] I. Laina, C. Rupprecht, V. Belagiannis, F. Tombari, and N. Navab, “Deeper depth prediction with fully convolutional residual networks,” in 3D Vision (3DV), 2016 Fourth International Conference on. IEEE, 2016, pp. 239-248. 1, 2, 3, 6, 9

[2] F. Ma and S. Karaman, “Sparse-to-dense: Depth prediction from sparse depth samples and a single image,” ICRA, 2018. 1, 2, 3, 5, 6, 9, 10, 11, 12, 14

[3] J. Uhrig, N. Schneider, L. Schneider, U. Franke, T. Brox, and A. Geiger, “Sparsity invariant cnns,” 3DV, 2017. 1, 3, 12

[4] J.-R. Chang and Y.-S. Chen, “Pyramid stereo matching network,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 5410-5418. 1, 2, 3, 12, 13, 14

[5] N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, “Indoor segmentation and support inference from rgbd images,” ECCV, 2012. 1, 2, 9, 10

[6] A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the kitti vision benchmark suite,” in CVPR, 2012. 1, 2, 9, 11, 12

[7] S. Liu, S. De Mello, J. Gu, G. Zhong, M.-H. Yang, and J. Kautz, “Learning affinity via spatial propagation networks,” in Advances in Neural Information Processing Systems, 2017, pp. 1519-1529. 1, 2, 3, 4, 6, 9, 10, 14

[8] N. Mayer, E. Ilg, P. Hausser, P. Fischer, D. Cremers, A. Dosovitskiy, and T. Brox, “A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation,” in CVPR, 2016. 1, 2, 3, 12

[9] M. Menze, C. Heipke, and A. Geiger, “Object scene flow,” ISPRS Journal of Photogrammetry and Remote Sensing (JPRS), 2018. 1, 2, 12 [OpenAIRE]

[10] C. Chen, A. Seff, A. Kornhauser, and J. Xiao, “Deepdriving: Learning affordance for direct perception in autonomous driving,” in Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 2722-2730. 1

[11] D. Murray and J. J. Little, “Using real-time stereo vision for mobile robot navigation,” autonomous robots, vol. 8, no. 2, pp. 161-171, 2000. 1

[12] J. Biswas and M. Veloso, “Depth camera based localization and navigation for indoor mobile robots,” in RGB-D Workshop at RSS, vol. 2011, 2011, p. 21. 1

[13] A. U. Haque and A. Nejadpak, “Obstacle avoidance using stereo camera,” arXiv preprint arXiv:1705.04114, 2017. 1

[14] B. Bascle and R. Deriche, “Stereo matching, reconstruction and refinement of 3d curves using deformable contours,” in Computer Vision, 1993. Proceedings., Fourth International Conference on. IEEE, 1993, pp. 421-430. 1 [OpenAIRE]

[15] C. Zhang, Z. Li, Y. Cheng, R. Cai, H. Chao, and Y. Rui, “Meshstereo: A global stereo model with mesh alignment regularization for view interpolation,” in Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 2057-2065. 1

91 references, page 1 of 7
Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue