publication . Preprint . 2018

Self-Attention Recurrent Network for Saliency Detection

Sun, Fengdong; Li, Wenhui; Guan, Yuanyuan;
Open Access English
  • Published: 05 Aug 2018
Abstract
Feature maps in deep neural network generally contain different semantics. Existing methods often omit their characteristics that may lead to sub-optimal results. In this paper, we propose a novel end-to-end deep saliency network which could effectively utilize multi-scale feature maps according to their characteristics. Shallow layers often contain more local information, and deep layers have advantages in global semantics. Therefore, the network generates elaborate saliency maps by enhancing local and global information of feature maps in different layers. On one hand, local information of shallow layers is enhanced by a recurrent structure which shared convol...
Subjects
free text keywords: Computer Science - Computer Vision and Pattern Recognition
Related Organizations
Download from
46 references, page 1 of 4

1. Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al.: Tensorflow: A system for large-scale machine learning. In: Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, OSDI16, p. 265283. USENIX Association (2016). URL http://dl.acm.org/citation.cfm?id=3026877.3026899

2. Achantay, R., Hemamiz, S., Estraday, F., Su¨sstrunky, S.: Frequency-tuned salient region detection. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009, pp. 1597-1604 (2009). DOI 10.1109/CVPRW.2009.5206596

3. Bi, S., Li, G., Yu, Y.: Person re-identification using multiple experts with random subspaces. International Journal of Image and Graphics 2(2), 151-157 (2014)

4. Borji, A., Frintrop, S., Sihite, D.N., Itti, L.: Adaptive object tracking by learning background context. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 23-30 (2012). DOI 10.1109/CVPRW.2012.6239191 [OpenAIRE]

5. Cheng, M., Zhang, F., Mitra, N., Huang, X., Hu, S.: RepFinder: Finding Approximately Repeated Scene Elements for Image Editing. ACM Transactions on Graphics TOG 29(4), 1 (2010). DOI 10.1145/1778765.1778820. URL http://discovery.ucl.ac.uk/1327991/

6. Cheng, M.M., Hou, Q.B., Zhang, S.H., Rosin, P.L.: Intelligent visual media processing:when graphics meets vision. Journal of Computer Science and Technology 32(1), 110-121 (2017) [OpenAIRE]

7. Cheng, M.M., Zhang, G.X., Mitra, N.J., Huang, X., Hu, S.M.: Global Contrast based Salient Region Detection. pp. 409-416 (2011). DOI 10.1109/CVPR.2011.5995344

8. Chenlei Guo, Liming Zhang: A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression. IEEE Transactions on Image Processing 19(1), 185-198 (2010). DOI 10.1109/TIP.2009.2030969. URL http://ieeexplore.ieee.org/ lpdocs/epic03/wrapper.htm?arnumber=5223506

9. Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248-255 (2009). DOI 10.1109/CVPR.2009.5206848

10. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Y.W. Teh, M. Titterington (eds.) Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research, vol. 9, pp. 249-256. PMLR, Chia Laguna Resort, Sardinia, Italy (2010). URL http://proceedings.mlr.press/v9/ glorot10a.html

11. Hou, Q., Cheng, M.M., Hu, X., Borji, A., Tu, Z., Torr, P.H.: Deeply Supervised Salient Object Detection with Short Connections (2018). DOI 10.1109/TPAMI.2018.2815688

12. Hua, Y., Zhao, Z., Tian, H., Guo, X., Cai, A.: A probabilistic saliency model with memory-guided top-down cues for free-viewing. In: IEEE International Conference on Multimedia and Expo, pp. 1-6 (2013)

13. Itti, L., Koch, C., Niebur, E.: A Model of Saliency Based Visual Attention for Rapid Scene Analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(11), 1254-1259 (1998). DOI 10.1016/S1053-5357(00)00088-3

14. Koch, C., Ullman, S.: Shifts in selective visual attention: towards the underlying neural circuitry. Human neurobiology 4(4), 219-27 (1985). DOI 10.1016/j.imavis.2008.02.004. URL http://www. ncbi.nlm.nih.gov/pubmed/3836989 [OpenAIRE]

15. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet Classification with Deep Convolutional Neural Networks. Advances In Neural Information Processing Systems pp. 1-9 (2012). DOI http://dx.doi. org/10.1016/j.protcy.2014.09.007

46 references, page 1 of 4
Related research
Abstract
Feature maps in deep neural network generally contain different semantics. Existing methods often omit their characteristics that may lead to sub-optimal results. In this paper, we propose a novel end-to-end deep saliency network which could effectively utilize multi-scale feature maps according to their characteristics. Shallow layers often contain more local information, and deep layers have advantages in global semantics. Therefore, the network generates elaborate saliency maps by enhancing local and global information of feature maps in different layers. On one hand, local information of shallow layers is enhanced by a recurrent structure which shared convol...
Subjects
free text keywords: Computer Science - Computer Vision and Pattern Recognition
Related Organizations
Download from
46 references, page 1 of 4

1. Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., et al.: Tensorflow: A system for large-scale machine learning. In: Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, OSDI16, p. 265283. USENIX Association (2016). URL http://dl.acm.org/citation.cfm?id=3026877.3026899

2. Achantay, R., Hemamiz, S., Estraday, F., Su¨sstrunky, S.: Frequency-tuned salient region detection. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009, pp. 1597-1604 (2009). DOI 10.1109/CVPRW.2009.5206596

3. Bi, S., Li, G., Yu, Y.: Person re-identification using multiple experts with random subspaces. International Journal of Image and Graphics 2(2), 151-157 (2014)

4. Borji, A., Frintrop, S., Sihite, D.N., Itti, L.: Adaptive object tracking by learning background context. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 23-30 (2012). DOI 10.1109/CVPRW.2012.6239191 [OpenAIRE]

5. Cheng, M., Zhang, F., Mitra, N., Huang, X., Hu, S.: RepFinder: Finding Approximately Repeated Scene Elements for Image Editing. ACM Transactions on Graphics TOG 29(4), 1 (2010). DOI 10.1145/1778765.1778820. URL http://discovery.ucl.ac.uk/1327991/

6. Cheng, M.M., Hou, Q.B., Zhang, S.H., Rosin, P.L.: Intelligent visual media processing:when graphics meets vision. Journal of Computer Science and Technology 32(1), 110-121 (2017) [OpenAIRE]

7. Cheng, M.M., Zhang, G.X., Mitra, N.J., Huang, X., Hu, S.M.: Global Contrast based Salient Region Detection. pp. 409-416 (2011). DOI 10.1109/CVPR.2011.5995344

8. Chenlei Guo, Liming Zhang: A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression. IEEE Transactions on Image Processing 19(1), 185-198 (2010). DOI 10.1109/TIP.2009.2030969. URL http://ieeexplore.ieee.org/ lpdocs/epic03/wrapper.htm?arnumber=5223506

9. Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248-255 (2009). DOI 10.1109/CVPR.2009.5206848

10. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Y.W. Teh, M. Titterington (eds.) Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Proceedings of Machine Learning Research, vol. 9, pp. 249-256. PMLR, Chia Laguna Resort, Sardinia, Italy (2010). URL http://proceedings.mlr.press/v9/ glorot10a.html

11. Hou, Q., Cheng, M.M., Hu, X., Borji, A., Tu, Z., Torr, P.H.: Deeply Supervised Salient Object Detection with Short Connections (2018). DOI 10.1109/TPAMI.2018.2815688

12. Hua, Y., Zhao, Z., Tian, H., Guo, X., Cai, A.: A probabilistic saliency model with memory-guided top-down cues for free-viewing. In: IEEE International Conference on Multimedia and Expo, pp. 1-6 (2013)

13. Itti, L., Koch, C., Niebur, E.: A Model of Saliency Based Visual Attention for Rapid Scene Analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(11), 1254-1259 (1998). DOI 10.1016/S1053-5357(00)00088-3

14. Koch, C., Ullman, S.: Shifts in selective visual attention: towards the underlying neural circuitry. Human neurobiology 4(4), 219-27 (1985). DOI 10.1016/j.imavis.2008.02.004. URL http://www. ncbi.nlm.nih.gov/pubmed/3836989 [OpenAIRE]

15. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet Classification with Deep Convolutional Neural Networks. Advances In Neural Information Processing Systems pp. 1-9 (2012). DOI http://dx.doi. org/10.1016/j.protcy.2014.09.007

46 references, page 1 of 4
Related research
Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue