Feedback Networks

Preprint English OPEN
Zamir, Amir R. ; Wu, Te-Lin ; Sun, Lin ; Shen, William ; Malik, Jitendra ; Savarese, Silvio (2016)
  • Subject: Computer Science - Computer Vision and Pattern Recognition

Currently, the most successful learning models in computer vision are based on learning successive representations followed by a decision layer. This is usually actualized through feedforward multilayer neural networks, e.g. ConvNets, where each layer forms one of such successive representations. However, an alternative that can achieve the same goal is a feedback based approach in which the representation is formed in an iterative manner based on a feedback received from previous iteration's output. We establish that a feedback based approach has several fundamental advantages over feedforward: it enables making early predictions at the query time, its output naturally conforms to a hierarchical structure in the label space (e.g. a taxonomy), and it provides a new basis for Curriculum Learning. We observe that feedback networks develop a considerably different representation compared to feedforward counterparts, in line with the aforementioned advantages. We put forth a general feedback based learning architecture with the endpoint results on par or better than existing feedforward networks with the addition of the above advantages. We also investigate several mechanisms in feedback architectures (e.g. skip connections in time) and design choices (e.g. feedback length). We hope this study offers new perspectives in quest for more natural and practical learning models.
  • References (48)
    48 references, page 1 of 5

    [1] M. Andriluka, L. Pishchulin, P. Gehler, and B. Schiele. 2d human pose estimation: New benchmark and state of the art analysis. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, pages 3686-3693. IEEE, 2014.

    [2] S. J. Ashford and L. L. Cummings. Feedback as an individual resource: Personal strategies of creating information. Organizational behavior and human performance, 32(3):370- 398, 1983.

    [3] V. Belagiannis and A. Zisserman. Recurrent human pose estimation. arXiv preprint arXiv:1605.02914, 2016.

    [4] Y. Bengio, J. Louradour, R. Collobert, and J. Weston. Curriculum learning. In Proceedings of the 26th annual international conference on machine learning, pages 41-48. ACM, 2009.

    [5] C. Cao, X. Liu, Y. Yang, Y. Yu, J. Wang, Z. Wang, Y. Huang, L. Wang, C. Huang, W. Xu, et al. Look and think twice: Capturing top-down visual attention with feedback convolutional neural networks. In Proceedings of the IEEE International Conference on Computer Vision, pages 2956-2964, 2015.

    [6] J. Carreira, P. Agrawal, K. Fragkiadaki, and J. Malik. Human pose estimation with iterative error feedback. arXiv preprint arXiv:1507.06550, 2015.

    [7] R. M. Cichy, D. Pantazis, and A. Oliva. Resolving human object recognition in space and time. Nature neuroscience, 17(3):455-462, 2014.

    [8] J. Deng, N. Ding, Y. Jia, A. Frome, K. Murphy, S. Bengio, Y. Li, H. Neven, and H. Adam. Large-scale object classification using label relation graphs. In European Conference on Computer Vision, pages 48-64. Springer, 2014.

    [9] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. FeiFei. Imagenet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pages 248-255. IEEE, 2009.

    [10] N. Ding, J. Deng, K. P. Murphy, and H. Neven. Probabilistic label relation graphs with ising models. In Proceedings of the IEEE International Conference on Computer Vision, pages 1161-1169, 2015.

  • Metrics
    No metrics available
Share - Bookmark