Transfer Learning for Video Recognition with Scarce Training Data for Deep Convolutional Neural Network

Preprint English OPEN
Su, Yu-Chuan; Chiu, Tzu-Hsuan; Yeh, Chun-Yen; Huang, Hsin-Fu; Hsu, Winston H.;
  • Subject: Computer Science - Computer Vision and Pattern Recognition | Computer Science - Learning

Unconstrained video recognition and Deep Convolution Network (DCN) are two active topics in computer vision recently. In this work, we apply DCNs as frame-based recognizers for video recognition. Our preliminary studies, however, show that video corpora with complete gr... View more
  • References (33)
    33 references, page 1 of 4

    [1] Y.-G. Jiang, G. Ye, S.-F. Chang, D. Ellis, and A. C. Loui, “Consumer video understanding: A benchmark database and an evaluation of human and machine performance,” in ICMR, 2011.

    [2] J. M. Chaquet, E. J. Carmona, and A. Ferna´ndez-Caballero, “A survey of video datasets for human action and activity recognition,” Comput. Vis. Image Underst., vol. 117, no. 6, pp. 633-659, 2013.

    [3] A. F. Smeaton, P. Over, and W. Kraaij, “Evaluation campaigns and trecvid,” in MIR, 2006.

    [4] I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld, “Learning realistic human actions from movies,” in CVPR, 2008.

    [5] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and F.-F. Li, “Imagenet: A large-scale hierarchical image database,” in CVPR, 2009.

    [6] J. Xiao, J. Hays, K. Ehinger, A. Oliva, and A. Torralba, “Sun database: Large-scale scene recognition from abbey to zoo,” in CVPR, 2010.

    [7] P. Scovanner, S. Ali, and M. Shah, “A 3-dimensional sift descriptor and its application to action recognition,” in Proceedings of the 15th International Conference on Multimedia, 2007.

    [8] A. Kla¨ser, M. Marszałek, and C. Schmid, “A spatio-temporal descriptor based on 3d-gradients,” in BMVC, 2008.

    [9] T. Deselaers, S. Hasan, O. Bender, and H. Ney, “A deep learning approach to machine transliteration,” in Proceedings of the Fourth Workshop on Statistical Machine Translation, 2009.

    [10] A. Mohamed, G. E. Dahl, and G. Hinton, “Acoustic modeling using deep belief networks,” Trans. Audio, Speech and Lang. Proc., vol. 20, no. 1, pp. 14-22, 2012.

  • Metrics
Share - Bookmark