publication . Other literature type . Preprint . Conference object . 2013

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Ross Girshick; Jeff Donahue; Trevor Darrell; Jitendra Malik;
  • Published: 11 Nov 2013
  • Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Object detection performance, as measured on the canonical PASCAL VOC dataset, has plateaued in the last few years. The best-performing methods are complex ensemble systems that typically combine multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012---achieving a mAP of 53.3%. Our approach combines two key insights: (1) one can apply high-capacity convolutional neural networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training ...
Subjects
free text keywords: Computer Science - Computer Vision and Pattern Recognition, Convolutional neural network, Source code, media_common.quotation_subject, media_common, Object detection, Ensemble systems, Pattern recognition, Feature (computer vision), Computer vision, Segmentation, Artificial intelligence, business.industry, business, Hierarchy, Machine learning, computer.software_genre, computer, Scalability, Computer science
Related Organizations
39 references, page 1 of 3

[1] B. Alexe, T. Deselaers, and V. Ferrari. Measuring the objectness of image windows. TPAMI, 2012. 2

[2] P. Arbela´ez, B. Hariharan, C. Gu, S. Gupta, L. Bourdev, and J. Malik. Semantic segmentation using regions and parts. In CVPR, 2012. 2, 7, 8

[3] Y. Bengio, A. Courville, and P. Vincent. Representation learning: A review and new perspectives. TPAMI, 2013. 1

[4] L. Bourdev and J. Malik. Poselets: Body part detectors trained using 3d human pose annotations. In ICCV, 2009. 2

[5] J. Carreira, R. Caseiro, J. Batista, and C. Sminchisescu. Semantic segmentation with second-order pooling. In ECCV, 2012. 1, 7, 8, 10

[6] J. Carreira and C. Sminchisescu. CPMC: Automatic object segmentation using constrained parametric min-cuts. TPAMI, 2012. 2 2012 (ILSVRC2012). http://www.image-net.org/ challenges/LSVRC/2012/. 1

[10] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. FeiFei. ImageNet: A large-scale hierarchical image database. In CVPR, 2009. 1

[11] M. Dikmen, D. Hoiem, and T. S. Huang. A data driven method for feature transformation. In CVPR, 2012. 1

[12] J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. arXiv e-prints , arXiv:1310.1531 [cs.CV]. 2, 3

[13] M. Douze, H. Je´gou, H. Sandhawalia, L. Amsaleg, and C. Schmid. Evaluation of gist descriptors for web-scale image search. In Proc. of the ACM International Conference on Image and Video Retrieval, 2009. 8 [OpenAIRE]

[14] I. Endres and D. Hoiem. Category independent object proposals. In ECCV, 2010. 2 [OpenAIRE]

[15] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The PASCAL Visual Object Classes (VOC) Challenge. IJCV, 2010. 1, 4

[16] C. Farabet, C. Couprie, L. Najman, and Y. LeCun. Learning hierarchical features for scene labeling. TPAMI, 2013. 7 [OpenAIRE]

[17] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part based models. TPAMI, 2010. 1, 2, 4, 6 [OpenAIRE]

[18] S. Fidler, R. Mottaghi, A. Yuille, and R. Urtasun. Bottom-up segmentation for top-down detection. In CVPR, 2013. 4 [OpenAIRE]

39 references, page 1 of 3
Powered by OpenAIRE Research Graph
Any information missing or wrong?Report an Issue