Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object 01 Jul 2017Embargo end date: 01 Jan 2016Publisher:IEEEJournal:2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Authors: Jonathan Huang; Vivek Rathod; Chen Sun 0002; Menglong Zhu; Anoop Korattikara; Alireza Fathi; Ian Fischer; +4 Authors

doi: 10.1109/cvpr.2017.351 , 10.48550/arxiv.1611.10012

arXiv: 1611.10012

Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors

- Summary
- Subjects
- Metrics

Abstract

The goal of this paper is to serve as a guide for selecting a detection architecture that achieves the right speed/memory/accuracy balance for a given application and platform. To this end, we investigate various ways to trade accuracy for speed and memory usage in modern convolutional object detection systems. A number of successful systems have been proposed in recent years, but apples-to-apples comparisons are difficult due to different base feature extractors (e.g., VGG, Residual Networks), different default image resolutions, as well as different hardware and software platforms. We present a unified implementation of the Faster R-CNN [Ren et al., 2015], R-FCN [Dai et al., 2016] and SSD [Liu et al., 2015] systems, which we view as "meta-architectures" and trace out the speed/accuracy trade-off curve created by using alternative feature extractors and varying other critical parameters such as image size within each of these meta-architectures. On one extreme end of this spectrum where speed and memory are critical, we present a detector that achieves real time speeds and can be deployed on a mobile device. On the opposite end in which accuracy is critical, we present a detector that achieves state-of-the-art performance measured on the COCO detection task.

Accepted to CVPR 2017

Related Organizations

University of California, Irvine
United States
Cardiff University
United Kingdom
University of Pennsylvania
United States
University College London
United Kingdom
Siberian Branch of the Russian Academy of Sciences
Russian Federation

View all View all

Keywords

FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2K
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 0.01%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 0.01%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 0.01%

Found an issue? Give us feedback

2K

Top 0.01%

Green

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering