Name: The Devil is in the Fine-Grained Details: Evaluating open-Vocabulary Object Detectors for Fine-Grained Understanding
Keywords: FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), open-vocabulary detection , fine-grained understanding , benchmark, Computer Science - Artificial Intelligence, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Machine Learning (cs.LG)

descriptionPublicationkeyboard_double_arrow_right Article , Conference object , Preprint 16 Jun 2024Embargo end date: 01 Jan 2023Publisher:IEEEJournal:2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Funded by:EC | SUN

Authors: Bianchi, Lorenzo; Carrara, Fabio; Messina, Nicola; Gennaro, Claudio; Falchi, Fabrizio;

doi: 10.1109/cvpr52733.2024.02125 , 10.5281/zenodo.13269555 , 10.5281/zenodo.13269556 , 10.48550/arxiv.2311.17518

arXiv: http://arxiv.org/abs/2311.17518

handle: 20.500.14243/468224

The Devil is in the Fine-Grained Details: Evaluating open-Vocabulary Object Detectors for Fine-Grained Understanding

- Summary
- Subjects
- Related research
  (1)
- Metrics

Abstract

Recent advancements in large vision-language models enabled visual object detection in open-vocabulary scenarios, where object classes are defined in free-text formats during inference. In this paper, we aim to probe the state-of-the-art methods for open-vocabulary object detection to determine to what extent they understand fine-grained properties of objects and their parts. To this end, we introduce an evaluation protocol based on dynamic vocabulary generation to test whether models detect, discern, and assign the correct fine-grained description to objects in the presence of hard-negative classes. We contribute with a benchmark suite of increasing difficulty and probing different properties like color, pattern, and material. We further enhance our investigation by evaluating several state-of-the-art open-vocabulary object detectors using the proposed protocol and find that most existing solutions, which shine in standard open-vocabulary benchmarks, struggle to accurately capture and distinguish finer object details. We conclude the paper by highlighting the limitations of current methodologies and exploring promising research directions to overcome the discovered drawbacks. Data and code are available at https://lorebianchi98.github.io/FG-OVD/.

Related Organizations

National Research Council
Italy
Institute of Information Science and Technologies "A. Faedo"
Italy
University of Pisa
Italy

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), open-vocabulary detection , fine-grained understanding , benchmark, Computer Science - Artificial Intelligence, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Machine Learning (cs.LG)

1 Research products, page 1 of 1

EpisodicMmemory software on GitHub
IsRelatedTo

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

Top 10%

Average

Green

Funded by

EC| SUN

The Devil is in the Fine-Grained Details: Evaluating open-Vocabulary Object Detectors for Fine-Grained Understanding

The Devil is in the Fine-Grained Details: Evaluating open-Vocabulary Object Detectors for Fine-Grained Understanding

1 Research products, page 1 of 1

EpisodicMmemory software on GitHub