descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Sep 2024Embargo end date: 01 Jan 2024Publisher:Institute of Electrical and Electronics Engineers (IEEE)Journal:IEEE Robotics and Automation Letters, volume 9, pages 7,389-7,396 (eissn: 2377-3774,

Authors: Ji Ma; Hongming Dai; Yao Mu; Pengying Wu; Hao Wang; Xiaowei Chi; Yang Fei; +2 Authors

doi: 10.1109/lra.2024.3426381 , 10.48550/arxiv.2402.19007

arXiv: http://arxiv.org/abs/2402.19007

DOZE: A Dataset for Open-Vocabulary Zero-Shot Object Navigation in Dynamic Environments

- Summary
- Subjects
- Metrics

Abstract

Zero-Shot Object Navigation (ZSON) requires agents to autonomously locate and approach unseen objects in unfamiliar environments and has emerged as a particularly challenging task within the domain of Embodied AI. Existing datasets for developing ZSON algorithms lack consideration of dynamic obstacles, object attribute diversity, and scene texts, thus exhibiting noticeable discrepancies from real-world situations. To address these issues, we propose a Dataset for Open-Vocabulary Zero-Shot Object Navigation in Dynamic Environments (DOZE) that comprises ten high-fidelity 3D scenes with over 18k tasks, aiming to mimic complex, dynamic real-world scenarios. Specifically, DOZE scenes feature multiple moving humanoid obstacles, a wide array of open-vocabulary objects, diverse distinct-attribute objects, and valuable textual hints. Besides, different from existing datasets that only provide collision checking between the agent and static obstacles, we enhance DOZE by integrating capabilities for detecting collisions between the agent and moving obstacles. This novel functionality enables the evaluation of the agents' collision avoidance abilities in dynamic environments. We test four representative ZSON methods on DOZE, revealing substantial room for improvement in existing approaches concerning navigation efficiency, safety, and object recognition accuracy. Our dataset can be found at https://DOZE-Dataset.github.io/.

This version of the paper has been accepted for publication in IEEE Robotics and Automation Letters (RA-L)

Related Organizations

University of Hong Kong
China (People's Republic of)
Hong Kong University of Science and Technology (香港科技大學)
China (People's Republic of)
Hong Kong Polytechnic University
China (People's Republic of)
Peking University
China (People's Republic of)

Keywords

FOS: Computer and information sciences, Computer Science - Robotics, Data sets for robot learning, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Embodied AI, Zero-shot object navigation, Semantic scene understanding, Robotics (cs.RO), Data sets for robotic vision, 004

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

Top 10%

Average

Green

Beta

SDGs Suggest

11. Sustainability

Beta

SDGs:

11. Sustainability,

Related to Research communities

Knowmad Institut