YCB-M: A Multi-Camera RGB-D Dataset for Object Recognition and 6DoF Pose Estimation

While a great variety of 3D cameras have been introduced in recent years, most publicly available datasets for object recognition and pose estimation focus on one single camera. This dataset consists of 32 scenes that have been captured by 7 different 3D cameras, totaling 49,294 frames. This allows evaluating the sensitivity of pose estimation algorithms to the specifics of the used camera and the development of more robust algorithms that are more independent of the camera model. Vice versa, our dataset enables researchers to perform a quantitative comparison of the data from several different cameras and depth sensing technologies and evaluate their algorithms before selecting a camera for their specific task. The scenes in our dataset contain 20 different objects from the common benchmark YCB object and model set. We provide full ground truth 6DoF poses for each object, per-pixel segmentation, 2D and 3D bounding boxes and a measure of the amount of occlusion of each object. If you use this dataset in your research, please cite the following publication: T. Grenzdörffer, M. Günther, and J. Hertzberg, “YCB-M: A Multi-Camera RGB-D Dataset for Object Recognition and 6DoF Pose Estimation,” in 2020 IEEE International Conference on Robotics and Automation, ICRA 2020, Paris, France, May 31-June 4, 2020. IEEE, 2020. @InProceedings{Grenzdoerffer2020ycbm, title = {{YCB-M}: A Multi-Camera {RGB-D} Dataset for Object Recognition and {6DoF} Pose Estimation}, author = {Grenzd{\"{o}}rffer, Till and G{\"{u}}nther, Martin and Hertzberg, Joachim}, booktitle = {2020 {IEEE} International Conference on Robotics and Automation, {ICRA} 2020, Paris, France, May 31-June 4, 2020}, year = {2020}, publisher = {{IEEE}} } This paper is also available on arXiv: https://arxiv.org/abs/2004.11657 To visualize the dataset, follow these instructions (tested on Ubuntu Xenial 16.04): # IMPORTANT: the ROS setup.bash must NOT be sourced, otherwise the following error occurs: # ImportError: /opt/ros/kinetic/lib/python2.7/dist-packages/cv2.so: undefined symbol: PyCObject_Type # nvdu requires Python 3.5 or 3.6 sudo add-apt-repository -y ppa:deadsnakes/ppa # to get python3.6 on Ubuntu Xenial sudo apt-get update sudo apt-get install -y python3.6 libsm6 libxext6 libxrender1 python-virtualenv python-pip # create a new virtual environment virtualenv -p python3.6 venv_nvdu cd venv_nvdu/ source bin/activate # clone our fork of NVIDIA's Dataset Utilities that incorporates some essential fixes pip install -e 'git+https://github.com/mintar/Dataset_Utilities.git#egg=nvdu' # download and transform the meshes # (alternatively, unzip the meshes contained in the dataset # to <path to venv_nvdu>/lib/python3.6/site-packages/nvdu/data/ycb/aligned_cm) nvdu_ycb -s # run nvdu_viz to visualize the dataset cd <a subdirectory of the YCB-M dataset with some frames> nvdu_viz --name_filters '*.jpg' For further details, see README.md.

Keywords

robotics, multi-camera, rgb-d camera, pose estimation, object recognition, computer vision

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average