Views provided by UsageCounts
3D reconstruction is mostly evaluated qualitatively. With this dataset, we are introducing a new difficult quantitative task, the 3D IQ test task (3D-IQTT). It is designed to be similar to mental rotation questions found in some IQ tests. Each element in the dataset consists of 4 images: reference object and answers 1-3. One of the answers is the reference object but randomly rotated. For every question, dataset users have to use their model to pick the rotated model out of the 3 possible answers. The dataset encourages semi-supervised or unsupervised 3D reconstruction because it contains a large corpus of unlabeled data and only a small set of labeled data where the correct answer is known. All the images are of blocky 3D shapes floating in space in front of a black background. Demo scripts for loading/processing the dataset can be found at https://github.com/fgolemo/3D-IQTT The dataset consists of: 3diqtt-v2-train.h5 (XZ-compressed) (Training Dataset) /labeled /questions format: [10,000 x 4 x 128 x 128 x 3], corresponding to (10k items) x (reference + 3 answers) x (img width) x (img height) x (RGB), np.float32 in range [0,1] /answers format: [10,000], corresponding to (10k answers), np.uint8, one of the following three items: [0,1,2] /unlabeled /questions format: [100,000 x 4 x 128 x 128 x 3], corresponding to (100k items) x (reference + 3 answers) x (img width) x (img height) x (RGB), np.float32 in range [0,1] 3diqtt-v2-test.h5 (Test Dataset) /questions format: [10,000 x 4 x 128 x 128 x 3], corresponding to (10k items) x (reference + 3 answers) x (img width) x (img height) x (RGB), np.float32 in range [0,1]. Important! This is what you have to evaluate yourself on. We have the correct answers but they are not public. 3diqtt-v2-val.h5 (Validation Dataset) /questions format: [10,000 x 4 x 128 x 128 x 3], corresponding to (10k items) x (reference + 3 answers) x (img width) x (img height) x (RGB), np.float32 in range [0,1] /answers format [10,000], corresponding to (10k answers), np.uint8, one of the following three items: [0,1,2] Important: Before use, the main training dataset (3diqtt-v2-train.h5.xz) needs to be decompressed. This can take up to 24h depending on your hardware. We apologize for any inconvenience caused by this. The uncompressed file has a size of ~74GB. The reason for this compression was a restriction on the size of individual files. The command for decompression is "unxz 3diqtt-v2-train.h5.xz" on Unix machines. If you use this dataset, please cite it.
This project was funded in part by the CHIST-ERA project "IGLU" through the Agence Nationale de la Recherche (ANR) and in part by the Canadian Institute for Advanced Research (CIFAR).
Mental Rotation, 3D, Dataset
Mental Rotation, 3D, Dataset
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 16 |

Views provided by UsageCounts