Mask R-CNN on NYUv2

Mask R-CNN on NYUv2 This repository mainly contains information from the execution of the Mask R-CNN network [1] on images from the NYUv2 dataset [2] as well as additional metadata. It was created for analyzing the output of Mask R-CNN and post-processing it using contextual information for improving its performance. This work has been carried out by Dr. Jose-Raul Ruiz-Sarmiento (MAPIR group, University of Málaga) and Dr. Shuda Li (AVG group, University of Oxford) in the scope of the European project MoveCare: Multiple-actOrs Virtual Empathic CARgiver for the Elder (Ref: 732158). Concretely, this repository includes: - metadata: + coco_nyu_mapping.txt: Mapping between the categories in COCO dataset and those in NYUv2. + coco_object_categories.txt: Object categories considered in COCO dataset. + nyu_object_categories.txt: Object categories used in NYUv2 dataset. + nyu_scene_categories.txt: Scene categories considered in NYUv2. + objects_and_categories_in_images.txt: For each image in NYUv2, the categories of the appearing objects. - nyu_content: + masks_in_X (Where X is the image index) - Y.png: Where Y is the object index in the image, represents the binary mask of that object. - pixels_labelled.png: Binary mask indicating the labelled pixels in image X. + bboxesX.txt: Where X is the image index, includes the ground truth bounding boxes of the objects in it. Format is: min_x min_y max_x max_y. - preds: + X: Where X is the image index. - Y.png: Where Y is the object index in the image, as detected by Mask R-CNN. Binary image containing the mask of such detected object. + X.txt: Where X is the image index. File containing the objects detected by Mask R-CNN, including: idx class score min_x min_y max_x max_y masks_file, being min_x min_y max_x and max_y bounding box information, while masks_file refers to X/Y.png as described above. + result_X.png: Where X is the image index. Image showing the detections with a socre higher than 0.3. + gt_iou_X: Where X is the image index. - Y: Where Y is the index of the detected object. + Z.png Where Z is the index of the object in the ground truth. Image showing the masks of both objects, Y and Z, for visually checking their overlapping. - Y.txt: Where Y is the index of the detected object. File containing: + The intersection ratio of the object mask Y with the labelled part of the image. + The IoU value for the mask of object Y and those of ground truth objects. References: [1] He, Kaiming, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. "Mask r-cnn." In Proceedings of the IEEE international conference on computer vision, pp. 2961-2969. 2017. [2] Silberman, Nathan, Derek Hoiem, Pushmeet Kohli, and Rob Fergus. "Indoor segmentation and support inference from rgbd images." In European Conference on Computer Vision, pp. 746-760. Springer, Berlin, Heidelberg, 2012.

Work partially supported by a postdoc contract from the I-PPIT-UMA program, financed by the University of Malaga

Related Organizations

University of Malaga
Spain
University of Oxford
United Kingdom

Keywords

MoveCare, Object detection, Mask R-CNN, NYUv2

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average