For any questions, contact nathan (dot) silberman (at) gmail (dot) com
Dataset: The semantic segmentation training set consists of 1800 RGB-D images. Each pixel in each image is labeled with one of the following 23 classes: background, bathtub, bed, blinds, cabinet, ceiling, chair, counter, curtain, desk, dresser, floor, night stand, picture, pillow, refrigerator, shelves, sofa, table, television, toilet, wall, window.
Evaluation: To evaluate the segmentation accuracy, we use the Jaccard index: the mean of the per-class intersection over union scores.
Normals from RGB
Dataset: The Normals from RGB dataset consists of 4105 RGB and surface normal pairs.
Evaluation: To evaluate the surface normals prediction accuracy, we compute the mean angular distance between each predicted surface normal and each ground truth surface normal.
- Training Data, each part contains a set of images, their corresponding surface normals, and a map of pixels whose normals could not be accurately computed.
- Development Kit
Depth from RGB
Dataset: The depth from RGB dataset consists of 4105 RGB and Depth pairs. To deal with the inherent ambiguity in the scene scale, we have also provided the absolute height of each scene along with the RGB and Depth pairs.
Evaluation: To evaluate the depth prediction accuracy, we compute the mean absolute relative error (per pixel). A depth pixel's relative error is the difference between the predicted and ground truth absolute depth divided by ground truth depth absolute depth.
- Training Data
- Development Kit
(Taster) RGBD Instance Segmentation
Dataset: The instance segmentation dataset consists of 1449 RGB and Depth frames.
Evaluation: To evaluate the instance segmentation accuracy, we measure the coverage score for each image and compute the mean coverage score over every image.