Depth Map Prediction from a Single Image using a Multi-Scale Deep Network

David Eigen      Christian Puhrsch      Rob Fergus

Paper PDF

Predicting depth is an essential component in understanding the 3D geometry of a scene. While for stereo images local correspondences suffice for estimation, finding depth relations from a single image requires integration of both global and local information. We address this by employing two deep network stacks: one that makes a coarse global prediction based on the entire image, and another that refines this prediction locally. Our method achieves state-of-the-art results on both NYU Depth and KITTI single-image depth prediction, and matches detailed depth boundaries without the need for superpixelation.

Please also see our newer work predicting depth, surface normals, and semantic labels.

NIPS 2014 Paper:
Predicted outputs:
  • Inference code with network weights for NYU Depth model (251 MB): download tgz
  • Weights and model files for KITTI (uses inference code in above link) (326 MB): download tgz