picture of David David Fouhey
Quick info: CV   Google Scholar  
Email: david.fouhey@nyu.edu (Please read if you are asking about RA/TA/GA positions)
Physical Locations: You can find me at either:
  • 60 Fifth Avenue, New York, NY 10011
  • 370 Jay Street, Brooklyn, NY 11201
Name FAQ: `foe'-`eee'. It rhymes with snowy or Joey: the key is to forget how it is spelled. It (but not me) is from County Cork, Ireland.
Photos: One picture is hard to identify a person with. Here are some more (but dated).

Summary: I am an Assistant Professor at NYU, jointly appointed between Computer Science in the Courant Institute of Mathematical Sciences and Electrical and Computer Engineering in the Tandon School of Engineering.

From 2019 to 2023, I was an Assistant Professor at CSE at the University of Michigan. Before that, I was a postdoctoral fellow at UC Berkeley, working with Alyosha Efros and Jitendra Malik. Before that, I received a Ph.D. in robotics from CMU where I worked with Abhinav Gupta and Martial Hebert.

Executive Summary

Research: I work on learning-based computer vision. I want to develop autonomous systems that can learn to build representations of the underlying state and dynamics of the world through observation. I am particularly interested in the following three areas:
Student Collaborators

Postdoc:
Current PhD student collaborators: PhD Graduates: I am also proud of my past student UG/MS collaborators who are off doing great things elsewhere! For a complete list, please see my CV.

Teaching

NYU: University of Michigan:
while I no longer teach at UM, I am keeping an archived version of the final form of each course in case they are useful for others.
Selected Publications
2024
teaser image for paper Sarah Jabbour, Gregory Kondas, Ela Kazerooni, Michael Sjoding, David Fouhey, Jenna Wiens
DEPICT: Diffusion-Enabled Permutation Importance for Image Classification Tasks
ECCV 2024
If you're careful about it, you can use diffusion models to let you perform classic permutation importance testing (as in Random Forests), but on image classifiers.
teaser image for paper Ruoyu Wang, David Fouhey, Richard Higgins, Spiro K. Antiochos, Graham Barnes, J. Todd Hoeksema, K.D. Leka, Yang Liu, Peter W. Schuck, Tamas I. Gombosi
SuperSynthIA: Physics-Ready Full-Disk Vector Magnetograms from HMI, Hinode, and Machine Learning
To appear in the Astrophysical Journal, 2024
We produce solar magnetograms combining the best features of multiple instruments and producing data products that can be immediately used in downstream systems (e.g., surface flux transport or solar wind forecasting)
teaser image for paper Chris Rockwell, Nilesh Kulkarni, Linyi Jin, JJ Park, Justin Johnson, David Fouhey
FAR: Flexible, Accurate and Robust 6DoF Relative Camera Pose Estimation
CVPR 2024
[Project Page]   [Code]
By combining deep learning with classic geometry in the right way, you can get a method that's super accurate like geometry, but robust like learning-based methods
teaser image for paper Linyi Jin, Nilesh Kulkarni, David Fouhey
3DFIRES: Few Image 3D REconstruction for Scenes with Hidden Surface
CVPR 2024
[Project Page]   [Arxiv (PDF)]
You can train a model to take one or a few posed images as input and produce scenes, including hidden surfaces
teaser image for paper Georgios Pavlakos, Dandan Shan, Ilija Radosavovic, Angjoo Kanazawa, David Fouhey, Jitendra Malik
Reconstructing Hands in 3D with Transformers
CVPR 2024
[Project Page]
Scaling models + data up works really well for reconstructing hands in 3D
2023
teaser image for paper Sarah Jabbour, David Fouhey, Stephanie Shepard, Thomas S. Valley, Ella A. Kazerooni, Nikola Banovic, Jenna Wiens, Michael W. Sjoding
Measuring the Impact of AI in the Diagnosis of Hospitalized Patients A Randomized Clinical Vignette Survey Study
JAMA 330(23)
Showing explanations doesn't help clinicians recover from being negatively influenced by systematically biased models.
teaser image for paper Tianyi Cheng*, Dandan Shan*, Ayda Sultan, Richard Higgins, David Fouhey
Towards A Richer 2D Understanding of Hands at Scale
NeurIPS 2023
[Project Page] [Code] [Data]
New dataset, tasks, and model for understanding more complex hand interactions, including bimanual manipulation and tool use
teaser image for paper Vadim Tschernezki*, Ahmad Darkhalil*, Zhifan Zhu*, David Fouhey Iro Laina Diane Larlus Dima Damen Andrea Vedaldi
EPIC Fields Marrying 3D Geometry and Video Understanding
NeurIPS Datasets & Benchmarks 2023
[Project Page]
Accurate camera poses for EPIC Kitchens leads to a number of exciting new challenges
teaser image for paper Shengyi Qian, David Fouhey
Understanding 3D Object Interaction from a Single Image.
ICCV 2023
[Project Page]
We use human judgments to get a first pass understanding of potential interaction in 3D from a single image
teaser image for paper Richard Higgins, David Fouhey
MOVES: Moving Objects in Video Enable Segmentation
CVPR 2023
[Project Page] Disagreement with a really simple background model provides surprisingly effective pseudolabel cues for performing grouping and hand-object association
teaser image for paper Nilesh Kulkarni, Linyi Jin, Justin Johnson, David Fouhey
Learning to Predict Scene-Level Implicit 3D from Posed RGBD Data
CVPR 2023
[Project Page] We can learn to predict implicit function-based 3D from posed RGBD.
teaser image for paper Linyi Jin, Jianming Zhang, Yannick Hold-Geoffroy, Oliver Wang, Kevin Matzen, Matthew Sticha, David Fouhey
Perspective Fields for Single Image Camera Calibration
CVPR 2023 (Highlight -- 2.5% accept rate)
[Project Page]
We develop a effective representation for camera geometry that distributes the parameters throughout the image, resulting in robustness and lots of fun applications
teaser image for paper David F. Fouhey, Richard E.L. Higgins, Spiro K. Antiochos, Graham Barnes, Marc DeRosa, J. Todd Hoeksema, K.D. Leka, Yang Liu, Peter W. Schuck, Tamas I. Gombosi
Large-Scale Spatial Cross-Calibration of Hinode/SOT-SP and SDO/HMI
Accepted in The Astrophysics Journal Supplement Series
[Arxiv]

We fix a more than decade-long issue with pointing and pixel scale in the spectropolarimeter onboard Hinode (which gets cold during eclipse season)
2022
teaser image for paper Ahmad Darkhalil*, Dandan Shan*, Bin Zhu*, Jian Ma*, Amlan Kar, Richard E.L. Higgins, Sanja Fidler, David F. Fouhey, Dima Damen.
EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations
NeurIPS Datasets and Benchmarks 2022
[Paper and Reviews]   [Project Webpage]   [Download]   [Trailer]

A new large-scale dataset of segments of people engaged in interaction with objects, including three new challenges and loads of data.
teaser image for paper Chris Rockwell, Justin Johnson, David F. Fouhey
The 8-Point Algorithm as an Inductive Bias for Relative Pose Prediction by ViTs
3DV 2022
[Project Page]   [Paper]   [Bibtex]

Small tweaks let vision transformers imitate much of the 8-pt algorithm, which facilitates learning to estimate full 6D relative camera pose, especially in few-sample settings
teaser image for paper Nilesh Kulkarni, Justin Johnson, David F. Fouhey
What's behind the couch? Directed Ray Distance Functions for 3D Scene Reconstruction
ECCV 2022
[Arxiv]   [PDF]   [Project Page]

We can produce high-quality 3D reconstructions from a single RGB image via implicit function by carefully analyzing what we expect networks to produce during training.
teaser image for paper Samir Agarwala, Linyi Jin, Chris Rockwell, David F. Fouhey
PlaneFormers: From Sparse View Planes to 3D Reconstruction
ECCV 2022
[Arxiv]   [PDF]   [Project Page]   [Bibtex]

Transformers are really good at integrating evidence across multiple views and producing a planar reconstruction.
teaser image for paper Shengyi Qian, Linyi Jin, Chris Rockwell, Siyi Chen, David F. Fouhey
Understanding 3D Object Articulation in Internet Videos
CVPR 2022
[Arxiv]   [Arxiv PDF]   [Bibtex]  

By training on both video data and 3D reconstructions in the right way, we can build models of articulations of 3D objects on ordinary video data.
teaser image for paper Brian C. Weeks, Zhizhuo Zhou, Bruce K. O'Brien, Rachel Darling, Morgan Dean, Tiffany Dias, Gemmechu Hassena, Mingyu Zhang, and David F. Fouhey
A deep neural network for high throughput measurement of functional traits on museum skeletal specimens.
Accepted in Methods in Ecology and Evolution.
[Paper]  

Bird sizes correlate with temperature. We reduce measurement time of museum specimens by ≈10x, leading to datasets at previously unexplored scales.
teaser image for paper Richard E.L. Higgins, David F. Fouhey, Spiro K. Antiochos, Graham Barnes, Mark C.M. Cheung, J. Todd Hoeksema, K.D. Leka, Yang Liu, Peter W. Schuck, Tamas I. Gombosi
SynthIA: A Synthetic Inversion Approximation for the Stokes Vector Fusing SDO and Hinode into a Virtual Observatory
Accepted in The Astrophysics Journal Supplement Series
[Arxiv]   [Open Access]   [Video of SynthIA outputs from May 5 to June 24, 2016]  

Our system produces synthetic solar magnetograms that combine the best aspects of multiple instruments. This system formed the basis of a successful NASA Heliophysics Division Tools and Method grant to integrate the system into SDO/HMI's Joint Science Center.
2021
teaser image for paper Dandan Shan*, Richard E.L. Higgins*, David F. Fouhey
COHESIV: Contrastive Object and Hand Embedding Segmentation In Video
NeurIPS 2021
[PDF]   [Bibtex]

By applying the Gestalt principle of common fate at scale, we can learn how to segment hand-held objects with fairly minimal supervision.
teaser image for paper Alexander Raistrick, Nilesh Kulkarni, David F. Fouhey
Collision Replay: What Does Bumping Into Things Tell You About Scene Geometry?
BMVC 2021 (Oral)
[PDF]   [Supplement (PDF)]   [Supplement Video (MP4)]   [Bibtex]

Collisions with the world are usually seen as a nuisance. At scale and with a random-walk-inspired formulation, they can be used to learn a depth sensor
teaser image for paper Linyi Jin, Shengyi Qian, Andrew Owens, David F. Fouhey
Planar Surface Reconstruction from Sparse Views
ICCV 2021 (Oral)
[Arxiv]   [PDF]   [Project Page]

We can learn to reconstruct scenes from a handful of views with an unknown relationship. Humans seem to do this fine, but it poses serious challenges for computers.
teaser image for paper Chris Rockwell, David F. Fouhey, Justin C. Johnson
PixelSynth: Generating a 3D-Consistent Experience from a Single Image
ICCV 2021
[Arxiv [PDF [Project Page]   [Bibtex]   [CSE News Piece]

PixelSynth fuses the complementary strengths of 3D reasoning and autoregressive modeling to create an immersive experience from a single image.
teaser image for paper Zhizhuo Zhou, Gemmechu Hassena, Brian C. Weeks, David F. Fouhey
Quantifying Bird Skeletons
CV4Animals Workshop
[PDF]   [Bibtex]  

We can measure bird skeleton specimens extraordinarily accurately and quite fast with deep learning. This system can unlock datasets of birds at unprecedented scales.
teaser image for paper Richard E.L. Higgins, David F. Fouhey, Dichang Zhang, Spiro K. Antiochos, Graham Barnes, J. Todd Hoeksema, K.D. Leka, Yang Liu, Peter W. Schuck, Tamas I. Gombosi
Fast and Accurate Emulation of the SDO/HMI Stokes Inversion with Uncertainty Quantification
The Astrophysical Journal (ApJ), Volume 911, Number 2, 2021
[Arxiv]   [Published PDF]   [Bibtex]   [Project Page]   [HMI Nugget]

We can emulate the magnetogram production pipeline of SDO/HMI, a key NASA mission. This system lays the ground-work for SynthIA, which produces best-of-both-worlds style magnetograms
2020
teaser image for paper S. Qian*, L. Jin*, D. F. Fouhey
Associative3D: Volumetric Reconstruction from Sparse Views
ECCV 2020
[Arxiv]   [Project Page]   [Code]   [Bibtex]

We can build a voxel-based reconstruction of images from two views, even without access to the relative camera positions
teaser image for paper C. Rockwell, D. F. Fouhey
Full-Body Awareness from Partial Observations
ECCV 2020
[Arxiv]   [Project Page]   [Bibtex]   [Code]

Human 3D pose estimation systems work poorly on people as they are usually depicted in video. A self-training method works well at fixing this problem.
teaser image for paper S. Jabbour, D.F. Fouhey, E. Kazerooni, M.W. Sjoding, J. Wiens
Deep Learning Applied to Chest X-Rays: Exploiting and Preventing Shortcuts
MLHC 2020
[PDF]   [Bibtex]   [Code]

Deep nets can easily exploit shortcuts (e.g., apparent bone density), but a simple transfer learning approach can help mitigate the use of shortcuts.
teaser image for paper D. Shan, J. Geng*, M. Shu*, D.F. Fouhey
Understanding Human Hands in Contact at Internet Scale
CVPR 2020 (Oral)
[PDF]   [Bibtex]   [Project Page & Code]

We built a new dataset and model that enables really accurate recognition of basic hand information. Since hands are key to interaction, this basic information unlocks tons of useful new problems.
teaser image for paper M. El Banani, J. Corso, D.F. Fouhey
Novel Object Viewpoint Estimation through Reconstruction Alignment
CVPR 2020
[PDF]   [Supp.]   [Bibtex]   [Code and Project Page]

We can learn to do relative pose estimation by aligning reconstructions
teaser image for paper N. Kulkarni, A. Gupta, D.F. Fouhey, S. Tulsiani
Articulation-aware Canonical Surface Mapping
CVPR 2020
[Arxiv]   [PDF]   [Bibtex]

We can build canonical surface maps for objects that articulate, such as elephants and horses
2019
teaser image for paper A. Szenicer*, D.F. Fouhey*, A. Munoz-Jaramillo, P.J. Wright, R. Thomas, R. Galvez, M. Jin, M.C.M. Cheung
A Deep Learning Virtual Instrument for Monitoring Extreme UV Solar Spectral Irradiance
Science Advances, Vol. 5, Number 10, 2019
[Open Acess]   [Bibtex]   [Prediction Video]   [Activations Video]   [Overview Video]  

We built a virtual version of the EVE MEGS-A instrument that can serve as a replacement after its electrical short
Press coverage/releases:
icon for press coverage for paper(Inverse.com)   icon for press coverage for paper(SETI Institute)
icon for press coverage for paper(theregister.co.uk)   icon for press coverage for paper(phys.org)
icon for press coverage for paper(TechExplorist.com)   icon for press coverage for paper(ScientificAmerican.com)
icon for press coverage for paper(Earth & Space Science News)   icon for press coverage for paper(hpcwire.com)
icon for press coverage for paper(sciencedaily.com)
teaser image for paper D. Zhukov, J.-B. Alayrac, G. Cinbis, D.F. Fouhey, I. Laptev, J. Sivic
Cross-task weakly-supervised learning from instructional videos
CVPR 2019
[PDF]   [Project Page]   [Arxiv]   [Bibtex]

By accounting for the compositional nature of language, we can learn better models from instructional videos
teaser image for paper R. Galvez*, D.F. Fouhey*, M. Jin, A. Szenicer, A. Munoz-Jaramillo, M.C.M. Cheung, P.J. Wright, M.G. Bobra, Y. Liu, J. Mason, R. Thomas
A Machine Learning Dataset Prepared From the NASA Solar Dynamics Observatory Mission
The Astrophysical Journal Supplement, 242:1, 2019
[PDF]   [Arxiv]   [Bibtex]   [Movie & Explanation]   [Small dataset + demo]

We produced a machine-learning-ready dataset that merges the three instruments aboard the NASA SDO mission
Earlier
teaser image for paper A. Kumar, S. Gupta, D. F. Fouhey, S. Levine, J. Malik
Visual Memory for Robust Path Following
NeurIPS 2018 (Oral)
[Project Page]   [PDF]   [Bibtex]  
teaser image for paper D. F. Fouhey, W. Kuo, A. A. Efros, J. Malik
From Lifestyle VLOGs to Everyday Interactions
CVPR 2018
[Project Page]   [Arxiv]   [Bibtex]  
teaser image for paper S. Tulsiani, S. Gupta, D. F. Fouhey, A. A. Efros, J. Malik
Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene
CVPR 2018
[Project Page]   [Arxiv]   [Bibtex]
teaser image for paper M. Lescroart, D. F. Fouhey, J. Malik
Convolutional neural networks represent shape dimensions -- but not as accurately as humans
Abstract at VSS 2018
[Abstract]
teaser image for paper S. Gupta, D.F. Fouhey, S. Levine, J. Malik
Unifying Map and Landmark Based Representations for Visual Navigation
Arxiv 2017
[Project Page]   [Arxiv]   [Bibtex]
teaser image for paper D. F. Fouhey, A. Gupta, A. Zisserman
From Images to 3D Shape Attributes
TPAMI (Pre-print on Arxiv)
The TPAMI version has ugly typesetting (full-width tables on the bottom?) that I was unable to change. Read the Arxiv one.
[Arxiv]   [Bibtex]
teaser image for paper R. Girdhar, D. F. Fouhey, M. Rodriguez, A. Gupta
Learning a Predictable and Generative Vector Representation for Objects
ECCV 2016 (Spotlight)
[Publication (PDF)]   [Bibtex]
[Project Page]
teaser image for paper D. F. Fouhey
Factoring Scenes into 3D Structure and Style
Doctoral Dissertation
[Dissertation (PDF)]   [Bibtex]
[Defense Slides (PDF)]
teaser image for paper D. F. Fouhey, A. Gupta, A. Zisserman
3D Shape Attributes
CVPR 2016 (Oral - Watch the presentation on Youtube)
[Publication (PDF)]   [Bibtex]
[Project Page]   [Poster (PDF)]   [Talk (PPTX)]   [Talk (PDF)]  
teaser image for paper R. Girdhar, D. F. Fouhey, K. M. Kitani, A. Gupta, M. Hebert
Cutting through the Clutter: Task-Relevant Features for Image Matching
WACV 2016
[Publication (PDF)]   [Bibtex]
teaser image for paper D. F. Fouhey, W. Hussain, A. Gupta, M. Hebert
Single Image 3D Without a Single 3D Image
ICCV 2015
[Publication (PDF)]   [Bibtex]
[Poster (PDF)]   [Supplemental (PDF)]   [Bonus Details (PDF)]
teaser image for paper X. Wang, D. F. Fouhey, A. Gupta
Designing Deep Networks for Surface Normal Estimation
CVPR 2015
[Publication (PDF)]   [Bibtex]
teaser image for paper D. F. Fouhey, A. Gupta, and M. Hebert
Unfolding an Indoor Origami World
ECCV 2014 (Oral - Watch the presentation on VideoLectures.net)
[Publication (PDF)]   [Bibtex]
[Project Page]   [Extended Results (PDF)]  
teaser image for paper D. F. Fouhey and C.L. Zitnick
Predicting Object Dynamics in Scenes
CVPR 2014
[Publication (PDF)]   [Bibtex]
[Poster (PDF)]   [Supplemental (PDF)]
teaser image for paper D. F. Fouhey, V. Delaitre, A. Gupta, A. Efros, I. Laptev, and J. Sivic.
People Watching: Human Actions as a Cue for Single View Geometry.
IJCV (extended version of ECCV 2012 paper)
[Preprint (PDF)]   [Final version (via Springer)]
teaser image for paper D. F. Fouhey, A. Gupta, and M. Hebert
Data-Driven 3D Primitives for Single Image Understanding
ICCV 2013
[Publication (PDF)]   [Bibtex]
[Project Page]   [Poster (PDF)]  
teaser image for paper D. F. Fouhey, V. Delaitre, A. Gupta, A. Efros, I. Laptev, and J. Sivic.
People Watching: Human Actions as a Cue for Single View Geometry.
ECCV 2012 (Oral - Watch the presentation on VideoLectures.net )
[Publication (PDF)]   [Bibtex]
[Project Page]  
teaser image for paper V. Delaitre, D. F. Fouhey, I. Laptev, J. Sivic, A. Gupta, and A. Efros.
Scene Semantics from Long-term Observation of People.
ECCV 2012
[Publication (PDF)]   [Bibtex]
[Project Page]
teaser image for paper D. F. Fouhey, A. Collet, M. Hebert, and S. Srinivasa
Object Recognition Robust to Imperfect Depth Data.
CDC4CV 2012 Workshop at ECCV 2012
[Publication (PDF)]   [Bibtex]
[Supplemental(PDF)]   [Supp. Video 1]   [Supp. Video 2]  
teaser image for paper M. Costanza-Robinson, B. Estabrook, and D. F. Fouhey
Representative elementary volume estimation for porosity, moisture saturation, and air-water interfacial areas in unsaturated porous media: Data quality implications
(Sorry for not posting a pre-print!)
In Water Resources Research, Volume 47, 2011
[Official Version]   [Bibtex]
teaser image for paper D. F. Fouhey, D. Scharstein, and A. Briggs.
Multiple plane detection in image pairs using J-linkage.
ICPR 2010
[Publication (PDF)]   [Bibtex]
Implementation (Python and C) [Code (Zip)]   [Poster (PDF)]  
Miscellaneous

About Joining My Group: Due to the volume of emails I receive and limited time, I unfortunately do not have time (and generally do not) respond to inquiries about joining my group. I want to post information that, I hope, addresses questions you may have: Writing (arranged in chronological order): Miscellaneous
Joke Papers
Sometimes when I feel a creative itch, the end result is a joke publication (which despite the name often have a serious point to make). These are all done with the one and only Daniel Maturana.
Fun & Games