Perceiving the World in 2D and 3D
Speaker: Georgia Gkioxari, Meta AI
Date: March 30, 2022, noon
Images are powerful storytellers as they capture events, memorable or mundane, from our everyday lives. Humans have the ability to perceive images effortlessly but for machines to do the same, they need to build an understanding of the world, a world composed of complex objects, humans and their rich interactions. In this talk, I will present my work towards enabling machines to recognize and localize objects and their interactions from images, work that is powering products in industry used by millions of people, such as Portal. The advances in 2D visual understanding are unprecedented but the world is 3D and objects have 3D properties which modern recognition models ignore. Toward 3D perception, I will present my work on inferring 3D object shapes from real-world images and understanding 3D scenes via multi-view 2D supervision. To this end, I will present PyTorch3D, our efficient and modular 3D deep learning library which efficiently fuses advances in deep learning with geometry and is widely adopted within the academic and industry research community.
Georgia Gkioxari is a research scientist at Meta AI. She received her PhD in computer science and electrical engineering from the University of California at Berkeley under the supervision of Jitendra Malik in 2016. Her research interests lie in computer vision, with a focus on object recognition from images and videos. In 2017, Georgia received the Marr Prize at ICCV for "Mask R-CNN". In 2019, she was named one of the 30 Influential Women Advancing AI by ReWork and was nominated for the Women in AI awards by VentureBeat. In 2021, Georgia received the PAMI Young Researcher Award and the Mark Everingham prize for Detectron.