Colloquium Details

Making Sense of the Multimodal World

Speaker: Carl Vondrick, Columbia University

Location: 60 Fifth Avenue 150

Date: December 4, 2024, 2 p.m.

Host: David Fouhey

Synopsis:

People experience the world through modalities of sight, sound, words, touch, and more. By leveraging their natural relationships and developing multimodal learning methods, my research creates artificial perception systems with diverse skills, including spatial, physical, logical, and cognitive abilities, for flexibly analyzing visual data. This multimodal approach provides versatile representations for tasks like 3D reconstruction, visual question answering, and object recognition, while offering inherent explainability and excellent zero-shot generalization across tasks. By closely integrating diverse modalities, we can overcome key challenges in machine learning and enable new capabilities for computer vision, especially for the many upcoming applications where physical interaction are required.

Speaker Bio:

Carl Vondrick is the YM Associate Professor of Computer Science at Columbia University. Previously, he was a Research Scientist at Google, and he received his PhD from MIT. His research interests are in computer vision, machine learning, and their applications. He is the recipient of the PAMI Young Researcher Award and the NSF CAREER award. is research is supported by the NSF, DARPA, Amazon, Google, and Toyota. For more information, please visit his website at https://www.cs.columbia.edu/~vondrick/

Notes:

In-person attendance only available to those with active NYU ID cards.

How to Subscribe