Training a Computer to See People
Speaker: Deva Ramanan, University of California at Irvine
Location: Warren Weaver Hall 1302
Date: March 4, 2011, 11:30 a.m.
Host: Denis Zorin
One of the great, open challenges in machine vision is to train a computer to "see people." A reliable solution opens up tremendous possibilities, from automated persistent surveillance and next-generation image search, to more intuitive computer interfaces. It is difficult to analyze people, and objects in general, because their appearance can vary due to a variety of "nuisance" factors (including viewpoint, body pose, and clothing) and because real-world images contain clutter. I will describe machine learning algorithms that accomplish such tasks by encoding image statistics of the visual world learned from large-scale training data. I will focus on predictive models that produce rich, structured descriptions of images and videos (How many people are present? What are they doing?) and models that compensate for nuisance factors through the use of latent variables. I will illustrate such approaches for the tasks of object detection, people tracking, and activity recognition, producing state-of-the-art systems as evidenced by recent benchmark competitions.
Deva Ramanan is an assistant professor of Computer Science and the co-director of the Computational Vision Lab at the University of California at Irvine. Prior to joining UCI, he was a Research Assistant Professor at the Toyota Technological Institute at Chicago (2005-2007). He also held visiting researcher positions in the Robotics Institute at Carnegie Mellon University in 2006 and Microsoft Research in 2008. He received his B.S. degree with distinction in computer engineering from the University of Delaware in 2000, graduating summa cum laude. He received his Ph.D. in Electrical Engineering and Computer Science with a Designed Emphasis in Communication, Computation, and Statistics from UC Berkeley in 2005. His research interests span computer vision, machine learning, and computer graphics, with a focus on the application of understanding people through images and video. His past work focused on articulated tracking, while recent work has focused on object recognition. His work in this area won or received special recognition at the PASCAL Visual Object Class Challenge, 2007-2010, including a Lifetime Achievement Prize in 2010. His work on contextual object modeling won the 2009 David Marr prize. He was awarded an NSF Career Award in 2010. His work is supported by NSF, ONR, DARPA, as well as industrial collaborations with the Intel Science and Technology Center for Visual Computing, Google Research, and Microsoft Research. He serves on the editorial board of the International Journal of Computer Vision (IJCV), is a senior program committee member for the IEEE Conference of Computer Vision and Pattern Recognition (CVPR), and has served on multiple NSF panels for computer vision and machine learning.
In-person attendance only available to those with active NYU ID cards.