Learning Models for Scalable Visual Recognition
Speaker: Lorenzo Torresani, Dartmouth College
Location: Warren Weaver Hall 1302
Date: February 12, 2010, 11:30 a.m.
Host: Chris Bregler
Visual recognition, the task of determining what is present in a picture, remains, still today, one of the fundamental open problems of computer vision. Over the last two decades, recognition models learned from examples have progressively supplanted traditional hand-constructed systems. However, two scalability challenges hinder the applicability of learning approaches to modern recognition problems. The typical learning-to-recognize pipeline requires training on large collections of manually labeled data, such as carefully selected image segmentations. This reliance on time-consuming human labeling effectively prevents these methods to scale to problems involving a large number of different object classes. The second challenge is linked to the dramatic recent growth of image data sets such as Internet-based photo sharing services, which have opened a wide array of potential new applications of visual recognition. However, current learning-based recognition models are too computationally expensive to support search and recognition in these ever expanding large image collections.
In this talk, I will discuss work aimed at addressing these two scalability problems. I will begin with an algorithm that can learn to detect and recognize objects from images annotated with labels indicating the presence of an object, but not its location. As the method requires very little supervision, it can be used to learn to recognize many different object classes. In the second part of my talk, I will discuss a learning algorithm for real-time visual recognition in large image collections. Efficiency is achieved by constraining the recognition model to be of a form directly executable on traditional text-search engines, which perform real-time search in databases of several billion documents.
Lorenzo Torresani is an Assistant Professor in the Computer Science Department at Dartmouth College. He received a Laurea Degree in Computer Science with summa cum laude honors from the University of Milan (Italy) in 1996, and an M.S. and a Ph.D. in Computer Science from Stanford University in 2001 and 2005, respectively. In the past, he has worked at several industrial research labs including Microsoft Research Cambridge, Riya and Digital Persona. His research interests are in computer vision, computer animation, and machine learning. In 2001, Torresani and his coauthors received the Best Student Paper Award at the IEEE Conference On Computer Vision and Pattern Recognition (CVPR).
Refreshments will be offered starting 15 minutes prior to the scheduled start of the talk.