Research Overview


I am interested in modeling the statistics of images, from high-level representations of scenes and objects to low-level cues such as image gradients. Such models may be used for a range of applications within Computer Vision and Computational Photography. Below are brief descriptions of my research, along with links to pages giving more details.

 

Object Recognition Link

One of my main areas of research is Object Recognition. Here the goal is to give computers the ability to "see" just as humans do.  A computer should be able to know where it is and what surrounds it just by looking. Today, cameras are ubiquitous but we lack the computational algorithms to process the images into more useful representations.

I have focused on the problem of recognizing object categories. While there are now viable methods for finding specific objects (e.g. a can of Coke) in images, the more general problem of finding categories of objects (e.g. all cans of soda) is harder. My co-authors and I have proposed various probabilistic representations which can be used in conjunction with machine learning methods to learn object models from a set of images containing the desired class of object. This model can then be used to recognize instances of the class in novel images. 


Leveraging the Internet for Object Recognition

The Internet is an incredibly rich resource of information that I am interested in using in conjunction with object recognition algorithms. One possibility is to use Internet image search engines to provide a diverse set of images from which object category models may be trained. In turn, these models may be applied to collections of images from the Internet or elsewhere, enabling search by visual content (known as content-based image retrieval, CBIR), rather than the text-based searches that are currently employed. Link

Another, more recent direction of investigation, is in "brute force" approaches to recognition, using 100's of millions of images gathered from the web. By using overwhelming amounts of data, very simple algorithms can perform surprisingly well. Link 

 

Computational Photography

Computational Photography is an area at the convergence of Computer Graphics and Vision. Through the use of computation, its goal is to overcome the limitations of traditional cameras, producing a richer, more informative representation of the visual world. My co-authors and I have used low-level statistical models of image gradients in two applications that extend the capabilities of conventional cameras.

1. Image deblurring. Many photos are spoiled by the user's hand moving while the camera shutter is open. Points in the scene are smeared out over the exposure interval, resulting in a blurry photo. My co-authors and I pose the problem as a blind deconvolution: we assume the blur function is constant over the image and so aim to recover the blur kernel (the motion of the user's hand) together with the underlying sharp image. We make use of heavy-tailed image priors on the image gradients in conjunction with some sophisticated machine learning tools to solve the blind deconvolution problem. We apply the algorithm to real photos, obtaining what we believe are the first convincing results on this difficult problem. Link


2. Acquiring depth and an image using a conventional camera with a coded aperture. Together with Anat Levin, we present a technique that permits BOTH a high resolution image AND depth information to be recovered from a single shot. To do this we make a simple modification to a conventional lens: we insert into the aperture a pattern cut from a piece of cardboard. This pattern changes the out-of-focus blur patterns, facilitating the acquisition of depth information. The local defocus patterns can be analyzed to recover a depth map. Then, using the depth map a high resolution all-focus image can also be recovered. Link