Low-latency Image Recognition with GPU-accelerated Convolutional Networks for Web-based Services

Candidate: Fu Jie Huang
Advisor: Yann LeCun


In this work, we describe an application of convolutional networks to object classification and detection in images. The task of image based object recognition is surveyed in the first chapter. Its application in internet advertisement is one of the main motivations of this work.

The architecture of the convolutional networks is described in details in the following chapter. Stochastic gradient descent is used to train the networks.

We then describe the data collection and labelling process. The set of training data labelled basically decides what kind of recognizer is being built. Four binary classifers are trained for the object types of sailboat, car, motorbike, and dog.

GPU based massive parallel implementation of the convolutional networks is built. This enables us to run the convolution operations at close to 40 times faster than running on a traditional CPU. Details about how to implement the convolutional operation on NVIDIA GPUs using CUDA is disscused.

In order to apply the object recognizer in a production environment where millions of images are processed daily, we have built a platform with cloud computing. We describe how large scale and low latency image processing can be achieved with such a system.