- Time Period: September 2003-June 2004.
- Participants: Margarita Osadchy (Technion), Matt Miller (NEC
Labs), Yann LeCun (Courant Institute/CBLL).
- Video: watch a video of the system in action:
- Slides: Synergistic Face Detection and Pose Estimation.
Slides of a talk delivered at the Object Recognition workshop,
Taormina, Sicily, October 2004.
- Poster:Synergistic Face Detection and Pose Estimation.
Poster presented at NIPS 2004, Vancouver, December 2004.
- [Osadchy, Miller, and
LeCun, 2004] Synergistic Face Detection and Pose Estimation
Proc. NIPS 2004.
- [LeCun and Huang, 2005].
Loss Functions for Discriminative Training of Energy-Based Models.
Proc. AI Stats 2005. This paper is not specifically about face
detection, but about the general concept of Energy-Based Model.
The loss function used for the face detector derives from
- [Vaillant, Monrocq, LeCun, 1994].
An original approach for the localisation of objects in
images, IEE Proc. on Vision, Image, and Signal Processing 1994.
This is an older paper about Yann LeCun's early work on face
detection using convolutional networks.
|[click picture to enlarge]|
Everyone is detected. |
We developed a novel method for real-time, simultaneous multi-view
face detection and facial pose estimation. The method employs a
convolutional network to map face images to points on a manifold
parameterized by pose, and non-face images to points away from that
manifold. This network is trained by optimizing an energy function
of three variables: image, pose, and face/non-face label.
The system was trained with 53,000 grayscale images of faces, manually
annotated with the pose (position, size, pitch, yaw, roll), and
53,000 images of non-faces.
We tested the resulting system, in a single configuration, on three
standard data sets -- one for frontal pose, one for rotated faces, and
one for profiles -- and founnd that its performance on each set is
comparable to previous multi-view face detectors that can only handle
one form of pose variation.
We also show experimentally that the system's accuracy on both face
detection and pose estimation is improved by training for the two
The main conceptual difference between the convolutional net approach
and the popular "Viola-Jones" approach is that convolutional networks
have a fixed number of feature detectors that are highly optimized
through gradient-based learning, while the V-J system generates a very
large number of very simple (binary) features, and uses AdaBoost to
select a good subset.
Our system runs in real time (a few frames per second) on a laptop,
processing each full frame independently of each other.
It operates on grayscale images, and therefore does not rely on the
|[click picture to enlarge]|
The usual suspects are rounded up. |
Members of our group have used convolutional nets for object
spotting in the past:
- [Matan, Burges, LeCun, Denker 1992]:
Multi-Digit Recognition Using a Space Displacement Neural
Network, NIPS 4, 1992. This paper describes our early work using
convolutional nets for spotting and recognizing handwritten
characters in an image.
- Back in 1989/1990, we applied convolutional nets to the detection
of the ID numbers that are painted on the side of railroad cars.
Sadly, this work was never published.
Other groups have used convolutional networks for object detection.
- Christophe Garcia
and Manolis Delakis have a nice paper in the Nov 2004 issue of IEEE PAMI,
face detector based on convolutional nets. They also have
an online demo.
- In 1995, Steve Nowlan and John Platt used convolutional nets
to do hand detection and tracking: [Steven Nowlan, John Platt 1995]:
a convolutional neural network hand tracker, Proc. NIPS 7, 1995
- In 1994, Wolf and Platt used convolutional nets to locate
addresses on postal envelopes: [Wolf, Platt 1994]:
Postal Address Block Location Using a Convolutional Locator
Network, Proc. NIPS 6, 1994.
If you want to experiment with convolutional nets,
A full implementation of is included
in the gblearn2 library distributed with the Lush language.
The Torch C++ Library by Ronan
Collobert, Samy Bengio and Johnny Mariethoz also has an implementation
of convolutional networks (somewhat inspired by the Lush version).
somewhat unusual suspects. |
Seriously deviant stuff: