Computer Science Colloquium

Detection and Pose Estimation of People in Challenging Real-world Scenes

Micha Andriluka, Max Planck Institute for Informatics, Saarbr├╝cken

June 21, 2013 11:30AM
Warren Weaver Hall, 102
251 Mercer Street
New York, NY 10012
(Directions)

Summer 2013 Colloquia Calendar

Host

Arjun Jain
ajain@nyu.edu

Synopsis

In this talk I will describe our recent efforts towards effective methods for people detection and
pose estimation in realistic conditions such as crowded street scenes. I will first illustrate the
complexity of the problem on the task of 3D articulated tracking of multiple people. Despite recent
success in this area the proposed solutions are working well only for people with limited degree of
occlusion and typically make strong assumptions on the people poses and motions. I will then discuss
our recent work that addresses these two challenges.

I will first describe our approach to people detection that jointly models the observations of two
people occluding each other and able to incorporate the patterns of person/person occlusions as
features for detection. I will show that our approach is able to significantly boost the detection
performance in crowded scenes with frequent occlusions while performing comparable to standard
methods in scenes without occlusions.

In the second part of the talk I will present our recent approach to pose estimation of strongly
articulated people. Our approach relies on the popular pictorial structures architecture that
represents human body using a collection of rigid parts that are typically connected in a tree
structure resembling a kinematic tree. In our work we propose a new model that incorporates
higher-order dependencies between model parts while remaining efficient. We achieve this by defining
a conditional model in which all body parts are connected a-priori, but which becomes a tractable
tree-structured pictorial structures model once the image observations are available. As a set of
conditioning variables we use the recent "Poselet" representation that is able to effectively
capture relationships between multiple non-adjacent body parts. We demonstrate the effectiveness of
our approach on several publicly available pose estimation benchmarks improving or being on-par with
state of the art in each case.


top | contact webmaster