Detection and Pose Estimation of People in Challenging Real-world Scenes
Speaker: Micha Andriluka, Max Planck Institute for Informatics, Saarbrücken
Location: Warren Weaver Hall 102
Date: June 21, 2013, 11:30 a.m.
Host: Arjun Jain
In this talk I will describe our recent efforts towards effective methods for people detection and pose estimation in realistic conditions such as crowded street scenes. I will first illustrate the complexity of the problem on the task of 3D articulated tracking of multiple people. Despite recent success in this area the proposed solutions are working well only for people with limited degree of occlusion and typically make strong assumptions on the people poses and motions. I will then discuss our recent work that addresses these two challenges.
I will first describe our approach to people detection that jointly models the observations of two people occluding each other and able to incorporate the patterns of person/person occlusions as features for detection. I will show that our approach is able to significantly boost the detection performance in crowded scenes with frequent occlusions while performing comparable to standard methods in scenes without occlusions.
In the second part of the talk I will present our recent approach to pose estimation of strongly articulated people. Our approach relies on the popular pictorial structures architecture that represents human body using a collection of rigid parts that are typically connected in a tree structure resembling a kinematic tree. In our work we propose a new model that incorporates higher-order dependencies between model parts while remaining efficient. We achieve this by defining a conditional model in which all body parts are connected a-priori, but which becomes a tractable tree-structured pictorial structures model once the image observations are available. As a set of conditioning variables we use the recent "Poselet" representation that is able to effectively capture relationships between multiple non-adjacent body parts. We demonstrate the effectiveness of our approach on several publicly available pose estimation benchmarks improving or being on-par with state of the art in each case.
Refreshments will be offered starting 15 minutes prior to the scheduled start of the talk.