CBLL HOME
VLG Group
News/Events
Seminars
People
Research
Publications
Talks
Demos
Datasets
Software
Courses
Links
Group Meetings
Join CBLL
Y. LeCun's website
CS at Courant
Courant Institute
NYU
Lush
Lush

Action Recognition in Video


  • Time Period: 2009 - present.
  • Participants: Graham Taylor, Rob Fergus, Chris Bregler, Yann LeCun (Courant Institute/CBLL).
  • Sponsors: DARPA, ONR.
  • Description: A trainable system was built to recognize actions in videos. The first layer is a Convolutional Gated Restricted Boltzmann Machine, which is trained in an unsupervised manner. It automatically learns features that primarily encode motion. The second layer uses sparse coding to learn mid-level features in an unspervised manner. The feature vectors thereby obtained are pooled over time, using a max-pooling operation, and fed to a Support vector Machine. Excellent performance was obtained on the Hollywood-2 dataset. A similar system was built to recognize actions on the KTH dataset. It also uses a CGRBM at the first layer, but uses a 3D (spatio-temporal) convolutional network architecture for the following layers.

Latest Video

Watch the real-time demo of our action recognition system (August 2010):

Publications

149. W. Taylor, Graham, Rob Fergus, Yann LeCun and Christoph Bregler: Convolutional Learning of Spatio-temporal Features, Proc. European Conference on Computer Vision (ECCV'10), 2010, \cite{taylor-eccv-10}. 467KBDjVu
857KBPDF
557KBPS.GZ