Pierre Sermanet
PhD in deep learning for vision, speech and robotics
News | Videos | Software | Data | Publications | Citations | LAGR | Eurobot | Resume

Dogs vs. Cats Kaggle challenge

  • Using my OverFeat library, I obtained 1st place in the Kaggle competition against 214 other teams.
  • My system reached 1.09% error, vs 1.69% for the second place.

OverFeat: Feature Extractor and world records

  • OverFeat is a feature extractor for natural images. It also contains a 1000-way classifier. It is written in C but contains wrappers for Lua/Torch, Python and Matlab.
  • OverFeat is also the name of our team which participated in the 2013 ImageNet Large Scale Visual Recognition Challenge (ILSVRC13).
  • World record on object detection and localization on ILSVRC13 datasets and ranked 5th in the classification task.

NYU LAGR featured on Discovery Science Channel

  • Our work on the LAGR program was featured on the Discovery Science channel in January 2010 in the Sci-Fi Science series by Michio Kaku:
    Physics of the Impossible: How to build an Intelligent Robot
  • In this episode, Michio Kaku's goal is to design an intelligent robot. His design includes a learning brain that constantly rewires itself as it performs new tasks.
  • The learning brain is illustrated by our convolutional neural network which has learned to recognize obstacles in natural scenes and also adapts constantly to new obstacles it has never seen before.

EBLearn: Open-Source Machine Learning C++ Library

  • The eblearn library is a C++ implementation of the machine learning algorithms used in the LAGR project, and in particular the supervised training of convolutional neural networks. It includes demos and tutorials.

NYU Robotics Class 2009 (Teaching Assistant): Rovio, 3PI and Arduino robots

Get the Flash Player to see this player.
  • Rovio Soccer: students programmed (in lush) Rovio robots to find a yellow tennis ball and push it between red goal posts. To robustly recognize colors, images are converted from RGB to YUV space to reduce sensitivity to different lighting conditions, then thresholding the squared Euclidean distance of the U and V components. To robustly find colored objects, noise is removed by eroding and dilating the binary image obtained after color detection and objects are identified by a connected components analysis. Finally, the distance to each object is estimated by transforming the distances from pixel space to real world space.
Get the Flash Player to see this player.
  • 3PI Line following: students competed to follow a black line as fast as possible (see video) with the 3PI robots. Those robots have 5 reflectivity sensors on the front side, covering an angle of approximately 45 degrees. The fastest robots used a PID controller.

  • 3PI Dead Reckoning: for this project, we aim for accuracy rather than speed. Robots have to follow a black line until its end, then return exactly to the starting point.

LAGR 2005-2008: Learning Applied to Ground Robots

  • This video gives an overview of the robot's intelligence developped by New York University for the DARPA LAGR program. The robot is able to learn new obstacles representations thanks to neural networks. It can see and navigate from more than 200 meters away.
  • The entire system is fully described in the journal papers "A Multi-Range Architecture for Collision-Free Off-Road Robot Navigation" and "Learning Long-Range Vision for Autonomous Off-Road Driving", published in the Journal of Field Robotics.
  • This video was selected for the video competitions of the AAAI 08 conference on artificial intelligence as well as the LAB-RS 08 symposium on learning and adaptive behaviors in robotics.

EUROBOT / E=M6 2004 Robot Competition

  • For this contest we have built a robot from scratch that can play rugby against another one. It grabs and shoots balls to score points. Our approach is a robust and reactive robot that does not assume his position on the playground, using vision and infrared sensors.
Last update: July 12th, 2010