VLG Group
Group Meetings
Y. LeCun's website
CS at Courant
Courant Institute

LAGR: Learning Applied to Ground Robotics

A collaboration between CBLL (Courant Institute) and Net-Scale Technologies.

LAGR Robot #13 "Coureuse".
The purpose of the LAGR project is to design vision and learning algorithms to allow the robot to navigate in complex outdoors environment solely from camera input.

Our team is one of 10 participants in the program funded by DARPA.

Each LAGR team received identical copies of the LAGR robot, built be the National Robotics Engineering Consortium at Carnegie Mellon University.

The government periodically runs competitions between the teams. The software from each team is loaded and run by the goverment team on their robot.

The robot is given the GPS coordinates of a goal to which it must drive as fast as possible. The terrain is unknown in advance. The robot is run three times through the test course.

The software can use the knowledge acquired during the early runs to improve the performance on the latter runs.

LAGR Robot #27 "Coureur".

Latest Video

Watch a video describing the final version of our system (version 4.0)

Click here for a high quality 640x480 AVI version of the video

Click here to watch more videos of the robot in action



10. Raia Hadsell, Pierre Sermanet, Marco Scoffier, Ayse Erkan, Koray Kavackuoglu, Urs Muller and Yann LeCun: Learning Long-Range Vision for Autonomous Off-Road Driving, Journal of Field Robotics, 26(2):120-144, February 2009, \cite{hadsell-jfr-09}. 1330KBDjVu

9. Pierre Sermanet, Raia Hadsell, Marco Scoffier, Matt Grimes, Jan Ben, Ayse Erkan, Chris Crudele, Urs Muller and Yann LeCun: A Multi-Range Architecture for Collision-Free Off-Road Robot Navigation, Journal of Field Robotics, 26(1):58-87, January 2009, \cite{sermanet-jfr-09}. 5401KBDjVu

8. Pierre Sermanet, Marco Scoffier, Chris Crudele, Urs Muller and Yann LeCun: Learning Maneuver Dictionaries for Ground Robot Planning, Proc. 39th International Symposium on Robotics (ISR'08), 2008, \cite{sermanet-isr-08}. 1990KBDjVu

7. Raia Hadsell, Ayse Erkan, Pierre Sermanet, Marco Scoffier, Urs Muller and Yann LeCun: Deep Belief Net Learning in a Long-Range Vision System for Autonomous Off-Road Driving, Proc. Intelligent Robots and Systems (IROS'08), 2008, \cite{hadsell-iros-08}. 517KBDjVu

6. Pierre Sermanet, Raia Hadsell, Marco Scoffier, Urs Muller and Yann LeCun: Mapping and Planning under Uncertainty in Mobile Robots with Long-Range Perception, Proc. Intelligent Robots and Systems (IROS'08), 2008, \cite{sermanet-iros-08}. 205KBDjVu

5. Ayse Erkan, Raia Hadsell, Pierre Sermanet, Jan Ben, Urs Muller and Yann LeCun: Adaptive Long Range Vision in Unstructured Terrain, Proc. Intelligent Robots and Systems (IROS'07), 2007, [key=erkan-iros-07]. 211KBDjVu

4. Raia Hadsell, Ayse Erkan, Pierre Sermanet, Jan Ben, Koray Kavukcuoglu, Urs Muller and Yann LeCun: A Multi-Range Vision Strategy for Autonomous Offroad Navigation, Proc. Robotics and Applications (RA'07), 2007, [key=hadsell-ra-07]. 416KBDjVu

2. Pierre Sermanet, Raia Hadsell, Jan Ben, Ayse Naz Erkan, Beat Flepp, Urs Muller and Yann LeCun: Speed-Range Dilemmas for Vision-Based Navigation in Unstructured Terrain, Proc. 6th IFAC Symposium on Intelligent Autonomous Vehicles, 2007, [key=sermanet-07]. 149KBDjVu

2. Raia Hadsell, Pierre Sermanet, Ayse Erkan, Jan Ben, Jeff Han, Beat Flepp, Urs Muller and Yann LeCun: On-line Learning for Offroad Robots: Using Spatial Label Propagation to Learn Long-Range traversability, Proc. Robotics Science and Systems 07, 2007, [key=hadsell-rss-07]. 359KBDjVu

1. Raia Hadsell, Pierre Sermanet, Jan Ben, Ause Erkan, Jeff Han, Matt Grimes, Sumit Chopra, Yury Sulsky, Beat Flepp, Urs Muller and Yann LeCun: On-line Learning for Offroad Robots: Using Spatial Label Propagation to Learn Long-Range traversability, Computational and Biological Learning Lab, Courant Institute, NYU, 2007, [key=hadsell-lagrtr-07]. 203KBDjVu

The LAGR Robot

The LAGR robot, was designed and built by a team at the National Robotics Engineering Consortium at Carnegie Mellon University. The lead designer was Herman Herman, hence the the robot's nickname: hHerminator.

The robot has two independently controled, electric-powered front wheels and two caster wheels at the rear. The sensors are:

  • Two "bumblebee" 1024x768 color stereo camera pairs mounted on the mast, providing a 110 degree field of view.
  • a 6-degree of freedom Inertial Measurement Units (IMU) wheel encoders, and a GPS for pose estimation.
  • a front bumper with left and right switch sensors.
  • two short-range infrared sensors that can (unreliably) detect obstacles up to 1.5 meters away.
In its original version, the robot contained three 1.4GHz Pentium-M based computers running Linux in its belly, plus a 4th processor for low-level control tasks. The new version has three dual-core 2.0GHz Pentium-M. The computers are connected through a Gigabit ethernet network. One computer is connected to the left camera pair, one to the right camera pair, and one is for general use (called the planner).

Communication with the outside world is provided through WiFi.

The Netscale/NYU software to run the robot is entirely implemented in the Lush language, a dialect of Lisp specifically designed for numerical taks, particularly computer vision and machine learning.

Separate Lush processes running left eye and right eye computers perform image acquisition, preprocessing, rectification, stereo, short-range obstacle detection, and long-range obstacle detection with on-line self learning. Each process sends a local map of the environment to the planner computer through a socket. The planner computer runs several Lush processes for combining the various maps, planning a route, and driving the robot.

The Task: Vision-Based Navigation with Machine Learning

Robot navigation in outdoors environments solely from vision is a very challenging problem.

The purpose of the LAGR project is to improve the state of the art in vision-based outdoors robot navigation by using learning methods.

A typical test run of the LAGR robot consists in placing the robot in an unknown terrain, and letting it drive autonomously to a prescribed GPS location as fast as possible. The robot is run three times through the course. Information can be gathered during during one run so as to improve the performance on subsequent runs.

A typical course is 100 to 200 meters through various outdoors environments with a variety of terrains (grass, dirt, sand, roads, forest grounds), and natural and artificial obstacles (trees, bushes, rocks, hay bales, fences,...).

The Problem: The Short-Sightedness of Stereo Vision

The method of choice for vision-based driving in off-road mobile robots is to construct a traversibility map of the environment using stereo vision. In the most common approach, a stereo matching algorithm, applied to images from a pair of stereo cameras, produces a "point-cloud", in which the visible pixels are given an XYZ position relative to the robot. A traversibility map can then be derived using various heuristics, such as counting the number of points that are above the ground plane in a given map cell. Maps from multiple frames are assembled in a global map in which path finding algorithms are run.

One problem with real-time map building from stereo is that the accuracy of range estimates degrades quickly above 7 to 10 meters. This causes robots to be "short-sighted", or as if in a self-imposed ``fog''. The robot would often drive into dead-ends, and take excessive time discovering pathways that, while beyond stereo range, would be obvious to a human observer.

Top view of a map generated from stereo (stereo is run at 320x240 resolution). The map is "smeared out" and sparse at long range because range estimates from stereo become inaccurate above 8 to 10 meters. Traversibility maps built from this are either short range or inaccurate. (click on image for high res version)

NSNYU System v4.0: Final Version

This system first became operational in November 2007.


The system is composed of the following components:

System architecture.

Convolutional Network

Convolutional networks are trainable machines composed of multiple layers of trainable non-linear convolutional filters. Convolutional networks have been used with great success (and deployed commercially) for such applications as handwriting recognition (e.g. check reading), face and people detection in images, and such.

The main advantage of convolutional networks is that they can be fed with "raw" pixels, without requiring extensive hand-crafted. pre-processing. The training process in convolutional nets automatically produces relevant features. In this system, the convolutional network is fed with the horizon-leveled image bands from the multiscale image pyramid.

The convolutional net can be seen as extracting a 100-dimensional feature vector for every 25x12 pixel window in the image bands (the windows are stepped every 4 pixels horizontally).

These feature vectors are fed to a logistic regression classifier that is trained on-line (as the robot runs) using the labels produced by the medium-range stereo module. Feature vectors from the last few frames are kept in a short-term memory (a ring buffer) to train the classifier. This module runs at about 1 frame per second.

Convolutional network architecture for long-range obstacle detection.

Results of Long-Range Obstacle Detection

Below is a series of triplets of images showing the result of the medium-range and long-range vision systems.

Each row shows three images:

Green indicates a traversable area, purple indicates an obstacle footline, and red indicates an obstacle.

Results of long-range obstacle detection

Hyperbolic Polar Map

The results from the various obstacle labeling systems are combined and accumulated into a map of the surroundings of the robot.

The map is a hyperbolic-polar map centered on the robot. In a hyperbolic representation, the entire world is mapped to a finite-radius disk using a hyperbolic mapping. Label information from multiple successive frames are accumulated in category histograms stored at each map cell. Before planning, the category histograms are transformed into traversability costs. Planning a route to the goal map performed using a A* path planning algorithm.

The picture below shows examples of map produced at various points along a trajectory followed by the robot.

Example of maps produced by the perception system.

Links to more information

For other details, pictures and videos, visit Raia Hadsell's LAGR page with more details, pictures, and the latest videos.

NSNYU System v3.0: Fast OD - Far OD - Convolutional Net

This system was operational between Winter 2007 and Summer 2007. While it worked satisfactorily, it was found to be too sensitive to the ground plane assumption.

The v2.0 system had a major flaw: the long-range obstacle detector took too long to run, and created long latencies (sometimes as long as 1/2 second) in the visual control loop. These delays, combined with built-in delays in the low-level control loop of the robot firmware, caused the robot to crash into obstacles before having time to avoid them.

We designed a new system composed of a very fast, very short range, stereo-based obstacle avoidance system (FastOD), combined with a slower, learning-based, long-range obstacle detection system (FarOD).

FastOD operates on 160x120 resolution images, is purely based on stereo, produces a map with 5 meter radius, and runs at 10 frames per second.

FarOD uperates on 320x240 resolution images, combines stereo and a trainable obstacle detector based on convolutional nets, produces a map with 35 meter radius, and runs at 3 frames per second.

The FastOD had a range of 5 meters, while the FarOD had a range of 35 meters.

The RBF-based long-range obstacle detector of v2.0 was replaced by new architecture based on the convolutional network method.

The convolutional network obstacle detector operates on the size-normalized bands from the pyramid. It has 4 layers and operates on YUV components (with contrast normalization on the Y component). The network is first trained off-line on log files. Each column of pixel in the bands that are withing stereo range are assigned a traversibility label by the stereo vision system. The convolutional net is trained map each band to this corresponding row of stereo labels.

The long-range obstacle detector works very well, as shown in the pictures below. Each panel shows (from left to right) the original image, the traversibility labels produced by stereo (up to 10-12 meters), and the traversibility labels produced by the convolutional net long-range obstacle detector trained on-line (up to 30-35 meters).

The v3.0 system worked quite well, but the distance estimated it provided were very dependent on the ground plane assumption. Furthermore, there were various issues with the robot-centered cartesian map populated by the long-range obstacle detector (FarOD). First, it was prohibitively expensive to extend this map beyond the range of 40 meters or so (the cartesian map would become prohibitively large). Second, the map was very "blurry" in the far range because range estimates vary considerably from one frame to the next. Obstacles estimated to be at 25 meters in one frame could very well be placed at 35 meters in the next frame.

Hence, we decided to implement a robot-centered map for FarOD whose cell shape would somewhat reflect the distance uncertainties produced by FarOD.

This resulted in the so-called Hyperbolic-Polar Map, or H-Polar Map for short.

NSNYU System v2.0: Long-Range Map with On-Line Learning

This version was operational between summer 2006 and winter 2006. The long-range obstacle detection system with on-line learning worked very well. The overall system worked OK, but never quite fulfilled our expectations. The main reason was excessive latency in the control loop.

Humans can easily locate pathways from monocular views, e.g. trails in a forest, holes in a row of bushes. Our second system represents an attempt to use on-line learning to provide the same capability to a mobile robot. Although supervised learning can be used for robot driving (see the Dave project), autonomous learning is far preferable. One long-advocated idea for autonomous learning in robots is to use the output of reliable modules (such as traversibility from stereo at short range) to provide labels for a trainable module (such as a long-range obstacle detector).

This has come to be known as "near to far learning".

Our proposed approach, designed for the LAGR robot, builds a distance-invariant pyramid of images at multiple scales, such that the appearance in the image at scale X of an object sitting on the ground X meters away is identical to the appearance in the image of scale Y of the same object when sitting on the ground Y meters away. The process is depicted at right. Windows are extracted from the image. The size of a window is inversely proportional to the estimated distance of the foot of obstacles that fall in the bottom third of the window (the distance is estimated through a ground plane fitting). The windows are then normalized to a fixed height.

More precisely, an pyramid of distance-normalized bands is built. Each band is parallel to the horizon, and contains the feet of obstacles at a given distance from the robot.

Each extracted band is then resized to make it a uniform height (12 pixels), resulting in image bands in which the appearance of an object on the ground is independent of its distance from the camera (only the band in which it appears varies). These uniform-height, variable-width bands form a size-normalized image pyramid whose 20 scales are separated by a factor of 2^(1/4). In order to trace the virtual lines on the ground on which the bands are centered, a ground plane is fitted to the stereo point cloud using a Hough transform.

The image bands from the pyramid are fed to a 2-layer classifier whose first layer is a series of 120 radial basis functions trained with K-means (unsupervised), and whose second layer is a linear classifier trained online as the robot runs.

The last layer of the convolutional net is trained on-line while the robot is running. It is trained on nearby bands (up to 10-12 meters) using labels provided by stereo vision, and is then applied to all the bands, including the ones that are outside of stereo range (up to 30-35 meters).

The classifier properly generalizes to far away bands since the objects in the distance-normalized bands appear with the same size regardless of their distance.

By classifying windows taken from the bands of the image pyramid, traversibility information can be directly mapped to specific world coordinates, since the distance to the center of each band is known. Thus, the pyramid provides the structure so that the long-range obstacle detector (OD) can generate accurate range maps.

The long-range OD goes through a labeling, training, and classification cycle on every frame. First, each overlapping 12x3 pixel RGB window from the right camera is assigned a traversibility label (ground or obstacle) if it is within stereo range (less than 10 meters) and if stereo data is available. Then feature vectors are computed for all windows over the entire pyramid. Each feature vector is comprised of euclidean distances or correlations between a 12x3 RGB window and 100 fixed prototypes trained in advance with an unsupervised learning algorithm. A logistic regression classifier is applied to the feature vectors, and trained using the labels provided by stereo. The resulting classifier is then applied to all feature vectors in the pyramid, including those with stereo labels.

The picture below shows examples of the maps generated by the long-range obstacle detector. The long-range OD not only yields surprisingly accurate traversibility information at distance up to 30 meters (far beyond stereo range), but also produces smooth, dense traversibility maps for areas that are within stereo range. The stereo-derived maps often have noisy spots or holes - disastrous to a path planner - but the adaptive long-range OD produces maps that are smooth and accurate, without holes or noise.

NSNYU System v1.0: Building a Polar Map from Stereo

Version 1.0 of the NSNYU system was operational from February 2006 to July 2006.

This system drives significantly faster than the LAGR baseline system (by a factor of 2.5), and was the fastest of all the LAGR contestants as of March 2006.

Our first system, while conventional in its use of stereo, attempts to solve the short-sightedness problem by separating the "tactical" aspect of driving and avoiding obstacles from the "strategic" aspect of map building and path planning. The basic design philosophy is to trust our eyes more than we trust our map.

Picking good candidate directions in which to drive must be done by looking around, not by looking at a map, and can be done without good distance estimates. Consider the image on the right, generating candidate directions can clearly be done, even though the absolute distance of the various obstacles may be unknown. It's enough to see that the right edge of the bush is straight ahead to figure out we should veer right.

The distance of the obstacles may be unknown, but the direction to take is obvious.
The main idea is to use stereo to build a local map of the visible environment in polar coordinates. The polar representation allows to represent angles (directions) with high accuracy (independently of distance), while increasing the surface area of map cells with distance, avoiding the problem created by empty map cells at large distances.

Once candidate directions are generated in the polar map, picking the best candidate can be done using traditional path planning on a map, starting from each of the candidate points.

System Architecture. Tactical driving decisions a performed in a 15-meter radius map in polar coordinates of the visible environment. This process runs at roughly 6 frames per second. Candidate waypoints from that local planning are sent to a global planner in a long-range cartesian map. The local planning finds candidates that are as close a possible to the current global route.

Our system is therefore based on the following steps:

Ground plane fitting. The ground plane is found by a robust fit of the point cloud obtained from stereo. Each yellow line on the right represents an imaginary line on the ground at a given distance from the camera. Point above the ground plane are colored red, and points on the ground plane are colored green.

Traversability Map. Once the ground plane is found, obstacles that are out of the groundplane (shown on red) can be discriminated from points near the ground plane (shown in turquoise). Click on the image for a larger view.

Click here for A video of the traversibility map extraction [2.6MB MPEG4 .AVI]

The practical range of simple stereo-based map building is limited for two reasons:
  • 1 it is difficult to estimate whether far-away points are near the ground or above the ground;
  • 2 the distance estimates are quite inaccurate for points more than 7 or 8 meters away from the camera.
To solve problem 1, we estimate the parameters of the ground plane by fitting a plane through the stereo point cloud. Two methods were used: Hough transform on point clouds in elevation, azimuth, disparity space; and EM-like robust plane fitting on point clouds in XYZ space. The traversibility of an area is estimated by measuring the density of points that are above the ground plane in that area.

Problem 2 is approached by noting that, while absolute range estimates of distant points are inaccurate, relative range estimates are relatively accurate, and azimuth estimates are very accurate.

This suggests that searching for good direction in which to drive the robot is better performed using a map of the visible environment represented in polar coordinates, rather than using a cartesian map of the entire ground. Our system identifies candidate waypoints up to 15 meters away in this local polar map, and uses them as starting points of a path finding algorithm in a global cartesian map.

An example of a 15-meter range stereo map is shown at right. The candidate waypoints are shown as hollow circles. Candidate waypoints are generated using a ray-casting algorithm followed by various selection heuristics. For example, candidate waypoints are generated near edges of obstacles.

Once candidate waypoint have been generated, a path-finding algorithm is run in the global map starting from each candidate waypoint. Given the current global map, the path finding algorithm finds the approximate shortest path from each condidate waypoint to the goal. The candidate waypoint with the shortest associated path to the goal is chosen.

Since the path-finding algorithm can take a long time to compute a path, it is run as a separate process. While the path is being updated, a fast-looping "autopilot" algorithm is used to steer the robot towards the candidate waypoint that is closest to the latest best path.

Example of polar map with candidate waypoints. The map has a range of 15 meters

Path Finding with Raystar

The path finding algorithm is a novel approximate algorithm based on ray casting, dubbed ``Raystar''.

The algorithm is built around the classical A* algorithm, except that candidate waypoints are generated from the current waypoints by casting rays at a predetermined distance. If a ray hits an obstacle, no candidate waypoint is generated on that ray. Various heuristics are used to place waypoints on rays. Those candidate waypoints are place on the A* "open list".

While the algorithm is suboptimal, it is considerably faster than traditional A*, in which new candidate waypoints are immediate neighboring cells of the current waypoints.

Example of run of the "raystar" path-planning algorithm on a synthetic map (click in image for larger version).

Example map building and path planning produced by the system on a real run (click in image for larger version).

Software Architecture

The LAGR robot contains three 1.4GHz Pentium M computers with flash drives (and no hard drive) running Linux. A fourth computer, which is not accessible to high-level software, takes care of low-level tasks (reading the sensors and preprocessing the signals, interacting with the radio control, and sending commands to the motors). The computers are connected through a Gigabit ethernet network and can talk to the outside world throug a WiFi bridge.

The first computers, called lagr-lefteye, is connected to the left stereo camera pair. The second one, called lagr-righteye to the right camera pair, and the third, lagr-planner is for general use.

Our software is entirely implemented in the Lush language. Lush is an object-oriented dialect of Lisp with numerous libraries for numerical computing, image processing, computer vision, machine learning. Lush has two main advantages for this project:

  • The Lush compiler can generate very efficient code for numerical functions.
  • It is extremely easy to integrate C/C++ code with Lush code. In fact, it is even possible to mix C/C++ code with Lush code within a single Lush function.
This allowed us to use interpreted Lush code for all the high-level processing, compiled Lush code for complex mid-level functions, C/C++ code for all the low-level numerical operations, and highly optimized library functions called from Lush for all the heavily repetitive numerical operations, such as low-level image processing and matrix function. We made heavy usage of the Intel Performance Library for this.

The system runs four Lush processes that communicate through Lush's "remote-lush" class. This mechanism allows two Lush processes to communicate data structures through Lush's built-in data serialization protocol. It also allows one Lush process to request the evaluation of a Lush expression by another Lush process (with blocking or non-blocking synchronization).

The main Lush process runs on the lagr-planner computer. Its main loop runs at an average of about 20Hz. Given the current best path to the goal and the current candidate waypoints, it picks the candidate waypoint that is closest to the best path, and steers the robot towards it, adjusting the speed if necessary. The speed is adjusted as a function of the curvature of the trajectory and the estimated distance to the neareast obstacle.

The main process controls Lush processes running on each eye computer. It sends them requests to compute the portion of the local map that falls within their field of view. Whenever the eye computers complete the processing of one frame, the main process integrates the to half maps coming from the two eye computers, builds the composite local map in polar representation, and computes the candidate waypoints. The process on each eye computer runs at about 7 frames per second.

The main process also sends requests to a fourth Lush process, running on the lagr-planner computer, to compute the best path to the goal, given the current set of candidate waypoints (using the "raystar" algorithm). The time taken by this path-finding process is highly variable. It depends on the size and complexity of the global map and can vary between 0.1 second and 5 seconds.

In addition, the main process also detects wheel slips and bumper hits. If a bumper hit or a long wheel slip is detected, the robot enters an obstacle avoidance behavior. It puts a lethal obstacle in its maps at the location of the hit, and backs up on its tracks for a given distance. It then recomputes candidate waypoints.

In addition to all this, a lightweight thread writes diagnostic information to a ramdisk file that is read by a display software running on a remote laptop computer while the robot is running.

Excessive latency in the vision control loop can lead to catastrophic behavior patterns, such as running into obstacles at high speed, or oscillating wildly around the prescribed driving direction. Therefore, the fast frame rate of video processing, and the quick update of the driving direction by the main process are key to the overall performance of the system. This is one of the reasons why we chose to run the stereo at relatively low resolution (320x240) compared to the maximum resolution of the cameras (1024x768).

Videos of NSNYU System v1.0 on Test Courses

These videos show the robot running along various courses autonomously. It is being controled by the v1.0 System, which uses stereo vision to build a 15-meter range polar map. No learning is involved here, beyond remembering the map from run to run.

This is faster than the LAGR Baseline System by a factor of 2 to 4.

finding its way between rows of trees (Sandy Hook Park, NJ, 02/2006) [AVI/MPEG4] (4.8MB)
Avoiding brushes to find a pathway leading to the goal (Sandy Hook, NJ, 02/2006) [AVI/MPEG4] (5.2MB)
Avoiding trees, fences, and other obstacles: the long way (Holmdel Park, NJ, 02/2006) [AVI/MPEG4] (18.6MB)
Avoiding trees, fences, and other obstacles: a shorter way (Holmdel Park, NJ, 02/2006) [AVI/MPEG4] (16.8MB)
Avoiding trees, fences, and other obstacles: a better way, with a tree hit (Holmdel Park, NJ, 02/2006) [AVI/MPEG4] (12.9MB)

example of map generated as the robot drives around the park near the tennis courts.

Videos of NSNYU System v1.0 during the Official LAGR Test 11

These videos shows the three official runs of the robot running the record-breaking NYU/Net-Scale software during the LAGR Test 11 campaign in Feb 2006.
Run 1. Time: 5:10. The robot took the "shortcut" on the right and wasted time in the cul-de-sac. At one point, the circuit breakers of the motors tripped and had to be reset. [WMV] (50.7MB)
Run 2. Time: 2:55. Remembering the map from the previous run, the robot takes the mulch path. Wheel slips towards the end were interpreted by the robot as caused by hard obstacle hits. This caused the robot to back up and waste a bit of time. /td> [WMV] (25.6MB)
Run 3. Time: 2:29. The software decided to erase its map because the GPS fix had drifted too much. The robot took the "shortcut" again. It fell into the cul-de-sac but got out of it quite quickly. This was the fastest run among all the teams for Test 11. [WMV] (23.7MB)

"Unofficial Race: NYU/Net-Scale versus U. Penn"

On March 31 2006, an informal race was organized between the Net-Scale/NYU team and the U. Penn team. The race took place on the grounds of the Crawford Hill Bell Labs location, near the radio-telescope antenna used by Wilson and Penzias to discover the background cosmic radiation.


(click on pictures for larger versions).
Some distinguished members of the cheering audience. Many are current and former AT&T Labs and Bell Labs scientists who wouldn't miss an afternoon of uber-geeky fun.
The two robots on the starting line. NYU/Net-Scale on the left, U. Penn on the right. Jan Ben from the NS/NYU team holds the transmitter, ready to switch his robot to autonomous mode
The robots are neck-to-neck during most of Heat 1. U. Penn is slightly ahead, and will exit the fenced area ahead of NS/NYU.


The judge with the red hat near the finish line is Vladimir Vapnik.
Heat 1. U. Penn wins: U. Penn pulls ahead and blocks the passage to the NYU/NS robot. U. Penn pulls out of the cul-de-sac in the fenced area ahead of NS/NYU and wins the heat. [MPG] (30.6MB)
another shot of the same heat (lower quality): [AVI] (19.3MB)
Heat 2. NS/NYU wins: NS/NYU is off to a great start and easily wins the heat. [MPG] (20.5MB)
Heat 3. NS/NYU wins: In a slightly easier course, NS/NYU pulls ahead at the beginning. U. Penn takes a wrong turn and gets snagged in front of the fence area. [MPG] (10.0MB)
Heat 4. U.Penn wins: NS/NYU is off to a good start and pulls ahead quickly while U. Penn slowly finds its way through the first row of obtacles. Unexpectedly, NS/NYU goes to the back of the fence area and traps itself in a mound of dead leaves, spinning it wheel hopelessly. U. Penn joins it briefly, but eventually frees itself from the leaves and wins the heat, while NS/NYU still digs its wheels deeper and deeper into the leaves. [MPG] (36.5MB)
Heat 5. NS/NYU wins: NS/NYU pulls ahead and wins the heat, while U. Penn slowly finds its way through the first row of obstacles. [MPG] (14.5MB)

After a close race, NS/NYU wins over U. Penn 3 to 2.

The two robots in front of the Big Bang Antenna. Yann LeCun (left) receives the winner's trophy from the DARPA/LAGR program manager Larry Jackel (wearing the red hat)
The NS/NYU robot wears permanent scars of the fierce battle with its rival from Philadelphia.

More Pictures

The Joy of Field Robotics in Winter

LAGR in New York City

Pictures are video clips are available here, and here.