Graham Taylor

Conditional Restricted Boltzmann Machines

We have emphasized that models with distributed hidden state are necessary for efficiently modeling complex time series. But using distributed representations for hidden state in directed models of time series (Bayes nets) makes inference difficult in all but the simplest models. If, however, we use a Restricted Boltzmann Machine to model the probability distribution of the observation vector at each time frame, the posterior over latent variables factorizes completely, making inference easy. In this chapter, we introduce the Conditional Restricted Boltzmann Machine (CRBM). The CRBM extends the RBM to capture temporal dependencies yet maintains its most important computational properties: simple, exact inference and efficient approximate learning using the contrastive divergence algorithm.

Return to index

4.5.1 Generation of walking and running sequences from a single model

A third-order CRBM with 200 hidden units is trained on 2813 frames of walking and running data from subject 35 in the CMU Motion Capture Database. The model can generate both walking and running depending on how it is initialized.

The player will show in this paragraph unless you do not have flash player installed

4.5.2 Learning transitions between walking and running

A CDBN with two hidden layers (each layer is third-order and has 200 stochastic binary units) is trained on 2515 frames of walking and jogging from Eugene Hsu's dataset. The model is able to stochastically transition between the two gaits.

The player will show in this paragraph unless you do not have flash player installed

4.5.3 Introducing transitions using noise

We demonstrate how transitions between different types of motion content can be generated even when such transitions are absent in the data.

The player will show in this paragraph unless you do not have flash player installed

4.5.4 Learning motion style

Model with 1 hidden layer

Here we show that a single CRBM can capture many different styles of walking, and generate based on the way it is initialized. We use 1200 hidden units, a sparseness penalty to encourage sparse hidden units, and 12 delay taps for this 60fps motion. We have trained using data from subject 137 in the CMU Motion Capture Database.

The player will show in this paragraph unless you do not have flash player installed

Model with 2 hidden layers

A 2-layer CDBN with 600 hidden units per layer can capture the same styles as the 1-layer CRBM, but, in addition, generate "old man" style walking. Initialization is based purely on initialization. Note that there is more variety in the movements compared to the 1-layer CRBM (this is apparent in the "cat" sequence), but there is still a spurious (and non-smooth) transition, this time in the "gangly" sequence.

The player will show in this paragraph unless you do not have flash player installed

4.5.5 Filling in missing data

The CRBM can also fill in missing data online. Here, we have deleted data (left leg, upper body) halfway through the sequence and the model fills in the missing joint angles.

The player will show in this paragraph unless you do not have flash player installed

4.5.6 Video texture

We demonstrate that the CRBM is applicable to high-dimensional data beyond mocap.

The player will show in this paragraph unless you do not have flash player installed

The player will show in this paragraph unless you do not have flash player installed