This page shows video predictions using the method presented in the paper Deep multi-scale video prediction beyond mean square error. The two frames with a red border are the predictions, the other ones are the input (real) frames. The predictions are made using the Adversarial+GDL network, taking four frames as input and predicting a single one. The second prediction is obtained by using the first prediction as an input.
A comparison of all models presented in the paper can be found on this page.
A comparison of this model with opticalflow extrapolations and results from Ranzato et al. can be found on this page.
The code for training and testing is available on github.
The four next predictions are using the sequences from the paper Video (language) modeling: a baseline for generative models of natural videos .