- Predict visual feature of one future frame [1] 
- Predict optical flow of one future frame [2] 
- Predict one future frame [4] (a special case of video prediction) 
- Predict future trajectories [5] 
- Predict optical flows of future frames, and then obtain future frames [3] 
Reference
- Vondrick, Carl, Hamed Pirsiavash, and Antonio Torralba. “Anticipating visual representations from unlabeled video.” CVPR, 2016. 
- Gao, Ruohan, Bo Xiong, and Kristen Grauman. “Im2flow: Motion hallucination from static images for action recognition.” CVPR, 2018. 
- Li, Yijun, et al. “Flow-grounded spatial-temporal video prediction from still images.” ECCV, 2018. 
- Xue, Tianfan, et al. “Visual dynamics: Probabilistic future frame synthesis via cross convolutional networks.” NIPS, 2016. 
- Walker, Jacob, et al. “An uncertain future: Forecasting from static images using variational autoencoders.” ECCV, 2016.