(SP22-CS 598) Efficient & Predictive Vision: Schedule

Schedule (Tentative)

We will typically cover two papers in each class. We may update the following tentative schedule as the course progresses.

Date Presenter Topic Papers Slides
Jan 18 Liangyan Gui Introduction
Jan 20 Liangyan Gui Teaser: Towards Human-Like Motion Prediction
Jan 25 Shengcao Cao Teaser: Efficient Neural Networks
Part 1.1: Predictive Vision - Basics
Jan 27 Yunze Man Classical Methods J. M. Wang, D. J. Fleet, and A. Hertzmann. Gaussian Process Dynamical Models. NeurIPS, 2005.

K. M. Kitani, B. D. Ziebart, J. A. Bagnell, and M. Hebert. Activity Forecasting. ECCV, 2012.
Feb 1 Deep Learning Methods M. Julieta, M. J. Black, and J. Romero. On human motion prediction using recurrent neural networks. CVPR, 2017.

M. Liang, B. Yang, R. Hu, Y. Chen, R. Liao, S. Feng, and R. Urtasun. Learning Lane Graph Representations for Motion Forecasting. ECCV, 2020.
Feb 3 Short-term vs. Long-term R. Villegas, J. Yang, Y. Zou, S. Sohn, X Lin, and H. Lee. Learning to Generate Long-term Future via Hierarchical Prediction. PMLR, 2017.

K. Pertsch, O. Rybkin, F. Ebert, C. Finn, D. Jayaraman, and S. Levine. Long-horizon visual planning with goal-conditioned hierarchical predictors. NeurIPS, 2020.
Feb 8 Deterministic vs. Probabilistic N. Lee, W. Choi, P. Vernaza, C. B. Choy, P. H. S. Torr, M. Chandraker. DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents. CVPR, 2017.

Y. Chai, B. Sapp, M. Bansal, D. Anguelov. MultiPath: Multiple Probabilistic Anchor Trajectory Hypotheses for Behavior Prediction. CoRL, 2019.
Feb 10 Single-agent vs. Multi-agent A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, L. Fei-Fei, S. Savarese. Social LSTM: Human Trajectory Prediction in Crowded Spacess. CVPR, 2016.

J. Gao, C. Sun, H. Zhao, Y. Shen, D. Anguelov, C. Li, C. Schmid. VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized Representation. CVPR, 2020.
Feb 15 Single-modality vs. Multi-modality R. Li, S. Yang, D. A. Ross, A. Kanazawa. AI Choreographer: Music Conditioned 3D Dance Generation with AIST++. ICCV, 2021.

M. Shah, Z. Huang, A. Laddha, M. Langfor, B. Barber, S. Zhang, C. Vallespi-Gonzalez, R. Urtasun. LiRaNet: End-to-End Trajectory Prediction using Spatio-Temporal Radar Fusion. CoRL, 2020.
Feb 17 Functions of prediction T. Han, W. Xie, A. Zisserman. Memory-augmented Dense Predictive Coding for Video Representation Learning. ECCV, 2020.

A. Oord, Y. Li, O. Vinyals. Representation Learning with Contrastive Predictive Coding. arXiv, 2018.
Part 1.2: Predictive Vision - Different types of prediction
Feb 22 Human pose prediction Z. Cao, H. Gao, K. Mangalam1, Q.-Z. Cai, M. Vo, and J. Malik. Long-term Human Motion Prediction with Scene Context. ECCV, 2020.

M. Hassan, D. Ceylan, R. Villegas, J. Saito, J. Yang, Y. Zhou, M. Black. Stochastic Scene-Aware Motion Prediction. ICCV, 2021.
Feb 24 Trajectory prediciton T. Salzmann, B. Ivanovic, P. Chakravarty, M. Pavone. Trajectron++: Dynamically-Feasible Trajectory Forecasting With Heterogeneous Data. ECCV, 2020.

K. Mangalam, H. Girase, S. Agarwal, K.-H. Lee, E. Adeli, J. Malik, A. Gaidon. It Is Not the Journey but the Destination: Endpoint Conditioned Trajectory Prediction. ECCV, 2020.
March 1 Intent prediction H. Girase, H. Gang, S. Malla, J. Li, A. Kanehara, K. Mangalam, C. Choi. LOKI: Long Term and Key Intentions for Trajectory Prediction. ICCV, 2021.

B. Liu, E. Adeli, Z. Cao, K.-H. Lee, A. Shenoi, A. Gaidon, and J. C. Niebles. Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction. ICRA, 2020.
March 3 Video prediction J. Walker, A. Gupta, and M. Hebert. Patch to the Future: Unsupervised Visual Prediction. CVPR, 2014.

N. Bodla, G. Shrivastava, R. Chellappa, and A. Shrivastava. Hierarchical Video Prediction using Relational Layouts for Human-Object Interactions. CVPR, 2021.
March 8 Action prediction C. Sun, A. Shrivastava, C. Vondrick, R. Sukthankar, K. Murphy, C. Schmid . Relational Action Forecasting. CVPR, 2019.

C. Vondrick, H. Pirsiavash, A. Torralba. Anticipating Visual Representations from Unlabeled Video. CVPR, 2016.
March 10 Segmentation prediction C. Graber, G. Tsai, M. Firman, G. Brostow, A. Schwing. Panoptic Segmentation Forecasting. CVPR, 2021.

P. Luc, C. Couprie, Y. LeCun, J. Verbeek. Predicting Future Instance Segmentation by Forecasting Convolutional Features. ECCV, 2018.
March 15 No Class Break
March 17 No Class Break
Part 1.3: Predictive Vision - Applications
March 22 VR/AR, Robotics R. Henrikson, T. Grossman, S. Trowbridge, D. Wigdor, and H. Benko. Head-Coupled Kinematic Template Matching: A Prediction Model for Ray Pointing in VR. CHI, 2020.

J. Bütepage, H. Kjellström, D. Kragic. Anticipating Many Futures: Online Human Motion Prediction and Synthesis for Human-Robot Collaboration. ICRA, 2018.
March 24 Anonymous driving, Graphics K. Chitta, A. Prakash, A. Geiger. NEAT: Neural Attention Fields for End-to-End Autonomous Driving. ICCV, 2021.

S. Starke, H. Zhang, T. Komura, J. Saito. Neural State Machine for Character-Scene Interactions. Siggraph Asia, 2019.
March 29 Healthcare, NLP G. Lee, K. Nho, B Kang, K.-A Sohn, and D. Kim. Predicting Alzheimer’s Disease Progression Using Multi-Modal Deep Learning Approach. Scientific reports, 2019.

J. Devlin, M.-W. Chang, K. Lee, K. Toutanova. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL, 2019.
Part 2.1: Efficient Vision - Network Compression
March 31 Pruning S. Han, J. Pool, J. Tran, William J. Dallym. Learning both Weights and Connections for Efficient Neural Networks. NeurIPS, 2015.

J. Frankle, M. Carbin. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks. ICLR, 2019.
April 5 Knowledge Distillation G. Hinton, O. Vinyals, J. Dean. Distilling the Knowledge in a Neural Network. NeurIPS Workshop, 2014.

T. Furlanello, Z. C. Lipton, M. Tschannen, L. Itti, A. Anandkumar. Born Again Neural Networks. ICML, 2018.
April 7 Invited Talk: Yuandong Tian
April 12 Quantization S. Gupta, A. Agrawal, K. Gopalakrishnan, P. Narayanan. Deep Learning with Limited Numerical Precision. PMLR, 2015.

M. Rastegari, V. Ordonez, J. Redmon, A. Farhadi. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks. ECCV, 2016.
April 14 Invited Talk: Jean Mercat Deep learning to forecast road agent trajectories
April 19 Combined Methods S. Han, H. Mao, W. J. Dally. Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding. ICLR, 2016.

Y. He, J. Lin, Z. Liu, H. Wang, L. Li, S. Han. AMC: AutoML for Model Compression and Acceleration on Mobile Devices. ECCV, 2018.
Part 2.2: Efficient Vision - Efficient Architectures
April 21 A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, H. Adam. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv, 2017.

X. Zhang, X. Zhou, M. Lin, J. Sun. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. CVPR, 2018.
April 26 M. Tan, R. Pang, Q. V. Le. EfficientDet: Scalable and Efficient Object Detection. CVPR, 2020.

M. Zaheer, G. Guruganesh, A. Dubey, J. Ainslie, C. Alberti, S. Ontanon, P. Pham, A. Ravula, Q. Wang, L. Yang, A. Ahme. Big Bird: Transformers for Longer Sequences. NeurIPS, 2020.
Part 2.3: Efficient Vision - Neural Architecture Search
April 28 B. Zoph, Q. V. Le. Neural Architecture Search with Reinforcement Learning. ICLR, 2017.

H. Liu, K. Simonyan, Y. Yang. DARTS: Differentiable Architecture Search. ICLR, 2019.
May 3 A. Zela, T. Elsken, T. Saikia, Y. Marrakchi, T. Brox, F. Hutter. Understanding and Robustifying Differentiable Architecture Search. ICLR, 2020.

J. Mellor, J. Turner, A. Storkey, E. J. Crowley. Neural Architecture Search without Training. arXiv, 2020.
May 5 No Class Break
May 11 7-10pm Final Project Presentations