(SP22-CS 598) Efficient & Predictive Vision: Schedule

Schedule (Tentative)

We will typically cover two papers in each class. We may update the following tentative schedule as the course progresses.

Date	Presenter	Topic	Papers	Slides
Jan 18	Liangyan Gui	Introduction
Jan 20	Liangyan Gui	Teaser: Towards Human-Like Motion Prediction
Jan 25	Shengcao Cao	Teaser: Efficient Neural Networks
Part 1.1: Predictive Vision - Basics
Jan 27	Yunze Man	Classical Methods	J. M. Wang, D. J. Fleet, and A. Hertzmann. Gaussian Process Dynamical Models. NeurIPS, 2005. K. M. Kitani, B. D. Ziebart, J. A. Bagnell, and M. Hebert. Activity Forecasting. ECCV, 2012.
Feb 1		Deep Learning Methods	M. Julieta, M. J. Black, and J. Romero. On human motion prediction using recurrent neural networks. CVPR, 2017. M. Liang, B. Yang, R. Hu, Y. Chen, R. Liao, S. Feng, and R. Urtasun. Learning Lane Graph Representations for Motion Forecasting. ECCV, 2020.
Feb 3		Short-term vs. Long-term	R. Villegas, J. Yang, Y. Zou, S. Sohn, X Lin, and H. Lee. Learning to Generate Long-term Future via Hierarchical Prediction. PMLR, 2017. K. Pertsch, O. Rybkin, F. Ebert, C. Finn, D. Jayaraman, and S. Levine. Long-horizon visual planning with goal-conditioned hierarchical predictors. NeurIPS, 2020.
Feb 8		Deterministic vs. Probabilistic	N. Lee, W. Choi, P. Vernaza, C. B. Choy, P. H. S. Torr, M. Chandraker. DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents. CVPR, 2017. Y. Chai, B. Sapp, M. Bansal, D. Anguelov. MultiPath: Multiple Probabilistic Anchor Trajectory Hypotheses for Behavior Prediction. CoRL, 2019.
Feb 10		Single-agent vs. Multi-agent	A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, L. Fei-Fei, S. Savarese. Social LSTM: Human Trajectory Prediction in Crowded Spacess. CVPR, 2016. J. Gao, C. Sun, H. Zhao, Y. Shen, D. Anguelov, C. Li, C. Schmid. VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized Representation. CVPR, 2020.
Feb 15		Single-modality vs. Multi-modality	R. Li, S. Yang, D. A. Ross, A. Kanazawa. AI Choreographer: Music Conditioned 3D Dance Generation with AIST++. ICCV, 2021. M. Shah, Z. Huang, A. Laddha, M. Langfor, B. Barber, S. Zhang, C. Vallespi-Gonzalez, R. Urtasun. LiRaNet: End-to-End Trajectory Prediction using Spatio-Temporal Radar Fusion. CoRL, 2020.
Feb 17		Functions of prediction	T. Han, W. Xie, A. Zisserman. Memory-augmented Dense Predictive Coding for Video Representation Learning. ECCV, 2020. A. Oord, Y. Li, O. Vinyals. Representation Learning with Contrastive Predictive Coding. arXiv, 2018.
Part 1.2: Predictive Vision - Different types of prediction
Feb 22		Human pose prediction	Z. Cao, H. Gao, K. Mangalam1, Q.-Z. Cai, M. Vo, and J. Malik. Long-term Human Motion Prediction with Scene Context. ECCV, 2020. M. Hassan, D. Ceylan, R. Villegas, J. Saito, J. Yang, Y. Zhou, M. Black. Stochastic Scene-Aware Motion Prediction. ICCV, 2021.
Feb 24		Trajectory prediciton	T. Salzmann, B. Ivanovic, P. Chakravarty, M. Pavone. Trajectron++: Dynamically-Feasible Trajectory Forecasting With Heterogeneous Data. ECCV, 2020. K. Mangalam, H. Girase, S. Agarwal, K.-H. Lee, E. Adeli, J. Malik, A. Gaidon. It Is Not the Journey but the Destination: Endpoint Conditioned Trajectory Prediction. ECCV, 2020.
March 1		Intent prediction	H. Girase, H. Gang, S. Malla, J. Li, A. Kanehara, K. Mangalam, C. Choi. LOKI: Long Term and Key Intentions for Trajectory Prediction. ICCV, 2021. B. Liu, E. Adeli, Z. Cao, K.-H. Lee, A. Shenoi, A. Gaidon, and J. C. Niebles. Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction. ICRA, 2020.
March 3		Video prediction	J. Walker, A. Gupta, and M. Hebert. Patch to the Future: Unsupervised Visual Prediction. CVPR, 2014. N. Bodla, G. Shrivastava, R. Chellappa, and A. Shrivastava. Hierarchical Video Prediction using Relational Layouts for Human-Object Interactions. CVPR, 2021.
March 8		Action prediction	C. Sun, A. Shrivastava, C. Vondrick, R. Sukthankar, K. Murphy, C. Schmid . Relational Action Forecasting. CVPR, 2019. C. Vondrick, H. Pirsiavash, A. Torralba. Anticipating Visual Representations from Unlabeled Video. CVPR, 2016.
March 10		Segmentation prediction	C. Graber, G. Tsai, M. Firman, G. Brostow, A. Schwing. Panoptic Segmentation Forecasting. CVPR, 2021. P. Luc, C. Couprie, Y. LeCun, J. Verbeek. Predicting Future Instance Segmentation by Forecasting Convolutional Features. ECCV, 2018.
March 15	No Class	Break
March 17	No Class	Break
Part 1.3: Predictive Vision - Applications
March 22		VR/AR, Robotics	R. Henrikson, T. Grossman, S. Trowbridge, D. Wigdor, and H. Benko. Head-Coupled Kinematic Template Matching: A Prediction Model for Ray Pointing in VR. CHI, 2020. J. Bütepage, H. Kjellström, D. Kragic. Anticipating Many Futures: Online Human Motion Prediction and Synthesis for Human-Robot Collaboration. ICRA, 2018.
March 24		Anonymous driving, Graphics	K. Chitta, A. Prakash, A. Geiger. NEAT: Neural Attention Fields for End-to-End Autonomous Driving. ICCV, 2021. S. Starke, H. Zhang, T. Komura, J. Saito. Neural State Machine for Character-Scene Interactions. Siggraph Asia, 2019.
March 29		Healthcare, NLP	G. Lee, K. Nho, B Kang, K.-A Sohn, and D. Kim. Predicting Alzheimer’s Disease Progression Using Multi-Modal Deep Learning Approach. Scientific reports, 2019. J. Devlin, M.-W. Chang, K. Lee, K. Toutanova. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL, 2019.
Part 2.1: Efficient Vision - Network Compression
March 31		Pruning	S. Han, J. Pool, J. Tran, William J. Dallym. Learning both Weights and Connections for Efficient Neural Networks. NeurIPS, 2015. J. Frankle, M. Carbin. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks. ICLR, 2019.
April 5		Knowledge Distillation	G. Hinton, O. Vinyals, J. Dean. Distilling the Knowledge in a Neural Network. NeurIPS Workshop, 2014. T. Furlanello, Z. C. Lipton, M. Tschannen, L. Itti, A. Anandkumar. Born Again Neural Networks. ICML, 2018.
April 7		Invited Talk: Yuandong Tian
April 12		Quantization	S. Gupta, A. Agrawal, K. Gopalakrishnan, P. Narayanan. Deep Learning with Limited Numerical Precision. PMLR, 2015. M. Rastegari, V. Ordonez, J. Redmon, A. Farhadi. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks. ECCV, 2016.
April 14		Invited Talk: Jean Mercat	Deep learning to forecast road agent trajectories
April 19		Combined Methods	S. Han, H. Mao, W. J. Dally. Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding. ICLR, 2016. Y. He, J. Lin, Z. Liu, H. Wang, L. Li, S. Han. AMC: AutoML for Model Compression and Acceleration on Mobile Devices. ECCV, 2018.
Part 2.2: Efficient Vision - Efficient Architectures
April 21			A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, H. Adam. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv, 2017. X. Zhang, X. Zhou, M. Lin, J. Sun. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. CVPR, 2018.
April 26			M. Tan, R. Pang, Q. V. Le. EfficientDet: Scalable and Efficient Object Detection. CVPR, 2020. M. Zaheer, G. Guruganesh, A. Dubey, J. Ainslie, C. Alberti, S. Ontanon, P. Pham, A. Ravula, Q. Wang, L. Yang, A. Ahme. Big Bird: Transformers for Longer Sequences. NeurIPS, 2020.
Part 2.3: Efficient Vision - Neural Architecture Search
April 28			B. Zoph, Q. V. Le. Neural Architecture Search with Reinforcement Learning. ICLR, 2017. H. Liu, K. Simonyan, Y. Yang. DARTS: Differentiable Architecture Search. ICLR, 2019.
May 3			A. Zela, T. Elsken, T. Saikia, Y. Marrakchi, T. Brox, F. Hutter. Understanding and Robustifying Differentiable Architecture Search. ICLR, 2020. J. Mellor, J. Turner, A. Storkey, E. J. Crowley. Neural Architecture Search without Training. arXiv, 2020.
May 5	No Class	Break
May 11	7-10pm	Final Project Presentations