04833510 Computer Vision and Deep Learning

Instructor:

Prof. Yadong Mu : myd@pku.edu.cn

Teaching Assistant:

Mr. LI Jinghan : li.jh@stu.pku.edu.cn
Mr. MU Yanchen : muyanchen@stu.pku.edu.cn

Location: Room 207, Teaching Building 1, Peking University

Time: Friday 15:10pm - 17:00pm (every week)

Office hours: Drop an email or wechat message to the instructor or TAs for appointing a course-related face-to-face QA.

Schedule of Lectures

Date	Topics
Feb 21, 2025	Introduction - I Course logistrics Introduction to computer vision: illustrative applications and demos
Feb 28, 2025	Introduction - II Bacics of machine learning Deep Learning: history, key concepts, back-propagation, neural layers etc.
March 7, 2025	Visual Recognition - I Visual Recognition: Task Definition and Challenges Visual Features: Harris Corner, SIFT, HOG etc. Bag-of-words Models Spatial Pyramid Matching Pyramid Match Kernel Vocabulary Tree Sparse Coding
March 14, 2025	Visual Recognition - II Deep Learning for Visual Recognition: LeNet-5, AlexNet, VGG-16, GoogleNet, ResNet, DenseNet Network Visualizatioin
March 21, 2025	Object Detection - I V-J Face Detector (Integral Image, AdaBoost, Cascade) HOG+SVM with NMS Deformable Part Model (DPM) for Pedestrian Detection
March 28, 2025	Object Detection - II ZFNet (i.e., Zeiler and Fergus (2013)) R-CNN Fast R-CNN Faster R-CNN Yolo, SSD Feature Pyramid Network
April 4, 2025	Holiday - no class
April 11, 2025	Pixel Computing - I Pixel Labeling: Segmentation, Matting, Parsing Unsupervised Image Segmentation: K-means, Mean-Shift, Normalized Cut Interactive Object Cutout: GraphCut, GrabCut, LazySnapping Image Matting: Poisson Matting Image Co-segmentation Image Inpainting / Image Completion
April 18, 2025	Pixel Computing - II Deep Pixel Labeling: FCN, DeepLab, SegNet, CNN-as-RNN, HRNet Human Pose Estimation: Bottom-Up and Top-Down
April 25, 2025	Sequantial Models Unrolling Computational Graph RNN variants (recurrent through output, sequence-input-single-output, teaching forcing, encoder-decoder, bi/quad-directional RNN etc.) Long short-term memory (LSTM) Transformer and its variants DETR, MLP-mixer Applications
May 2, 2025	Holiday - no class
May 9, 2025	Video Computing Introduction of Video Computing Tasks Video Features (STIP,Deep Video, C3D, Trajectory Feature) Optical flows Deep Learning for Video Classification (multi-stream fusion techniques) Video Event Detection An Illustrative System for Video Classification Video Moment Localization Vision-Language Grounding in Videos
May 16, 2025	3D Computer Vision - I Epipolar geometry Camera calibration Image rectification Stereo Structure from motion
May 23, 2025	3D Computer Vision - II Image-based rendering Neural rendering models
	Visual Tracking Mean-shift KLT Kalman filter More visual tracking methods
May 30, 2025	Image / Video Generation Generative adversarial networks (GAN) Variational autoencoder (VAE) Autoregressive models and flows Vision-Language Foundation Models
June 6, 2025	Vision-Robot Learning Basics of reinforcement learning Autonomous driving Robot control

Textbook:

There is no textbook for this course.

References:

Jean Ponce, David Forsyth, Computer Vision A Modern Approach, Approach, Prentice Hall, 2011 (main reference)
Richard Szeliski, Computer Vision Algorithms and Applications, Springer-Verlag, 2011 (main reference)
Simon Prince, Computer vision: models, learning and inference, 2012 (main reference)
Ian Goodfellow, Yoshua Bengio, Aaron Courville, Deep Learning, MIT Press, 2016
Christopher Bishop, Pattern Recognition and Machine Learning, Springer, 2006
Multiple View Geometry in Computer Vision
Epipolar Geometry in Stereo, Motion and Object Recognition A Unified Approach by Gang Xu, Zhengyou Zhang
An invitation to 3-D Vision
Zhi-hua Zhou, Machine Learning (in Chinese), 2016
CVPR / ICCV / ECCV / NIPS / ICML / ICLR proceedings
Domestic conferences: VALSE, PRCV

Course Work

Final Grade	Grading will be based on homeworks (35%), mid-term (20%) and final exam (45%). The end-of-term grade is curved. Your overall grade will depend on your performance relative to your classmates.