Slides at https://ubc-cs.github.io/cpsc340/lectures/L22.pdf?raw=1
Camera operated by Tanner Johnson. Content based on original course materials created by Mark Schmidt.
0:00 kernel trick recap
1:45 stochastic gradient motivation
3:45 SGD vs. GD
6:20 minimizing averages
7:50 per-example gradients
17:00 visualizing SGD: 1 parameter
23:50 decreasing step sizes
31:20 stochastic average gradient
32:55 visualizing SGD: 2 parameters
44:30 summary