Gradient Descent: The Algorithm That Trains Every AI Model
Gradient descent is the engine behind every AI model ever trained. From ChatGPT to Stable Diffusion to self-driving cars β they all learn using the same simple rule: follow the slope downhill. In this episode, we break down exactly how it works: partial derivatives, the update rule, the learning rate, and modern tricks like momentum, Adam, and stochastic gradient descent. π― CHAPTERS 00:00 Why gradient descent is the engine of AI 0:45 The blindfolded hiker intuition 1:45 Partial derivatives and the gradient 3:15 The update rule: w β w β Ξ±βL 4:30 The learning rate β Goldilocks territory 6:00 Watching gradient descent learn a line 7:30 When the surface isn't a nice bowl 9:00 Momentum and Adam: modern tricks 11:00 Stochastic gradient descent (SGD) 12:30 From 2 parameters to GPT-4 π KEY CONCEPTS β’ Partial derivatives and the gradient vector β’ The update rule: new w = old w β Ξ± Γ βL/βw β’ Learning rate: too small, too big, just right β’ Local minima, saddle points, plateaus β’ Momentum and Adam optimizer β’ Stochastic gradient descent (mini-batches) π PREREQUISITE β’ Episode 1 β Linear Regression: youtu.be/... π THE SERIES Episode 2 of 10. Next: Neural Networks β how stacking linear functions with one simple trick lets us learn anything. π± WANT TO PRACTICE THE MATH? NovaMaths β SAT & ACT Math prep with 749+ exercises: https://www.novamaths.app π French channel: βͺ@MathsAcademy27β¬ βββββββββββββββββββββ Channel hosted by Julien, certified math teacher with 30 years of classroom experience. #GradientDescent #Adam #SGD #MachineLearning #AIMath
Download
0 formatsNo download links available.