1:46:51Session 4 Diffusion Denoising Probabilistic Model DDPMMainak's PMRF Tutorials32 views·18 hours ago
1:44:50Session 2 Variational Inference, Autoencoders, Variational AutoencodersMainak's PMRF Tutorials68 views·5 days ago
1:48:51Session 21 Actor Critic based Policy Gradient, Safe RL, Planning, DYNA, Curriculum LearningMainak's PMRF Tutorials268 views·11 months ago
1:50:03Session 20 Deep Neural Networks, MLP, Backpropagation, Policy Gradient, REINFORCEMainak's PMRF Tutorials99 views·11 months ago
1:54:17Session 19 Asynchronous Q learning, Classification in ML, MLE, Logistic and Softmax RegressionMainak's PMRF Tutorials266 views·11 months ago
1:57:08Session 18 Synchronous Q-learning, Model-free, based, tabular, with Linear Fn. Approx., ConvergenceMainak's PMRF Tutorials54 views·11 months ago
1:39:10Session 17 Off-Policy Evaluation of TD0 with linear function Approximation, Emphatic TD0Mainak's PMRF Tutorials42 views·11 months ago
1:42:49Session 16 γ contraction, Banach's Fixed Point Theorem, How far is it far from the intended optimalMainak's PMRF Tutorials56 views·11 months ago
1:52:56Session 15 TD(0) convergence proof (contd), Point of Convergence of TD(0) (linear function approx.)Mainak's PMRF Tutorials58 views·11 months ago
1:54:39Session 14 TD0 with linear function approximation, Glimpse at Stochastic Approximation Algorithm(1)Mainak's PMRF Tutorials87 views·1 year ago
1:45:21Session 13 Function Approximation in RL, Policy Evaluation, SGD Monte Carlo, TD(0) ImplementationMainak's PMRF Tutorials121 views·1 year ago
1:49:15Session 12 On Policy vs Off Policy Algorithms, Importance Sampling, Model-free Q learning, SARSAMainak's PMRF Tutorials131 views·1 year ago
1:44:33Session 11 Model Free Methods, Monte Carlo, Temporal Difference Algorithm, TD(λ) AlgorithmMainak's PMRF Tutorials111 views·1 year ago
1:51:52Session 10 Stochastic Shortest Path, Bellman Operators, Proof of convergence of Policy EvaluationMainak's PMRF Tutorials134 views·1 year ago
1:55:36Session 9 Policy Iteration & Q learning code, Finite Horizon MDPs, Dynamic Program, Theory and ExmpMainak's PMRF Tutorials141 views·1 year ago
1:48:24Session 8 Bellman Equation, Optimal Policy, Iterative Policy Evaluation, Policy & Value IterationMainak's PMRF Tutorials169 views·1 year ago
1:51:33Session 7 MDPs, Action, Value, Reward functions, Bellman Equations 1, ExamplesMainak's PMRF Tutorials183 views·1 year ago
1:53:14Session 6 Random Processes, Markov Chains and Stationary DistributionMainak's PMRF Tutorials165 views·1 year ago
1:50:48Session 5 ODE Interpretation in Bandits, UCB, Gradient-Based Algorithms, UCB in PythonMainak's PMRF Tutorials164 views·1 year ago
1:42:57Session 4 Introduction to Reinforcement Learning, Multi-armed Bandits Algorithm and ImplementationMainak's PMRF Tutorials387 views·1 year ago
1:54:27Session 3 Recap on Joint Distributions, Conditional Distributions, and Conditional ExpectationsMainak's PMRF Tutorials146 views·1 year ago
1:48:30Session 2 Recap - Continuous Distributions, Transformation of random variablesMainak's PMRF Tutorials215 views·1 year ago
1:56:11Session 1 Recap on Random Variables, Exemplar Discrete Distributions, ExpectationsMainak's PMRF Tutorials480 views·1 year ago
2:20:38Session 24 Mixture Models, Expectation Maximization, GMMs, K-means is a specialized GMMMainak's PMRF Tutorials444 views·1 year ago
2:07:14Session 23 Dimensionality Reduction - Principal Component Analysis, Linear Discriminant AnalysisMainak's PMRF Tutorials373 views·1 year ago
2:02:29Session 22 Unsupervised Learning, Clustering algorithms, K-means, K-medoids, and HierarchicalMainak's PMRF Tutorials268 views·1 year ago
1:41:05Session 21 Backpropagation, Dropout, Bias-variance tradeoff, Prevent overfitting or underfittingMainak's PMRF Tutorials375 views·2 years ago
1:59:31Session 20 Perceptron, Perceptron Learning Algorithm, Convergence Proof, MLPs, Forward PropagationMainak's PMRF Tutorials257 views·2 years ago