Back to Browse

Reinforcement Learning 2: Markov Decision Processes

14.0K views
Feb 22, 2021
54:04

This lecture uses the excellent MDP example from David Silver. Slides: https://cwkx.github.io/data/teaching/dl-and-rl/rl-lecture2.pdf Colab: https://colab.research.google.com/gist/cwkx/ba6c44031137575d2445901ee90454da/mrp.ipynb Twitter: https://twitter.com/cwkx Next video: https://www.youtube.com/playlist?list=PLMsTLcO6ettgmyLVrcPvFLYi2Rs-R4JOE Content: Markov Chains - markov property - state transition matrix - definition and example Markov Reward Process - definition and example - the return - state value function - the Bellman equation Markov Decision Process - definition and example - policies - state and action value functions - the Bellman equation for MDPs - optimal state and action value functions - the Bellman optimality equations #MDPs #MRPs #markovchains #reinforcementlearning #BellmanEquations #BellmanOptimality

Download

1 formats

Video Formats

360pmp478.6 MB

Right-click 'Download' and select 'Save Link As' if the file opens in a new tab.

Reinforcement Learning 2: Markov Decision Processes | NatokHD