Reinforcement Learning (Adaptive Dynamic Programming (ADP), Temporal Difference Learning)

Name: Reinforcement Learning (Adaptive Dynamic Programming (ADP), Temporal Difference Learning)
Uploaded: May 1, 2021
Duration: 3428 s

Free Lectures of CSE Courses763 subscribers

2.0K views

May 1, 2021

57:08

ADP is a smarter method than Direct Utility Estimation as it runs trials to learn the model of the environment by estimating the utility of a state as a sum of reward for being in that state and the expected discounted reward of being in the next state. Calculating the exact utility for a state can we approximate it and possibly make it less computationally expensive? Yes we can! Using Temporal Difference (TD) learning

Download

0 formats

No download links available.