Temporal Difference Models: Deep Model-free RL for Model-based

Name: Temporal Difference Models: Deep Model-free RL for Model-based
Uploaded: Apr 20, 2018
Duration: 1190 s

Microsoft Research356K subscribers

5.6K views

Apr 20, 2018

19:50

Deep reinforcement learning (RL) has shown promising results for learning complex sequential decision-making behaviors in various environments. However, most successes have been exclusively in simulation, and results in real-world applications such as robotics are limited, largely due to poor sample efficiency of typical deep RL algorithms. I will introduce temporal difference models (TDMs), an extension of goal-conditioned value functions that enables multi time resolution model-base planning. TDMs generalize traditional predictive models, bridge the gap between model-based and off-policy model-free RL, and show substantial improvements in sample efficiency without introducing asymptotic performance loss. See more at https://www.microsoft.com/en-us/research/video/temporal-difference-models-deep-model-free-rl-model-based/

Download

1 formats

Video Formats

360pmp432.4 MB

Download

Right-click 'Download' and select 'Save Link As' if the file opens in a new tab.