Reinforcement Learning (Passive Learning- Direct Utility Estimation)

Name: Reinforcement Learning (Passive Learning- Direct Utility Estimation)
Uploaded: May 1, 2021
Duration: 3797 s

Free Lectures of CSE Courses763 subscribers

3.0K views

May 1, 2021

1:03:17

Direct Utility Estimation In this method, the agent executes a sequence of trials or runs (sequences of states-actions transitions that continue until the agent reaches the terminal state). Each trial gives a sample value and the agent estimates the utility based on the samples values. Can be calculated as running averages of sample values. The main drawback is that this method makes a wrong assumption that state utilities are independent while in reality they are Markovian. Also, it is slow to converge.

Download

0 formats

No download links available.