ADP is a smarter method than Direct Utility Estimation as it runs trials to learn the model of the environment by estimating the utility of a state as a sum of reward for being in that state and the expected discounted reward of being in the next state.
Calculating the exact utility for a state can we approximate it and possibly make it less computationally expensive? Yes we can! Using Temporal Difference (TD) learning