PPO model loaded for tested after training on visual fox task. The fox has to reach the closest target when it spawns in. A reward of +1 is given if it runs into the correct target, -1 if the incorrect target, and -0.01 for every action step.
No download links available.