Back to Browse

Proper Value Equivalence: Simplifying Model-based RL

152 views
Premiered Apr 19, 2022
19:51

Which aspects of an environment must be modeled in order to plan optimally? This video reviews and discusses the paper Proper Value Equivalence: Simplifying Model-based Reinforcement Learning, a joint work by researchers at the University of Michigan and DeepMind. The paper investigates which properties of an environment are essential for optimal decision-making and which details can be safely ignored without affecting performance. The discussion begins by generalizing the classical notion of value equivalence (VE) to an order-𝑘 formulation defined in terms of repeated applications of the Bellman operator. This leads to a hierarchy of VE classes that grow as 𝑘 →∞. In the limit, this construction yields proper value equivalence (PVE), a broader equivalence class in which multiple distinct environment models can support optimal planning. A central insight of the paper is that, unlike standard value equivalence, the PVE class may contain many models—even when considering all policies and value functions—that are sufficient for planning. These models can omit large portions of the environment’s dynamics while still guaranteeing optimal behavior, offering a principled way to simplify model-based reinforcement learning. The video walks through the theoretical framework, key definitions, and implications of PVE, and discusses how this perspective reshapes our understanding of environment modeling and abstraction in reinforcement learning. Referenced paper: * Proper Value Equivalence: Simplifying Model-based Reinforcement Learning https://arxiv.org/abs/2106.10316 Supplementary material: * Slides used in this episode: https://dry-peak.cloudvent.net/Additional-Post Files/Youtube/WhenShouoldAgentsExplore.pdf Questions, critiques, and alternative perspectives are welcome in the comments.

Download

0 formats

No download links available.

Proper Value Equivalence: Simplifying Model-based RL | NatokHD