Value-driven Hindsight Modelling
Arthur Guez Fabio Viola Théophane Weber Lars Buesing
Steven Kapturowski Doina Precup David Silver Nicolas Heess
DeepMind
aguez@google.com
Abstract
Value estimation is a critical component of the reinforcement learning (RL)
paradigm. The question of how to effectively learn value predictors from data is
one of the major problems studied ...


雷达卡


京公网安备 11010802022788号







