Policy Continuation with Hindsight Inverse Dynamics
Hao Sun1 , Zhizhong Li1 , Xiaotong Liu2 , Dahua Lin1 , Bolei Zhou1
1
The Chinese University of Hong Kong, 2 Peking University
Abstract
Solving goal-oriented tasks is an important but challenging problem in reinforce-
ment learning (RL). For such tasks, the rewards are often sparse, making it difficult
to learn a policy effectively. To tackle this difficulty, we propose ...


雷达卡


京公网安备 11010802022788号







