PsiPhi-Learning: Reinforcement Learning with Demonstrations using
Successor Features and Inverse Temporal Difference Learning
Angelos Filos 1 Clare Lyle 1 Yarin Gal 1 Sergey Levine 2 Natasha Jaques * 2 3 Gregory Farquhar * 4
Abstract
We study reinforcement learning (RL) with no-
reward demonstrations, a setting in which an RL
agent has access to additional data from the inter-
action of other agents with the same environment.
However, it has no access to ...


雷达卡




京公网安备 11010802022788号







