Data-efficient Hindsight Off-policy Option Learning
Markus Wulfmeier 1 Dushyant Rao 1 Roland Hafner 1 Thomas Lampe 1 Abbas Abdolmaleki 1 Tim Hertweck 1
Michael Neunert 1 Dhruva Tirumala 1 Noah Siegel 1 Nicolas Heess 1 Martin Riedmiller 1
Abstract tice, hierarchical control schemes often introduce technical
challenges, including a tendency to learn degenerate solu-
We introduce Hindsight Off-policy Op-
...


雷达卡




京公网安备 11010802022788号







