Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning
Yue Wu 1 2 Shuangfei Zhai 1 Nitish Srivastava 1 Joshua Susskind 1 Jian Zhang 1 Ruslan Salakhutdinov 2
Hanlin Goh 1
Abstract leveraging prior experience (Lange et al., 2012). However,
most prior off-policy RL algorithms (Haarnoja et al., 2018;
Offline Reinforcement Learning promises to learn Munos et al., 2016; Kalashnikov et al ...


雷达卡




京公网安备 11010802022788号







