Offline Reinforcement Learning
with Fisher Divergence Critic Regularization
Ilya Kostrikov 1 2 Jonathan Tompson 2 Rob Fergus 1 3 Ofir Nachum 2
Abstract where deploying a new policy to interact with the live en-
vironment is expensive or associated with risks or safety
Many modern approaches to offline Reinforce- concerns (Thomas, 2015), it is more common to have only
ment Learning (RL) utilize b ...


雷达卡




京公网安备 11010802022788号







