The LoCA Regret: A Consistent Metric to Evaluate
Model-Based Behavior in Reinforcement Learning
Harm van Seijen1 , Hadi Nekoei2 , Evan Racah2 , Sarath Chandar2,3,4
1
Microsoft Research Montréal, 2 Mila - Quebec AI Institute,
3
cole Polytechnique de Montréal, 4 Canada CIFAR AI Chair
Abstract
Deep model-based Reinforcement Learning (RL) has the potential to substantially
improve the sample-efficiency of ...


雷达卡


京公网安备 11010802022788号







