Regret Minimization for Reinforcement Learning by
Evaluating the Optimal Bias Function
Zihan Zhang Xiangyang Ji
Tsinghua University Tsinghua University
zihan-zh17@mails.tsinghua.edu.cn xyji@tsinghua.edu.cn
Abstract
We present an algorithm based on the Optimism in the Face of Uncertainty (OFU)
principle which is able to learn Reinforcement Learning (RL) modeled by Markov
deci ...


雷达卡


京公网安备 11010802022788号







