Batch Value-function Approximation with Only Realizability
Tengyang Xie 1 Nan Jiang 1
Abstract this subproblem, we create a piecewise constant function
We make progress in a long-standing problem class of statistical complexity O(1/2 ) that can express both
of batch reinforcement learning (RL): learning candidate functions up to small discretization errors, and
Q? from an exploratory and polynomial-sized use t ...


雷达卡




京公网安备 11010802022788号







