Exponential Lower Bounds for Batch Reinforcement Learning:
Batch RL can be Exponentially Harder than Online RL
Andrea Zanette 1
Abstract we consider two classical batch RL problems: 1) the off-
Several practical applications of reinforcement policy evaluation (OPE) problem, where the batch algo-
learning involve an agent learning from past data rithm needs to predict the performance of a target policy
without th ...


雷达卡




京公网安备 11010802022788号







