Towards Tight Bounds on the Sample Complexity
of Average-reward MDPs
Yujia Jin 1 Aaron Sidford 1
Abstract making under uncertainty and reinforcement learning (Puter-
man, 2014; Sutton & Barto, 2018). It is a prominent theoret-
We prove new upper and lower bounds for sample ical test-bed for learning algorithms and has been studied ex-
complexity of finding an -optimal poli ...


雷达卡




京公网安备 11010802022788号







