Characterizing the Gap Between Actor-Critic and Policy Gradient
Junfeng Wen 1 Saurabh Kumar 2 Ramki Gummadi 3 Dale Schuurmans 1 3
Abstract on a range of challenging tasks. Despite the success of AC
methods, AC and PG have subtle differences that are only
Actor-critic (AC) methods are ubiquitous in re- partially characterized in the literature (Konda and Tsitsiklis,
inforcement learning. Although it is understoo ...


雷达卡




京公网安备 11010802022788号







