Bootstrapping Fitted Q-Evaluation for Off-Policy Inference
Botao Hao 1 Xiang Ji 2 Yaqi Duan 2 Hao Lu 2 Csaba Szepesvari 1 3 Mengdi Wang 1 2
Abstract et al., 2013; Munos & Szepesvari, 2008; Le et al., 2019).
Bootstrapping provides a flexible and effective In practice, FQE has demonstrated robust and satisfying
approach for assessing the quality of batch rein- performances on many classical RL tasks under different
forcement learnin ...


雷达卡




京公网安备 11010802022788号







