Variance-Reduced Off-Policy TDC Learning:
Non-Asymptotic Convergence Analysis
Shaocong Ma Yi Zhou Shaofeng Zou
Department of ECE Department of ECE Department of EE
University of Utah University of Utah University at Buffalo
Salt Lake City, UT 84112 Salt Lake City, UT 84112 Buffalo, NY 14260
s.ma@utah.edu yi.zhou@utah.edu szou3@buffalo.edu
Abstract
Variance reduction ...


雷达卡


京公网安备 11010802022788号







