Convex Regularization in Monte-Carlo Tree Search
Tuan Dam 1 Carlo D’Eramo 1 Jan Peters 1 Joni Pajarinen 1 2
Abstract structure (Coulom, 2006). MCTS provides a principled ap-
Monte-Carlo planning and Reinforcement Learn- proach for trading off between exploration and exploitation
ing (RL) are essential to sequential decision mak- in sequential decision making. Moreover, recent advances
ing. The recent AlphaGo and AlphaZero algo- ...


雷达卡




京公网安备 11010802022788号







