Non-Stationary Markov Decision Processes
a Worst-Case Approach using Model-Based
Reinforcement Learning
Erwan Lecarpentier Emmanuel Rachelson
Université de Toulouse Université de Toulouse
ONERA - The French Aerospace Lab ISAE-SUPAERO
erwan.lecarpentier@isae-supaero.fr emmanuel.rachelson@isae-supaero.fr
Abstract
This work tackles the problem of robust planning in non-stationary stochas ...


雷达卡


京公网安备 11010802022788号







