Sample Efficient Reinforcement Learning
via Low-Rank Matrix Estimation
Devavrat Shah Dogyoon Song Zhi Xu Yuzhe Yang
EECS, MIT EECS, MIT EECS, MIT EECS, MIT
devavrat@mit.edu dgsong@mit.edu zhixu@mit.edu yuzhe@mit.edu
Abstract
We consider the question of learning Q-function in a sample efficient manner for
reinforcement learning with continuous state and action spaces under ...


雷达卡



京公网安备 11010802022788号







