On Reinforcement Learning with Adversarial Corruption
and Its Application to Block MDP
Tianhao Wu * 1 2 Yunchang Yang * 3 Simon S. Du 4 Liwei Wang 3 5
Abstract is vulnerable to corrupted data stemming from malicious
entities (Huang et al., 2017; Ma et al., 2019), non-malicious
We study reinforcement learning (RL) in episodic yet non-stationary behavior, or simply errors in the system.
tabular M ...


雷达卡




京公网安备 11010802022788号







