Provably Efficient Algorithms for Multi-Objective Competitive RL
Tiancheng Yu 1 Yi Tian 1 Jingzhao Zhang 1 Suvrit Sra 1
Abstract average return to a target set small as long as this set satisfies
a condition called approachability (Blackwell, 1956).
We study multi-objective reinforcement learning
(RL) where an agent’s reward is represented as The approachability theorem applies to multi-objective
a ...


雷达卡




京公网安备 11010802022788号







