Optimal Thompson Sampling strategies for support-aware CVaR bandits
Dorian Baudry 1 Romain Gautron 2 3 Emilie Kaufmann 1 Odalric-Ambrym Maillard 1
Abstract Value at Risk (CVaR) as well as more generic coherent spec-
tral risk measures (Acerbi and Tasche, 2002) have received
In this paper we study a multi-arm bandit prob-
specific attention from the bandit community (Galichet et al.
lem ...


雷达卡




京公网安备 11010802022788号







