Adapting to Misspecification in Contextual Bandits with Offline Regression
Oracles
Sanath Kumar Krishnamurthy 1 Vitor Hadad 2 Susan Athey 2
Abstract whose distribution may depend on the context and action.
Computationally efficient contextual bandits are The objective of the algorithm is to interactively learn a map-
often based on estimating a predictive model of ping from contexts to actions so as to maximize the r ...


雷达卡




京公网安备 11010802022788号







