Finding the Stochastic Shortest Path with Low Regret:
The Adversarial Cost and Unknown Transition Case
Liyu Chen 1 Haipeng Luo 1
Abstract end within a fixed number of steps is extensively studied
We make significant progress toward the stochas- in recent years (often known as episodic finite-horizon re-
tic shortest path problem with adversarial costs inforcement learning or loop-free SSP). The general (and
...


雷达卡




京公网安备 11010802022788号







