标题1:On the convergence of policy iteration in stationary dynamic programming
[size=-1]作者:ML Puterman… -
[size=-1]期刊:Mathematics of Operations …, 1979
[size=-1]链接:http://mathor.highwire.org/cgi/content/abstract/4/1/60
[size=-1]标题2:Modified policy iteration algorithms for discounted Markov decision problems
[size=-1]作者:ML Puterman
[size=-1]期刊: Management Science, 1978
[size=-1]链接:http://www.jstor.org/stable/2630487