Approximate dynamic programming - 计量经济学与统计软件

1关注
2粉丝

博士生

8%

还不是VIP/贵宾

-

0%

威望: 0 级
论坛币: 58985 个
通用积分: 56.9848
学术水平: 2 点
热心指数: 1 点
信用等级: 1 点
经验: 9071 点
帖子: 136
精华: 0
在线时间: 250 小时
注册时间: 2009-1-15
最后登录: 2023-6-8

楼主

tmdzhu 发表于 2009-7-14 22:02:27 |只看作者 |坛友微信交流群|倒序 |AI写论文

相似文件

换一批

是否 +2 论坛币

k人参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群

赵安豆老师微信：zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

立即领取

感谢您参与论坛问题回答

经管之家送您两个论坛币！

+2 论坛币

Approximate dynamic programming : solving the curses of dimensionality (Wiley Series in Probability and Statistics) / Warren B. Powell.

CONTENTS
Preface xi
Acknowledgments xv
1 The challenges of dynamic programming 1
1.1 A dynamic programming example: a shortest path problem 2
1.2 The three curses of dimensionality 3
1.3 Some real applications 6
1.4 Problem classes 9
1.5 The many dialects of dynamic programming 12
1.6 What is new in this book? 14
1.7 Bibliographic notes 15
2 Some illustrative models 17
2.1 Deterministic problems 18
2.2 Stochastic problems 23
2.3 Information acquisition problems 36
2.4 A simple modeling framework for dynamic programs 39
2.5 Bibliographic notes 42
Problems 43
3 Introduction to Markov decision processes 47
3.1 The optimality equations 48
3.2 Finite horizon problems 53
3.3 Infinite horizon problems 55
3.4 Value iteration 56
3.5 Policy iteration 61
3.6 Hybrid valuepolicy
iteration 62
3.7 The linear programming method for dynamic programs 63
3.8 Monotone policies* 64
3.9 Why does it work?** 70
3.10 Bibliographic notes 85
Problems 85
4 Introduction to approximate dynamic programming 91
4.1 The three curses of dimensionality (revisited) 92
4.2 The basic idea 93
4.3 Sampling random variables 100
4.4 ADP using the postdecision
state variable 101
4.5 Lowdimensional
representations of value functions 107
4.6 So just what is approximate dynamic programming? 110
4.7 Experimental issues 112
4.8 Dynamic programming with missing or incomplete models 117
4.9 Relationship to reinforcement learning 117
4.10 But does it work? 119
4.11 Bibliographic notes 120
Problems 121
5 Modeling dynamic programs 127
5.1 Notational style 129
5.2 Modeling time 130
5.3 Modeling resources 133
5.4 The states of our system 137
5.5 Modeling decisions 144
5.6 The exogenous information process 149
5.7 The transition function 157
5.8 The contribution function 164
5.9 The objective function 166
5.10 A measuretheoretic
view of information** 168
5.11 Bibliographic notes 170
Problems 171
6 Stochastic approximation methods 177
6.1 A stochastic gradient algorithm 179
6.2 Some stepsize recipes 181
6.3 Stochastic stepsizes 188
6.4 Computing bias and variance 193
6.5 Optimal stepsizes 195
6.6 Some experimental comparisons of stepsize formulas 202
6.7 Convergence 207
6.8 Why does it work?** 209
6.9 Bibliographic notes 218
Problems 219
7 Approximating value functions 225
7.1 Approximation using aggregation 226
7.2 Approximation methods using regression models 235
7.3 Recursive methods for regression models 246
7.4 Neural networks 252
7.5 Batch processes 257
7.6 Why does it work?** 261
7.7 Bibliographic notes 264
Problems 266
8 ADP for finite horizon problems 269
8.1 Strategies for finite horizon problems 270
8.2 Qlearning
8.3 Temporal difference learning 277
8.4 Policy iteration 280
8.5 Monte Carlo value and policy iteration 282
8.6 The actorcritic
paradigm 283
8.7 Bias in value function estimation 284
8.8 State sampling strategies 288
8.9 Starting and stopping 291
8.10 A taxonomy of approximate dynamic programming strategies 294
8.11 Why does it work** 296
8.12 Bibliographic notes 296
Problems 297
9 Infinite horizon problems 301
9.1 From finite to infinite horizon 302
9.2 Algorithmic strategies 302
9.3 Stepsizes for infinite horizon problems 311
9.4 Error measures 313
9.5 Direct ADP for online
applications 315
9.6 Finite horizon models for steady state applications 315
9.7 Why does it work?** 317
9.8 Bibliographic notes 317
Problems 317
10 Exploration vs. exploitation 321
10.1 A learning exercise: the nomadic trucker 321
10.2 Learning strategies 324
10.3 A simple information acquisition problem 328
10.4 Gittins indices and the information acquisition problem 330
10.5 Variations 335
10.6 The knowledge gradient algorithm 337
10.7 Information acquisition in dynamic programming 340
10.8 Bibliographic notes 343
Problems 344
11 Value function approximations for special functions 349
11.1 Value functions versus gradients 350
11.2 Linear approximations 351
11.3 Piecewise linear approximations 353
11.4 The SHAPE algorithm 357
11.5 Regression methods 360
11.6 Cutting planes* 363
11.7 Why does it work?** 375
11.8 Bibliographic notes 381
Problems 382
12 Dynamic resource allocation 385
12.1 An asset acquisition problem 386
12.2 The blood management problem 390
12.3 A portfolio optimization problem 399
12.4 A general resource allocation problem 402
12.5 A fleet management problem 414
12.6 A driver management problem 420
12.7 Bibliographic references 424
Problems 425
13 Implementation challenges 431
13.1 Will ADP work for your problem? 431
13.2 Designing an ADP algorithm for complex problems 432
13.3 Debugging an ADP algorithm 434
13.4 Convergence issues 435
13.5 Modeling your problem 436
13.6 Online

Approximate_Dynamic_Programming_Solving_the_Curses_of_Dimensionality.rar (22.73 MB, 需要: 100 个论坛币) 本附件包括：
Approximate Dynamic Programming Solving the Curses of Dimensionality.pdf

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

分享0 收藏3 回帖

关键词：Programming Approximate Dynamic Program Approx Programming Dynamic Approximate

Approximate dynamic programming [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我拉你入群

相关帖子

本版微信群

Approximate dynamic programming [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我 拉你入群

相关帖子

本版微信群

扫码加我拉你入群