楼主: wz151400
7 0

[其他] 【英文计量经济学资料】Machine Learning for Econometrics [推广有奖]

已卖:13453份资源
好评率:99%
商家信誉:良好

泰斗

73%

还不是VIP/贵宾

-

TA的文库  其他...

百味图书

威望
0
论坛币
373 个
通用积分
2446.6550
学术水平
177 点
热心指数
208 点
信用等级
105 点
经验
9729 点
帖子
23559
精华
0
在线时间
13925 小时
注册时间
2016-2-10
最后登录
2026-1-30

楼主
wz151400 在职认证  发表于 2026-1-25 14:06:16 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
Machine Learning for Econometrics.pdf (14.18 MB, 需要: RMB 19 元)
机器学习计量经济学
2026最新资料。
内容丰富353页。
1. Introduction 1
1.1 Econometrics versus machine learning 1
1.2 What is this book about? 6
1.2.1 High dimension and variable selection 6
1.2.2 Estimation of heterogeneous effects 7
1.2.3 Aggregate data and macroeconomic forecasting 8
1.2.4 Text data 9
1.3 Framework and notations 10
1.4 Additional resources 12
PART I. STATISTICS AND ECONOMETRICS PREREQUISITES
2. Statistical tools 15
2.1 Linear regression 15
2.2 Singular value decomposition 18
2.3 High dimension and penalized regressions 20
2.3.1 When OLS fail 20
2.3.2 Ridge regression 20
2.3.3 Lasso regression 21
2.3.4 Ridge or Lasso? 22
2.3.5 Choosing λ by cross-validation 24
2.4 Maximum likelihood 25
2.4.1 General principle 25
2.4.2 Examples and penalized versions 27
2.5 Generalized method of moments 28
2.6 Factor models 30
2.7 Random forests 33
2.7.1 Single sample trees 33
2.7.2 Details on the segmentation test 34
2.7.3 Random forests 36
2.8 Neural networks 36
2.8.1 Architecture 37
2.8.2 Loss functions 39
2.8.3 Training through backpropagation 39
2.8.4 Training tips 42
2.9 Summary 44
2.10 Proofs and additional results 45
3. Causal inference 46
3.1 Definitions 46
3.2 Randomized controlled trials 47
3.3 Conditional independence and the propensity score 49
3.3.1 Baseline assumptions 49
3.3.2 Two characterizations of the ATE 50
3.3.3 Efficient estimation of treatment effect 52
3.4 Instrumental variables 52
3.4.1 Endogeneity and instrumental variables 52
3.4.2 The problem of optimal instruments 54
3.5 Summary 57
PART II. HIGH-DIMENSION AND VARIABLE SELECTION
4. Post-selection inference 61
4.1 The post-selection inference problem 61
4.1.1 The model 62
4.1.2 Consistent model selection 63
4.1.3 Distribution of the post-selection estimator 64
4.2 High dimension, sparsity, and the Lasso 66
4.3 Theoretical elements on the Lasso 68
4.4 Regularization bias 69
4.4.1 Selection and estimation cannot be optimally done at the
same time 69
4.4.2 The bias of the naive “plug-in” estimator 70
4.5 The double selection method 72
4.6 Empirical application: the effect of education on wage 75
4.7 Summary 77
4.8 Proofs and additional results 78
4.8.1 Proof of the main results 78
4.8.2 Additional results 84
5. Generalization and methodology 87
5.1 Theory: immunization 87
5.1.1 Intuition 87
5.1.2 Asymptotic normality 88
5.2 Orthogonal scores for treatment effect estimation 91
5.3 Sample-splitting 92
5.4 Simulations: regularization bias 94
5.4.1 Data-generating process 94
5.4.2 Estimators 95
5.5 Empirical application: job training program 95
5.6 Summary 97
5.7 Proofs and additional results 98
6. High dimension and endogeneity 101
6.1 Specific model for instrumental variables 103
6.2 Immunization for instrumental variables 104
6.3 Simulations 107
6.4 Applications 108
6.4.1 Logistic demand model 108
6.4.2 Instrument selection for estimating returns to education 111
6.5 Summary 112
6.6 Additional remark 113
7. Going further 115
7.1 Estimation with non-Gaussian errors 115
7.2 Sample splitting 119
7.3 Joint inference on a group of coefficients 120
7.3.1 Double selection and Lasso desparsification 120
7.3.2 Asymptotic normality of the bias-corrected estimator 121
7.4 Regularization and instrument selection for panel data 123
7.4.1 The cluster-Lasso: intuition 124
7.4.2 Application to the economics of crime 126
7.5 Summary 127
7.6 Proofs and additional results 128
PART III. TREATMENT EFFECT HETEROGENEITY
8. Inference on heterogeneous effects 135
8.1 Heterogeneous treatment effects 136
8.2 Direct estimation 139
8.3 Inference with causal random forests 141
8.3.1 Double sample trees 141
8.3.2 Two-sample random forests 142
8.3.3 Bias and honesty of the random forest regression 143
8.3.4 Double sample causal trees 144
8.3.5 Applications 148
8.3.6 The problem of estimating treatment heterogeneity
with endogeneity 149
8.3.7 The gradient tree algorithm 151
8.3.8 Central limit theorem for generalized random forests 153
8.3.9 Application to the heterogeneity of the effect of subsidized
training on traineesʼ income 154
8.4 Inference on characteristics of heterogeneous effects 155
8.4.1 Estimation of key characteristics of CATE 156
8.4.2 Inference for key features of the CATE 158
8.4.3 Algorithm: inference on the main features of CATE 161
8.4.4 Simulations 162
8.5 Summary 164
8.6 Proofs and additional results 166
9. Optimal policy learning 173
9.1 Problem: optimal policy learning 173
9.1.1 Optimal policy in a simplified framework 173
9.1.2 The minimax regret criterion 175
9.2 Empirical welfare maximization 176
9.2.1 Empirical welfare maximization with known
propensity score 176
9.2.2 Maximization of empirical welfare with estimated propensity
score 180
9.3 Application: optimization of a training program 181
9.4 Summary 183
PART IV. AGGREGATED DATA AND MACROECONOMIC
FORECASTING
10. The synthetic control method 189
10.1 Framework and estimation 190
10.2 A result on the bias 194
10.3 When and why should synthetic controls be used 196
10.4 Inference using permutation tests 197
10.4.1 Permutation tests in a simple framework 198
10.4.2 The confidence interval-test duality 201
10.5 Empirical application: tobacco control program 202
10.6 Multiple treated units 207
10.7 Summary 208
10.8 Proofs and additional results 209
10.8.1 Proofs of the main results 209
10.8.2 Additional Results 212
11. Forecasting in high-dimension 213
11.1 Regression in high-dimension for forecasting 214
11.1.1 Time series in high-dimension 214
11.1.2 Model and estimator 214
11.1.3 Mixed data sampling regression models (MIDAS) 216
11.1.4 Asymptotic properties 218
11.1.5 Another dependence assumption: mixing 221
11.2 Limitations and other methods 221
11.2.1 Critical approach of the sparsity hypothesis 221
11.2.2 A mixed approach: FARM 223
11.2.3 Nonlinearity 225
11.2.4 Application: nowcasting of the US GDP 225
11.3 Testing Granger causality 227
11.3.1 Joint inference on a group of coefficients with time series 227
11.3.2 Granger causality tests in high-dimension 228
11.3.3 Application: text and GDP prediction 230
11.4 Summary 231
PART V. TEXTUAL DATA
12. Working with text data 235
12.1 Basic concepts and roadmap 236
12.1.1 Definitions 237
12.1.2 Road-map for leveraging text data 237
12.2 NLP 1.0: text-processing tools to build tabular data 239
12.2.1 Pre-processing 239
12.2.2 Selecting n-grams with mutual information 240
12.2.3 The document-term matrix 241
12.2.4 How to measure similarity? 242
12.2.5 Textual regression 243
12.3 Empirical applications based on word frequency 243
12.3.1 Impact of racism on American elections 243
12.3.2 Definition of business sectors using company descriptions 244
12.4 Language modeling with latent variables 245
12.4.1 The unigram model 246
12.4.2 Unigram modeling with topic mixture 246
12.4.3 Latent Dirichlet allocation 249
12.5 Empirical applications 251
12.5.1 Monetary policy transparency 251
12.5.2 Political division 252
12.6 Summary 253
13. Word embeddings 255
13.1 Limitations of the one-hot representation 255
13.2 Factorization of the co-occurrence matrix 256
13.2.1 Representation using the co-occurrence matrix 256
13.2.2 Dimension reduction through singular value decomposition 257
13.3 word2vec and self-supervised learning 257
13.3.1 Vector arithmetic 257
13.3.2 Self-supervised learning 259
13.3.3 Skip-gram 260
13.3.4 Continuous bag of words 261
13.3.5 Computational considerations 262
13.3.6 Choice of hyperparameters 263
13.3.7 Empirical applications 264
13.4 Classification using text embeddings 266
13.4.1 Potential applications 266
13.4.2 Bag-of-word architecture 267
13.4.3 Other applications 269
13.5 Going further: representation of unstructured data 270
13.5.1 Encoding textual or visual information 270
13.5.2 Embeddings for consumer goods 271
13.6 Summary 272
14. Modern language models 274
14.1 Tokenizers 275
14.1.1 Character-level tokenization 275
14.1.2 Word-level tokenization 276
14.1.3 Sub-word tokenization 277
14.1.4 Practical considerations when training a tokenizer 278
14.2 Building BERT 279
14.2.1 Context matters 279
14.2.2 Self-attention 279
14.2.3 Transformer layer 281
14.2.4 The anatomy of BERT 283
14.3 Training BERT 284
14.3.1 Pre-training 284
14.3.2 Fine-tuning 286
14.3.3 Zero-shot learning 287
14.4 Application: matching via Siamese neural networks 289
14.4.1 Description of the problem 289
14.4.2 General strategy 290
14.4.3 Loss functions 291
14.4.4 Model evaluation 293
14.4.5 Training tips 293
14.5 Summary 295
14.6 Appendix: Siamese networks beyond text data 296
14.6.1 Vector representation of job offers 296
14.6.2 Differentiation in the font market 296
PART VI. EXERCISES
15. Exercises 301
15.1 Regression as a weighting estimator 301
15.2 Orthogonal score for treatment effect on treated 302
15.3 Voting model 303
15.4 Gender wage gap 305
15.5 Drought and incentives for water conservation 309
15.6 Synthetic control and regularization 315
Bibliography 317
Index 331
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:计量经济学资料 经济学资料 计量经济学 计量经济 经济学

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
扫码
拉您进交流群
GMT+8, 2026-1-31 06:09