Causal Inference: What If
by Miguel A. Hern´an, James M. Robins
December 31, 2020 (revised January 2023)
Contents
Introduction: Towards less casual causal inferences vii
I Causal inference without models 1
1 A definition of causal effect 3
1.1 Individual causal effects . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Average causal effects . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Measures of causal effect . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Random variability . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Causation versus association . . . . . . . . . . . . . . . . . . . . 10
2 Randomized experiments 13
2.1 Randomization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Conditional randomization . . . . . . . . . . . . . . . . . . . . . 17
2.3 Standardization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.4 Inverse probability weighting . . . . . . . . . . . . . . . . . . . . 20
3 Observational studies 27
3.1 Identifiability conditions . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 Exchangeability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3 Positivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4 Consistency: First, define the counterfactual outcome . . . . . . 33
3.5 Consistency: Second, link counterfactuals to the observed data . 37
3.6 The target trial . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4 Effect modification 43
4.1 Heterogeneity of treatment effects . . . . . . . . . . . . . . . . . 43
4.2 Stratification to identify effect modification . . . . . . . . . . . . 45
4.3 Why care about effect modification . . . . . . . . . . . . . . . . . 47
4.4 Stratification as a form of adjustment . . . . . . . . . . . . . . . 49
4.5 Matching as another form of adjustment . . . . . . . . . . . . . . 51
4.6 Effect modification and adjustment methods . . . . . . . . . . . 52
5 Interaction 57
5.1 Interaction requires a joint intervention . . . . . . . . . . . . . . 57
5.2 Identifying interaction . . . . . . . . . . . . . . . . . . . . . . . . 58
5.3 Counterfactual response types and interaction . . . . . . . . . . . 60
5.4 Sufficient causes . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.5 Sufficient cause interaction . . . . . . . . . . . . . . . . . . . . . 65
5.6 Counterfactuals or sufficient-component causes? . . . . . . . . . . 67
6 Graphical representation of causal effects 71
6.1 Causal diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.2 Causal diagrams and marginal independence . . . . . . . . . . . 73
6.3 Causal diagrams and conditional independence . . . . . . . . . . 76
6.4 Positivity and consistency in causal diagrams . . . . . . . . . . . 77
6.5 A structural classification of bias . . . . . . . . . . . . . . . . . . 80
6.6 The structure of effect modification . . . . . . . . . . . . . . . . . 83
7 Confounding 85
7.1 The structure of confounding . . . . . . . . . . . . . . . . . . . . 85
7.2 Confounding and exchangeability . . . . . . . . . . . . . . . . . . 87
7.3 Confounding and the backdoor criterion . . . . . . . . . . . . . . 89
7.4 Confounding and confounders . . . . . . . . . . . . . . . . . . . . 92
7.5 Single-world intervention graphs . . . . . . . . . . . . . . . . . . 95
7.6 Confounding adjustment . . . . . . . . . . . . . . . . . . . . . . . 96
8 Selection bias 101
8.1 The structure of selection bias . . . . . . . . . . . . . . . . . . . 101
8.2 Examples of selection bias . . . . . . . . . . . . . . . . . . . . . . 103
8.3 Selection bias and confounding . . . . . . . . . . . . . . . . . . . 105
8.4 Selection bias and censoring . . . . . . . . . . . . . . . . . . . . . 107
8.5 How to adjust for selection bias . . . . . . . . . . . . . . . . . . . 109
8.6 Selection without bias . . . . . . . . . . . . . . . . . . . . . . . . 113
9 Measurement bias 117
9.1 Measurement error . . . . . . . . . . . . . . . . . . . . . . . . . . 117
9.2 The structure of measurement error . . . . . . . . . . . . . . . . 118
9.3 Mismeasured confounders and colliders . . . . . . . . . . . . . . . 120
9.4 Causal diagrams without measured variables? . . . . . . . . . . . 122
9.5 Many proposed causal diagrams are actually noncausal . . . . . 123
9.6 Does it matter that many proposed diagrams are noncausal? . . 125
10 Random variability 129
10.1 Identification versus estimation . . . . . . . . . . . . . . . . . . 129
10.2 Estimation of causal effects . . . . . . . . . . . . . . . . . . . . 132
10.3 The myth of the super-population . . . . . . . . . . . . . . . . . 134
10.4 The conditionality “principle” . . . . . . . . . . . . . . . . . . . 136
10.5 The curse of dimensionality . . . . . . . . . . . . . . . . . . . . 140
II Causal inference with models 143
11 Why model? 145
11.1 Data cannot speak for themselves . . . . . . . . . . . . . . . . . 145
11.2 Parametric estimators of the conditional mean . . . . . . . . . . 147
11.3 Nonparametric estimators of the conditional mean . . . . . . . 148
11.4 Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
11.5 The bias-variance trade-off . . . . . . . . . . . . . . . . . . . . . 15
12 IP weighting and marginal structural models 155
12.1 The causal question . . . . . . . . . . . . . . . . . . . . . . . . . 155
12.2 Estimating IP weights via modeling . . . . . . . . . . . . . . . . 156
12.3 Stabilized IP weights . . . . . . . . . . . . . . . . . . . . . . . . 159
12.4 Marginal structural models . . . . . . . . . . . . . . . . . . . . . 161
12.5 Effect modification and marginal structural models . . . . . . . 163
12.6 Censoring and missing data . . . . . . . . . . . . . . . . . . . . 164
13 Standardization and the parametric g-formula 167
13.1 Standardization as an alternative to IP weighting . . . . . . . . 167
13.2 Estimating the mean outcome via modeling . . . . . . . . . . . 169
13.3 Standardizing the mean outcome to the confounder distribution 170
13.4 IP weighting or standardization? . . . . . . . . . . . . . . . . . 171
13.5 How seriously do we take our estimates? . . . . . . . . . . . . . 173
14 G-estimation of structural nested models 179
14.1 The causal question revisited . . . . . . . . . . . . . . . . . . . 179
14.2 Exchangeability revisited . . . . . . . . . . . . . . . . . . . . . . 180
14.3 Structural nested mean models . . . . . . . . . . . . . . . . . . 181
14.4 Rank preservation . . . . . . . . . . . . . . . . . . . . . . . . . . 183
14.5 G-estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
14.6 Structural nested models with two or more parameters . . . . . 188
15 Outcome regression and propensity scores 191
15.1 Outcome regression . . . . . . . . . . . . . . . . . . . . . . . . . 191
15.2 Propensity scores . . . . . . . . . . . . . . . . . . . . . . . . . . 193
15.3 Propensity stratification and standardization . . . . . . . . . . . 194
15.4 Propensity matching . . . . . . . . . . . . . . . . . . . . . . . . 196
15.5 Propensity models, structural models, predictive models . . . . 197
16 Instrumental variable estimation 201
16.1 The three instrumental conditions . . . . . . . . . . . . . . . . . 201
16.2 The usual IV estimand . . . . . . . . . . . . . . . . . . . . . . . 204
16.3 A fourth identifying condition: homogeneity . . . . . . . . . . . 206
16.4 An alternative fourth condition: monotonicity . . . . . . . . . . 209
16.5 The three instrumental conditions revisited . . . . . . . . . . . 212
16.6 Instrumental variable estimation versus other methods . . . . . 215
17 Causal survival analysis 219
17.1 Hazards and risks . . . . . . . . . . . . . . . . . . . . . . . . . . 219
17.2 From hazards to risks . . . . . . . . . . . . . . . . . . . . . . . . 221
17.3 Why censoring matters . . . . . . . . . . . . . . . . . . . . . . . 224
17.4 IP weighting of marginal structural models . . . . . . . . . . . . 226
17.5 The parametric g-formula . . . . . . . . . . . . . . . . . . . . . 228
17.6 G-estimation of structural nested models . . . . . . . . . . . . . 229
18 Variable selection for causal inference 233
18.1 The different goals of variable selection . . . . . . . . . . . . . . 233
18.2 Variables that induce or amplify bias . . . . . . . . . . . . . . . 234
18.3 Causal inference and machine learning . . . . . . . . . . . . . . 238
18.4 Doubly robust machine learning estimators . . . . . . . . . . . . 239
18.5 Variable selection is a difficult problem . . . . . . . . . . . . . . 242
III Causal inference from complex longitudinal data 245
19 Time-varying treatments 247
19.1 The causal effect of time-varying treatments . . . . . . . . . . . 247
19.2 Treatment strategies . . . . . . . . . . . . . . . . . . . . . . . . 248
19.3 Sequentially randomized experiments . . . . . . . . . . . . . . . 249
19.4 Sequential exchangeability . . . . . . . . . . . . . . . . . . . . . 251
19.5 Identifiability under some but not all treatment strategies . . . 253
19.6 Time-varying confounding and time-varying confounders . . . . 257
20 Treatment-confounder feedback 259
20.1 The elements of treatment-confounder feedback . . . . . . . . . 259
20.2 The bias of traditional methods . . . . . . . . . . . . . . . . . . 261
20.3 Why traditional methods fail . . . . . . . . . . . . . . . . . . . 263
20.4 Why traditional methods cannot be fixed . . . . . . . . . . . . . 265
20.5 Adjusting for past treatment . . . . . . . . . . . . . . . . . . . . 266
21 G-methods for time-varying treatments 269
21.1 The g-formula for time-varying treatments . . . . . . . . . . . . 269
21.2 IP weighting for time-varying treatments . . . . . . . . . . . . . 274
21.3 A doubly robust estimator for time-varying treatments . . . . . 278
21.4 G-estimation for time-varying treatments . . . . . . . . . . . . . 281
21.5 Censoring is a time-varying treatment . . . . . . . . . . . . . . 289
21.6 The big g-formula . . . . . . . . . . . . . . . . . . . . . . . . . . 292
22 Target trial emulation 297
22.1 Intention-to-treat effect and per-protocol effect . . . . . . . . . 297
22.2 A target trial with sustained treatment strategies . . . . . . . . 301
22.3 Emulating a target trial with sustained strategies . . . . . . . . 305
22.4 Time zero . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
22.5 A unified approach to causal inference . . . . . . . . . . . . . . 309
23 Causal mediation 313
23.1 Mediation analysis under attack . . . . . . . . . . . . . . . . . . 313
23.2 A defense of mediation analysis . . . . . . . . . . . . . . . . . . 315
23.3 Empirically verifiable mediation . . . . . . . . . . . . . . . . . . 317
23.4 An interventionist theory of mediation . . . . . . . . . . . . . . 319
References 321
Index 341