Tutorials: Causal Models for Mediation Analyses

0关注
62粉丝

VIP

已卖：4196份资源

院士

67%

还不是VIP/贵宾

-

TA的文库 其他...

Bayesian NewOccidental

Spatial Data Analysis

东西方数据挖掘

0%

威望: 0 级
论坛币: 50294 个
通用积分: 83.8106
学术水平: 253 点
热心指数: 300 点
信用等级: 208 点
经验: 41518 点
帖子: 3256
精华: 14
在线时间: 766 小时
注册时间: 2006-5-4
最后登录: 2022-11-6

楼主

Lisrelchen 发表于 2016-5-29 04:01:07 |AI写论文

是否 +2 论坛币

k人参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群

赵安豆老师微信：zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

立即领取

感谢您参与论坛问题回答

经管之家送您两个论坛币！

+2 论坛币

Zheng, C., Atkins, D. C., & Zhou, A. (2012). Causal models for mediation analyses: An introduction to structural mean models. Manuscript submitted for publication.

本帖隐藏的内容

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

分享0 收藏0 回帖

关键词：mediation Tutorials Tutorial Analyses analyse

本帖被以下文库推荐

· Multilevel NewOccidental|主题: 247, 订阅: 21

沙发

Lisrelchen 发表于 2016-5-29 04:01:31

### R code to accompany (Updated, 29 August 2012):
#
# Zheng, C., Atkins, D. C., & Zhou, A. (2012). Causal models for mediation analyses: An introduction to structural mean models. Manuscript submitted for publication.
#
### Analyses demonstrating how to fit the RPM model described in
### the paper are shown below. Data originally from:
#
# Whiteside, U., Atkins, D. C., Kleiber, B. V., Neighbors, C., Witkiewitz, K., & Larimer, M. E. (2011). DBT skills plus personalized normative feedback: Results of a randomized clinical trial. Manuscript submitted for publication.
### Import data
newdata <- read.csv("http://depts.washington.edu/cshrb/newweb/stats%20documents/Jan2012Mediation.csv",
header = TRUE)
head(newdata)
summary(newdata)
### NOTE: variables with prefix "b" were assessed at baseline,
### and variables with prefix "o" were assessed at one month
### post intervention
#
### NOTE: original study had 3 treatment groups, but our mediation
### analyses focus only on DBT-BASICS and Control
#
### NOTE: some individuals had missing data at one month and are not
### included here
# Original variables will be re-named to generic names using the
# following conventions:
#
# m = mediator
# y = outcome
# r = intervention indicator
# x.y = covariate for outcome
# x.m = covariate for mediator
#
### Hypothesized mediator: One month DERS
m <- newdata$o_ders
### Convert treatment to binary indicator
r <- 1 - as.numeric(newdata$condition == "Control")
### Outcome: One month BDI
y <- newdata$o_bdi
### Baseline covariate (centered around mean)
x.m <- newdata$b_bai - mean(newdata$b_bai)
x.y <- newdata$b_bai - mean(newdata$b_bai)
### NOTE: In the current usage, x.m and x.y (covariates for mediator model
### and outcome model) are identical; however, they do not have to be,
### and so we show them as two separate terms here
#
### Pull data together into data.frame
df <- data.frame(y = y, m = m, r = r, x.m = x.m, x.y = x.y)
head(df)
### The rank preserving model (RPM), which is a particular way of
### estimating a structural mean model, is fit using estimating
### equations. The following R code gives an example of how to
### fit the RPM to the college student drinking data and directly
### follows the equations in the technical appendix of the ms.
#
### Fit model for m to obtain weight W_i(X) in w1, w2
#
### First, fit linear model of mediator predicted from treatment by
### baseline interaction
modelm <- lm(m ~ r*x.m, data = df)
summary(modelm)
### NOTE: Interaction of treatment and baseline covariate (here, BAI),
### is not very strong (p = 0.19), which will impact the variance
### of the RPM estimate.
### The following code generates the predicted value of the mediator
### given baseline covariates for each level of the treatment, based
### on Equation 8 in the Technical Appendix
tmp.data <- expand.grid(x.m = df$x.m,
r = c(0,1))
tmp.data$pred <- predict(modelm, tmp.data, type = "response")
head(tmp.data)
### Now, get treatment difference on mediator (conditional on x.m)
w <- tmp.data[tmp.data$r == 1, "pred"] -
tmp.data[tmp.data$r == 0, "pred"]
### "weight 1" is the difference between treatment and average treatment,
### that is, r^tilde - r^bar in equation 7:
w1 <- df$r - mean(df$r)
### "weight 2" is this difference multiplied by w (ie, treatment difference in
### mediator conditional on x.m; see equation 8)
w2 <- (df$r - mean(df$r))*w
### For Equation 7 in the Appendix, we do not include additional
### baseline covariates. See below for estimation using Equation 9.1
### that does include baseline covariates.
#
### To more clearly connect the R code with the Technical Appendix
### we include a representation of Equation 7 here (as best we can
### given unformatted text); plugging in Y^00(b, c') = Y - c'r - bm:
#
### Sigma_i (r-Er) W(x) (Y-c'r-bm) = 0 [7]
### We have already estimated the first two parts, which we combine:
WW <- cbind(w1,w2)
head(WW)
### Relating this to Equation 7 in Technical Appendix:
#
### WW = (r-Er) W(x)
### Re-arranging Eq. 7 above:
#
### (r-Er) W(x) Y = (r-Er) W(x) (r,m) %*% (c',b)
#
### Combine observed treatment and mediator, where (r,m) = MM
MM <- cbind(df$r, df$m)
### To solve for (c',b), we take final steps:
#
### YY = MM %*% (c',b) and
#
### theta = (c',b) = MM^{-1} YY.
XX <- crossprod(WW, MM)
YY <- crossprod(WW, df$y)
theta <- tcrossprod(solve(XX), t(YY))
theta
### Calculate y00 and then get sandwich variance estimation
y0 <- df$y - tcrossprod(MM, t(theta))
V <- tcrossprod(tcrossprod(solve(XX), WW * as.vector(y0)))
B3 <- solve(solve(crossprod(WW * as.vector(y0)))[1:2,1:2])
### "Sandwich" variance estimates
V3 <- solve(XX) %*% B3 %*% solve(t(XX))
V3
### Calculate 95% CI
#
### Extract SE from variance-covariance matrix
se <- sqrt(diag(V3))
res3 <- cbind(b = theta[1:2],
lo = theta[1:2] + qnorm(0.025)*se,
hi = theta[1:2] + qnorm(0.975)*se)
round(res3, 2)
### Note: The results reported in the ms use baseline covariates; thus,
### see the following section.
############# Augmented with Baseline Covariates #########
#
### As noted in the Technical Appendix, it is possible to reduce
### the variance of the estimated parameters (i.e., increase
### efficiency), by including baseline covariates as shown in
### Equation 9.1
#
### Based on equations 9.1 and 10.1, we augment the estimating
### equations with an intercept and baseline covariate:
w4 <- 1
w5 <- df$x.y
WW.2 <- cbind(w1,w2,w4,w5)
head(WW.2)
### Note that WW.2 is the matrix shown in the explicit form of the solution,
### directly following presentation of equation 10.1 in the ms. Moreover,
### MM.2 (below) is the middle matrix:
MM.2 <- cbind(df$r, df$m, 1, df$x.y)
### We can then estimate coefficients via:
XX.2 <- crossprod(WW.2, MM.2)
YY.2 <- crossprod(WW.2, df$y)
theta.2 <- tcrossprod(solve(XX.2), t(YY.2))
theta.2
### Calculate y00 and then get sandwich variance estimation
y0.2 <- df$y - tcrossprod(MM.2, t(theta.2))
V.2 <- tcrossprod(tcrossprod(solve(XX.2), WW.2 * as.vector(y0.2)))
A.2 <- XX.2[1:2, 1:2]
B3.2 <- solve(solve(crossprod(WW.2 * as.vector(y0.2)))[1:2,1:2])
### "Sandwich" variance estimates
V3.2 <- solve(A.2) %*% B3.2 %*% solve(t(A.2))
V3.2
### Calculate 95% CI
#
### Extract SE from variance-covariance matrix
se2 <- sqrt(diag(V3.2))
res3.2 <- cbind(b = theta.2[1:2],
lo = theta.2[1:2] + qnorm(0.025)*se2,
hi = theta.2[1:2] + qnorm(0.975)*se2)
round(res3.2, 2)
round(res3, 2)
### Note that we also see that including the baseline covariate increased the efficiency of the estimates.

复制代码

藤椅

Lisrelchen 发表于 2016-5-29 04:02:45

### Import original data and use the treatment as well as
### baseline BAI as z and x
newdata <- read.csv("Mediation.csv", header = TRUE)
### Generate simulation data, alphau and betau measure
### violation of sequential ignorability and gamma measures
### interaction
gendata <- function(thetaz, thetam, betax, betau, beta0,
alphaz, alphax, alpha0, gamma, alphau) {
z <- 1 - as.numeric(newdata$condition=="Control")
x.m <- newdata$b_bai
x.y <- newdata$b_bai
n <- length(z)
u <- rnorm(n)
m <- alphaz*z + alphax*x.m + gamma*x.m*z + alphau*u +
alpha0 + rnorm(n, sd = 0.2)
y <- thetaz*z + thetam*m + betax*x.y + betau*u + beta0 +
rnorm(n, sd = 0.5)
list(z, m, y, x.m, x.y)
}
library(sandwich) # for robust SE
### RPM Estimation function, return with estimates and CI
### from RPM and OLS
mymed <- function(z, m, y, x.m, x.y){
data <- na.omit(data.frame(z = z, m = m, y = y,
x.m = x.m, x.y = x.y))
modelm <- lm(m ~ z*x.m, data = data)
data1 <- data0 <- data
data1$z <- 1
data0$z <- 0
w <- as.matrix(predict(modelm, data1, type = "response")-predict(modelm, data0, type = "response"))
w1 <- (data$z-mean(data$z))
w2 <- (data$z-mean(data$z))*w
# w4, w5 is for the augmented estimating equations
w4 <- 1
w5 <- data$x.y
WW <- cbind(w1, w2, w4, w5)
# solve the equation by its explicit expression
MM <- cbind(data$z, data$m, 1, data$x.y)
XX <- crossprod(WW, MM)
YY <- crossprod(WW, data$y)
theta <- tcrossprod(solve(XX), t(YY))
# calculate y00 and then get sandwich variance estimation
y0 <- data$y-tcrossprod(MM, t(theta))
V <- tcrossprod(tcrossprod(solve(XX), WW*as.vector(y0)))
B3 <- solve(solve(crossprod(WW*as.vector(y0))))
A <- XX
V3 <- (solve(A)%*%B3%*%solve(t(A)))[1:2, 1:2]
# Calculate 95% CI
res3 <- cbind(theta[1:2], theta[1:2] + qnorm(0.025)*sqrt(diag(V3)), theta[1:2] + qnorm(0.975)*sqrt(diag(V3)))
#Indirect effect and Mediated effect via OLS
res <- lm(y~z + m + x.y, data = data)
#Compute 95% CI for indirect and mediated effect
rbind(res3, cbind(coef(res), coef(res)-1.96*sqrt(diag(vcovHC(res))), coef(res) + 1.96*sqrt(diag(vcovHC(res))))[2:3, ])
}
# one time simulation
onesim <- function(betau = betau, gamma = gamma){
thetaz <- (-1.55)
thetam <- 0.37
betax <- 0.1
beta0 <- 0
alphaz <- -2.2
alphax <- 0.9
alpha0 <- 0
alphau <- 1
simdata <- gendata(thetaz, thetam, betax, betau, beta0, alphaz, alphax, alpha0, gamma, alphau)
z <- simdata[[1]]
m <- simdata[[2]]
y <- simdata[[3]]
x.m <- simdata[[4]]
x.y <- simdata[[5]]
result <- mymed(z, m, y, x.m, x.y)
bias <- result[,1]-rep(c(thetaz, thetam), 2)
cr <- as.numeric(((result[,2]-rep(c(thetaz, thetam), 2))*(result[,3]-rep(c(thetaz, thetam), 2)))<0)
c(bias, cr)
}
# Simulation
M <- 1000
myfun <- function(betau = betau, gamma = gamma) {
res <- replicate(M, onesim(betau, gamma), simplify = TRUE)
# calculate bias and coverage rate
a1 <- apply(res, 1, mean)
# calculate sd
a2 <- apply(res, 1, sd)
out <- cbind(a1[1:4], a2[1:4], a1[5:8])
colnames(out) <- c("Bias","SD","CR")
rownames(out) <- c("RPM.Z","RPM.M","OLS.Z","OLS.M")
round(out, 2)
}
# Sequential ignorability fails and weak interaction
set.seed(111)
myfun(2, -0.1)
### NOTE: ~17 sec for 1,000 simulations on a current MacBook Pro
# Sequential ignorability fail and strong interaction
set.seed(111)
myfun(2, -0.5)
# Sequential ignorability hold and weak interaction
set.seed(111)
myfun(0, -0.1)
# Sequential ignorability hold and strong interaction
set.seed(111)
myfun(0, -0.5)

复制代码

已有 1 人评分	经验	收起理由
dxystata	+ 100	精彩帖子

总评分: 经验 + 100 查看全部评分

板凳

jnjnjn100 发表于 2016-5-29 04:28:15

已经阅读论坛新手操作内容已经阅读论坛新手操作内容已经阅读论坛新手操作内容已经阅读论坛新手操作内容

报纸

solomen313 发表于 2016-5-29 11:00:30

感谢楼主提供好资料！

地板

hliu62 发表于 2016-6-7 14:00:59

thanks

Tutorials: Causal Models for Mediation Analyses [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

本帖隐藏的内容

扫码加我拉你入群

本帖被以下文库推荐

浏览过的帖子

浏览过的版块

本版微信群

Tutorials: Causal Models for Mediation Analyses [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

本帖隐藏的内容

扫码加我 拉你入群

本帖被以下文库推荐

浏览过的帖子

浏览过的版块

本版微信群

扫码加我拉你入群