楼主: Lisrelchen
1386 5

Tutorials: Causal Models for Mediation Analyses [推广有奖]

  • 0关注
  • 62粉丝

VIP

已卖:4194份资源

院士

67%

还不是VIP/贵宾

-

TA的文库  其他...

Bayesian NewOccidental

Spatial Data Analysis

东西方数据挖掘

威望
0
论坛币
50288 个
通用积分
83.6306
学术水平
253 点
热心指数
300 点
信用等级
208 点
经验
41518 点
帖子
3256
精华
14
在线时间
766 小时
注册时间
2006-5-4
最后登录
2022-11-6

楼主
Lisrelchen 发表于 2016-5-29 04:01:07 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币

Zheng, C., Atkins, D. C., & Zhou, A. (2012). Causal models for mediation analyses: An introduction to structural mean models. Manuscript submitted for publication.





二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:mediation Tutorials Tutorial Analyses analyse

本帖被以下文库推荐

沙发
Lisrelchen 发表于 2016-5-29 04:01:31
  1. ### R code to accompany (Updated, 29 August 2012):
  2. #
  3. # Zheng, C., Atkins, D. C., & Zhou, A. (2012). Causal models for mediation analyses: An introduction to structural mean models.  Manuscript submitted for publication.
  4. #
  5. ### Analyses demonstrating how to fit the RPM model described in
  6. ### the paper are shown below.  Data originally from:
  7. #
  8. # Whiteside, U., Atkins, D. C., Kleiber, B. V., Neighbors, C., Witkiewitz, K., & Larimer, M. E. (2011).  DBT skills plus personalized normative feedback: Results of a randomized clinical trial.  Manuscript submitted for publication.

  9. ### Import data
  10. newdata <- read.csv("http://depts.washington.edu/cshrb/newweb/stats%20documents/Jan2012Mediation.csv",
  11.                     header = TRUE)
  12. head(newdata)
  13. summary(newdata)

  14. ### NOTE: variables with prefix "b" were assessed at baseline,
  15. ###                  and variables with prefix "o" were assessed at one month
  16. ###                  post intervention
  17. #
  18. ### NOTE: original study had 3 treatment groups, but our mediation
  19. ###                  analyses focus only on DBT-BASICS and Control
  20. #
  21. ### NOTE: some individuals had missing data at one month and are not
  22. ###                  included here

  23. # Original variables will be re-named to generic names using the
  24. # following conventions:
  25. #   
  26. # m = mediator
  27. # y = outcome
  28. # r = intervention indicator
  29. # x.y = covariate for outcome
  30. # x.m = covariate for mediator
  31. #
  32. ### Hypothesized mediator: One month DERS
  33. m <- newdata$o_ders

  34. ### Convert treatment to binary indicator
  35. r <- 1 - as.numeric(newdata$condition == "Control")

  36. ### Outcome: One month BDI
  37. y <- newdata$o_bdi  

  38. ### Baseline covariate (centered around mean)
  39. x.m <- newdata$b_bai - mean(newdata$b_bai)
  40. x.y <- newdata$b_bai - mean(newdata$b_bai)

  41. ### NOTE: In the current usage, x.m and x.y (covariates for mediator model
  42. ###             and outcome model) are identical; however, they do not have to be,
  43. ###                    and so we show them as two separate terms here
  44. #
  45. ### Pull data together into data.frame
  46. df <- data.frame(y = y, m = m, r = r, x.m = x.m, x.y = x.y)
  47. head(df)

  48. ### The rank preserving model (RPM), which is a particular way of
  49. ### estimating a structural mean model, is fit using estimating
  50. ### equations.  The following R code gives an example of how to
  51. ### fit the RPM to the college student drinking data and directly
  52. ### follows the equations in the technical appendix of the ms.
  53. #
  54. ### Fit model for m to obtain weight W_i(X) in w1, w2
  55. #
  56. ### First, fit linear model of mediator predicted from treatment by
  57. ### baseline interaction
  58. modelm <- lm(m ~ r*x.m, data = df)
  59. summary(modelm)

  60. ### NOTE: Interaction of treatment and baseline covariate (here, BAI),
  61. ###       is not very strong (p = 0.19), which will impact the variance
  62. ###       of the RPM estimate.

  63. ### The following code generates the predicted value of the mediator
  64. ### given baseline covariates for each level of the treatment, based
  65. ### on Equation 8 in the Technical Appendix
  66. tmp.data <- expand.grid(x.m = df$x.m,
  67.                         r = c(0,1))
  68. tmp.data$pred <- predict(modelm, tmp.data, type = "response")
  69. head(tmp.data)

  70. ### Now, get treatment difference on mediator (conditional on x.m)
  71. w <- tmp.data[tmp.data$r == 1, "pred"] -
  72.      tmp.data[tmp.data$r == 0, "pred"]

  73. ### "weight 1" is the difference between treatment and average treatment,
  74. ### that is, r^tilde - r^bar in equation 7:
  75. w1 <- df$r - mean(df$r)

  76. ### "weight 2" is this difference multiplied by w (ie, treatment difference in
  77. ### mediator conditional on x.m; see equation 8)
  78. w2 <- (df$r - mean(df$r))*w

  79. ### For Equation 7 in the Appendix, we do not include additional
  80. ### baseline covariates.  See below for estimation using Equation 9.1
  81. ### that does include baseline covariates.
  82. #
  83. ### To more clearly connect the R code with the Technical Appendix
  84. ### we include a representation of Equation 7 here (as best we can
  85. ### given unformatted text); plugging in Y^00(b, c') = Y - c'r - bm:
  86. #
  87. ### Sigma_i (r-Er) W(x) (Y-c'r-bm) = 0      [7]

  88. ### We have already estimated the first two parts, which we combine:
  89. WW <- cbind(w1,w2)
  90. head(WW)

  91. ### Relating this to Equation 7 in Technical Appendix:
  92. #
  93. ### WW = (r-Er) W(x)

  94. ### Re-arranging Eq. 7 above:
  95. #
  96. ### (r-Er) W(x) Y = (r-Er) W(x) (r,m) %*% (c',b)
  97. #
  98. ### Combine observed treatment and mediator, where (r,m) = MM
  99. MM <- cbind(df$r, df$m)

  100. ### To solve for (c',b), we take final steps:
  101. #
  102. ### YY = MM %*% (c',b) and
  103. #
  104. ### theta = (c',b) = MM^{-1} YY.
  105. XX <- crossprod(WW, MM)
  106. YY <- crossprod(WW, df$y)
  107. theta <- tcrossprod(solve(XX), t(YY))
  108. theta

  109. ### Calculate y00 and then get sandwich variance estimation
  110. y0 <- df$y - tcrossprod(MM, t(theta))
  111. V <- tcrossprod(tcrossprod(solve(XX), WW * as.vector(y0)))
  112. B3 <- solve(solve(crossprod(WW * as.vector(y0)))[1:2,1:2])

  113. ### "Sandwich" variance estimates
  114. V3 <- solve(XX) %*% B3 %*% solve(t(XX))
  115. V3

  116. ### Calculate 95% CI
  117. #
  118. ### Extract SE from variance-covariance matrix
  119. se <- sqrt(diag(V3))
  120. res3 <- cbind(b = theta[1:2],
  121.                           lo = theta[1:2] + qnorm(0.025)*se,
  122.                           hi = theta[1:2] + qnorm(0.975)*se)
  123. round(res3, 2)

  124. ### Note: The results reported in the ms use baseline covariates; thus,
  125. ### see the following section.

  126. ############# Augmented with Baseline Covariates #########
  127. #
  128. ### As noted in the Technical Appendix, it is possible to reduce
  129. ### the variance of the estimated parameters (i.e., increase
  130. ### efficiency), by including baseline covariates as shown in
  131. ### Equation 9.1
  132. #
  133. ### Based on equations 9.1 and 10.1, we augment the estimating
  134. ### equations with an intercept and baseline covariate:
  135. w4 <- 1
  136. w5 <- df$x.y
  137. WW.2 <- cbind(w1,w2,w4,w5)
  138. head(WW.2)

  139. ### Note that WW.2 is the matrix shown in the explicit form of the solution,
  140. ### directly following presentation of equation 10.1 in the ms.  Moreover,
  141. ### MM.2 (below) is the middle matrix:
  142. MM.2 <- cbind(df$r, df$m, 1, df$x.y)

  143. ### We can then estimate coefficients via:
  144. XX.2 <- crossprod(WW.2, MM.2)
  145. YY.2 <- crossprod(WW.2, df$y)
  146. theta.2 <- tcrossprod(solve(XX.2), t(YY.2))
  147. theta.2

  148. ### Calculate y00 and then get sandwich variance estimation
  149. y0.2 <- df$y - tcrossprod(MM.2, t(theta.2))
  150. V.2 <- tcrossprod(tcrossprod(solve(XX.2), WW.2 * as.vector(y0.2)))
  151. A.2 <- XX.2[1:2, 1:2]
  152. B3.2 <- solve(solve(crossprod(WW.2 * as.vector(y0.2)))[1:2,1:2])

  153. ### "Sandwich" variance estimates
  154. V3.2 <- solve(A.2) %*% B3.2 %*% solve(t(A.2))
  155. V3.2

  156. ### Calculate 95% CI
  157. #
  158. ### Extract SE from variance-covariance matrix
  159. se2 <- sqrt(diag(V3.2))
  160. res3.2 <- cbind(b = theta.2[1:2],
  161.               lo = theta.2[1:2] + qnorm(0.025)*se2,
  162.               hi = theta.2[1:2] + qnorm(0.975)*se2)
  163. round(res3.2, 2)
  164. round(res3, 2)

  165. ### Note that we also see that including the baseline covariate increased the efficiency of the estimates.
复制代码

藤椅
Lisrelchen 发表于 2016-5-29 04:02:45
  1. ### Import original data and use the treatment as well as
  2. ### baseline BAI as z and x

  3. newdata <- read.csv("Mediation.csv", header = TRUE)

  4. ### Generate simulation data, alphau and betau measure
  5. ### violation of sequential ignorability and gamma measures
  6. ### interaction

  7. gendata <- function(thetaz, thetam, betax, betau, beta0,
  8.                                         alphaz, alphax, alpha0, gamma, alphau) {

  9. z <- 1 - as.numeric(newdata$condition=="Control")
  10. x.m <- newdata$b_bai
  11. x.y <- newdata$b_bai
  12. n <- length(z)
  13. u <- rnorm(n)
  14. m <- alphaz*z + alphax*x.m + gamma*x.m*z + alphau*u +
  15.           alpha0 + rnorm(n, sd = 0.2)

  16. y <- thetaz*z + thetam*m + betax*x.y + betau*u + beta0 +
  17.                 rnorm(n, sd = 0.5)

  18. list(z, m, y, x.m, x.y)
  19. }

  20. library(sandwich) # for robust SE

  21. ### RPM Estimation function, return with estimates and CI
  22. ### from RPM and OLS

  23. mymed <- function(z, m, y, x.m, x.y){

  24. data <- na.omit(data.frame(z = z, m = m, y = y,
  25.                                 x.m = x.m, x.y = x.y))

  26. modelm <- lm(m ~ z*x.m, data = data)
  27. data1 <- data0 <- data
  28. data1$z <- 1
  29. data0$z <- 0

  30. w <- as.matrix(predict(modelm, data1, type = "response")-predict(modelm, data0, type = "response"))
  31. w1 <- (data$z-mean(data$z))
  32. w2 <- (data$z-mean(data$z))*w

  33. # w4,  w5 is for the augmented estimating equations

  34. w4 <- 1
  35. w5 <- data$x.y
  36. WW <- cbind(w1, w2, w4, w5)

  37. # solve the equation by its explicit expression

  38. MM <- cbind(data$z, data$m, 1, data$x.y)
  39. XX <- crossprod(WW, MM)
  40. YY <- crossprod(WW, data$y)
  41. theta <- tcrossprod(solve(XX), t(YY))

  42. # calculate y00 and then get sandwich variance estimation

  43. y0 <- data$y-tcrossprod(MM, t(theta))
  44. V <- tcrossprod(tcrossprod(solve(XX), WW*as.vector(y0)))
  45. B3 <- solve(solve(crossprod(WW*as.vector(y0))))
  46. A <- XX
  47. V3 <- (solve(A)%*%B3%*%solve(t(A)))[1:2, 1:2]

  48. # Calculate 95% CI
  49. res3 <- cbind(theta[1:2], theta[1:2] + qnorm(0.025)*sqrt(diag(V3)), theta[1:2] + qnorm(0.975)*sqrt(diag(V3)))

  50. #Indirect effect and Mediated effect via OLS
  51. res <- lm(y~z + m + x.y, data = data)

  52. #Compute 95% CI for indirect and mediated effect
  53. rbind(res3, cbind(coef(res), coef(res)-1.96*sqrt(diag(vcovHC(res))), coef(res) + 1.96*sqrt(diag(vcovHC(res))))[2:3, ])

  54. }

  55. # one time simulation
  56. onesim <- function(betau = betau, gamma = gamma){

  57. thetaz <- (-1.55)
  58. thetam <- 0.37
  59. betax <- 0.1
  60. beta0 <- 0
  61. alphaz <- -2.2
  62. alphax <- 0.9
  63. alpha0 <- 0
  64. alphau <- 1

  65. simdata <- gendata(thetaz, thetam, betax, betau, beta0, alphaz, alphax, alpha0, gamma, alphau)

  66. z <- simdata[[1]]
  67. m <- simdata[[2]]
  68. y <- simdata[[3]]
  69. x.m <- simdata[[4]]
  70. x.y <- simdata[[5]]

  71. result <- mymed(z, m, y, x.m, x.y)

  72. bias <- result[,1]-rep(c(thetaz, thetam), 2)

  73. cr <- as.numeric(((result[,2]-rep(c(thetaz, thetam), 2))*(result[,3]-rep(c(thetaz, thetam), 2)))<0)

  74. c(bias, cr)

  75. }

  76. # Simulation
  77. M <- 1000

  78. myfun <- function(betau = betau, gamma = gamma) {

  79.         res <- replicate(M, onesim(betau, gamma), simplify = TRUE)

  80.         # calculate bias and coverage rate
  81.         a1 <- apply(res, 1, mean)

  82.         # calculate sd
  83.         a2 <- apply(res, 1, sd)

  84.         out <- cbind(a1[1:4], a2[1:4], a1[5:8])
  85.         colnames(out) <- c("Bias","SD","CR")
  86.         rownames(out) <- c("RPM.Z","RPM.M","OLS.Z","OLS.M")
  87.         round(out, 2)
  88. }

  89. # Sequential ignorability fails and weak interaction
  90. set.seed(111)
  91. myfun(2, -0.1)
  92. ### NOTE: ~17 sec for 1,000 simulations on a current MacBook Pro

  93. # Sequential ignorability fail and strong interaction
  94. set.seed(111)
  95. myfun(2, -0.5)

  96. # Sequential ignorability hold and weak interaction
  97. set.seed(111)
  98. myfun(0, -0.1)

  99. # Sequential ignorability hold and strong interaction
  100. set.seed(111)
  101. myfun(0, -0.5)
复制代码
已有 1 人评分经验 收起 理由
dxystata + 100 精彩帖子

总评分: 经验 + 100   查看全部评分

板凳
jnjnjn100 发表于 2016-5-29 04:28:15
已经阅读论坛新手操作内容已经阅读论坛新手操作内容已经阅读论坛新手操作内容已经阅读论坛新手操作内容

报纸
solomen313 发表于 2016-5-29 11:00:30
感谢楼主提供好资料!

地板
hliu62 发表于 2016-6-7 14:00:59
thanks

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2026-1-2 22:57