楼主: Lisrelchen
1945 5

[Case Study]Penalized Semiparametric Additive Hazards Model using R [推广有奖]

  • 0关注
  • 62粉丝

VIP

已卖:4194份资源

院士

67%

还不是VIP/贵宾

-

TA的文库  其他...

Bayesian NewOccidental

Spatial Data Analysis

东西方数据挖掘

威望
0
论坛币
50288 个
通用积分
83.6306
学术水平
253 点
热心指数
300 点
信用等级
208 点
经验
41518 点
帖子
3256
精华
14
在线时间
766 小时
注册时间
2006-5-4
最后登录
2022-11-6

楼主
Lisrelchen 发表于 2015-3-21 07:19:06 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币

Authors:

Anders Gorst-Rasmussen, Thomas H. Scheike

Title:

[download]
(2000)
Coordinate Descent Methods for the Penalized Semiparametric Additive Hazards Model

Reference:

Vol. 47, Issue 9, Apr 2012Submitted 2011-07-05, Accepted 2012-03-06

Type:

Article

Abstract:

For survival data with a large number of explanatory variables, lasso penalized Cox regression is a popular regularization strategy. However, a penalized Cox model may not always provide the best fit to data and can be difficult to estimate in high dimension because of its intrinsic nonlinearity. The semiparametric additive hazards model is a flexible alternative which is a natural survival analogue of the standard linear regression model. Building on this analogy, we develop a cyclic coordinate descent algorithm for fitting the lasso and elastic net penalized additive hazards model. The algorithm requires no nonlinear optimization steps and offers excellent performance and stability. An implementation is available in the R package ahaz. We demonstrate this implementation in a small timing study and in an application to real data.



二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Case study Parametric penalized semipara Additive difficult survival popular provide always

本帖被以下文库推荐

沙发
Lisrelchen 发表于 2015-3-21 07:20:25
  1. ###########
  2. # PACKAGES
  3. ###########

  4. library("timereg")
  5. library("lars")
  6. library("ahaz")
  7. library("xtable")

  8. ############################
  9. # FUNCTION DEFINITIONS ETC.
  10. ############################

  11. # Generate data from an AFT model
  12. # Adapted from the R-code accompanying the paper Simon et al. (2011); see http://www.jstatsoft.org/v39/i05/
  13. gendata <- function(n,p,rho,snr=3)
  14.   {
  15.     # Design matrix
  16.      if(abs(rho) < 1){
  17.        beta <- sqrt(rho / (1-rho))
  18.        X0 <- matrix(rnorm(n * p),ncol=p)
  19.        z <- rnorm(n)
  20.        X <-beta * matrix(z,nrow=n,ncol=p,byrow=F) + X0
  21.      }
  22.      if(abs(rho)==1){X <- matrix(z,nrow=n,ncol=p,byrow=F)}

  23.      # True survival times
  24.      b <- ((-1)^(1:p)) * exp(-2 * ( (1:p) - 1) / 20)
  25.      f <- X %*% b
  26.      e <- rnorm(n)
  27.      k <- sqrt(var(f) / (snr * var(e)))
  28.      T <- exp(f + k * e)

  29.      # Censoring times
  30.      b <- ((-1)^(1:p)) * exp(-2 * ( (1:p) - 1) / 20)
  31.      f <- X %*% b
  32.      e <- rnorm(n)
  33.      k <- sqrt(var(f) / (snr * var(e)))
  34.      C <- exp(k * e)
  35.      
  36.      return(list("X"=X,"surv"=Surv(apply(cbind(C,T),1,min),T<C)))
  37.   }

  38. # Local ahaz - get D,d but *not* B
  39. local.ahaz <- function(surv,X)
  40.   {
  41.     a <- .C("aha",
  42.             "X"=X,
  43.             "time1"=as.double(rep(0,nrow(X))),
  44.             "time2"=as.double(surv[,1]),
  45.             "event"=as.integer(surv[,2]),
  46.             "weights"=as.double(rep(1,nrow(X))),
  47.             "n"=nrow(X),
  48.             "p"=ncol(X),
  49.             "d"=numeric(ncol(X)),
  50.             "D"=numeric(ncol(X)^2),
  51.             "B"=numeric(0),
  52.             "getB"=as.integer(0),
  53.             "univariate"=as.integer(0),
  54.             "usethis"= as.integer(rep(1,ncol(X))),
  55.             "rightcens"=as.integer(1))
  56.     D <- matrix(a$D,ncol=ncol(X))
  57.     D <- D + t(D) - diag(diag(D))
  58.     d <- a$d
  59.     return(list("D"=D,"d"=d,"n"=nrow(X)))
复制代码

藤椅
Lisrelchen 发表于 2015-3-21 07:21:17
  1. ##############
  2. # TABLE 1 + 2
  3. ##############

  4. # NOTE: Calculations for p=10,000 are memory-intensive!  #
  5. times <- array(0, c(3,3,5,12))
  6. n<-c(rep(200,4),rep(500,4),rep(1000,4))
  7. ptmp<-c(100,500,5000,10000)
  8. p<-rep(ptmp,3)
  9. rho <- c(0, 0.25, 0.5, 0.9, 0.95)
  10. max.steps <- 100
  11. snr <- 3

  12. for(k in 1:3){                   # Repetitions
  13.   for(i in 1:12){                  # Loop over n/p combinations
  14.     cat(paste("Case",i,"- repetition",k,"\n"))
  15.     for(j in 1:5){                   # Loop over rho
  16.       data <- gendata(n[i], p[i], rho[j], snr)
  17.       times[k,1,j,i] <-  system.time(aha <- local.ahaz(data$surv,data$X))[1]
  18.       times[k,2,j,i] <-  system.time(lar <- surv.lars(aha$D,aha$d,aha$n,max.steps=max.steps))[1] + times[k,1,j,i]
  19.       lminf <- min(lar$lambda) / max(lar$lambda)
  20.       times[k,3,j,i]  <- system.time(ccd <- ahazpen(data$surv,data$X,lambda.minf=lminf,standardize=FALSE))[1]
  21.     }
  22.   }
  23. }

  24. # LaTeX table (CCD versus LARS+pre-calculation)
  25. avg.tab1 <- apply(times, c(2,3,4), mean)
  26. i <- 1
  27. tab <- c()
  28. for(nn in 1:length(unique(n))){
  29.   tmp<-c()
  30.   for(q in 1:length(ptmp)) {
  31.     tmp <- cbind(tmp, t(avg.tab1[c(3,2),,i]))
  32.     i <- i + 1
  33.   }
  34.   tab <- rbind(tab,tmp)
  35. }
  36. xtable(tab)

  37. # LaTeX table (pre-calculation only)
  38. avg.tab2 <- apply(times, c(2,4), mean)[1,]
  39. i <- 1
  40. tab <- c()
  41. for(nn in 1:length(unique(n))){
  42.   tmp <- c()
  43.   for(q in 1:length(ptmp)) {
  44.     tmp <- c(tmp,avg.tab2[i])
  45.     i <- i + 1
  46.   }
  47.   tab <- rbind(tab,tmp)
  48. }
  49. xtable(tab)
复制代码

板凳
Lisrelchen 发表于 2015-3-21 07:21:55
  1. ##########
  2. # TABLE 3
  3. ##########

  4. times <- array(0, c(3,5,6))
  5. n <- c(200,200,200,40000,100000,250000)
  6. p <- c(40000,100000,250000,200,200,200)
  7. snr <- 3
  8. rho <- c(0, 0.25, 0.5, 0.9, 0.95)
  9. for(k in 1:1){                       # Repetitions
  10.   for(i in 1:6){                       # Loop over n/p combinations
  11.     cat(paste("Case",i,"- repetition",k,"\n"))
  12.     for(j in 1:5){                       # Loop over rho
  13.       
  14.       data <- gendata(n[i], p[i], rho[j], snr)
  15.       ccd1 <- ahazpen(data$surv,data$X,dfmax=120,pmax=1000,standardize=FALSE) # Large pmax: many variables considered for rho=.95
  16.       lminf <- ccd1$lambda[which.min(abs(ccd1$df - 100))] / ccd1$lambda[1]
  17.       times[k,j,i]  <- system.time(ccd2 <- ahazpen(data$surv,data$X,lambda.minf=lminf,standardize=FALSE))[1]
  18.     }
  19.   }
  20. }
复制代码

报纸
Lisrelchen 发表于 2015-3-21 07:23:20
  1. #################
  2. # SORLIE EXAMPLE
  3. #################

  4. # Load data
  5. data("sorlie")
  6. set.seed(10101)
  7. surv <- Surv(sorlie$time + runif(nrow(sorlie)) * 1e-2, sorlie$status)
  8. Z <- sorlie[,3:ncol(sorlie)]
  9. p <- ncol(Z)
  10. pw.comb <- combn(1:p,2)
  11. X <- cbind(Z, Z[, pw.comb[1,]] * Z[, pw.comb[2,]])

  12. # Initial fit
  13. fit.init <- ahazpen(surv, X, dfmax = 50)

  14. # Expand lambda grid
  15. l <- range(fit.init$lambda)
  16. fit <- ahazpen(surv, X, lambda.minf = l[1] / l[2])
  17. plot(fit)

  18. # Cross-validation
  19. set.seed(10101)
  20. fit.tune <- tune.ahazpen(surv, X, lambda.minf = l[1] / l[2], tune = "cv")
  21. plot(fit.tune)

  22. # Final estimate
  23. beta <- coef(fit.tune)
  24. which(as.numeric(beta)!=0)
复制代码

地板
auirzxp 学生认证  发表于 2015-3-21 07:29:53
提示: 作者被禁止或删除 内容自动屏蔽

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2025-12-31 14:09