求如何做岭回归 - SPSS论坛

0关注
0粉丝

硕士生

82%

还不是VIP/贵宾

-

0%

威望: 0 级
论坛币: 2 个
通用积分: 0
学术水平: 0 点
热心指数: 0 点
信用等级: 0 点
经验: 830 点
帖子: 214
精华: 0
在线时间: 6 小时
注册时间: 2007-6-2
最后登录: 2023-1-16

楼主

hooyan 发表于 2007-10-14 08:47:00 |AI写论文

是否 +2 论坛币

k人参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群

赵安豆老师微信：zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

立即领取

感谢您参与论坛问题回答

经管之家送您两个论坛币！

+2 论坛币

求如何做岭回归，希望能以这个例子得出各个参数的回归系数，和相关的统计量最好能告诉一下操作方法，谢谢

164005.rar (2.07 KB) 本附件包括：
求岭回归.xls

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

分享0 收藏0 回帖

关键词：岭回归回归系数统计量如何统计最好

回帖推荐

hanszhu 发表于3楼查看完整内容

Ridge Regression in SPSS SPSS does lots of things with its GUI interface, but not everything. There are many procedures that can only be used through syntax files. Thus goes back to the old days when SPSS was command driven. The syntax is also useful when you want to repeat a command often. Further, some people write little mini-programs to do specific things in SPSS.& ...

斜阳残雪发表于7楼查看完整内容

我只会SPSS的做法：新建一个语句窗口，然后输入：include'<你的SPSS目录>\ridge regression.sps'.ridgereg dep=y/enter x1 x2 x3.其中y是因变量，x1x2x3是自变量，形式根据自己需要更改

本帖被以下文库推荐

· SPSS精彩问答|主题: 2176, 订阅: 25

沙发

hanszhu 发表于 2007-10-16 08:08:00

For SAS:

THIS RUN DEMONSTRATES ONE APPROACH TO RIDGE REGRESSION.

THIS RUN MAY EASILY BE EXPANDED TO PLOT THE RIDGE TRACES OF

ALL THE B-VALUES AS WELL AS B'B;

DATA;

START=51735717;

DO N=1 TO 50;

     U=RANUNI(START)*5;

     X1=U+RANNOR(START)*.5;

     X2=U+RANNOR(START)*.5;

     X3=U+RANNOR(START)*.5;

     X4=U+RANNOR(START)*.5;

     Y=1+X1+X2+X3+X4+RANNOR(START);

     KEEP X1-X4 Y;

     OUTPUT;

END;

RUN;

PROC IML;

RESET AUTONAME ;

START MAIN;

*------------------ RIDGE REGRESSION ------------------------*;

 N= J({1});

 USE _LAST_ ;

 READ ALL INTO XY ;

 J= NCOL(XY)-{1};

 N= NROW(XY);

 IJ={1}:J;

 XY=XY- J(N,{1})*( J({1},N)*XY* RECIP(N));

 C=XY`*XY;

 S= DIAG( RECIP( SQRT( VECDIAG(C))));

 R=S*C*S;

 PRINT " CORRELATION MATRIX";

 PRINT R;

 SX= S[IJ,IJ];

 SY= RECIP( S[J+{1},J+{1}]);

 RX= R[IJ,IJ];

 RY= R[IJ,J+{1}];

 SKIP 2;

 *-------------- OBTAIN O L S ESTIMATES -------------*;

 CALL EIGEN( M, E, RX);

 GRX=E* DIAG( RECIP( FUZZ(M)))*E`;

 B_OLS=GRX*RY;

 SSE={1}-RY`*GRX*RY;

 MSE=SSE* RECIP(N-J-{1});

 PRINT B_OLS;

 TB_OLS=SX*B_OLS*SY;

 PRINT " OLS ESTIMATES";

 PRINT TB_OLS;

 Q= SSQ(B_OLS)-MSE* TRACE(GRX);

 PRINT Q;

 IF ( Q<={0}) THEN DO;

 PRINT 'Q<=0, K NOT DETERMINED';

 STOP;

 END;

 SKIP 2;

 *---- SOLVE FOR K SUCH THAT SSQ(BK)=Q, BY NEWTONS METHOD ----;

 K={0.5};

 L=E`*RY;

 IT={0};

LOOP: KJ=K* J(J,{1});

 IT=IT+{1};

 IF ( IT>{25}) THEN GOTO GOTK;

 F= SSQ(L# RECIP(M+KJ))-Q;

 IF ( ABS(F)<{1E-6}) THEN GOTO GOTK;

 RMK=(M+KJ)#(M+KJ)#(M+KJ);

 DF={2}* SUM(L#L# RECIP(RMK));

 CF=F* RECIP(DF);

 K=K+CF;

 GOTO LOOP;

GOTK: BK=E* DIAG( RECIP(M+KJ))*E`*RY;

   BKB= SSQ(BK);

   PRINT K, BK, BKB, IT;

   TBK=SX*BK*SY;

   PRINT " RIDGE ESTIMATES";

   PRINT TBK;

*--------- PLOT THE RIDGE TRACE ---------*;

RT: OK= SSQ(B_OLS)||{ 0};

K={0};

LL: K=K+{.1};

 KJ=K* J(J,{1});

 SBK= SSQ(E* DIAG( RECIP(M+KJ))*E`*RY);

 OK=OK//(SBK||K);

 IF ( K<{2}) THEN GOTO LL;

 OK=OK|| J( NROW(OK),{1},Q);

 _TMP_ROW = 'ROW1 ' : compress('ROW'+char(nrow(OK)));

 CREATE KK ( RENAME=(_TMP_ROW=ROW )) FROM OK [ROWNAME=_TMP_ROW ];

 APPEND FROM OK [ROWNAME=_TMP_ROW];

FINISH MAIN;

RUN MAIN;

QUIT;

OPTIONS NOOVP;

PROC PLOT;

PLOT COL2*COL1='*' COL2*COL3='-' / OVERLAY;

TITLE 'RIDGE TRACE'; LABEL COL1='B(K)''B(K)' COL2='K VALUE';

RUN;

藤椅

hanszhu 发表于 2007-10-16 08:36:00

Ridge Regression in SPSS

SPSS does lots of things with its GUI interface, but not everything. There are many procedures that can only be used through syntax files. Thus goes back to the old days when SPSS was command driven. The syntax is also useful when you want to repeat a command often. Further, some people write little mini-programs to do specific things in SPSS. Many of these can be downloaded from http://www.spsstools.net/. The people at SPSS felt ridge regression was so useful that they they include the syntax with the main program. This suggests that it will be in the GUI soon.

To run a ridge regression, create a new syntax file, and type:

INCLUDE 'Ridge regression.sps'.

(The full stop is important!!! SPSS is usually not case sensitive.)

Click the little triangle or go to RUN and either run the whole syntax file or just highlight the part you want to run. If this does not work it is likely because SPSS did not find the file. If so, go to SEARCH FILES in Windows (or whatever it is for a MAC) and find a file called "Ridge Regression.sps". It should be somewhere in the SPSS folder. Write the whole address out. So, for my computer I would write:

INCLUDE 'C:\Program Files\SPSS\Ridge regression.sps'.

Now the program has in its active memory the ridge regression procedure. To run a ridge regression stay in the syntax. Suppose you are trying to predict exam score (EXAM) from marks on 5 assignments (ASSIGN1, ASSIGN2, ...). Write

RIDGEREG DEP=exam /ENTER = assign1 assign2 assign3 assign4 assign5/.

(Again, full stop important).

Highlight this and run. This will give you the two graphs to help you decide how much shrinkage to have. Shrinkage is measured by K. If you choose a good K (like where the coefficients have leveled out, pretend it is .5 here), then type

RIDGEREG DEP=exam /ENTER = assign1 assign2 assign3 assign4 assign5/k=.5.

This will produce the coefficients for this value K. There are more advanced ways to choose k. See Hastie et al. (2001).

NOTE1: It is worth playing around with SPSS syntax. Many people find it better than the windows interface. It can certainly be quicker.

NOTE2: If you have lots of predictor variables, the procedure will not be able to print out all of your output if the screen width is set to the default of most university computers. To change the default, go to EDIT in the menus, then options. You get a screen like this:

Tick the viewer thingee at the top and you get a screen like this:

Your default for text output page size width is probably 80. Just tick the Wide option or the custom one so that it prints wide enough for your output.

It is worth playing around with the various options just to see what SPSS can do.

已有 1 人评分	经验	论坛币	收起理由
bakoll	+ 2	+ 10	精彩帖子

总评分: 经验 + 2 论坛币 + 10 查看全部评分

板凳

hanszhu 发表于 2007-10-16 08:47:00

/* SAS Example of Ridge Regression */ options nocenter pagesize=80 linesize=120; data farmprod; infile '/users/n/newman/public_html/Classes/S510/DATA/farmprod.dat' firstobs=2; input year y x1 x2 x3 x4 x5 x6 x7 x8; /*----- fitting raw data w/ OLS ------------------------------------ */ /* NOTE: SAS scales "automatically" for calculation of eigenvalues so that X'X has 1's on the diagonal */ proc reg; model y = x1 x2 x3 x4 x5 x6 x7 x8/collin VIF; run; /* --- preparation for plotting ridge trace ---------------------- */ title 'Ridge Trace of Farm Production Data'; symbol1 value=x color=black; symbol2 v=circle c=red; symbol3 v=square c=green; symbol4 v=triangle c=blue; symbol5 v=plus c=orange; symbol6 v=6 c=purple; symbol7 v=7 c=brown; symbol8 v=8 c=magenta; /* note valid colors include the above and cyan,gray,pink,white,yellow */ legend2 position=(top right inside) across=3 cborder=black offset=(0,0) label=(color=blue position=(top center) 'Predictors'); /*----- fitting ridge regression -------------------------------- */ proc reg graphics outest=temp ridge=0 to 0.02 by 0.005 outvif; model y = x1 x2 x3 x4 x5 x6 x7 x8/collin VIF; plot/ridgeplot nomodel legend=legend2 nostat vref=0 lvref=1 cvref=blue; run; /* --- looking at coefficients for different shrinkage par. values */ proc print data=temp; run;

报纸

hanszhu 发表于 2007-10-16 08:48:00

For S-Plus

ridge.lm <- function(y,Xmat,k.vec,plot.it=T,crossvalid=F,original.scale=T,mycex=0.7) { #if crossvalid=T, exact PRESS statistic calculated #if original.scale=T, the ridge trace will plot coefficients based on unctr'd, unscaled Xmat n <- length(y) xmeans <- apply(Xmat,2,mean) xscale <- sqrt(apply(Xmat,2,var)*(n-1)) X.std <- cbind(1,scale(Xmat)/sqrt(n-1)) p <- dim(X.std)[[2]] if(sum(k.vec==0)==0) k.vec <- c(0,k.vec) #add 0 shrinkage if not included r <- length(k.vec) betas <- matrix(NA,r,p) VIF <- matrix(NA,r,p-1) C.k <- rep(NA,r) df.k <- rep(NA,r) PR.ridge.k <- rep(NA,r) GCV.k <- rep(NA,r) real.cv <- NA if(crossvalid) real.cv <- rep(NA,r) #--OLS model ols <- lm(y ~ X.std-1) sigma.2 <- (summary(ols)$sigma)^2 #---looping over the different shrinkage parameters #---don't shrink the intercept since centered data for(i in 1:r) { cat("iter=",i,"\n") k <- c(0,rep(k.vec,p-1)) X.k <- t(X.std) %*% X.std + k * diag(p) betas[i,] <- solve(qr(X.k), t(X.std) %*% cbind(y)) errs <- y-X.std%*%cbind(betas[i,]) SSres.k <- sum(errs^2) temp <- solve(t(X.k)) VIF[i,] <- diag(temp%*%t(X.std)%*%X.std%*%temp)[-1] H.k.ii <- diag(X.std[,-1]%*%solve(t(X.std[,-1])%*%X.std[,-1]+k[-1]*diag(p-1))%*%t(X.std[,-1])) C.k <- SSres.k/sigma.2-n+2+2*sum(H.k.ii) df.k <- sum(H.k.ii) PR.ridge.k <- sum((errs/(1-1/n-H.k.ii))^2) GCV.k <- SSres.k/(n-1-df.k)^2 if(crossvalid) { pred.err <- 0 for(j in 1:n) { X <- Xmat[-j,] X1.std <- cbind(1,scale(X)/sqrt(n-2)) X.k <- t(X1.std) %*% X1.std + k * diag(p) b <- solve(qr(X.k), t(X1.std) %*% cbind(y[-j])) xm <- apply(X,2,mean) xs <- sqrt(apply(X,2,var)*(n-2)) omit.x <- c(1,(Xmat[j,]-xm)/xs) errs <- y[j]-sum(omit.x*b) pred.err <- pred.err + errs^2 } real.cv <- pred.err } } #---backtransforming betas to original location and scale of Xmat beta.orig <- t(t(betas[,2:p])/xscale) intercept <- betas[,1]-apply(t(t(beta.orig)*xmeans),1,sum) beta.orig <- cbind(intercept,beta.orig) #---ridge trace plot if(plot.it) { orig.par <- par() par(mfrow=c(3,2),oma=c(0,0,3,0)) if(original.scale) { betas.ts <- as.ts(beta.orig[,-1]) } else { betas.ts <- as.ts(betas[,-1]) } ts.plot(betas.ts,xlab="k",ylab="Betas",main="Ridge trace", axes=F,type='b',lty=1:(p-1),pch=1:(p-1),cex=mycex) axis(side=2) # axis(side=1,at=1:r,labels=as.character(k.vec)) ???doesn't work?? mylab <- as.character(k.vec) axis(side=1,at=1:r,labels=mylab,cex=mycex) box() legend(round(r*.75),max(as.vector(betas.ts)),legend=paste("x",as.character(1:(p-1))), lty=1:(p-1),marks=1:(p-1),cex=mycex) plot(k.vec,C.k,type='l',xlab="k",ylab="C_k",main="C_k vs k",cex=mycex) plot(k.vec,PR.ridge.k,type='l',xlab="k",lty=1, ylab="PR(Ridge)_k",main="Pseudo-Press vs k",cex=mycex) if(crossvalid) { lines(k.vec,real.cv,type='l',lty=2) legend(k.vec[1],max(real.cv,PR.ridge.k),legend=c("PRESS","PR(Ridge)"),lty=2:1,cex=mycex) } plot(k.vec,GCV.k,type='l',xlab="k",ylab="GCV(k)",main="GCV(k) vs k",cex=mycex) plot(k.vec,df.k,type='l',xlab="k",ylab="df_k",main="df vs k",cex=mycex) par(mgp=orig.par$mgp) } dimnames(VIF) <- list(as.character(k.vec),paste("beta",as.character(1:(p-1)))) dimnames(beta.orig) <- list(as.character(k.vec),paste("beta",as.character(0:(p-1)))) return(betas,beta.orig,VIF,C.k,df.k,PR.ridge.k,real.cv,GCV.k) } #---farm data # temp <- read.table("Data/farmprod.dat",header=T,row.names="year") # # k.vec <- seq(0,0.02,by=0.01) # # out <- ridge.lm(temp[,1],temp[,-1],k.vec=k.vec,plot.it=T,crossvalid=T) # #---hospital data temp <- as.matrix(read.table("Data/hospital.dat",header=T,row.names=NULL)) k.vec <- seq(0,0.25,by=0.01) out <- ridge.lm(temp[,6],temp[,-6],k.vec=k.vec,plot.it=T,original.scale=F,crossvalid=T)

地板

大笨蛋 发表于 2007-12-22 16:04:00

都什么乱七八糟的呀

连我都看不懂

7楼

斜阳残雪 发表于 2007-12-22 22:20:00

我只会SPSS的做法：

新建一个语句窗口，然后输入：

include'<你的SPSS目录>\ridge regression.sps'.

ridgereg dep=y/enter x1 x2 x3.

其中y是因变量，x1x2x3是自变量，形式根据自己需要更改

已有 1 人评分	经验	论坛币	收起理由
bakoll	+ 3	+ 10	精彩帖子

总评分: 经验 + 3 论坛币 + 10 查看全部评分

8楼

unicornliu 发表于 2007-12-22 23:25:00

那怎么判定选定的k回归的结果的多重共线性消除情况呢？
如何求vif或dw值啊？
多谢指教~

9楼

hanszhu 发表于 2007-12-23 01:47:00

http://www.ats.ucla.edu/stat/sas/examples/alsm/alsmsasch10.htm

10楼

阳光下的星星 发表于 2008-12-20 18:47:00

用R怎么做啊？最好给出例题~~谢谢啦

[学习资料] 求如何做岭回归 [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我拉你入群

相关帖子

回帖推荐

本帖被以下文库推荐

浏览过的帖子

浏览过的版块

本版微信群

[学习资料] 求如何做岭回归 [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我 拉你入群

相关帖子

回帖推荐

本帖被以下文库推荐

浏览过的帖子

浏览过的版块

本版微信群

扫码加我拉你入群