举个例子,做完逻辑斯蒂回归后SAS会有一些列的检验统计量,有些感觉不是特别直观。
以下SMC和Jaccard系数是在《Pang-Ning Tan等,数据挖掘导论.人民邮电出版社,2010》 P44提到的两个系数,比较直观。
SMC=值匹配的属性个数/属性个数
Jaccard=匹配的个数/不涉及0-0匹配的个数
调用时:
ins=输入数据集
a=预测结果
b=实际结果
*本处a和b都是boolean二值类型的(0,1)
%macro cpair(ins,a,b);
data dtem1;
set &ins end=last;
if &a eq 1 and &b eq 1 then f11+1;
if &a lt 1 and &b eq 1 then f01+1;
if &a lt 1 and &b lt 1 then f00+1;
if &a eq 1 and &b lt 1 then f10+1;
if last;
keep f11 f10 f00 f01;
run;
data dtem2;
set dtem1;
SMC=(f11+f00)/(f11+f00+f10+f01);
J=f11/(f11+f10+f01);
Event=f11/(f11+f01);
put "SMC Means Simple Matched Paired,Which is 1-1 pairs plus 0-0 pairs divide by All Pairs";
put SMC=;
put "********";
put "J Means Jaccard, Which Exclude 0-0 Pairs";
put J=;
put "********";
put "Event Means Predicted 1 / All Appeared 1";
put Event=;
put "********";
format smc j event percent7.1;
run;
%mend;