1. 先按照var6 的四个级别 level 分开
楼主是要按 var6 的四个level进行分组计算, 还是要分成四个数据集?
如果是后者的话, 可以这样.
data Level_A Level_B Level_C Level_D;
set test;
if var6 = "Level A" then output Level_A;
else if var6 = "Level B" then output Level_B;
else if var6 = "Level C" then output Level_C;
else if var6 = "Level D" then output Level_D;
run;
如果SAS的内存不够,可以先分成几个小数据集。然后再分组,再合并。
data subset_1;
set test (firstobs=1 obs=100);
run;
data subset_2;
set test(firstobs=101 obs=200);
run;
data subset_3;
set test(firstobs=201);
run;
2. 按 var6 分组计算的 percent
proc sql;
select var1, var2, var3, var4, var5, var6,
sum(case when var5=0 then 1 end) as count_V5_0,
sum(case when var5 ne 0 then 1 end) as count_v5_,
count(var5) as nv5,
calculated count_v5_0 / calculated nv5 as percent_zero,
calculated count_v5_ / calculated nv5 as percent_nonzero
from test
group by var6;
quit;
|