- PROC SURVEYSELECT DATA=rvic OUT=a(keep=subj rp rv ic)
- METHOD =sys
- SEED=0
- SAMPSIZE=2
- ;
- strata subj
- ;
- RUN;
- proc sort data=a out=a1;
- by subj rp;
- run;
- data wanted;
- set a1;
- by subj rp;
- if first.subj then rp=1;
- if last.subj then rp=2;
- run;
与问题(1)的区别在于一个是取两对,一个是取两次,后面包含有可能重复值,所以采用如下做法:
- PROC SURVEYSELECT DATA=rvic(drop=ic) OUT=a(keep=subj rp rv)
- METHOD =sys
- SEED=0
- SAMPSIZE=1
- ;
- STRATA subj;
- RUN;
- PROC SURVEYSELECT DATA=rvic(drop=ic) OUT=a1(keep=subj rp rv)
- METHOD =sys
- SEED=0
- SAMPSIZE=1
- ;
- strata subj;
- RUN;
- proc sql;
- create table a2 as select * from a union all select * from a1;
- quit;
- proc sort data=a2 out=a3;
- by subj rp;
- run;
- data a4;
- set a3;
- by subj rp;
- if first.subj then rp=1;
- if last.subj then rp=2;
- run;
问题(3):
对于第三个问题,我是这么理解你的意思的,先对总体进行随机抽样,抽取100个subj观测的随机样本,然后对100个subj里的每个rv进行方式(1)抽样,且需要抽取两对rv,对每个ic进行方式(2)抽样,且需要抽取两次rv。
按照我上面的理解,问题(3)可以这么来做:
- data temp;
- do i=1 to 100;
- r=int(1+11*ranuni(0));
- output;
- end;
- drop i;
- run;
- proc sql;
- create table temp1 as select distinct(r),count(r) as count from temp group by r;
- quit;