尝试用data步写了一下,有空可以用宏写一下,思路就是利用merge来实现观测来源标识,最后通过标识按照楼主要求组合完成。
- /*录入数据a*/
- data a;
- input ida $ idb $;
- cards;
- a b
- a c
- a d
- b c
- d f
- j f
- g h
- ;
- run;
- /*录入数据b*/
- data b;
- input idm $;
- cards;
- a
- b
- c
- d
- e
- f
- ;
- run;
- /*思路用merge分别将a和b合并,利用by变量和first.ida及first.idb来判别来自数据a或b*/
- proc sort data=a out=aa;
- by ida;
- run;
- proc sort data=b out=bb;
- by idm;
- run;
- /*先合并aa和bb表,先通过by iba变量来对来源打标签,即aa表中ida的值存在于bb表,则group_a=1,否则为零*/
- data both1;
- merge aa (in=ina) bb(rename=(idm=ida) in=inb);
- by ida;
- if ina=0 & inb=1 then delete;
- if ina & inb then group_a=1;
- else group_a=0;
- run;
- proc print;
- run;
- /*对新组合表排序idb*/
- proc sort data=both1;
- by idb;
- run;
- /*在合并新组合表both1和bb表,通过by idm完成,则idb存在于idm中则group_b=1,否则为零*/
- data both;
- merge both1(in=ina) bb(rename=(idm=idb) in=inb);
- by idb;
- if ina=0 & inb=1 then delete;
- if ina & inb then group_b=1;
- else group_b=0;
- run;
- proc print;
- run;
- /*对最后的结果排序,按照楼主的规则根据新添加的标签来加groupa和groupb的标签*/
- proc sort data=both;
- by ida idb;
- run;
- data result;
- set both;
- by ida idb;
- if group_a+group_b=2 then groupa=1;
- else groupa=0;
- if group_a+group_b ge 1 then groupb=1;
- else groupb=0;
- keep ida idb groupa groupb;
- run;
- proc print data=result;
- run;
复制代码