A表有一列关于地址的变量address,B表有一列关于小区的变量village。
现在我想创建一张表C,包含全部A表的内容,且满足:
如果A.address中包含B.village,则把B.village添加到A表后面,
如果A.address中不包含B.village,则添加为缺失。
我的代码如下:
proc sql;
create table c as
select a.*,d.village as village
from a
left join
(select a.address,b.village
from a,b
where find(address,compress(village))>=1) as d
on a.address=d.address;
quit;
结果:C表的观测数大于A表,猜测是因为A.address中某些观测同时包含B.village中多条观测引起的。
请问有什么方法可以解决?