楼主: Lisrelchen
1092 1

[问答] How do I perform a left outer join using SPSS commands? [推广有奖]

  • 0关注
  • 62粉丝

VIP

院士

67%

还不是VIP/贵宾

-

TA的文库  其他...

Bayesian NewOccidental

Spatial Data Analysis

东西方数据挖掘

威望
0
论坛币
50057 个
通用积分
79.9387
学术水平
253 点
热心指数
300 点
信用等级
208 点
经验
41518 点
帖子
3256
精华
14
在线时间
766 小时
注册时间
2006-5-4
最后登录
2022-11-6

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币

Can SPSS commands (e.g., MERGE FILES) be used to perform a left outer join between 2 SPSS datasets? Assume that the join field is not unique in either dataset.

Example: Let the left Dataset1 contains 2 fields - ClassNbr and Fact1 - and these 4 records . . .

1 A
1 D
2 A
3 B

Let Dataset2 contains 2 fields - ClassNbr and Fact2 - and these 3 records . . .

1 XX

1 XY
3 ZZ

I want to join Dataset1 and Dataset2 on ClassNbr. The desired result is a 6 record dataset as follows:

1 A XX

1 A XY
1 D XX
1 D XY
2 A (NULL)
3 B ZZ

I would prefer a solution that uses SPSS commands (as opposed to SQL/Python/etc.).



二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Commands Command Perform Using Comm between desired records either result

沙发
Lisrelchen 发表于 2014-5-6 01:46:30 |只看作者 |坛友微信交流群
As far as I'm aware you can not do this directly. One potential way to do the workaround is to "reshape" the data from long format to wide format (using casestovars), do the merge, and then reshape back into long format (using varstocases). Below is a use example (if any clarification is needed on the code just ask).

data list free / ClassNbr (F1) Fact1 (A1).
begin data
1 A
1 D
2 A
3 B
end data.
dataset name data1.

casestovars
/id = ClassNbr.

data list free / ClassNbr (F1) Fact2 (A2).
begin data
1 XX
1 XY
3 ZZ
end data.
dataset name data2.

casestovars
/id = ClassNbr.

match files file = 'data1'
/file = 'data2'
/by ClassNbr.
execute.

varstocases
/make Fact1 FROM Fact1.1 to Fact1.2
/null = KEEP.
varstocases
/make Fact2 FROM Fact2.1 to Fact2.2
/null = KEEP.
This creates some cases that you do not want, here I have just defined a set of commands to identify those cases and take them out (I'm sure this could be improved to be more efficient).

*now cleaning up the extra records.
compute flag = 0.
if ClassNbr = lag(ClassNbr) and Fact1 = lag(Fact1) and Fact2 = lag(Fact2) flag = 1.
select if flag = 0.
execute.
if Fact1 = " " and Fact2 = " " flag = 1.
select if flag = 0.
execute.
if ClassNbr = lag(ClassNbr) and Fact1 = lag(Fact1) and Fact2 = " " flag = 1.
select if flag = 0.
execute.
if ClassNbr = lag(ClassNbr) and Fact2 = lag(Fact2) and Fact1 = " " flag = 1.
select if flag = 0.
execute.
I'm sure it would be possible to make this more robust (probably making some custom python functions). But hopefully this helps get you started.

使用道具

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注cda
拉您进交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-6-17 05:34