楼主: zhou.wen
3364 15

[原创博文] Hashing,the best tool for Searching [推广有奖]

已卖:4460份资源

教授

9%

还不是VIP/贵宾

-

TA的文库  其他...

SAS Technology

威望
0
论坛币
31987 个
通用积分
6.2141
学术水平
283 点
热心指数
262 点
信用等级
257 点
经验
56058 点
帖子
396
精华
4
在线时间
1328 小时
注册时间
2010-10-12
最后登录
2018-3-9

初级学术勋章 初级热心勋章 中级学术勋章 中级热心勋章

楼主
zhou.wen 发表于 2013-5-23 17:16:34 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
Searching is the most wildely used technology.It includes:merges,joins,formats,indexs and other special functions.
The skills about merge,sql and conditional logic are all base on comparsion.Because it need to compare key with one or more keys in another table,it is slow and it would spend a lot of memory.What is more,if you use merge,you have to sort the data first.If the data happen to be a large data,the Merge method would be the worst.
Hashing is better method base on Direct-Addressing and it is much more convenient.
The is a Demo using hash.You can try it in another way like sql,merge to compare their efficiency.

  1. data a;
  2. input id type $;
  3. cards;
  4. 1  a,b
  5. 2  a
  6. 3  a,b,c
  7. ;
  8. run;

  9. data b;
  10. input key $ data $;
  11. cards;
  12. a         xy
  13. b         er
  14. c         abc
  15. run;
复制代码
Table A is the main dataset,The question is you need the replace the variable 'type' using table B.For example using 'xy' to replace 'a'.As a result,the first line would be:
  1. 1 xy,er
复制代码
The solution I will post later.welcome disscusion.


二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:searching searchin search Ching Tool includes another compare formats special

回帖推荐

ntsean 发表于9楼  查看完整内容

我也来试试 data _null_; set b; call symputx(key, data); run; data c; set a; length str $100; k = 1; str = ""; do while (scan(type, k, ',') ne ""); if k=1 then str = symget(scan(type, k, ',')); else str = strip(str)||","||symget(scan(type, k, ',')); k = k + 1; end; drop k; run;

邓贵大 发表于8楼  查看完整内容

No, you keep lecturing! I was just trying to bump up your post! http://www.theprogrammerscabin.com/NS9PO13P.pdf http://support.sas.com/resources/papers/proceedings09/071-2009.pdf
已有 4 人评分经验 论坛币 学术水平 热心指数 信用等级 收起 理由
webgu + 60 + 60 + 3 + 3 + 3 精彩帖子
boe + 1 + 1 + 1 精彩帖子
zll_zh + 1 + 1 + 1 精彩帖子
Imasasor + 100 + 80 + 4 + 2 + 3 精彩帖子

总评分: 经验 + 160  论坛币 + 140  学术水平 + 9  热心指数 + 7  信用等级 + 8   查看全部评分

本帖被以下文库推荐

Practice Is The Best Teacher!

沙发
zkymath 在职认证  发表于 2013-5-23 23:21:02
很好

藤椅
zhou.wen 发表于 2013-5-24 10:17:03
The 'merge' version:
  1. data a;
  2. input id type $;
  3. cards;
  4. 1 a,b
  5. 2 a
  6. 3 a,b,c
  7. ;
  8. data b;
  9. input key $ data $;
  10. cards;
  11. a xy
  12. b er
  13. c abc
  14. ;
  15. run;
  16. data out;
  17. set a;
  18. i=1;
  19. do while(scan(type,i)^='');
  20.    key=scan(type,i);
  21.    i=i+1;
  22.    output;
  23. end;
  24. run;
  25. proc sort data=out;by  key;run;
  26. data out;
  27. merge out b;
  28. by key;
  29. keep id data i;
  30. i=i-1;
  31. run;
  32. proc sort data=out;by id i;run;
  33. data out;
  34. length id 8 type $100;
  35. retain type ;
  36. set out;
  37. by id i;
  38. if first.id then type=compress(data);
  39. else type=compress(type)||','||compress(data);
  40. if last.id then output;
  41. keep id type;
  42. run;
复制代码
已有 1 人评分学术水平 热心指数 信用等级 收起 理由
zll_zh + 1 + 1 + 1 观点有启发

总评分: 学术水平 + 1  热心指数 + 1  信用等级 + 1   查看全部评分

Practice Is The Best Teacher!

板凳
邓贵大 发表于 2013-5-24 11:09:08
Professor, I'd pick the FORMAT method for your sample question because of the single key variable.
Be still, my soul: the hour is hastening on
When we shall be forever with the Lord.
When disappointment, grief and fear are gone,
Sorrow forgot, love's purest joys restored.

报纸
zhou.wen 发表于 2013-5-24 11:34:45
邓贵大 发表于 2013-5-24 11:09
Professor, I'd pick the FORMAT method for your sample question because of the single key variable.
Can you share the code of FORMAT method?
Practice Is The Best Teacher!

地板
iavjssssmqee 发表于 2013-5-24 11:49:47
楼主的实力,可以开专栏了。
已有 1 人评分热心指数 信用等级 收起 理由
jingju11 + 1 + 1 agree.

总评分: 热心指数 + 1  信用等级 + 1   查看全部评分

决定了,心一恒,就不会害怕!!!

7
iavjssssmqee 发表于 2013-5-24 11:52:42
希望版主能邀请楼主开个SAS学习的小专栏,每隔几天给大家分享交流一些东西。
谢谢。
决定了,心一恒,就不会害怕!!!

8
邓贵大 发表于 2013-5-24 11:59:32
zhou.wen 发表于 2013-5-24 11:34
Can you share the code of FORMAT method?
No, you keep lecturing! I was just trying to bump up your post!
http://www.theprogrammerscabin.com/NS9PO13P.pdf
http://support.sas.com/resources ... ings09/071-2009.pdf
Be still, my soul: the hour is hastening on
When we shall be forever with the Lord.
When disappointment, grief and fear are gone,
Sorrow forgot, love's purest joys restored.

9
ntsean 发表于 2013-5-24 13:52:13
我也来试试

data _null_;
  set b;
  call symputx(key, data);
run;

data c;
  set a;
  length str $100;
  k = 1;
  str = "";
  do while (scan(type, k, ',') ne "");
    if k=1 then str = symget(scan(type, k, ','));
        else str =  strip(str)||","||symget(scan(type, k, ','));
        k = k + 1;
  end;
  drop k;
run;

10
zhou.wen 发表于 2013-5-24 13:58:36
ntsean 发表于 2013-5-24 13:52
我也来试试

data _null_;
Good try.
Using the macro variable.
Practice Is The Best Teacher!

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注cda
拉您进交流群
GMT+8, 2025-12-5 21:47