人大经济论坛 › 论坛 › 数据科学与人工智能 › 数据分析与数据科学 › SAS专版 › 索引?

CDA数据分析研究院

商业数据分析与大数据领航教育品牌



经管云课堂

经管/金融/财会/社科/名师公开课



学术培训

Stata 空间计量 SSCI Python

贵宾：通行论坛特权+数据库权限
+案例库+下载特权 VIP：论坛特权+更多下载次数
+ccerdata数据库+更高阅读权限+……

发帖

楼主: ltjzzyz

4478 3

索引? [推广有奖]

1关注
0粉丝

AzureS

硕士生

还不是VIP/贵宾

威望: 0 级
论坛币: 1214 个
通用积分: 0
学术水平: 0 点
热心指数: 0 点
信用等级: 0 点
经验: 2463 点
帖子: 90
精华: 0
在线时间: 164 小时
注册时间: 2006-5-20
最后登录: 2024-2-19

楼主

ltjzzyz 发表于 2007-3-19 19:29:00 |只看作者 |坛友微信交流群|倒序 |AI写论文

是否 +2 论坛币

k人参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群

赵安豆老师微信：zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

立即领取

感谢您参与论坛问题回答

经管之家送您两个论坛币！

+2 论坛币

土问modify语句中,key选项所规定的index是何意义?另外,数据集的索引index的作用是什么?

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

分享0 收藏0 回帖

关键词：modify Index IND 何意义 ODI 索引

相关帖子

使用道具举报

沙发

sumc 发表于 2007-3-20 09:03:00 |只看作者 |坛友微信交流群

数据集的索引是为了提高查询效率,就好象把字典中所有的字按拼音或者笔画建立目录便于查找一样.

但是索引不是在所有的情况下都有效,比如需要顺序遍历数据集的时候是不必要建立索引的.一般索引建立在where条件的变量上.建立索引以后,查询和修改的速度会提高不少

使用道具举报

藤椅

ltjzzyz 发表于 2007-3-20 11:36:00 |只看作者 |坛友微信交流群

非常感谢楼上,下面是一个运用索引的modify语句.其过程及结果都不很明白.望指教

data master (index=(locate));
input locate $ code @@;
cards;
a 200 a 201 b 100 a 202 a 203 b 101 c 600 d 700 d 701
;
data keyvals;
input locate $ newcode @@;
cards;
b 1 a 2 a 3 a 4 b 11 a 12 c 6 d 16 d 7
;
data master;
set keyvals;
modify master key=locate;
code=newcode;
run;

使用道具举报

板凳

sumc 发表于 2007-3-20 16:38:00 |只看作者 |坛友微信交流群

sas帮助中对modify语句使用索引的时候是这样说明的:

If there are duplicate values of the indexed variable in the master data set, only the first occurrence is retrieved, modified, or replaced. Use a DO LOOP to execute a SET statement with the KEY= option multiple times to update all duplicates with the transaction value.

If there are duplicate, nonconsecutive values in the like-named variable in the data source, MODIFY applies each transaction cumulatively to the first observation in the master data set whose index value matches the values from the data source. Therefore, only the value in the last duplicate transaction is the result in the master observation unless you write an accumulation statement to accumulate each duplicate transaction value in the master observation.

If there are duplicate, consecutive values in the variable in the data source, the values from the first observation in the data source are applied to the master data set, but the DATA step terminates with an error when it tries to locate an observation in the master data set for the second duplicate from the data source. To avoid this error, use the UNIQUE option in the MODIFY statement. The UNIQUE option causes SAS to return to the top of the master data set before retrieving a match for the index value. You must write an accumulation statement to accumulate the values from all the duplicates. If you do not, only the last one applied is the result in the master observation.

If there are duplicate index values in both data sets, you can use SQL to apply the duplicates in the transaction data set to the duplicates in the master data set in a one-to-one correspondence.

使用道具举报