楼主: rockfido
4285 20

请教SAS 数据处理问题 [推广有奖]

  • 0关注
  • 0粉丝

已卖:1754份资源

博士生

32%

还不是VIP/贵宾

-

威望
0
论坛币
2715 个
通用积分
0.0341
学术水平
1 点
热心指数
1 点
信用等级
0 点
经验
4441 点
帖子
201
精华
0
在线时间
202 小时
注册时间
2008-8-29
最后登录
2024-11-1

楼主
rockfido 在职认证  发表于 2010-2-17 10:03:05 |AI写论文
20论坛币
数据大概如下:

Name   Test1 ..... Test1000
Billy   1              2
.....
....

其实对于所有的 Test Variable, 只有三个值:1,2,3

我想知道,有什么简单方法,可以计算对于每一列(TEST1, TEST2....,TEST1000)的数据中,1出现的百分比,2出现的百分比,3出现的百分比,同时还要把它存储到这个DATA SET的最后一行

对了,数据中有MISSING DATA的,计算百分比的时候,MISSING VALUE不可以计算在内

非常感谢!!

最佳答案

jingju11 查看完整内容

If I understand the question correctly, those percentages should converage to 0.33 when randomly generated 1,2, 3 and missing values as what I was doing above; something wrong with the website, many 's were gone...??
关键词:数据处理问题 数据处理 Variable missing value 请教 数据处理 SAS 数据分析专题 数据处理 数据分析软件 数据分析报告 面板数据分析 excel数据分析 数据分析方法 项目数据分析

回帖推荐

rockfido 发表于14楼  查看完整内容

i totally agree with you. actually either way works for my project. since i kept on thinking about adding 3 lines but without any idea, i'm just so curious about how to fulfill it. and ur rite, adding 3 lines actually doesnt make much sense, but just for some little advantage for my project. thank you so much! 13# jingju11

本帖被以下文库推荐

沙发
jingju11 发表于 2010-2-17 10:03:06
  1. *generating a test data set;
  2. data a;
  3. array test{1000};
  4. do j = 1 to 2000;
  5.   do i = 1 to dim(test);
  6.    r =ceil (ranuni(0)*12);
  7.    r2 =ceil (ranuni(0)*11);
  8.    test{i} = 1+mod(r, 3);
  9.    if r2 = 11 then test{i} = .;
  10.    end;
  11.   output;
  12.   end;
  13. drop r:;
  14. run;
  15. data b;
  16.    set a end = Eof;
  17.    array test{1000}; *load vars;
  18.    array n_{1000}; *count # of non-missing for each var;
  19.    array n1_{1000}; *count # of 1s for each var;
  20.    array n2_{1000}; *count # of 2s for each var;
  21.    array n3_{1000}; *count # of 3s for each var;
  22.    array p1_{1000}; *percentage of 1s for each var;
  23.    array p2_{1000}; *percentage of 2s for each var;
  24.    array p3_{1000}; *percentage of 3s for each var;
  25.    do i = 1 to dim(test);
  26.       if ^missing(test{i}) then
  27.          do;
  28.             n_{i}+1;
  29.             n1_{i}+(test{i} = 1);
  30.             n2_{i}+(test{i} = 2);
  31.             n3_{i}+(test{i} = 3);
  32.             end;
  33.       end;
  34. if Eof then
  35.     do i = 1 to dim(test);
  36.        p1_{i} = n1_{i} /n_{i};
  37.        p2_{i}  = n2_{i} /n_{i};
  38.        p3_{i}  = n3_{i} /n_{i};
  39.        end;
  40. drop i j;
  41. run;
复制代码
If I understand the question correctly, those percentages should converage to 0.33 when randomly generated 1,2, 3 and missing values as what I was doing above;
something wrong with the website, many 's were gone...??
已有 1 人评分论坛币 收起 理由
eijuhz + 100 热心回答他人疑问

总评分: 论坛币 + 100   查看全部评分

藤椅
markai 发表于 2010-2-19 19:52:45
AMAZING!!!!
钉子精神

板凳
醉_清风 发表于 2010-2-19 21:46:58
进来学习了
从来不需要想起 永远也不会忘记

报纸
rockfido 在职认证  发表于 2010-2-20 11:31:46
I really hope to be at ur level of SAS......thank you so much!!!


2# jingju11

地板
lyfyb99 在职认证  发表于 2010-2-20 17:29:57
xuexile, xiexie!

7
rockfido 在职认证  发表于 2010-3-2 03:12:31
i have a question, it seems like that i cant use n_ in the do loop, and have to change to be n_{i}, is there any tricky for it?

also, what would be the final data set look like? as following?

test1 ............ test 1000 n_1 ..................n_1000 n1_1......................n1_1000...................


and each one has 2000 obs?

thanks a lot

2# jingju11

8
jingju11 发表于 2010-3-2 04:20:02
rockfido 发表于 2010-3-2 03:12
i have a question, it seems like that i cant use n_ in the do loop, and have to change to be n_{i}, is there any tricky for it?

also, what would be the final data set look like? as following?

test1 ............ test 1000 n_1 ..................n_1000 n1_1......................n1_1000...................


and each one has 2000 obs?

thanks a lot

2# jingju11
Nothing tricky here. Originally I used 中括号,but some problem to show this sign [].
should be like that.

9
zespri 发表于 2010-3-2 06:22:51
Great programing skill, fantastic

10
rockfido 在职认证  发表于 2010-3-2 07:25:42
so in that case, all the first 1999 obs are not valid but only the last line is needed? because, I guess for n_, the 2000 obs should be valuing from 1, 2, 3, until 2000?

do u have any idea how could i add those p1_, p2_, p3_ as the last three observations under each variables (instead of being added as 3 new variables....)

thanks a lot!

8# jingju11

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注cda
拉您进交流群
GMT+8, 2025-12-22 18:51