[原创博文] 奖励20论坛币，紧急求助！Probable disk full condition. [推广有奖]

11楼

soporaeternus2 发表于 2010-1-24 22:16:02

FAT32格式的盘现在可以有单个4G以上的文件，这个我忘记了......
我想楼主的数据文件应该不在C盘上

12楼

redaring 发表于 2010-1-25 01:19:12

谢谢楼上各位！
经过多番尝试，证明了，的确是硬盘格式的问题。我的c盘是fat32的，改成ntfs以后，就可以运行了。
也许很多朋友没有处理过体积这么大的数据，希望我这个问题可以为以后的朋友提供点帮助，不会弄得像我这样狼狈。

13楼

bobguy 发表于 2010-1-25 02:56:50

redaring 发表于 2010-1-25 01:19
谢谢楼上各位！
经过多番尝试，证明了，的确是硬盘格式的问题。我的c盘是fat32的，改成ntfs以后，就可以运行了。
也许很多朋友没有处理过体积这么大的数据，希望我这个问题可以为以后的朋友提供点帮助，不会弄得像我这样狼狈。

I am glad that you solve the proble. I am about to advise you other approaches. We don't have to hang ourselves on one tree, there are many other options with SAS. Here is a couple,

1) index  - which is 'cheaper' than sort
2)take out keys only from big file + (_N_ observation point) and save as keysonly file. This file will be much smaller. Sort the smaller file by keys + _N_. Create the new file with the order in smaller file and point access to the bigger file.
According to the log, the approach is faster then a naked sort by a factor of 2+. That is a surprise. Your case may vary.

284  data t1;
285 retain x1-x2000 '222222222222';
286 do i=1 to 10000 ;
287    key=ceil(ranuni(99)*10000);
288    a= ranuni(99); b=ranuni(99);
289    output;
290 end;
291 drop i;
292  run;

NOTE: The data set WORK.T1 has 10000 observations and 2003 variables.
NOTE: DATA statement used (Total process time):
   real time          22.89 seconds
   cpu time          1.29 seconds

293
294  data tmp/view=tmp;
295    set t1(keep=key);
296    original_ord=_n_;
297  run;

NOTE: DATA STEP view saved on file WORK.TMP.
NOTE: A stored DATA STEP view cannot run under a different operating system.
NOTE: DATA statement used (Total process time):
   real time          0.03 seconds
   cpu time          0.00 seconds

298
299  proc sort data=tmp out=tmp_srt; by key original_ord; run;

NOTE: There were 10000 observations read from the data set WORK.TMP.
NOTE: View WORK.TMP.VIEW used (Total process time):
   real time          0.17 seconds
   cpu time          0.17 seconds

NOTE: There were 10000 observations read from the data set WORK.T1.
NOTE: The data set WORK.TMP_SRT has 10000 observations and 2 variables.
NOTE: PROCEDURE SORT used (Total process time):
   real time          0.21 seconds
   cpu time          0.21 seconds

300
301  data t2;
302 set  tmp_srt;
303 set t1 point=original_ord;
304  run;

NOTE: The variable original_ord exists on an input data set, but was also specified in an I/O
   statement option.  The variable will not be included on any output data set.
NOTE: There were 10000 observations read from the data set WORK.TMP_SRT.
NOTE: The data set WORK.T2 has 10000 observations and 2003 variables.
NOTE: DATA statement used (Total process time):
   real time          27.89 seconds
   cpu time          1.82 seconds

305
306
307  proc sort data=t1 out=t3; by key;run;

NOTE: There were 10000 observations read from the data set WORK.T1.
NOTE: The data set WORK.T3 has 10000 observations and 2003 variables.
NOTE: PROCEDURE SORT used (Total process time):
   real time          1:11.86
   cpu time          3.95 seconds