sum() total() pc() rowtotal() 下为stata11 Manual中的例子,从上里来看,sum()和total()两个函数是不同的,其中sum()为按照下标动态累加。而total()是求总和。但应该 注意的是这里是stata's sum() VS egen's total(),如果在egen下使用sum(),作用与total()相同。 Distinguish carefully between Stata’s sum()function and egen’s total()function. Stata’s sum()function creates the running sum, whereas egen’s total() function creates a constant equal to the overall sum. For example: clear set obs 5 gen a=_n gen sum1=sum(a) egen sum2=total(a) list 执行结果如下: . do "C:\DOCUME~1\ADMINI~1\LOCALS~1\Temp\STD00000000.tmp" . clear . set obs 5 obs was 0, now 5 . gen a=_n . gen sum1=sum(a) . egen sum2=total(a) end of do-file . list +-----------------+ | a sum1 sum2 | |-----------------| 1. | 1 1 15 | 2. | 2 3 15 | 3. | 3 6 15 | 4. | 4 10 15 | 5. | 5 15 15 | +-----------------+ 如果将上述例子稍微修改,将第四行代码中gen改为egen如下: clear set obs 5 gen a=_n egen sum1=sum(a) egen sum2=total(a) list 执行结果如下: . clear . set obs 5 obs was 0, now 5 . gen a=_n . egen sum1=sum(a) . egen sum2=total(a) end of do-file . list +-----------------+ | a sum1 sum2 | |-----------------| 1. | 1 15 15 | 2. | 2 15 15 | 3. | 3 15 15 | 4. | 4 15 15 | 5. | 5 15 15 | +-----------------+ sum()、total()两个命令的应用: 请问如何用stata命令求在下列四组每个公司在各自行业中的销售额比率?如行业1,先求行业内四个公司的销售总额,然后求A1,A2...各自占行业的份额。 公司 Sales 行业 A1 27.72 1 A2 26.37 1 A3 24.79 1 A4 18.69 1 B1 17.48 2 B2 17.04 2 B3 10.87 2 B4 6.68 2 C1 9.06 3 C2 6.8 3 C3 8.85 3 C4 9.43 3 D1 11.48 4 D2 13.96 4 D3 14.19 4 D4 17.93 4 使用total()或者sum()函数就可以完成这个任务,为避免中文无法识别,将行业变量命名为industry 代码如下: by industry ,sort : egen sale_s=total(Sales) gen ratio=Sales/sale_s 当然,stata中还有一个一直的求所占比率的函数pc(),这个也是egen命令的fnc之一。 改进的命令为: by industry, sort: egen ratio=pc(Sales),prop 关于pc() pc( exp ) (allows by varlist ) returens exp (within varlist) scaled to be a percentage of the total, between 0 and 100. The prop option returns exp scaled to be a proption of the total, between 0 and 1. 此为stata11中egen函数下pc()的说明,从内容来看,pc()这一function可以返回变量某一取值占总和的比重,加上prop这一option之后,可将百分数改为0到1之间的小数。而且,此function可与by连用,从而功能更加强大。 关于rowtotal(): generate’s sum()function creates the vertical, running sum of its argument, whereas egen’stotal()function creates a constant equal to the overall sum. egen’s rowtotal()function, however,creates the horizontal sum of its arguments. They all treat missing as zero. However, if the missingoption is specified with total()or rowtotal(), then newvar will contain missing valuesif allvalues of exp or varlist are missing. 从这段话来看,sum()函数做的是纵向合并,total()函数得到的是最终的和,二rowtotal()函数则是横向求和。三种函数都把缺省值视为数值0。 例子: . webuse egenxmpl4,clear . egen hsum=rowtotal(a b c) . generate vsum=sum(hsum) . egen sum=total(hsum) . list +----------------------------------+ | a b c hsum vsum sum | |----------------------------------| 1. | . 2 3 5 5 63 | 2. | 4 . 6 10 15 63 | 3. | 7 8 . 15 30 63 | 4. | 10 11 12 33 63 63 | +----------------------------------+ end of do-file 这个例子生动的展示了sum() total() rowtotal() 之间的区别。