楼主: t818
4598 5

如何删除重复的数据以进行进一步检验 [推广有奖]

  • 2关注
  • 2粉丝

教师

博士生

84%

还不是VIP/贵宾

-

威望
0
论坛币
73516 个
通用积分
4.3104
学术水平
3 点
热心指数
9 点
信用等级
0 点
经验
6706 点
帖子
96
精华
0
在线时间
580 小时
注册时间
2008-11-7
最后登录
2024-4-17

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
请教一下:事件研究法中,用以下命令生成累计超额收益率后,每个id的各行数据均相同(即累计超额收益率重复出现),此时如何只留下一个累计超额收益率的数据以进行T检验等?谢谢!
gen abnormal_return=ret-predicted_return if event_window==1
by id: egen car = sum(abnormal_return)  if dif>=-365 & dif<0
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Predicted abnormal predict RETURN normal 数据 检验 删除

沙发
t818 发表于 2010-7-23 03:06:39 |只看作者 |坛友微信交流群
能否在用egen时针对每个id直接只输出一个累计超额收益率?谢谢!

使用道具

藤椅
houquan 发表于 2010-7-23 08:53:52 |只看作者 |坛友微信交流群
help collapse                                                                                                                           dialog:  collapse
---------------------------------------------------------------------------------------------------------------------------------------------------------

Title

    [D] collapse -- Make dataset of summary statistics


Syntax

        collapse clist [if] [in] [weight] [, options]

    where clist is either

        [(stat)] varlist [ [(stat)] ... ]

        [(stat)] target_var=varname [target_var=varname ...] [ [(stat)] ...]

    or any combination of the varlist or target_var forms, and stat is one of

        mean         means (default)
        median       medians
        p1           1st percentile
        p2           2nd percentile
        ...          3rd-49th percentiles
        p50          50th percentile (same as median)
        ...          51st-97th percentiles
        p98          98th percentile
        p99          99th percentile
        sd           standard deviations
        semean       standard error of the mean (sd/sqrt(n))
        sebinomial   standard error of the mean, binomial (sqrt(p(1-p)/n))
        sepoisson    standard error of the mean, Poisson (sqrt(mean))
        sum          sums
        rawsum       sums, ignoring optionally specified weight
        count        number of nonmissing observations
        max          maximums
        min          minimums
        iqr          interquartile range
        first        first value
        last         last value
        firstnm      first nonmissing value
        lastnm       last nonmissing value

    If stat is not specified, mean is assumed.

    options          description
    ---------------------------------------------------------------------------------------------------------------------------------------------------
    Options
      by(varlist)    groups over which stat is to be calculated
      cw             casewise deletion instead of all possible observations

    + fast           do not restore the original dataset should the user press Break; programmer's command
    ---------------------------------------------------------------------------------------------------------------------------------------------------
    + fast is not shown in the dialog box.
    varlist and varname in clist may contain time-series operators; see tsvarlist.
    aweights, fweights, iweights, and pweights are allowed; see weight, and see Weights below.  pweights may not be used with sd, semean, sebinomial,
      or sepoisson.  iweights may not be used with semean, sebinomial, or sepoisson.  aweights may not be used with sebinomial or sepoisson.


Menu

    Data > Create or change data > Other variable-transformation commands > Make dataset of means, medians, etc.


Description

    collapse converts the dataset in memory into a dataset of means, sums, medians, etc.  clist must refer to numeric variables exclusively.

    Note: See [D] contract if you want to collapse to a dataset of frequencies.


Options

        +---------+
    ----+ Options +------------------------------------------------------------------------------------------------------------------------------------

    by(varlist) specifies the groups over which the means, etc., are to be calculated.  If this option is not specified, the resulting dataset will
        contain 1 observation.  If it is specified, varlist may refer to either string or numeric variables.

    cw specifies casewise deletion.  If cw is not specified, all possible observations are used for each calculated statistic.

    The following option is available with collapse but is not shown in the dialog box:

    fast specifies that collapse not restore the original dataset should the user press Break.  fast is intended for use by programmers.


Weights

    collapse allows all four weight types; the default is aweights.  Weight normalization impacts only the sum, count, sd, semean, and sebinomial
    statistics.

    Here are the definitions for count and sum with weights:

     count:                           
        unweighted                    _N, the number of physical observations
        aweight:                      _N, the number of physical observations
        fweight, iweight, pweight:    sum(w_j), the sum of user-specified weights
     sum:                             
        unweighted                    sum(x_j), the sum of the variable
        aweight:                      sum(v_j*x_j); v_j = weights normalized to sum to _N
        fweight, iweight, pweight:    sum(w_j*x_j); w_j = user supplied weights.

    The sd statistic with weights returns the bias-corrected standard deviation, which is based on the factor sqrt(N/(N-1)), where N is the number of
    observations. Statistics sd, semean, sebinomial, and sepoisson are not allowed with pweighted data.  Otherwise, the statistic is changed by the
    weights through the computation of the count (N), as outlined above.

    For instance, consider a case in which there are 25 physical observations in the dataset and a weighting variable that sums to 57.  In the
    unweighted case, the weight is not specified, and N = 25.  In the analytically weighted case, N is still 25; the scale of the weight is irrelevant.
    In the frequency-weighted case, however, N = 57, the sum of the weights.

    The rawsum statistic with aweights ignores the weight, with one exception:  observations with zero weight will not be included in the sum.


Examples

    -----------------------------------------------------------------------------------------------------------------------------------------------------
    Setup
        . webuse college
        . describe
        . list

    Create dataset containing the 25th percentile of gpa for each year
        . collapse (p25) gpa [fw=number], by(year)

    List the result
        . list

    -----------------------------------------------------------------------------------------------------------------------------------------------------
    Setup
        . webuse college, clear

    Create dataset containing the mean and median of gpa and hour for each year, and store median of gpa and hour in medgpa and medhour, respectively
        . collapse (mean) gpa hour (median) medgpa=gpa medhour=hour [fw=number], by(year)

    List the result
        . list

    -----------------------------------------------------------------------------------------------------------------------------------------------------
    Setup
        . webuse college, clear

    Create dataset containing the count of gpa and hour and the minimums of gpa and hour, and store the minimums in mingpa and minhour, respectively
        . collapse (count) gpa hour (min) mingpa=gpa minhour=hour [fw=number], by(year)

    List the result
        . list

    -----------------------------------------------------------------------------------------------------------------------------------------------------
    Setup
        . webuse college, clear
        . replace gpa = . in 2/4

    Create dataset containing the mean of gpa and hour for each year, but ignore all observations that have missing values when calculating the means
        . collapse (mean) gpa hour [fw=number], by(year) cw

    List the result
        . list
    -----------------------------------------------------------------------------------------------------------------------------------------------------


Also see

    Manual:  [D] collapse

      Help:  [D] contract, [D] egen, [D] statsby, [R] summarize
We all love to instruct, though we can teach only what is not worth knowing. -- J. Austen

使用道具

板凳
ajun685 发表于 2010-7-23 11:00:30 |只看作者 |坛友微信交流群
by id, sort: drop if car == car[_n+1]
已有 1 人评分学术水平 热心指数 信用等级 收起 理由
t818 + 1 + 1 + 1 很有启发

总评分: 学术水平 + 1  热心指数 + 1  信用等级 + 1   查看全部评分

生于忧患,死于安乐。

使用道具

报纸
t818 发表于 2010-7-23 13:37:32 |只看作者 |坛友微信交流群
谢谢!
发现用这个就不必剔除样本,这样以便于进一步计算不同窗口期的收益率:
by car,sort: replace car =. if car== car[_n+1]

使用道具

地板
dandan36956 发表于 2011-6-2 20:44:44 |只看作者 |坛友微信交流群
楼主你好,我最近也要用事件研究法,可是因为没找到系统的文献或者书籍,我本身对事件研究法也是第一次接触,所以能不能请你给我一些指点,比如说,能不能给我介绍点文章或者资料。万分感谢!

使用道具

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-4-28 06:10