楼主: 南方郎
8726 9

[其他] 面板数据中如何按照证券代码和年份2个变量合并数据 [推广有奖]

  • 10关注
  • 1粉丝

已卖:155份资源

博士生

8%

还不是VIP/贵宾

-

威望
0
论坛币
279 个
通用积分
19.0984
学术水平
7 点
热心指数
7 点
信用等级
5 点
经验
3123 点
帖子
142
精华
0
在线时间
300 小时
注册时间
2009-10-11
最后登录
2025-1-6

楼主
南方郎 发表于 2012-3-7 09:46:43 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
面板数据中如何按照证券代码和年份2个变量合并数据

例如
stock   year    a           与              stock   year   b
1         03       2                           1         03       4
1         04       4                           1         04       6   
合并成
stock    year   a    b
1            03    2    4
1            04    4    6



亟需帮助,谢谢。
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:变量合并 面板数据 合并数据 Stock tock 如何 证券

沙发
南方郎 发表于 2012-3-7 10:12:03
  merge m:1 varlist using filename [, options]

这里的m代表什么含义?

藤椅
sungmoo 发表于 2012-3-7 10:27:06
多对1

板凳
蓝色 发表于 2012-3-7 11:57:13
merge的help文件,英文还有应该能明白吧
Title

    [D] merge -- Merge datasets


Syntax

    One-to-one merge on specified key variables

        merge 1:1 varlist using filename [, options]


    Many-to-one merge on specified key variables

        merge m:1 varlist using filename [, options]


    One-to-many merge on specified key variables

        merge 1:m varlist using filename [, options]


    Many-to-many merge on specified key variables

        merge m:m varlist using filename [, options]


    One-to-one merge by observation

        merge 1:1 _n using filename [, options]


报纸
南方郎 发表于 2012-3-7 12:54:23
1:1或者1:m或者m:m代表什么含义,不是很理解?

地板
swufe2012 发表于 2012-3-7 13:08:38
merge stock year using"文件所在位置"就可以了,你可以help merge上面有介绍的

7
蓝色 发表于 2012-3-7 13:29:56
Basic description

    Think of merge as being master + using = merged result.

    Call the dataset in memory the master dataset, and the dataset on disk the using dataset.  This way
    we have general names that are not dependent on individual datasets.

    Suppose we have two datasets,

       master in memory       on disk in file filename
          +-------+                  +--------+
          |id  age|                  |id   wgt|        
          |-------|                  |--------|     
          | 1   22|                  | 1   130|        
          | 2   56|                  | 2   180|        
          | 5   17|                  | 4   110|        
          +-------+                  +--------+

    We would like to join together the age and weight information.  We notice that the id variable
    identifies unique observations in both datasets: if you tell me the id number, then I can tell you
    the one observation that contains information about that id.  This is true for both the master and
    the using datasets.

    Because id uniquely identifies observations in both datasets, this is a 1:1 merge. We can bring in
    the dataset from disk by typing

        . merge 1:1 id using filename

        in memory      in filename.dta
         master     +       using      =    merged result
        +-------+        +--------+         +------------+
        |id  age|        |id   wgt|         |id  age  wgt|
        |-------|        |--------|         |------------|
        | 1   22|        | 1   130|         | 1   22  130|  (matched)
        | 2   56|        | 2   180|         | 2   56  180|  (matched)     
        | 5   17|        | 4   110|         | 5   17    .|  (master only)
        +-------+        +--------+         | 4    .  110|  (using only)
                                            +------------+
                                          

    The original data in memory are called the master data.  The data in filename.dta are called the
    using data.  After merge, the merged result is left in memory.  The id variable is called the key
    variable.  Stata jargon is that the datasets were merged on id.

    Observations for id==1 existed in both the master and using datasets and so were combined in the
    merged result.  The same occurred for id==2.  For id==5 and id==4, however, no matches were found
    and thus each became a separate observation in the merged result.  Thus each observation in the
    merged result came from one of three possible sources:

           numeric    equivalent
            code      word           description
           ------------------------------------------------------------
              1       master         originally appeared in master only
              2       using          originally appeared in using only
              3       match          originally appeared in both
           ------------------------------------------------------------

    merge encodes this information into new variable _merge, which merge adds to the merged result:

        in memory        in filename.dta
         master      +       using       =        merged result
        +-------+          +--------+         +--------------------+
        |id  age|          |id   wgt|         |id  age  wgt  _merge|
        |-------|          |--------|         |--------------------|
        | 1   22|          | 1   130|         | 1   22  130       3|
        | 2   56|          | 2   180|         | 2   56  180       3|
        | 5   17|          | 4   110|         | 5   17    .       1|
        +-------+          +--------+         | 4    .  110       2|
                                              +--------------------+

    Note: Above we show the master and using data sorted by id before merging; this was for
    illustrative purposes.  The dataset resulting from a 1:1 merge will have the same data, regardless
    of the sort order of the master and using datasets.

    The formal definition for merge behavior is the following:  Start with the first observation of the
    master.  Find the corresponding observation in the using data, if there is one.  Record the matched
    or unmatched result.  Proceed to the next observation in the master dataset.  When you finish
    working through the master dataset, work through unused observations from the using data.  By
    default, unmatched observations are kept in the merged data, whether they come from the master
    dataset or the using dataset.

    Remember this formal definition.  It will serve you well.


1:1 merges

    The example shown above is called a 1:1 merge, because the key variable uniquely identified each
    observation in each of the datasets.

    A variable or variable list uniquely identifies the observations if each distinct value of the
    variable(s) corresponds to one observation in the dataset.

    In some datasets, multiple variables are required to identify the observations.  Imagine data
    obtained by observing patients at specific points in time so that variables pid and time, taken
    together, identify the observations.  Below we have two such datasets and run a 1:1 merge on pid
    and time,

        . merge 1:1 pid time using filename

            master      +       using        =        merged result

        +-------------+     +-------------+     +-------------------------+
        |pid  time  x1|     |pid  time  x2|     |pid  time  x1  x2  _merge|
        |-------------|     |-------------|     |-------------------------|
        | 14     1   0|     | 14     1   7|     | 14     1   0   7       3|
        | 14     2   0|     | 14     2   9|     | 14     2   0   9       3|
        | 14     4   0|     | 16     1   2|     | 14     4   0   .       1|
        | 16     1   1|     | 16     2   3|     | 16     1   1   2       3|
        | 16     2   1|     | 17     1   5|     | 16     2   1   3       3|
        | 17     1   0|     | 17     2   2|     | 17     1   0   5       3|
        +-------------+     +-------------+     | 17     2   .   2       2|
                                                +-------------------------+

    This is a 1:1 merge because the combination of the values of pid and time uniquely identifies
    observations in both datasets.

    By default, there is nothing about a 1:1 merge that implies that all, or even any of, the
    observations match.  Above five observations matched, one observation was only in the master
    (subject 14 at time 4), and another was only in the using (subject 17 at time 2).


m:1 merges

    In an m:1 merge, the key variable or variables uniquely identify the observations in the using
    data, but not necessarily in the master data.  Suppose you had person-level data within regions and
    you wished to bring in regional data.  Here is an example:

       . merge m:1 region using filename

             master     +     using      =           merged result

       +--------------+    +----------+       +--------------------------+
       |id  region   a|    |region   x|       |id  region   a   x  _merge|
       |--------------|    |----------|       |--------------------------|
       | 1       2  26|    |     1  15|       | 1       2  26  13       3|
       | 2       1  29|    |     2  13|       | 2       1  29  15       3|
       | 3       2  22|    |     3  12|       | 3       2  22  13       3|
       | 4       3  21|    |     4  11|       | 4       3  21  12       3|
       | 5       1  24|    +----------+       | 5       1  24  15       3|
       | 6       5  20|                       | 6       5  20   .       1|
       +--------------+                       | .       4   .  11       2|
                                              +--------------------------+

    To bring in the regional information, we need to merge on region.  The values of region identify
    individual observations in the using data, but it is not an identifier in the master data.

    We show the merged dataset sorted by id because this makes it easier to see how the merged dataset
    was constructed.  For each observation in the master data, merge finds the corresponding
    observation in the using data.  merge combines the values of the variables in the using dataset to
    the observations in the master dataset.


1:m merges

    1:m merges are similar to m:1, except that now the key variables identify unique observations in
    the master dataset.  Any datasets that can be merged using an m:1 merge may be merged using a 1:m
    merge by reversing the roles of the master and using datasets.  Here is the same example as used
    previously, with the master and using datasets reversed:

       . merge 1:m region using filename

          master     +       using        =         merged result

       +----------+     +--------------+    +--------------------------+
       |region   x|     |id  region   a|    |region   x  id   a  _merge|
       |----------|     |--------------|    |--------------------------|
       |     1  15|     | 1       2  26|    |     1  15   2  29       3|
       |     2  13|     | 2       1  29|    |     1  15   5  24       3|
       |     3  12|     | 3       2  22|    |     2  13   1  26       3|
       |     4  11|     | 4       3  21|    |     2  13   3  22       3|
       +----------+     | 5       1  24|    |     3  12   4  21       3|
                        | 6       5  20|    |     4  11   .   .       1|
                        +--------------+    |     5   .   6  20       2|
                                            +--------------------------+

    This merged result is identical to the merged result in the previous section, except for the sort
    order and the contents of _merge.  This time, we show the merged result sorted by region rather
    than id.  Reversing the roles of the files causes a reversal in the 1s and 2s for _merge: where
    _merge was previously 1, it is now 2, and vice versa.  These exchanged _merge values reflect the
    reversed roles of the master and using data.

    For each observation in the master data, merge found the corresponding observation(s) in the using
    data and then wrote down the matched or unmatched result.  Once the master observations were
    exhausted, merge wrote down any observations from the using data that were never used.


m:m merges

    m:m specifies a many-to-many merge and is a bad idea.  In an m:m merge, observations are matched
    within equal values of the key variable(s), with the first observation being matched to the first;
    the second, to the second; and so on.  If the master and using have an unequal number of
    observations within the group, then the last observation of the shorter group is used repeatedly to
    match with subsequent observations of the longer group.  Thus m:m merges are dependent on the
    current sort order -- something which should never happen.

    Because m:m merges are such a bad idea, we are not going to show you an example.  If you think that
    you need an m:m merge, then you probably need to work with your data so that you can use a 1:m or
    m:1 merge. Tips for this are given in Troubleshooting m:m merges below.


Sequential merges

    In a sequential merge, there are no key variables.  Observations are matched solely on their
    observation number:

        . merge 1:1 _n using filename

           master   +     using   =       merged result

            +--+          +--+         +----------------+
            |x1|          |x2|         |x1   x2   _merge|
            |--|          |--|         |----------------|
            |10|          | 7|         |10    7        3|
            |30|          | 2|         |30    2        3|
            |20|          | 1|         |20    1        3|
            | 5|          | 9|         | 5    9        3|
            +--+          | 3|         | .    3        2|
                          +--+         +----------------+

    In the example above, the using data are longer than the master, but that could be reversed.  In
    most cases where sequential merges are appropriate, the datasets are expected to be of equal
    length, and you should type

        . merge 1:1 _n using filename, assert(match) nogen

    Sequential merges, like m:m merges, are dangerous. Both depend on the current sort order of the
    data.


8
蓝色 发表于 2012-3-7 13:31:39
难道help里的例子也看不懂吗

9
蓝色 发表于 2012-3-7 13:33:21
manual里面应该更加详细
已有 1 人评分热心指数 收起 理由
南方郎 + 1 热心帮助其他会员

总评分: 热心指数 + 1   查看全部评分

10
南方郎 发表于 2012-3-7 14:28:06
蓝色 发表于 2012-3-7 13:33
manual里面应该更加详细
好的,非常感谢,刚才电脑问题,没有显示刚才上面的详细描述,真是太感谢了。

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2025-12-9 13:56