楼主: carol119
16551 10

[其他] 求助:自动生成的_merge项的数值表示什么意思啊? [推广有奖]

  • 0关注
  • 0粉丝

大专生

60%

还不是VIP/贵宾

-

威望
0
论坛币
202 个
通用积分
0.0021
学术水平
0 点
热心指数
0 点
信用等级
0 点
经验
524 点
帖子
57
精华
0
在线时间
34 小时
注册时间
2006-3-18
最后登录
2021-5-23

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
<p>insheet using "1.txt", tab names<br/>sort v1</p><p>save "1.dta",replace<br/>clear</p><p>insheet using "2.txt", tab names<br/>sort v1</p><p>merge v1  using "1.dta"<br/></p><p>运行这些命令之后会出来一个变量_merge,取值是都是1、2、3这样的数字,请问一下这些数字是什么意思啊?</p>
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Merge 自动生成 insheet replace Using

沙发
蓝色 发表于 2009-3-28 20:07:00 |只看作者 |坛友微信交流群

你要学会看帮助啊。stata帮助很好的都有解释。

要学会自己能解决问题。下面是stata中merge的帮助


help merge                                                                                  dialogs:  merge        
                                                                                                      merge multiple
--------------------------------------------------------------------------------------------------------------------

Title

    [D] merge -- Merge datasets


Syntax

        merge [varlist] using filename [filename ...] [, options]

    options             description
    --------------------------------------------------------------------------------------------------------------
    Options
      keep(varlist)     keep only the specified variables from data in filename
      _merge(newvar)    newvar marks source of resulting observation; default is _merge
      nolabel           do not copy value label definitions from filename
      nonotes           do not copy notes from filename
      update            replace missing data in memory with data from filename
      replace           replace nonmissing data in memory with data from filename
      nokeep            drop observations in using dataset that do not match
      nosummary         drop summary variables when multiple filenames are specified
    * unique            match variables uniquely identify observations in both data in memory and in filename
    * uniqmaster        match variables uniquely identify observations in memory
    * uniqusing         match variables uniquely identify observations in filename
    * sort              sort master and using datasets by match variables before merge; sort implies unique if
                          uniqmaster or uniqusing is not specified
    --------------------------------------------------------------------------------------------------------------
    * unique, uniqmaster, uniqusing, and sort require varlist (the match variables) be specified.


Description

    merge joins corresponding observations from the dataset currently in memory (called the master dataset) with
    those from Stata-format datasets stored as filename (called the using datasets) into single observations.  If
    filename is specified without an extension, .dta is assumed.

    merge can perform both one-to-one and match merges.


Options

        +---------+
    ----+ Options +-----------------------------------------------------------------------------------------------

    keep(varlist) specifies the variables to be kept from the using data.  If keep() is not specified, all
        variables are kept.

        The varlist in keep(varlist) differs from standard Stata varlists in two ways: variable names in varlist
        may not be abbreviated, except by the use of wildcard characters; and you may not refer to a range of
        variables, such as price-weight.

    _merge(newvar) specifies the name of the variable to be created that will mark the source of the resulting
        observation.  The default is _merge(_merge); that is, if you do not specify this option, the new variable
        will be named _merge.

    nolabel prevents Stata from copying the value label definitions from the using dataset into the result.  Even
        if you do not specify this option, label definitions from the using dataset do not replace label
        definitions in the master dataset.

    nonotes prevents notes in the using data from being incorporated into the result.  The default is to
        incorporate notes from the using data that do not already appear in the master dataset.

    update specifies that the values from the using dataset be retained in cases where the master dataset contains
        missing.  By default, the master dataset is held inviolate -- values from the master dataset are retained
        when the variables are found in both datasets.

    replace, allowed with update only, specifies that even when the master dataset contains nonmissing values,
        they are to be replaced with corresponding values from the using dataset when the corresponding values are
        not equal.  A nonmissing value, however, will never be replaced with a missing value.

    nokeep causes merge to ignore observations in the using dataset that have no corresponding observation in the
        master.  The default is to add these observations to the merged result and mark such observations with
        _merge==2.

    nosummary causes merge to drop the summary variables created when multiple using datasets are specified. The
        default is to create _merge1 recording results from merging the first disk dataset, _merge2 recording
        results from merging the second disk dataset, and so on.  _merge1, _merge2, ..., contain 1 if an
        observation was found in the respective disk dataset and 0 otherwise.

        Whether or not nosummary is specified, overall status variable _merge is created.

    unique, uniqmaster, and uniqusing specify that the match variables in a match-merge uniquely identify the
        observations.  Match variables are required with unique, uniqmaster, and uniqusing.

        unique specifies that the match variables uniquely identify the observations in the master dataset and in
        the using dataset.  For most match-merges, you should specify unique.  merge does nothing differently when
        you specify the option, unless the assumption you are making is false, in which case an error message is
        issued and the data are not merged.

        uniqmaster specifies that the match variables uniquely identify the observations in memory, the master
        data, but not necessarily the ones in the using dataset.

        uniqusing specifies that the match variables uniquely identify the observations in the using dataset, but
        not necessarily the ones in the master dataset.

        unique is thus equivalent to specifying uniqmaster and uniqusing.

        Things are more complicated when multiple using datasets are specified.  unique still means unique in all
        datasets, and uniqusing still means unique in each of the using datasets, just as you would expect, but
        uniqmaster takes on a whole new meaning:  uniqmaster means unique in the master and in all using datasets
        except the last!  It asserts that the match variables uniquely identify observations in the master at each
        step, meaning that when the master is merged with the first using dataset, then when the (new) master
        (equal to original plus first using) is merged with the second using dataset, and so on.  In summary,
        uniqmaster is simply not useful when multiple using datasets are specified.

        If none of the three unique options are specified, observations in neither the master nor the using
        dataset are required to be unique, although they could be.  If they are not unique, records that have the
        same values of the match variables are joined by observation until all the records on one side or the
        other are matched; after that, the final record on the shorter side is duplicated over and over again to
        match with the remaining records needing to be matched on the longer side.

    sort specifies that the master and using datasets be sorted by the match variables, before the datasets are
        merged, if they are not already sorted by them.  Match variables are required with sort.  sort implies
        unique if uniqmaster or uniqusing is not specified.


Remarks

    merge can perform both one-to-one and match merges.  In either case, the variable _merge (or the variable
    specified in _merge() if provided) is added to the data containing

                           _merge==1    obs. from master data                           
                           _merge==2    obs. from only one using dataset                
                           _merge==3    obs. from at least two datasets, master or using

    update can be used only when there is one using file.  When update is specified, the codes for _merge are

                           _merge==1    obs. from master data                           
                           _merge==2    obs. from using data                            
                           _merge==3    obs. from both, master agrees with using        
                           _merge==4    obs. from both, missing in master updated       
                           _merge==5    obs. from both, master disagrees with using
     

    When multiple using files are specified, a set of summary variables is created, as long as nosummary is not
    used.  These summary variables are named _merge1 (related to the first using dataset), _merge2 (related to the
    second using dataset), etc. (or, once again, the variable specified in _merge() if provided, followed by the
    number of the using file).  These variables will contain

                           _mergek==0   obs. not present in corresponding using dataset 
                           _mergek==1   obs. present in corresponding using dataset     

    Variable labels identifying the dataset associated with each summary variable are attached to these summary
    variables.


Examples

    ----------------------------------------------------------------------------------------------------------------
    Setup
        . webuse odd
        . list
        . webuse even1
        . list

    Perform one-to-one merge
        . merge using http://www.stata-press.com/data/r10/odd
        . list

    ----------------------------------------------------------------------------------------------------------------
    Setup
        . webuse even1, clear

    Perform match-merge
        . merge number using http://www.stata-press.com/data/r10/odd, sort
        . list
        . sort number
        . list

    ----------------------------------------------------------------------------------------------------------------
    Setup
        . webuse autotech, clear
        . describe
        . describe using http://www.stata-press.com/data/r10/autocost

    Perform match-merge
        . merge make using http://www.stata-press.com/data/r10/autocost
        . tabulate _merge

    ----------------------------------------------------------------------------------------------------------------
    Setup
        . webuse dollars, clear
        . list
        . webuse sforce
        . list

    Perform match-merge with spreading
        . merge region using http://www.stata-press.com/data/r10/dollars

    ----------------------------------------------------------------------------------------------------------------
    Setup
        . webuse odd3, clear
        . list
        . webuse letter
        . list
        . webuse even
        . list

    Perform match-merge with multiple datasets
        . merge number using http://www.stata-press.com/data/r10/odd3 http://www.stata-press.com/data/r10/letter

    ----------------------------------------------------------------------------------------------------------------
    Setup
        . webuse original, clear
        . list
        . webuse updates
        . list
        . webuse original

    Update data with match-merge
        . merge make using http://www.stata-press.com/data/r10/updates, update
        . list
    ----------------------------------------------------------------------------------------------------------------


Also see

    Manual:  [D] merge

    Online:  [D] append, [D] cross, [D] joinby, [D] save, [D] sort

使用道具

藤椅
carol119 发表于 2009-3-28 21:36:00 |只看作者 |坛友微信交流群
噢 谢谢了唷~~

使用道具

板凳
止戈为武 发表于 2010-6-20 12:25:08 |只看作者 |坛友微信交流群
THX too much!

使用道具

报纸
xiaoli_yuan 发表于 2015-7-15 08:48:04 |只看作者 |坛友微信交流群
请问如果Merge后有5这一项,就是两个数据库有不一样的地方 怎么找到不一样的地方在哪呢

使用道具

地板
pingguzh 发表于 2015-7-16 14:36:37 |只看作者 |坛友微信交流群
不一样的地方是用_merge的值来判断的,在stata的输出结果里可以非常明显的看到这一点

使用道具

7
zabbyy 发表于 2016-11-4 11:06:35 |只看作者 |坛友微信交流群
xiaoli_yuan 发表于 2015-7-15 08:48
请问如果Merge后有5这一项,就是两个数据库有不一样的地方 怎么找到不一样的地方在哪呢
keep if _merge==5
然后自己观察

使用道具

8
xxbxxb789456 学生认证  发表于 2019-3-10 13:28:55 |只看作者 |坛友微信交流群
zabbyy 发表于 2016-11-4 11:06
keep if _merge==5
然后自己观察
如果想保留_merge==1和_merge==3怎么写?
keep(1 3) nogen??
drop(2) nogen??

使用道具

9
xxbxxb789456 学生认证  发表于 2019-3-10 13:29:23 |只看作者 |坛友微信交流群
蓝色 发表于 2009-3-28 20:07
你要学会看帮助啊。stata帮助很好的都有解释。要学会自己能解决问题。下面是stata中merge的帮助help merge& ...
如果想保留_merge==1和_merge==3怎么写?
keep(1 3) nogen??
drop(2) nogen??

使用道具

10
黃河泉 在职认证  发表于 2019-3-11 08:51:17 |只看作者 |坛友微信交流群
xxbxxb789456 发表于 2019-3-10 13:29
如果想保留_merge==1和_merge==3怎么写?
keep(1 3) nogen??
drop(2) nogen??
试试
  1. keep if _merge == 1 | _merge == 3
复制代码

使用道具

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-11-5 14:37