人大经济论坛 › 论坛 › 计量经济学与统计论坛五区 › 计量经济学与统计软件 › Stata专版 › 求助：自动生成的_merge项的数值表示什么意思啊？

CDA数据分析研究院

商业数据分析与大数据领航教育品牌



经管云课堂

经管/金融/财会/社科/名师公开课



学术培训

Stata 空间计量 SSCI Python

贵宾：通行论坛特权+数据库权限
+案例库+下载特权 VIP：论坛特权+更多下载次数
+ccerdata数据库+更高阅读权限+……

12 下一页

发帖

楼主: carol119

16551 10

[其他] 求助：自动生成的_merge项的数值表示什么意思啊？ [推广有奖]

0关注
0粉丝

大专生

60%

还不是VIP/贵宾

威望: 0 级
论坛币: 202 个
通用积分: 0.0021
学术水平: 0 点
热心指数: 0 点
信用等级: 0 点
经验: 524 点
帖子: 57
精华: 0
在线时间: 34 小时
注册时间: 2006-3-18
最后登录: 2021-5-23

楼主

carol119 发表于 2009-3-28 19:27:00 |只看作者 |坛友微信交流群|倒序 |AI写论文

是否 +2 论坛币

k人参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群

赵安豆老师微信：zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

立即领取

感谢您参与论坛问题回答

经管之家送您两个论坛币！

+2 论坛币

insheet using "1.txt", tab names sort v1save "1.dta",replace clearinsheet using "2.txt", tab names sort v1merge&nbsp;v1 &nbsp;using "1.dta" 运行这些命令之后会出来一个变量_merge，取值是都是1、2、3这样的数字，请问一下这些数字是什么意思啊？

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

分享0 收藏2 回帖

关键词：Merge 自动生成 insheet replace Using

相关帖子

使用道具举报

沙发

蓝色 发表于 2009-3-28 20:07:00 |只看作者 |坛友微信交流群

你要学会看帮助啊。stata帮助很好的都有解释。

要学会自己能解决问题。下面是stata中merge的帮助

help merge dialogs: merge
merge multiple
--------------------------------------------------------------------------------------------------------------------

Title

[D] merge -- Merge datasets

Syntax

merge [varlist] using filename [filename ...] [, options]

    options             description
    --------------------------------------------------------------------------------------------------------------
    Options
      keep(varlist)     keep only the specified variables from data in filename
      _merge(newvar)    newvar marks source of resulting observation; default is _merge
      nolabel           do not copy value label definitions from filename
      nonotes           do not copy notes from filename
      update            replace missing data in memory with data from filename
      replace           replace nonmissing data in memory with data from filename
      nokeep            drop observations in using dataset that do not match
      nosummary         drop summary variables when multiple filenames are specified
    * unique            match variables uniquely identify observations in both data in memory and in filename
    * uniqmaster        match variables uniquely identify observations in memory
    * uniqusing         match variables uniquely identify observations in filename
    * sort              sort master and using datasets by match variables before merge; sort implies unique if
                          uniqmaster or uniqusing is not specified
    --------------------------------------------------------------------------------------------------------------
    * unique, uniqmaster, uniqusing, and sort require varlist (the match variables) be specified.

Description

    merge joins corresponding observations from the dataset currently in memory (called the master dataset) with
    those from Stata-format datasets stored as filename (called the using datasets) into single observations. If
    filename is specified without an extension, .dta is assumed.

merge can perform both one-to-one and match merges.

Options

+---------+
----+ Options +-----------------------------------------------------------------------------------------------

keep(varlist) specifies the variables to be kept from the using data. If keep() is not specified, all
variables are kept.

        The varlist in keep(varlist) differs from standard Stata varlists in two ways: variable names in varlist
        may not be abbreviated, except by the use of wildcard characters; and you may not refer to a range of
        variables, such as price-weight.

    _merge(newvar) specifies the name of the variable to be created that will mark the source of the resulting
        observation. The default is _merge(_merge); that is, if you do not specify this option, the new variable
        will be named _merge.

    nolabel prevents Stata from copying the value label definitions from the using dataset into the result. Even
        if you do not specify this option, label definitions from the using dataset do not replace label
        definitions in the master dataset.

nonotes prevents notes in the using data from being incorporated into the result. The default is to
incorporate notes from the using data that do not already appear in the master dataset.

    update specifies that the values from the using dataset be retained in cases where the master dataset contains
        missing. By default, the master dataset is held inviolate -- values from the master dataset are retained
        when the variables are found in both datasets.

    replace, allowed with update only, specifies that even when the master dataset contains nonmissing values,
        they are to be replaced with corresponding values from the using dataset when the corresponding values are
        not equal. A nonmissing value, however, will never be replaced with a missing value.

    nokeep causes merge to ignore observations in the using dataset that have no corresponding observation in the
        master. The default is to add these observations to the merged result and mark such observations with
        _merge==2.

    nosummary causes merge to drop the summary variables created when multiple using datasets are specified. The
        default is to create _merge1 recording results from merging the first disk dataset, _merge2 recording
        results from merging the second disk dataset, and so on. _merge1, _merge2, ..., contain 1 if an
        observation was found in the respective disk dataset and 0 otherwise.

Whether or not nosummary is specified, overall status variable _merge is created.

unique, uniqmaster, and uniqusing specify that the match variables in a match-merge uniquely identify the
observations. Match variables are required with unique, uniqmaster, and uniqusing.

        unique specifies that the match variables uniquely identify the observations in the master dataset and in
        the using dataset. For most match-merges, you should specify unique. merge does nothing differently when
        you specify the option, unless the assumption you are making is false, in which case an error message is
        issued and the data are not merged.

uniqmaster specifies that the match variables uniquely identify the observations in memory, the master
data, but not necessarily the ones in the using dataset.

uniqusing specifies that the match variables uniquely identify the observations in the using dataset, but
not necessarily the ones in the master dataset.

unique is thus equivalent to specifying uniqmaster and uniqusing.

        Things are more complicated when multiple using datasets are specified. unique still means unique in all
        datasets, and uniqusing still means unique in each of the using datasets, just as you would expect, but
        uniqmaster takes on a whole new meaning: uniqmaster means unique in the master and in all using datasets
        except the last! It asserts that the match variables uniquely identify observations in the master at each
        step, meaning that when the master is merged with the first using dataset, then when the (new) master
        (equal to original plus first using) is merged with the second using dataset, and so on. In summary,
        uniqmaster is simply not useful when multiple using datasets are specified.

        If none of the three unique options are specified, observations in neither the master nor the using
        dataset are required to be unique, although they could be. If they are not unique, records that have the
        same values of the match variables are joined by observation until all the records on one side or the
        other are matched; after that, the final record on the shorter side is duplicated over and over again to
        match with the remaining records needing to be matched on the longer side.

    sort specifies that the master and using datasets be sorted by the match variables, before the datasets are
        merged, if they are not already sorted by them. Match variables are required with sort. sort implies
        unique if uniqmaster or uniqusing is not specified.

Remarks

merge can perform both one-to-one and match merges. In either case, the variable _merge (or the variable
specified in _merge() if provided) is added to the data containing

                           _merge==1    obs. from master data
                           _merge==2    obs. from only one using dataset
                           _merge==3    obs. from at least two datasets, master or using

update can be used only when there is one using file. When update is specified, the codes for _merge are

                           _merge==1    obs. from master data
                           _merge==2    obs. from using data
                           _merge==3    obs. from both, master agrees with using
                           _merge==4    obs. from both, missing in master updated
                           _merge==5    obs. from both, master disagrees with using

    When multiple using files are specified, a set of summary variables is created, as long as nosummary is not
    used. These summary variables are named _merge1 (related to the first using dataset), _merge2 (related to the
    second using dataset), etc. (or, once again, the variable specified in _merge() if provided, followed by the
    number of the using file). These variables will contain

_mergek==0 obs. not present in corresponding using dataset
_mergek==1 obs. present in corresponding using dataset

Variable labels identifying the dataset associated with each summary variable are attached to these summary
variables.

Examples

    ----------------------------------------------------------------------------------------------------------------
    Setup
        . webuse odd
        . list
        . webuse even1
        . list

    Perform one-to-one merge
        . merge using http://www.stata-press.com/data/r10/odd
        . list

    ----------------------------------------------------------------------------------------------------------------
    Setup
        . webuse even1, clear

    Perform match-merge
        . merge number using http://www.stata-press.com/data/r10/odd, sort
        . list
        . sort number
        . list

    ----------------------------------------------------------------------------------------------------------------
    Setup
        . webuse autotech, clear
        . describe
        . describe using http://www.stata-press.com/data/r10/autocost

    Perform match-merge
        . merge make using http://www.stata-press.com/data/r10/autocost
        . tabulate _merge

    ----------------------------------------------------------------------------------------------------------------
    Setup
        . webuse dollars, clear
        . list
        . webuse sforce
        . list

Perform match-merge with spreading
. merge region using http://www.stata-press.com/data/r10/dollars

    ----------------------------------------------------------------------------------------------------------------
    Setup
        . webuse odd3, clear
        . list
        . webuse letter
        . list
        . webuse even
        . list

Perform match-merge with multiple datasets
. merge number using http://www.stata-press.com/data/r10/odd3 http://www.stata-press.com/data/r10/letter

    ----------------------------------------------------------------------------------------------------------------
    Setup
        . webuse original, clear
        . list
        . webuse updates
        . list
        . webuse original

    Update data with match-merge
        . merge make using http://www.stata-press.com/data/r10/updates, update
        . list
    ----------------------------------------------------------------------------------------------------------------

Also see

Manual: [D] merge

Online: [D] append, [D] cross, [D] joinby, [D] save, [D] sort

Stata常见问题解答https://bbs.pinggu.org/thread-272681-1-1.html

使用道具举报