人大经济论坛 › 论坛 › 计量经济学与统计论坛五区 › 计量经济学与统计软件 › Stata专版 › 如何用stata软件将好几个数据库整合成一个数据库？

CDA数据分析研究院

商业数据分析与大数据领航教育品牌



经管云课堂

经管/金融/财会/社科/名师公开课



学术培训

Stata 空间计量 SSCI Python

贵宾：通行论坛特权+数据库权限
+案例库+下载特权 VIP：论坛特权+更多下载次数
+ccerdata数据库+更高阅读权限+……

发帖

楼主: 瑞拉加油

9945 6

[数据管理求助] 如何用stata软件将好几个数据库整合成一个数据库？ [推广有奖]

4关注
1粉丝

本科生

38%

还不是VIP/贵宾

威望: 0 级
论坛币: 0 个
通用积分: 0
学术水平: 0 点
热心指数: 0 点
信用等级: 0 点
经验: 2732 点
帖子: 114
精华: 0
在线时间: 51 小时
注册时间: 2012-9-25
最后登录: 2017-7-23

楼主

瑞拉加油 发表于 2013-4-27 16:50:41 |只看作者 |坛友微信交流群|倒序 |AI写论文

是否 +2 论坛币

k人参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群

赵安豆老师微信：zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

立即领取

感谢您参与论坛问题回答

经管之家送您两个论坛币！

+2 论坛币

请教各位高手如何用stata软件将好几个数据库整合成一个数据库？，谢谢各位啦

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

分享0 收藏1 回帖

关键词：stata软件 Stata tata 数据库如何用数据库软件如何

本帖被以下文库推荐

· Stata统计软件及相关资料|主题: 19, 订阅: 8

使用道具举报

沙发

ygqalone 发表于 2013-4-27 20:34:33 |只看作者 |坛友微信交流群

可以在stata中输入命令 h merge 按照相关提示做

已有 1 人评分	学术水平	热心指数	信用等级	收起理由
txje	+ 1	+ 1	+ 1	热心帮助其他会员

总评分: 学术水平 + 1 热心指数 + 1 信用等级 + 1 查看全部评分

使用道具举报

藤椅

ygqalone 发表于 2013-4-27 20:35:48 |只看作者 |坛友微信交流群

Syntax

One-to-one merge on specified key variables

      merge 1:1 varlist using filename [, options]

Many-to-one merge on specified key variables

      merge m:1 varlist using filename [, options]

One-to-many merge on specified key variables

      merge 1:m varlist using filename [, options]

Many-to-many merge on specified key variables

      merge m:m varlist using filename [, options]

One-to-one merge by observation

      merge 1:1 _n using filename [, options]

options             Description
----------------------------------------------------------------------------
Options
   keepusing(varlist)  variables to keep from using data; default is all
   generate(newvar) name of new variable to mark merge results; default is
                        _merge
   nogenerate       do not create _merge variable
   nolabel          do not copy value-label definitions from using
   nonotes          do not copy notes from using
   update             update missing values of same-named variables in
                        master with values from using
   replace          replace all values of same-named variables in master
                        with nonmissing values from using (requires update)
   noreport          do not display match result summary table
   force             allow string/numeric variable type mismatch without
                        error

Results
   assert(results)    specify required match results
   keep(results)    specify which match results to keep

   sorted             do not sort; datasets already sorted
----------------------------------------------------------------------------
sorted does not appear in the dialog box.

Menu

Data > Combine datasets > Merge two datasets

Description

merge joins corresponding observations from the dataset currently in memory
(called the master dataset) with those from filename.dta (called the using
dataset), matching on one or more key variables.  merge can perform match
merges (one-to-one, one-to-many, many-to-one, and many-to-many), which are
often called 'joins' by database people. merge can also perform sequential
merges, which have no equivalent in the relational database world.

merge is for adding new variables from a second dataset to existing
observations.  You use merge, for instance, when combining hospital patient
and discharge datasets. If you wish to add new observations to existing
variables, then see [D] append.  You use append, for instance, when adding
current discharges to past discharges.

By default, merge creates a new variable, _merge, containing numeric codes
concerning the source and the contents of each observation in the merged
dataset. These codes are explained below in the match results table.

If filename is specified without an extension, then .dta is assumed.

Options

      +---------+
----+ Options +-------------------------------------------------------------

keepusing(varlist) specifies the variables from the using dataset that are
      kept in the merged dataset. By default, all variables are kept.  For
      example, if your using dataset contains 2,000 demographic
      characteristics but you want only sex and age, then type merge ...,
      keepusing(sex age) ....

generate(newvar) specifies that the variable containing match results
      information should be named newvar rather than _merge.

nogenerate specifies that _merge not be created.  This would be useful if
      you also specified keep(match), because keep(match) ensures that all
      values of _merge would be 3.

nolabel specifies that value-label definitions from the using file be
      ignored.  This option should be rare, because definitions from the
      master are already used.

nonotes specifies that notes in the using dataset not be added to the merged
      dataset; see [D] notes.

update and replace both perform an update merge rather than a standard
      merge.  In a standard merge, the data in the master are the authority
      and inviolable.  For example, if the master and using datasets both
      contain a variable age, then matched observations will contain values
      from the master dataset, while unmatched observations will contain
      values from their respective datasets.

      If update is specified, then matched observations will update missing
      values from the master dataset with values from the using dataset.
      Nonmissing values in the master dataset will be unchanged.

      If replace is specified, then matched observations will contain values
      from the using dataset, unless the value in the using dataset is
      missing.

      Specifying either update or replace affects the meanings of the match
      codes. See Treatment of overlapping variables in [D] merge for details.

noreport specifies that merge not present its summary table of match
      results.

force allows string/numeric variable type mismatches, resulting in missing
      values from the using dataset.  If omitted, merge issues an error; if
      specified, merge issues a warning.

      +---------+
----+ Results +-------------------------------------------------------------

assert(results) specifies the required match results.  The possible results
      are

         numeric equivalent
         code    word (results)    description
         -------------------------------------------------------------------
            1    master          observation appeared in master only
            2    using             observation appeared in using only
            3    match             observation appeared in both

            4    match_update    observation appeared in both,
                                       missing values updated
            5    match_conflict    observation appeared in both,
                                       conflicting nonmissing values
         -------------------------------------------------------------------
         Codes 4 and 5 can arise only if the update option is specified.
         If codes of both 4 and 5 could pertain to an observation, then 5 is
         used.

      Numeric codes and words are equivalent when used in the assert() or
      keep() options.

      The following synonyms are allowed:  masters for master, usings for
      using, matches and matched for match, match_updates for match_update,
      and match_conflicts for match_conflict.

      Using assert(match master) specifies that the merged file is required to
      include only matched master or using observations and unmatched master
      observations, and may not include unmatched using observations.
      Specifying assert() results in merge issuing an error if there are match
      results among those observations you allowed.

      The order of the words or codes is not important, so all the following
      assert() specifications would be the same:

         assert(match master)

         assert(master matches)

         assert(1 3)

      When the match results contain codes other than those allowed, return
      code 9 is returned, and the merged dataset with the unanticipated
      results is left in memory to allow you to investigate.

keep(results) specifies which observations are to be kept from the merged
      dataset.  Using keep(match master) specifies keeping only matched
      observations and unmatched master observations after merging.

      keep() differs from assert() because it selects observations from the
      merged dataset rather than enforcing requirements.  keep() is used to
      pare the merged dataset to a given set of observations when you do not
      care if there are other observations in the merged dataset.  assert() is
      used to verify that only a given set of observations is in the merged
      dataset.

      You can specify both assert() and keep().  If you require matched
      observations and unmatched master observations but you want only the
      matched observations, then you could specify assert(match master)
      keep(match).

      assert() and keep() are convenience options whose functionality can be
      duplicated using _merge directly.

         . merge ..., assert(match master) keep(match)

      is identical to

         . merge ...
         . assert _merge==1 | _merge==3
         . keep if _merge==3

The following option is available with merge but is not shown in the dialog
box:

sorted specifies that the master and using datasets are already sorted by
      varlist.  If the datasets are already sorted, then merge runs a little
      more quickly; the difference is hardly detectable, so this option is of
      interest only where speed is of the utmost importance.

Prior syntax

Prior to Stata 11, merge had a more primitive syntax.  Code using the old
syntax will run unmodified.  To assist those attempting to understand or
debug out-of-date code, the original help file for merge can be found here.

Examples

------------------------------------------------------------------------------
Setup
      . webuse autosize
      . list
      . webuse autoexpense
      . list

Perform 1:1 match merge
      . webuse autosize
      . merge 1:1 make using http://www.stata-press.com/data/r12/autoexpense
      . list

------------------------------------------------------------------------------
Perform 1:1 match merge, requiring there to be only matches
(The merge command intentionally causes an error message.)
      . webuse autosize, clear
      . merge 1:1 make using http://www.stata-press.com/data/r12/autoexpense,
         assert(match)
      . tab _merge
      . list

------------------------------------------------------------------------------
Perform 1:1 match merge, keeping only matches and squelching the _merge
variable
      . webuse autosize, clear
      . merge 1:1 make using http://www.stata-press.com/data/r12/autoexpense,
         keep(match) nogen
      . list

------------------------------------------------------------------------------
Setup
      . webuse dollars, clear
      . list
      . webuse sforce
      . list

Perform m:1 match merge with sforce in memory
      . merge m:1 region using http://www.stata-press.com/data/r12/dollars
      . list

------------------------------------------------------------------------------
Setup
      . webuse overlap1, clear
      . list, sepby(id)
      . webuse overlap2
      . list

Perform m:1 match merge, illustrating update option
      . webuse overlap1
      . merge m:1 id using http://www.stata-press.com/data/r12/overlap2,
         update
      . list

------------------------------------------------------------------------------
Perform m:1 match merge, illustrating update replace option
      . webuse overlap1, clear
      . merge m:1 id using http://www.stata-press.com/data/r12/overlap2,
         update replace
      . list

------------------------------------------------------------------------------
Perform 1:m match merge, illustrating update replace option
      . webuse overlap2, clear
      . merge 1:m id using http://www.stata-press.com/data/r12/overlap1,
         update replace
      . list

------------------------------------------------------------------------------
Perform sequential merge
      . webuse sforce, clear
      . merge 1:1 _n using http://www.stata-press.com/data/r12/dollars
      . list