[数据管理求助] 合併資料的問題 [推广有奖]

11楼

rurusoso 发表于 2014-3-16 22:51:29

可能我剛剛弄錯甚麼了= =
還很不熟stata 讓您麻煩了

所以我接下來要使用#duplicates drop 這個指令嗎??

12楼

蓝色 发表于 2014-3-16 22:54:00

你先看duplicates 相关选项

这是帮你检查是不是有重复的样本的
如果没有才能一一对应进行合并，
那个cv数据我下载不了，无法帮你验证是否有问题

Title

[D] duplicates -- Report, tag, or drop duplicate observations

Syntax

Report duplicates

      duplicates report [varlist] [if] [in]

List one example for each group of duplicates

      duplicates examples [varlist] [if] [in] [, options]

List all duplicates

      duplicates list [varlist] [if] [in] [, options]

Tag duplicates

      duplicates tag [varlist] [if] [in] , generate(newvar)

Drop duplicates

      duplicates drop [if] [in]

      duplicates drop varlist [if] [in] , force

options                Description
--------------------------------------------------------------------------------------------------
Main
   compress             compress width of columns in both table and display formats
   nocompress          use display format of each variable
   fast                synonym for nocompress; no delay in output of large datasets
   abbreviate(#)       abbreviate variable names to # characters; default is ab(8)
   string(#)             truncate string variables to # characters; default is string(10)

Options
   table                force table format
   display             force display format
   header                display variable header once; default is table mode
   noheader             suppress variable header
   header(#)             display variable header every # lines
   clean                force table format with no divider or separator lines
   divider             draw divider lines between columns
   separator(#)          draw a separator line every # lines; default is separator(5)
   sepby(varlist)       draw a separator line whenever varlist values change
   nolabel             display numeric codes rather than label values

Summary
   mean[(varlist)]       add line reporting the mean for each of the (specified) variables
   sum[(varlist)]       add line reporting the sum for each of the (specified) variables
   N[(varlist)]          add line reporting the number of nonmissing values for each of the
                           (specified) variables
   labvar(varname)       substitute Mean, Sum, or N for varname in last row of table

Advanced
   constant[(varlist)] separate and list variables that are constant only once
   notrim                suppress string trimming
   absolute             display overall observation numbers when using by varlist:
   nodotz                display numerical values equal to .z as field of blanks
   subvarname          substitute characteristic for variable name in header
   linesize(#)          columns per line; default is linesize(79)
--------------------------------------------------------------------------------------------------

Menu

Data > Data utilities > Manage duplicate observations

Description

duplicates reports, displays, lists, tags, or drops duplicate observations, depending on the
subcommand specified.  Duplicates are observations with identical values either on all variables
if no varlist is specified or on a specified varlist.

duplicates report produces a table showing observations that occur as one or more copies and
indicating how many observations are "surplus" in the sense that they are the second (third, ...)
copy of the first of each group of duplicates.

duplicates examples lists one example for each group of duplicated observations.  Each example
represents the first occurrence of each group in the dataset.

duplicates list lists all duplicated observations.

duplicates tag generates a variable representing the number of duplicates for each observation.
This will be 0 for all unique observations.

duplicates drop drops all but the first occurrence of each group of duplicated observations.  The
word drop may not be abbreviated.

Any observations that do not satisfy specified if and/or in conditions are ignored when you use
report, examples, list, or drop.  The variable created by tag will have missing values for such
observations.

Stata常见问题解答https://bbs.pinggu.org/thread-272681-1-1.html