|
你先看duplicates 相关选项
这是帮你检查是不是有重复的样本的
如果没有才能一一对应进行合并,
那个cv数据我下载不了,无法帮你验证是否有问题
Title
[D] duplicates -- Report, tag, or drop duplicate observations
Syntax
Report duplicates
duplicates report [varlist] [if] [in]
List one example for each group of duplicates
duplicates examples [varlist] [if] [in] [, options]
List all duplicates
duplicates list [varlist] [if] [in] [, options]
Tag duplicates
duplicates tag [varlist] [if] [in] , generate(newvar)
Drop duplicates
duplicates drop [if] [in]
duplicates drop varlist [if] [in] , force
options Description
--------------------------------------------------------------------------------------------------
Main
compress compress width of columns in both table and display formats
nocompress use display format of each variable
fast synonym for nocompress; no delay in output of large datasets
abbreviate(#) abbreviate variable names to # characters; default is ab(8)
string(#) truncate string variables to # characters; default is string(10)
Options
table force table format
display force display format
header display variable header once; default is table mode
noheader suppress variable header
header(#) display variable header every # lines
clean force table format with no divider or separator lines
divider draw divider lines between columns
separator(#) draw a separator line every # lines; default is separator(5)
sepby(varlist) draw a separator line whenever varlist values change
nolabel display numeric codes rather than label values
Summary
mean[(varlist)] add line reporting the mean for each of the (specified) variables
sum[(varlist)] add line reporting the sum for each of the (specified) variables
N[(varlist)] add line reporting the number of nonmissing values for each of the
(specified) variables
labvar(varname) substitute Mean, Sum, or N for varname in last row of table
Advanced
constant[(varlist)] separate and list variables that are constant only once
notrim suppress string trimming
absolute display overall observation numbers when using by varlist:
nodotz display numerical values equal to .z as field of blanks
subvarname substitute characteristic for variable name in header
linesize(#) columns per line; default is linesize(79)
--------------------------------------------------------------------------------------------------
Menu
Data > Data utilities > Manage duplicate observations
Description
duplicates reports, displays, lists, tags, or drops duplicate observations, depending on the
subcommand specified. Duplicates are observations with identical values either on all variables
if no varlist is specified or on a specified varlist.
duplicates report produces a table showing observations that occur as one or more copies and
indicating how many observations are "surplus" in the sense that they are the second (third, ...)
copy of the first of each group of duplicates.
duplicates examples lists one example for each group of duplicated observations. Each example
represents the first occurrence of each group in the dataset.
duplicates list lists all duplicated observations.
duplicates tag generates a variable representing the number of duplicates for each observation.
This will be 0 for all unique observations.
duplicates drop drops all but the first occurrence of each group of duplicated observations. The
word drop may not be abbreviated.
Any observations that do not satisfy specified if and/or in conditions are ignored when you use
report, examples, list, or drop. The variable created by tag will have missing values for such
observations.
|