Title
[D] joinby -- Form all pairwise combinations within groups
Syntax
joinby [varlist] using filename [, options]
options Description
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Options
When observations match:
update replace missing data in memory with values from filename
replace replace all data in memory with values from filename
When observations do not match:
unmatched(none) ignore all; the default
unmatched(both) include from both datasets
unmatched(master) include from data in memory
unmatched(using) include from data in filename
_merge(varname) varname marks source of resulting observation; default is _merge
nolabel do not copy value-label definitions from filename
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
varlist may not contain strLs.
Menu
Data > Combine datasets > Form all pairwise combinations within groups
Description
joinby joins, within groups formed by varlist, observations of the dataset in memory with filename, a Stata-format dataset. By join we mean to form all pairwise combinations.
filename is required to be sorted by varlist. If filename is specified without an extension, .dta is assumed.
If varlist is not specified, joinby takes as varlist the set of variables common to the dataset in memory and in filename.
Observations unique to one or the other dataset are ignored unless unmatched() specifies differently. Whether you load one dataset and join the other or vice versa makes no
difference in the number of resulting observations.
If there are common variables between the two datasets, however, the combined dataset will contain the values from the master data for those observations. This behavior can be
modified with the update and replace options.
Options
+---------+
----+ Options +------------------------------------------------------------------------------------------------------------------------------------------------------------------------
update varies the action that joinby takes when an observation is matched. By default, values from the master data are retained when the same variables are found in both datasets.
If update is specified, however, the values from the using dataset are retained where the master dataset contains missing.
replace, allowed with update only, specifies that nonmissing values in the master dataset be replaced with corresponding values from the using dataset. A nonmissing value, however,
will never be replaced with a missing value.
unmatched(none|both|master|using) specifies whether observations unique to one of the datasets are to be kept, with the variables from the other dataset set to missing. Valid values
are
none ignore all unmatched observations (default)
both include unmatched observations from the master and using data
master include unmatched observations from the master data
using include unmatched observations from the using data
_merge(varname) specifies the name of the variable that will mark the source of the resulting observation. The default name is _merge(_merge). To preserve compatibility with earlier
versions of joinby, _merge is generated only if unmatched is specified.
nolabel prevents Stata from copying the value-label definitions from the dataset on disk into the dataset in memory. Even if you do not specify this option, label definitions from
the disk dataset do not replace label definitions already in memory.
Example
Setup
. webuse child
. describe
. list
. webuse parent
. describe
. list, sep(0)
. sort family_id
Join information on parents from data in memory with information on children from data at http://www.stata-press.com
. joinby family_id using http://www.stata-press.com/data/r13/child
Describe the resulting dataset
. describe
List the resulting data
. list, sepby(family_id) abbrev(12)