Title
[U] 11.4.3 Factor variables
Description
Factor variables are extensions of varlists of existing variables. When a command allows factor
variables, in addition to typing variable names from your data, you can type factor variables, which
might look like
i.varname
i.varname#i.varname
i.varname#i.varname#i.varname
i.varname##i.varname
i.varname##i.varname##i.varname
Factor variables create indicator variables from categorical variables, interactions of indicators of
categorical variables, interactions of categorical and continuous variables, and interactions of
continuous variables (polynomials). They are allowed with most estimation and postestimation commands,
along with a few other commands.
There are four factor-variable operators:
Operator Description
--------------------------------------------------------------------------------------------------
i. unary operator to specify indicators
c. unary operator to treat as continuous
# binary operator to specify interactions
## binary operator to specify factorial interactions
--------------------------------------------------------------------------------------------------
The indicators and interactions created by factor-variable operators are referred to as virtual
variables. They act like variables in varlists but do not exist in the dataset.
Categorical variables to which factor-variable operators are applied must contain nonnegative integers
with values in the range 0 to 32,740, inclusive.
Factor variables may be combined with the L. and F. time-series operators.
Remarks
Remarks are presented under the following headings:
Basic examples
Base levels
Selecting levels
Applying operators to a group of variables
Basic examples
Here are some examples of use of the operators:
Factor
specification Result
--------------------------------------------------------------------------------------------------
i.group indicators for levels of group
i.group#i.sex indicators for each combination of levels of group and sex, a two-way
interaction
group#sex same as i.group#i.sex
group#sex#arm indicators for each combination of levels of group, sex, and arm, a three-way
interaction
group##sex same as i.group i.sex group#sex
group##sex##arm same as i.group i.sex i.arm group#sex group#arm sex#arm group#sex#arm
sex#c.age two variables -- age for males and 0 elsewhere, and age for females and 0
elsewhere; if age is also in the model, one of the two virtual variables will
be treated as a base
sex##c.age same as i.sex age sex#c.age
c.age same as age
c.age#c.age age squared
c.age#c.age#c.age age cubed
--------------------------------------------------------------------------------------------------
Base levels
You can specify the base level of a factor variable by using the ib. operator. The syntax is
Base
operator(*) Description
------------------------------------------------------------------------------------------------
ib#. use # as base, #=value of variable
ib(##). use the #th ordered value as base (**)
ib(first). use smallest value as base (the default)
ib(last). use largest value as base
ib(freq). use most frequent value as base
ibn. no base level
------------------------------------------------------------------------------------------------
(*) The i may be omitted. For instance, you may type ib2.group or b2.group.
(**) For example, ib(#2). means to use the second value as the base.
If you want to use group==3 as the base in a regression, you can type,
. regress y i.sex ib3.group
You can also permanently set the base levels of categorical variables by using the fvset command.
Selecting levels
You can select a range of levels -- a range of virtual variables -- by using the i(numlist). operator.
Examples Description
--------------------------------------------------------------------------------------------------
i2.cat a single indicator for cat==2
2.cat same as i2.cat
i(2 3 4).cat three indicators, cat==2, cat==3, and cat==4;
same as i2.cat i3.cat i4.cat
i(2/4).cat same as i(2 3 4).cat
2.cat#1.sex a single indicator that is 1 when cat==2 and sex==1, and is 0 otherwise
i2.cat#i1.sex same as 2.cat#1.sex
--------------------------------------------------------------------------------------------------
Applying operators to a group of variables
Factor-variable operators may be applied to groups of variables by using parentheses.
In the examples that follow, variables group, sex, arm, and cat are categorical, and variables age, wt,
and bp are continuous:
Examples Expansion
--------------------------------------------------------------------------------------------------
i.(group sex arm) i.group i.sex i.arm
group#(sex arm cat) group#sex group#arm group#cat
group##(sex arm cat) i.group i.sex i.arm i.cat group#sex group#arm group#cat
group#(c.age c.wt c.bp) i.group group#c.age group#c.wt group#c.bp
group#c.(age wt bp) same as group#(c.age c.wt c.bp)
--------------------------------------------------------------------------------------------------
|