7# houquan
Options
+---------+
----+ Model 2 +----------------------------------------------------------
components(#) and mineigen(#) specify the maximum number of components
(eigenvectors or factors) to be retained. components() specifies the
number directly, and mineigen() specifies it indirectly, keeping all
components with eigenvalues greater than the indicated value. The
options can be specified individually, together, or not at all.
factors() is a synonym for components().
components(#) sets the maximum number of components (factors) to be
retained. pca and pcamat always display the full set of eigenvalues
but display eigenvectors only for retained components. Specifying a
number larger than the number of variables in varlist is equivalent
to specifying the number of variables in varlist, and is the default.
mineigen(#) sets the minimum value of eigenvalues to be retained.
The default is 1e-5 or the value of tol() if specified.
Specifying components() and mineigen() affects only the number of
components to be displayed and stored in e(); it does not enforce the
assumption that the other eigenvalues are 0. In particular, the
standard errors reported when vce(normal) is specified do not depend
on the number of retained components.
correlation and covariance specify that principal components be
calculated for the correlation matrix and covariance matrix,
respectively. The default is correlation. Unlike factor analysis,
PCA is not scale invariant; the eigenvalues and eigenvectors of a
covariance matrix differ from those of the associated correlation
matrix. Usually, a PCA of a covariance matrix is meaningful only if
the variables are expressed in the same units.
For pcamat, do not confuse the type of the matrix to be analyzed with
the type of matname. Obviously, if matname is a correlation matrix
and the option sds() is not specified, it is not possible to perform
a PCA of the covariance matrix.
vce(none|normal) specifies whether standard errors are to be computed for
the eigenvalues, the eigenvectors, and the (cumulative) percentage of
explained variance (confirmatory PCA). These standard errors are
obtained assuming multivariate normality of the data and are valid
only for a PCA of a covariance matrix. Be cautious if applying these
to correlation matrices.
+-----------+
----+ Reporting +--------------------------------------------------------
level(#) specifies the confidence level, as a percentage, for confidence
intervals. The default is level(95) or as set by set level. level()
is allowed only with vce(normal).
blanks(#) shows blanks for loadings with absolute value smaller than #.
This option is ignored when specified with vce(normal).
novce suppresses the display of standard errors, even though they are
computed, and displays the PCA results in a matrix/table style. You
can specify novce during estimation in combination with vce(normal).
More likely, you will want to use novce during replay.
means displays summary statistics of the variables over the estimation
sample. This option is not available with pcamat.
+----------+
----+ Advanced +---------------------------------------------------------
tol(#) is an advanced, rarely used option and is available only with
vce(normal). An eigenvalue, ev_i, is classified as being close to
zero if ev_i < tol * max(ev). Two eigenvalues, ev_1 and ev_2, are
"close" if abs(ev_1-ev_2) < tol*max(ev). The default is tol(1e-5).
See option ignore and Remarks below.
ignore is an advanced, rarely used option and is available only with
vce(normal). It continues the computation of standard errors and
tests, even if some eigenvalues are suspiciously close to zero or
suspiciously close to other eigenvalues, violating crucial
assumptions of the asymptotic theory used to estimate standard errors
and tests. See Remarks below.
The following option is available with pca and pcamat but is not shown in
the dialog box:
norotated displays the unrotated principal components, even if rotated
components are available. This option may be specified only when
replaying results.
Options unique to pcamat
+-------+
----+ Model +------------------------------------------------------------
shape(shape_arg) specifies the shape (storage mode) for the covariance or
correlation matrix matname. The following shapes are supported:
full specifies that the correlation or covariance structure of k
variables is stored as a symmetric k x k matrix. Specifying
shape(full) is optional in this case.
lower specifies that the correlation or covariance structure of k
variables is stored as a vector with k(k+1)/2 elements in rowwise
lower-triangular order:
C(11) C(21) C(22) C(31) C(32) C(33) ... C(k1) C(k2) ... C(kk)
upper specifies that the correlation or covariance structure of k
variables is stored as a vector with k(k+1)/2 elements in rowwise
upper-triangular order:
C(11) C(12) C(13) ... C(1k) C(22) C(23) ... C(2k) ... C(k-1
k-1) C(k-1 k) C(kk)
names(namelist) specifies a list of k different names, which are used to
document output and to label estimation results and are used as
variable names by predict. By default, pcamat verifies that the row
and column names of matname and the column or row names of matname2
and matname3 from the sds() and means() options are in agreement.
Using the names() option turns off this check.
forcepsd modifies the matrix matname to be positive semidefinite (psd)
and so to be a proper covariance matrix. If matname is not positive
semidefinite, it will have negative eigenvalues. By setting negative
eigenvalues to 0 and reconstructing, we obtain the least-squares
positive-semidefinite approximation to matname. This approximation
is a singular covariance matrix.
n(#) is required and specifies the number of observations.
sds(matname2) specifies a k x 1 or 1 x k matrix with the standard
deviations of the variables. The row or column names should match
the variable names, unless the names() option is specified. sds()
may be specified only if matname is a correlation matrix.
means(matname3) specifies a k x 1 or 1 x k matrix with the means of the
variables. The row or column names should match the variable names,
unless the names() option is specified. Specify means() if you have
variables in your dataset and want to use predict after pcamat.
Remarks
Technical note:
pca and pcamat with the vce(normal) option assume that
(A1) the variables are multivariate normal distributed and
(A2) the variance-covariance matrix of the observations has all
distinct and strictly positive eigenvalues.
Under assumptions A1 and A2, the eigenvalues and eigenvectors of the
sample covariance matrix can be seen as maximum likelihood estimates for
the population analogues and they are asymptotically (multivariate)
normal distributed. Be cautious in interpreting because the asymptotic
variances are rather sensitive to violations of assumption A1 (and A2).
Wald tests of hypotheses that are in conflict with assumption A2 (e.g.,
testing that the first and second eigenvalue are the same) produce
incorrect p-values.
Because the statistical theory for a PCA of a correlation matrix is much
more complicated, pca and pcamat compute standard errors and tests of a
correlation matrix as if it were a covariance matrix. This will usually
lead to some underestimation of standard errors, but we believe that this
problem is smaller than the consequences of deviations from normality.
We suggest that you conduct tests for marginal normality of the variables
(see [R] sktest and [R] swilk), but recall that marginal normality does
not imply multivariate normality.
Examples
Standard PCA for descriptive use
. sysuse auto
. pca trunk weight length headroom
. pca trunk weight length headroom, comp(2) covariance
PCA assuming multivariate normality of the data
. webuse bg2
. pca bg2cost*, vce(normal)
PCA of a covariance or correlation matrix
. matrix S = ( 10.167, 22.690, 2.040 \ ///
22.690, 56.949, 3.788 \ ///
2.040, 3.788, 0.688 )
. matrix rownames S = visual hearing taste
. matrix colnames S = visual hearing taste
. pcamat S, n(979) comp(2)
Same as above
. matrix S = ( 10.167, 22.690, 2.040, ///
56.949, 3.788, ///
0.688 )
. pcamat S, n(979) shape(upper) comp(2) names(visual hearing taste)