怎么用主成份法提取公因子？ - Stata专版

0关注
0粉丝

高中生

0%

还不是VIP/贵宾

-

0%

威望: 0 级
论坛币: 2 个
通用积分: 0.0001
学术水平: 0 点
热心指数: 0 点
信用等级: 0 点
经验: 240 点
帖子: 16
精华: 0
在线时间: 11 小时
注册时间: 2010-5-12
最后登录: 2022-5-2

楼主

马远航 发表于 2010-12-29 16:25:42 |AI写论文

是否 +2 论坛币

k人参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群

赵安豆老师微信：zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

立即领取

感谢您参与论坛问题回答

经管之家送您两个论坛币！

+2 论坛币

怎么用主成份法提取公因子？新手求教，多谢了！
麻烦说详细点，最好有表格解释多谢了！

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

分享0 收藏1 回帖

关键词：公因子主成份新手求教详细点最好

相关帖子

沙发

lifemg 发表于 2010-12-29 16:33:21

楼主要用哪种软件？

藤椅

江上輕舟 发表于 2010-12-29 16:40:45

用spss很容易就操作完成步骤了。

板凳

lifemg 发表于 2010-12-29 16:57:51

如果用SAS这个应该对你有帮助也是在论坛上下的

报纸

lifemg 发表于 2010-12-29 17:00:02

传不上来你看这个网址吧 http://www.pinggu.org/bbs/thread-94932-1-1.html

地板

马远航 发表于 2011-1-2 14:19:21

如果用stata怎么做？

7楼

houquan 发表于 2011-1-2 23:14:00

6# 马远航
Stata里面的 pca 命令，就是做主成分分析的（我用的Stata S/E 11）
帮助文件抄在下面

help pca                                        dialogs:  pca  pcamat
                                                also see:  pca postestimation
-------------------------------------------------------------------------------

Title

[MV] pca -- Principal component analysis

Syntax

Principal component analysis of data

      pca varlist [if] [in] [weight] [, options]

Principal component analysis of a correlation or covariance matrix

      pcamat matname , n(#) [options pcamat_options]

options             description
-------------------------------------------------------------------------
Model 2
   components(#)    retain maximum of # principal components; factors()
                        is a synonym
   mineigen(#)       retain eigenvalues larger than #; default is 1e-5
   correlation       perform PCA of the correlation matrix; the default
   covariance       perform PCA of the covariance matrix
   vce(none)       do not compute VCE of the eigenvalues and vectors;
                        the default
   vce(normal)       compute VCE of the eigenvalues and vectors assuming
                        multivariate normality

Reporting
   level(#)          set confidence level; default is level(95)
   blanks(#)       display loadings as blank when |loadings| < #
   novce             suppress display of SEs even though calculated
# means             display summary statistics of variables

Advanced
   tol(#)          advanced option; see Options for details
   ignore          advanced option; see Options for details

+ norotated       display unrotated results, even if rotated results
                        are available (replay only)
-------------------------------------------------------------------------
# means is not allowed with pcamat.
+ norotated is not available in the dialog box.

pcamat_options    description
-------------------------------------------------------------------------
Model
   shape(full)       matname is a square symmetric matrix; the default
   shape(lower)    matname is a vector with the rowwise lower triangle
                        (with diagonal)
   shape(upper)    matname is a vector with the rowwise upper triangle
                        (with diagonal)
   names(namelist) variable names; required if matname is triangular
   forcepsd          modifies matname to be positive semidefinite
* n(#)             number of observations
   sds(matname2)    vector with standard deviations of variables
   means(matname3) vector with means of variables
-------------------------------------------------------------------------
* n() is required for pcamat.

bootstrap, by, jackknife, rolling, statsby, and xi are allowed with pca;
   see prefix.  However, bootstrap and jackknife results should be
   interpreted with caution; identification of the pca parameters involves
   data-dependent restrictions, possibly leading to badly biased and
   overdispersed estimates.
Weights are not allowed with the bootstrap prefix.
aweights are not allowed with the jackknife prefix.
aweights and fweights are allowed with pca; see weight.
See [MV] pca postestimation for features available after estimation.

Menu

pca

      Statistics > Multivariate analysis > Factor and principal component
         analysis > Principal component analysis (PCA)

pcamat

      Statistics > Multivariate analysis > Factor and principal component
         analysis > PCA of a correlation or covariance matrix

Description

Principal component analysis (PCA) is a statistical technique used for
data reduction.  The leading eigenvectors from the eigen decomposition of
the correlation or covariance matrix of the variables describe a series
of uncorrelated linear combinations of the variables that contain most of
the variance.  In addition to data reduction, the eigenvectors from a PCA
are often inspected to learn more about the underlying structure of the
data.

pca and pcamat display the eigenvalues and eigenvectors from the PCA
eigen decomposition.  The eigenvectors are returned in orthonormal form,
i.e., orthogonal (uncorrelated) and normalized (with unit length, L'L =
I).  pcamat provides the correlation or covariance matrix directly.  For
pca the correlation or covariance matrix is computed from the variables
in varlist.

pcamat allows the correlation or covariance matrix to be specified as a k
x k symmetric matrix with row and column names set to the variable names
or as a k(k+1)/2 long row or column vector containing the lower or upper
triangle C along with the names() option providing the variable names.
See the shape() option for details.

The vce(normal) option of pca and pcamat provides standard errors of the
eigenvalues and eigenvectors and aids in interpreting the eigenvectors.
See Remarks for a discussion of the underlying assumptions.

Scores, residuals, rotations, scree plots, score plots, loading plots,
and more are available after pca and pcamat; see [MV] pca postestimation.

We all love to instruct, though we can teach only what is not worth knowing. -- J. Austen

8楼

houquan 发表于 2011-1-2 23:14:34

7# houquan
Options

      +---------+
----+ Model 2 +----------------------------------------------------------

components(#) and mineigen(#) specify the maximum number of components
      (eigenvectors or factors) to be retained.  components() specifies the
      number directly, and mineigen() specifies it indirectly, keeping all
      components with eigenvalues greater than the indicated value.  The
      options can be specified individually, together, or not at all.
      factors() is a synonym for components().

      components(#) sets the maximum number of components (factors) to be
      retained.  pca and pcamat always display the full set of eigenvalues
      but display eigenvectors only for retained components.  Specifying a
      number larger than the number of variables in varlist is equivalent
      to specifying the number of variables in varlist, and is the default.

      mineigen(#) sets the minimum value of eigenvalues to be retained.
      The default is 1e-5 or the value of tol() if specified.

      Specifying components() and mineigen() affects only the number of
      components to be displayed and stored in e(); it does not enforce the
      assumption that the other eigenvalues are 0.  In particular, the
      standard errors reported when vce(normal) is specified do not depend
      on the number of retained components.

correlation and covariance specify that principal components be
      calculated for the correlation matrix and covariance matrix,
      respectively.  The default is correlation.  Unlike factor analysis,
      PCA is not scale invariant; the eigenvalues and eigenvectors of a
      covariance matrix differ from those of the associated correlation
      matrix.  Usually, a PCA of a covariance matrix is meaningful only if
      the variables are expressed in the same units.

      For pcamat, do not confuse the type of the matrix to be analyzed with
      the type of matname.  Obviously, if matname is a correlation matrix
      and the option sds() is not specified, it is not possible to perform
      a PCA of the covariance matrix.

vce(none|normal) specifies whether standard errors are to be computed for
      the eigenvalues, the eigenvectors, and the (cumulative) percentage of
      explained variance (confirmatory PCA). These standard errors are
      obtained assuming multivariate normality of the data and are valid
      only for a PCA of a covariance matrix.  Be cautious if applying these
      to correlation matrices.

      +-----------+
----+ Reporting +--------------------------------------------------------

level(#) specifies the confidence level, as a percentage, for confidence
      intervals.  The default is level(95) or as set by set level.  level()
      is allowed only with vce(normal).

blanks(#) shows blanks for loadings with absolute value smaller than #.
      This option is ignored when specified with vce(normal).

novce suppresses the display of standard errors, even though they are
      computed, and displays the PCA results in a matrix/table style.  You
      can specify novce during estimation in combination with vce(normal).
      More likely, you will want to use novce during replay.

means displays summary statistics of the variables over the estimation
      sample.  This option is not available with pcamat.

      +----------+
----+ Advanced +---------------------------------------------------------

tol(#) is an advanced, rarely used option and is available only with
      vce(normal).  An eigenvalue, ev_i, is classified as being close to
      zero if ev_i < tol * max(ev).  Two eigenvalues, ev_1 and ev_2, are
      "close" if abs(ev_1-ev_2) < tol*max(ev).  The default is tol(1e-5).
      See option ignore and Remarks below.

ignore is an advanced, rarely used option and is available only with
      vce(normal).  It continues the computation of standard errors and
      tests, even if some eigenvalues are suspiciously close to zero or
      suspiciously close to other eigenvalues, violating crucial
      assumptions of the asymptotic theory used to estimate standard errors
      and tests.  See Remarks below.

The following option is available with pca and pcamat but is not shown in
the dialog box:

norotated displays the unrotated principal components, even if rotated
      components are available.  This option may be specified only when
      replaying results.

Options unique to pcamat

      +-------+
----+ Model +------------------------------------------------------------

shape(shape_arg) specifies the shape (storage mode) for the covariance or
      correlation matrix matname.  The following shapes are supported:

      full specifies that the correlation or covariance structure of k
         variables is stored as a symmetric k x k matrix.  Specifying
         shape(full) is optional in this case.

      lower specifies that the correlation or covariance structure of k
         variables is stored as a vector with k(k+1)/2 elements in rowwise
         lower-triangular order:

            C(11) C(21) C(22) C(31) C(32) C(33) ... C(k1) C(k2) ... C(kk)

      upper specifies that the correlation or covariance structure of k
         variables is stored as a vector with k(k+1)/2 elements in rowwise
         upper-triangular order:

            C(11) C(12) C(13) ... C(1k) C(22) C(23) ... C(2k) ...  C(k-1
                  k-1) C(k-1 k) C(kk)

names(namelist) specifies a list of k different names, which are used to
      document output and to label estimation results and are used as
      variable names by predict.  By default, pcamat verifies that the row
      and column names of matname and the column or row names of matname2
      and matname3 from the sds() and means() options are in agreement.
      Using the names() option turns off this check.

forcepsd modifies the matrix matname to be positive semidefinite (psd)
      and so to be a proper covariance matrix.  If matname is not positive
      semidefinite, it will have negative eigenvalues.  By setting negative
      eigenvalues to 0 and reconstructing, we obtain the least-squares
      positive-semidefinite approximation to matname.  This approximation
      is a singular covariance matrix.

n(#) is required and specifies the number of observations.

sds(matname2) specifies a k x 1 or 1 x k matrix with the standard
      deviations of the variables.  The row or column names should match
      the variable names, unless the names() option is specified.  sds()
      may be specified only if matname is a correlation matrix.

means(matname3) specifies a k x 1 or 1 x k matrix with the means of the
      variables.  The row or column names should match the variable names,
      unless the names() option is specified.  Specify means() if you have
      variables in your dataset and want to use predict after pcamat.

Remarks

Technical note:

pca and pcamat with the vce(normal) option assume that

      (A1) the variables are multivariate normal distributed and

      (A2) the variance-covariance matrix of the observations has all
         distinct and strictly positive eigenvalues.

Under assumptions A1 and A2, the eigenvalues and eigenvectors of the
sample covariance matrix can be seen as maximum likelihood estimates for
the population analogues and they are asymptotically (multivariate)
normal distributed.  Be cautious in interpreting because the asymptotic
variances are rather sensitive to violations of assumption A1 (and A2).
Wald tests of hypotheses that are in conflict with assumption A2 (e.g.,
testing that the first and second eigenvalue are the same) produce
incorrect p-values.

Because the statistical theory for a PCA of a correlation matrix is much
more complicated, pca and pcamat compute standard errors and tests of a
correlation matrix as if it were a covariance matrix.  This will usually
lead to some underestimation of standard errors, but we believe that this
problem is smaller than the consequences of deviations from normality.

We suggest that you conduct tests for marginal normality of the variables
(see [R] sktest and [R] swilk), but recall that marginal normality does
not imply multivariate normality.

Examples

Standard PCA for descriptive use
      . sysuse auto
      . pca trunk weight length headroom
      . pca trunk weight length headroom, comp(2) covariance

PCA assuming multivariate normality of the data
      . webuse bg2
      . pca bg2cost*, vce(normal)

PCA of a covariance or correlation matrix
      . matrix S = ( 10.167, 22.690,  2.040  \ ///
                     22.690, 56.949,  3.788  \ ///
                     2.040,  3.788,  0.688  )
      . matrix rownames S = visual hearing taste
      . matrix colnames S = visual hearing taste
      . pcamat S, n(979) comp(2)

Same as above
      . matrix S = ( 10.167, 22.690, 2.040, ///
                           56.949, 3.788, ///
                                    0.688 )
      . pcamat S, n(979) shape(upper) comp(2) names(visual hearing taste)

We all love to instruct, though we can teach only what is not worth knowing. -- J. Austen

9楼

houquan 发表于 2011-1-2 23:14:52

8# houquan
Saved results

pca and pcamat without the vce(normal) option save the following in e():

Scalars
   e(N)             number of observations
   e(f)             number of retained components
   e(rho)             fraction of explained variance
   e(trace)          trace of e(C)
   e(lndet)          ln of the determinant of e(C)
   e(cond)          condition number of e(C)

Macros
   e(cmd)             pca (even in the case of pcamat)
   e(cmdline)       command as typed
   e(Ctype)          correlation or covariance
   e(wtype)          weight type
   e(wexp)          weight expression
   e(title)          title in output
   e(properties)    nov noV eigen
   e(rotate_cmd)    program used to implement rotate
   e(estat_cmd)       program used to implement estat
   e(predict)       program used to implement predict
   e(marginsnotok)    predictions disallowed by margins

Matrices
   e(C)             p x p correlation or covariance matrix
   e(means)          1 x p matrix of means
   e(sds)             1 x p matrix of standard deviations
   e(Ev)             1 x p matrix of eigenvalues (sorted)
   e(L)             p x f matrix of eigenvectors = components
   e(Psi)             1 x p matrix of unexplained variance

Functions
   e(sample)          marks estimation sample

pca and pcamat with the vce(normal) option save the above, as well as the
following:

Scalars
   e(v_rho)          variance of e(rho)
   e(chi2_i)          chi-squared statistic for test of independence
   e(df_i)          degrees of freedom for test of independence
   e(p_i)             significance of test of independence
   e(chi2_s)          chi-squared statistic for test of sphericity
   e(df_s)          degrees of freedom for test of sphericity
   e(p_s)             significance of test of sphericity
   e(rank)          rank of e(V)

Macros
   e(vce)             multivariate normality
   e(properties)    b V

Matrices
   e(b)             1 x p+fp coefficient vector (all eigenvalues and
                        retained eigenvectors)
   e(Ev_bias)       1 x p matrix: bias of eigenvalues
   e(Ev_stats)       p x 5 matrix with statistics on explained variance
   e(V)             variance-covariance matrix of the estimates e(b)

Also see

Manual:  [MV] pca

   Help:  [MV] pca postestimation;
         [R] tetrachoric, [MV] biplot, [MV] canon, [MV] factor, [D]
         corr2data, [R] alpha

我试了一下：
      . sysuse auto
      . pca trunk weight length headroom

输出如下：
Principal components/correlation                Number of obs =       74
Number of comp.  =       4
Trace          =       4
Rotation: (unrotated = principal)          Rho             = 1.0000

--------------------------------------------------------------------------
Component Eigenvalue Difference       Proportion Cumulative
-------------+------------------------------------------------------------
Comp1    3.02027    2.36822          0.7551    0.7551
Comp2    .652053    .37494          0.1630    0.9181
Comp3    .277113    .226551          0.0693    0.9874
Comp4    .0505616          .          0.0126    1.0000
--------------------------------------------------------------------------

Principal components (eigenvectors)

--------------------------------------------------------------------
Variable    Comp1    Comp2    Comp3    Comp4  Unexplained
-------------+----------------------------------------+-------------
trunk 0.5068 0.2327 -0.8249 0.0921          0
weight 0.5221 -0.4536 0.2677 0.6708          0
length 0.5361 -0.3903 0.1370 -0.7358          0
headroom 0.4280 0.7667 0.4786 -0.0057          0
--------------------------------------------------------------------

Comp1 Comp2 ... 就是主成分了

We all love to instruct, though we can teach only what is not worth knowing. -- J. Austen

10楼

马远航 发表于 2011-1-4 23:05:55

非常感谢，祝你新年好运！

[其他] 怎么用主成份法提取公因子？ [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我拉你入群

相关帖子

浏览过的帖子

浏览过的版块

本版微信群

[其他] 怎么用主成份法提取公因子？ [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我 拉你入群

相关帖子

浏览过的帖子

浏览过的版块

本版微信群

扫码加我拉你入群