楼主: rosson123
13509 9

[学习资料] 急:因子分析检验时显示不是正定矩阵 [推广有奖]

  • 0关注
  • 1粉丝

已卖:27份资源

本科生

48%

还不是VIP/贵宾

-

威望
0
论坛币
283 个
通用积分
0.1741
学术水平
2 点
热心指数
1 点
信用等级
1 点
经验
3873 点
帖子
115
精华
0
在线时间
41 小时
注册时间
2009-9-1
最后登录
2016-3-16

楼主
rosson123 发表于 2010-5-11 20:41:03 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
请问因子分析时,非正定矩阵结果有效吗?有什么改进的方法?请高手指点!
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:分析检验 正定矩阵 因子分析 非正定矩阵 高手指点 检验 因子分析 矩阵

资料共享 交流经验

沙发
rosson123 发表于 2010-5-11 20:45:04
请大侠 指点!!!!!!!!不胜感激
资料共享 交流经验

藤椅
人地天道 发表于 2010-5-14 03:43:05
此為heywoodcase的情形

板凳
rosson123 发表于 2010-5-14 23:27:33
能详细点吗!
资料共享 交流经验

报纸
lixin-lucky 发表于 2011-8-31 17:04:47
能说详细点吗?

地板
llllmnmn 发表于 2011-9-6 10:33:24
我在2007年写论文时也遇到同样的问题。你检查一下因子之间有没有高度相关的,如果有spss将不能进行因子分析,并提示不是正定矩阵。与变量多少没关系。

7
深流 发表于 2014-3-19 20:09:31
llllmnmn 发表于 2011-9-6 10:33
我在2007年写论文时也遇到同样的问题。你检查一下因子之间有没有高度相关的,如果有spss将不能进行因子分析 ...
是因子间高度相关还是题目(变量)间高度相关?
另外高度相关需要达到多少,0.5算么?
论文都快逼急了,请赐教~~~~~~~~~~

8
ReneeBK 发表于 2014-3-20 05:17:13

Not Positive Definite Matrices--Causes and Cures


Ed Rigdon



The seminal work on dealing with not positive definite matrices is Wothke (1993). The chapter is both reabable and comprehensive. This page uses ideas from Wothke, from SEMNET messages, and from my own experience.
The Problem

There are four situations in which a researcher may get a message about a matrix being "not positive definite." The four situations can be very different in terms of their causes and cures.
First, the researcher may get a message saying that the input covariance or correlation matrix being analyzed is "not positive definite." Generalized least squares (GLS) estimation requires that the covariance or correlation matrix analyzed must be positive definite, and maximum likelihood (ML) estimation will also perform poorly in such situations. If the matrix to be analyzed is found to be not positive definite, many programs will simply issue an error message and quit.

Second, the message may refer to the asymptotic covariance matrix. This is not the covariance matrix being analyzed, but rather a weight matrix to be used with asymptotically distribution-free / weighted least squares (ADF/WLS) estimation.

Third, the researcher may get a message saying that its estimate of Sigma (), the model-implied covariance matrix, is not positive definite. LISREL, for example, will simply quit if it issues this message.

Fourth, the program may indicate that some parameter matrix within the model is not positive definite. This attribute is only relevant to parameter matrices that are variance/covariance matrices. In the language of the LISREL program, these include the matrices Theta-delta, Theta-epsilon, Phi () and Psi. Here, however, this "error message" can result from correct specification of the model, so the only problem is convincing the program to stop worrying about it.

"Not Positive Definite"--What Does It Mean?

Strictly speaking, a matrix is "positive definite" if all of its eigenvalues are positive. Eigenvalues are the elements of a vector, e, which results from the decomposition of a square matrix S as:
S = e'Me

To an extent, however, we can discuss positive definiteness in terms of the sign of the "determinant" of the matrix. The determinant is a scalar function of the matrix. In the case of symmetric matrices, such as covariance or correlation matrices, positive definiteness wil only hold if the matrix and every "principal submatrix" has a positive determinant. ("Principal submatrices" are formed by removing row-column pairs from the original symmetric matrix.) A matrix which fails this test is "not positive definite." If the determinant of the matrix is exactly zero, then the matrix is "singular." (Thanks to Mike Neale, Werner Wothke and Mike Miller for refining the details here.)

Why does this matter? Well, for one thing, using GLS estimation methods involves inverting the input matrix. Any text on matrix algebra will show that inverting a matrix involves dividing by the matrix determinant. So if the matrix is singular, then inverting the matrix involves dividing by zero, which is undefined. Using ML estimation involves inverting Sigma, but since the aim to maximize the similarity between the input matrix and Sigma, the prognosis is not good if the input matrix is not positive definite. Now, some programs include the option of proceeding with analysis even if the input matrix is not positive definite--with Amos, for example, this is done by invoking the $nonpositive command--but it is unwise to proceed without an understanding of the reason why the matrix is not positive definite. If the problem relates to the asymptotic weight matrix, the researcher may not be able to proceed with ADF/WLS estimation, unless the problem can be resolved.

In addition, one interpretation of the determinant of a covariance or correlation matrix is as a measure of "generalized variance." Since negative variances are undefined, and since zero variances apply only to constants, it is troubling when a covariance or correlation matrix fails to have a positive determinant.

Another reason to care comes from mathematical statistics. Sample covariance matrices are supposed to be positive definite. For that matter, so should Pearson and polychoric correlation matrices. That is because the population matrices they are supposedly approximating *are* positive definite, except under certain conditions. So the failure of a matrix to be positive definite may indicate a problem with the input matrix.

Why is My Matrix Not Positive Definite, and What Can I Do About It?

Properly, the question is, why does the matrix contain zero or negative eigenvalues. However, it may be easier for many researchers to think about why the determinant is zero or negative? Either way, there are many possibilities, and there are different possible solutions that go with each possible cause.
Further, there are other solutions which sidestep the problem without really addressing its cause. These options carry potentially steep cost. They are discussed separately, below.

Linear Dependency

A not positive definite input covariance matrix may signal a perfect linear dependency of one variable on another. For example, if a plant researcher had data on corn (maize) stalks, and two of the variables in the covariance matrix were "plant height" and "plant weight," the linear correlation between the two would be nearly perfect, and the covariance matrix would be not positive definite within sampling error. It may be easier to detect such relationships by sight in a correlation matrix rather than a covariance matrix, but often these relationships are logically obvious. Multivariate dependencies, where several variables together perfectly predict another variable, may not be visually obvious. In those cases, sequential analysis of the covariance matrix, adding one variable at a time and computing the determinant, should help to isolate the problem. (I would use a spreadsheet program for this, like Microsoft (TM) Excel (TM), for convenience.)
Dealing with this kind of problem involves changing the set of variables included in the covariance matrix. If two variables are perfectly correlated with each other, then one may be deleted. Alternatively, principal components may be used to replace a set of collinear variables with one or more orthogonal components.

In regard to the asymptotic weight matrix, the linear dependency exists not between variables, but between elements of the moments (the means and variances and covariances or the correlations) which are being analyzed. This can occur in connection with modeling multiplicative interaction relationships between latent variables. Jöreskog and Yang (1996) show how moments of the interaction construct are linear functions of moments of the "main effect" constructs. Their article explores alternative approaches for estimating these models

Error Reading the Data

If the problem is with your input matrix in particular, first make sure that the program has read your data correctly. Remember, an empty covariance matrix (with no variables in it) is always not positive definite. Try reading the data using another program, which will allow you to validate the covariance matrix estimated by the SEM program. If you generated the covariance matrix with one program, and are analyzing it with another, make sure that the covariance matrix was read correctly. This can be particularly problematic when the asymptotic weight matrix is the focus of the problem.
Typographical Error

Whenever a covariance matrix is transcribed, there is a chance of error. So if you just have the matrix (say, from a published article, but not the data itself, double-check for transcription errors. Also remember that journals are not perfect, so a covariance matrix in an article may also contain an error. In a recent case, for example, it appeared that the sign of a single (relatively large) coefficient was reversed at some point, and this reversal made the matrix not positive definite. In that case, changing the sign of that one coefficient eliminated the problem.
Starting Values

The model-implied matrix Sigma is computed from the model's parameter estimates. Especially before iterations begin, those estimates may be such that Sigma is not positive definite. So if the problem relates to Sigma, first make sure that the model has been specified correctly, with no syntax errors. If the proposed model is "unusual," then the starting value routines that are incorporated into most SEM programs may fail. Then it is up to the researcher to supply likely starting values.
Sampling Variation

When sample size is small, a sample covariance or correlation matrix may be not positive definite due to mere sampling fluctuation. As most matrices rapidly converge on the population matrix, however, this in itself is unlikely to be a problem. Anderson and Gerbing (1984) documented how parameter matrices (Theta-Delta, Theta-Epsilon, Psi and possibly Phi) may be not positive definite through mere sampling fluctation. Most often, such cases involve "improper solutions," where some variance parameters are estimated as negative. In such cases, Gerbing and Anderson (1987) suggested that the offending estimates could be fixed to zero with minimal harm to the program.
Estimators of the asymptotic weight matrix converge much more slowly, so problems due to sampling variation can occur at much larger sample sizes (Muthén & Kaplan, 1985, 1992). Using an asymptotic weight matrix with polychoric correlations appears to compound the problem. Where sampling variation is the issue, Yung and Bentler (1994) have proposed a bootstrapping approach to estimating the asymptotic weight matrix, which may avoid the problem.

Missing Data

Large amounts of missing data can lead to a covariance or correlation matrix not positive definite. With simple replacement schemes, the replacement value may be at fault. With pairwise deletion, the problem may arise precisely because each element of the covariance matrix is computed from a different subset of the cases (Arbuckle, 1996). To check whether this is the cause, use a different missing data technique, such as a different replacement value, listswise deletion or (perhaps ideally) a maximum likelihood/EMCOV simultaneous estimation method.
My Variable is a Constant!

Sometimes, either through an error reading data or through the process of deleting cases that include missing data, it happens that some variable in a data set takes on only a single value. In other words, one of the variables is actually a constant. This variable will then have zero variance, and the covariance matrix will be not positive definite. Simple tabulation of the data will provide a forewarning of this. If this is the problem, either the researcher must choose a different missing-data strategy, or else the variable must be deleted.
Polychoric Correlations

Programs that estimate polychoric correlations on a pairwise basis--one correlation at a time--may yield input correlation matrices that are not positive definite. Here the problem occurs because the whole correlation matrix is not estimated simultaneously. It appears that this is most likely to be a problem when the correlation matrix contains large numbers of variables. Try computing a matrix of Pearson correlations and see whether the problem persists.
If the problem lies with the polychoric correlations, there may not be a good solution. One approach is to use a program, like EQS, that includes the option of deriving all polychoric correlations simultaneously, rather than one at a time (cf., Lee, Poon & Bentler, 1992). But be warned--Joop Hox reports that the computational burden is enormous, and it increases exponentially with the number of variables.

Ed Cook has experimented with an eigenvalue/eigenvector decomposition approach. If a covariance or correlation matrix is not positive definite, then one or more of its eigenvalues will be negative. After decomposing the correlation matrix into eigenvalues and eigenvectors, Ed Cook replaced the negative eigenvalues with small (.05) positive values, used the new values to compute a covariance matrix, then standardized the resulting matrix (diving by the square root of the diagonal values) so that the result was again was a correlation matrix. Ed reported that the bias resulting from this process appeared to be small.

No Error Variance

Sometimes researchers specify zero elements on the diagonals of Theta-delta or Theta-epsilon. A zero here implies no measurement error. While it may seem unlikely, on reflection, that any latent variable could be measured without error, nevertheless the practice is common, when a construct has only a single measure. Single measures often lead to identification problems, and analysts may leave the parameter fixed at zero by default. If a diagonal element is fixed to zero, then the matrix will be not positive definite. However, since this is precisely what the researcher intended to do, there is no cause for alarm. The only problem is that these values may cause the solution to fail an "admissibility check," which may lead to premature termination of the iterative estimation process. In such cases, it is merely a matter of disabling the admissibility check. In LISREL, for example, this is done by adding AD=OFF to the OUtput line.
Negative Error Variance

Negative values on the diagonal are another matter. Since the diagonal elements of these matrices are variance terms, negative values are unacceptable. Further, since these error variances represent the "left-over" part of some variable, a negative error variance suggests that the regression has somehow explained more than 100 percent of the variance. In my own experience, these values are symptoms of a serious fit problem. Comprehensive fit assessment will help the researcher to isolate the specific problem.

Sidestepping the Problem

As with many problems, there are ways to sidestep this problem without actually trying to discern its cause. Besides simply compelling the program to proceed with its analysis, researchers can make a ridge adjustment to the covariance or correlation matrix. This involves adding some quantity to the diagonal elements of the matrix. This addition has the effect of attenuating the estimated relations between variables. A large enough addition is sure to result in a positive definite matrix. The price of this adjustment, however, is bias in the parameter estimates, standard errors, and fit indices. Partial least squares methods may also proceed with no regard for the determinant of the matrix, but this involves an entirely different methodology.

References

Anderson, J. C., & Gerbing, D. W. (1984). The effect of sampling error on convergence, improper solutions, and goodness-of-fit indices for maximum likelihood confirmatory factor analysis. Psychometrika, 49(2--June), 155-73.
Arbuckle, J. L. (1996). Full information estimation in the presence of incomplete data. In G. A. Marcoulides & R. E. Schumacker (Eds.), Advanced structural equation modeling: Issues and techniques (pp. 243-78). Mahwah, NJ: Lawrence Erlbaum.

Gerbing, D. W., & Anderson, J. C. (1987). Improper solutions in the analysis of covariance structures: Their interpretability and a comparison of alternate respecifications. Psychometrika, 52(1--March), 99-111.

Jöreskog, K. G., & Yang F. [now Fan Yang Jonsson] (1996). Nonlinear structural equation models: The Kenny-Judd model with interaction effects. In G. A. Marcoulides & R. E. Schumacker (Eds.), Advanced structural equation modeling: Issues and techniques (pp. 57-88). Mahwah, NJ: Lawrence Erlbaum.

Lee, S.-Y., Poon, W.-Y., & Bentler, P. M. (1992). Structural equation models with continuous and polytomous variables. Psychometrika, 57(1--March), 89-105.

Muthén, B. & Kaplan, D. (1985). A comparison of some methodologies for the factor analysis of non-normal Likert variables. British Journal of Mathematical and Statistical Psychology, 38, 171-89.

Muthén, B. & Kaplan, D. (1992). A comparison of some methodologies for the factor analysis of non-normal Likert variables: A note on the size of the model. British Journal of Mathematical and Statistical Psychology, 45, 19-30.

Wothke, W. (1993). Nonpositive definite matrices in structural modeling. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 256-93). Newbury Park, CA: Sage.

Yung, Y.-F., & Bentler, P. M. (1994). Bootstrap-corrected ADF test statistics in covariance structure analysis. British Journal of Mathematical and Statistical Psychology, 47, 63-84.

9
ReneeBK 发表于 2014-3-20 05:22:44
I am running a factor analysis in SPSS and get a "matrix is not positive definite" error from my correlation matrix. I've tried removing correlated variables, but I have to remove all variables down to those with correlations of r = 0.8 before the "positive definite" issue is resolved. This seems weird, as I have read that this problem normally arises where two or more vbls are nearly perfectly collinear (to me, r = 0.8 is a high correlation, but not near-perfect.) Can anyone suggest a way of dealing with this issue that doesn't just involve arbitrary removal of variables until the problem goes away? Or perhaps explain why I am getting the issue even when I seem to be removing all the really highly correlated variables?

http://stats.stackexchange.com/q ... ed-variables-are-re

10
ReneeBK 发表于 2014-3-20 05:24:50

IBM SPSS Technote (troubleshooting)



Problem(Abstract)
I want to run a factor analysis in SPSS for Windows. I select the variables and the model that I wish to run, but when I run the procedure, I get a message saying:

"This matrix is not positive definite."

I do not get any meaningful output as well, but just this message and a message saying:

"Extraction could not be done. The extraction is skipped."

Why is this happening?

Resolving the problem
The error indicates that your correlation matrix is nonpositive definite (NPD), i.e., that some of the eigenvalues of your correlation matrix are not positive numbers. If you request a factor extraction method other than principal components (PC) or unweighted least squares (ULS), an NPD matrix will cause the procedure to stop without extracting factors. If one or more of the eigenvalues are negative, then PC and ULS extraction will also terminate.

Matrices can be NPD as a result of various other properties. A correlation matrix will be NPD if there are linear dependencies among the variables, as reflected by one or more eigenvalues of 0. For example, if variable X12 can be reproduced by a weighted sum of variables X5, X7, and X10, then there is a linear dependency among those variables and the correlation matrix that includes them will be NPD. If there are more variables in the analysis than there are cases, then the correlation matrix will have linear dependencies and be NPD. Remember that FACTOR uses listwise deletion of cases with missing data by default. If you had more cases in the file than variables in the analysis, listwise deletion could leave you with more variables than retained cases. Pairwise deletion of missing data can also lead to NPD matrices. Negative eigenvalues may be present in these situations. See the following chapter for a helpful discussion and illustration of how this can happen.

  • Wothke, W. (1993) Nonpositive definite matrices in structural modeling. In K.A. Bollen & J.S. Long (Eds.), Testing Structural Equation Models. Newbury Park NJ: Sage. (Chap. 11, pp. 256-293).
  • Wothke's chapter also provides some suggestions for diagnosing NPD matrices, including the use of principal components analysis to detect linear dependencies.

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注cda
拉您进交流群
GMT+8, 2025-12-25 18:16