楼主: SPSSCHEN
6111 19

[学科前沿] [讨论]Cluster Analysis [推广有奖]

  • 0关注
  • 0粉丝

博士生

22%

还不是VIP/贵宾

-

TA的文库  其他...

Voxco NewOccidental

Case Study NewOccidental

NoSQL NewOccidental

威望
0
论坛币
946 个
通用积分
0.6700
学术水平
7 点
热心指数
2 点
信用等级
0 点
经验
2052 点
帖子
306
精华
0
在线时间
42 小时
注册时间
2005-9-25
最后登录
2022-10-25

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
Hi everyone,

I am looking for different ways to test the stability of clusters after they have been generated (eg using holdout cases or monte carlo simulations etc).  I appreciate that the type of test will vary across clustering alogorithms but I would appreciate some generic reference material (preferably internet accessible) for different ways to assess cluster stability and listed criterion that could be applied to the assessment process.

Regards Paul
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Analysis Analysi Cluster alysis Analys accessible different generated reference internet

沙发
SPSSCHEN 发表于 2005-12-22 10:58:00 |只看作者 |坛友微信交流群
Hello Everybody,

A) I was wondering if anybody know how one can get the following or similar to following statistics that one can get in SAS software when doing Cluster Analysis: 1. RMSSTD (root-mean-square total-sample standard deviation) for measuing homogeneitiy of new clusters 2. SPR (Semipartial R-sqared) for measuring homogeneity of merged clusters 3. RS (R-Squared) for measuing hetrogeneitiy of clusters 4. CD (distance between two clusters)for measuing homogeneitiy of merged clusters

B) Is it appropriate to use categorical independent variables in Discriminant Analysis?

Thanks in advance.

Best regards, Sanjay

使用道具

藤椅
SPSSCHEN 发表于 2005-12-22 11:06:00 |只看作者 |坛友微信交流群

This is a multi-part message in MIME format. ------=_NextPart_000_0000_01C5E3F0.C8B2E2B0 Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable From: Steven R Brown Date: Mon Nov 07 15:52:59 CST 2005 T Q-METHOD@LISTSERV.KENT.EDU Subject: Re.: Use of SPSS I must disagree with some of the presumed advantages of SPSS advanced by David Goldstein. At 05:32 PM 11/6/2005 -0500, David M. Goldsteinwrote: Some of the pluses of using SPSS: It provides some help with the issue of how many factors to extract.=20 Steve,=20 Do you decide how many factors are present only after rotation=20 Particularly in connection with judgmental rotations, the nature of the factors and which to retain is a judgment that emerges as the = factoranalyst interacts with the data. A priori criteria, such as Cattell's scree test or eigenvalues greater than 1.00, are onlyassociated with the statistical properties of the data.It provides somedifferent ways of rotating. I have found that the Equimax method keeps onyielding results which are consistent with judgemental rotation.This has to be mainly accidental.

I understand your point.. However, I keep finding that the Equimax method of rotation keeps on producing the same results as when I use judegmental rotation. The Lipset data is usually used to illustrate the advantages of judgmental rotation. The Equimax rotation produces the same result. Is it possible that when a person is doing judgmental = rotation that the person is doing something similar to what Equimax is doing? In a sense, Equimax rotation could me a model of what is happening with judgmental rotation.

That is, there can be nosystematic connection between equimax or any other automatic rotationprocedure (which responds only to the statistical topography of the data) and judgmental rotation, which mainly responds to content, or totheoretical considerations. The statistical configuration can ofcourse influence the analyst's decision making, but there are too manyother considerations (of which equimax, varimax, and other rotations are oblivious) that will also play a role.

One can obtainfactor scores just like with the other programs. True enough, but PCQ and PQMethod provide analyses of factor scores(based on standard error formulas) that are missing in SPSS, and probablySAS (with which I am not familiar). Steve, I don undertand this statement. Can you explain it a little more. One can analyze the items for patterns over time. I use the k-means cluster analysis program. This is probably similar to the repertory grid people. I'm sure that SPSS can do this, but looking at individual items outside of the context of the overall response that gives them meaning (if thisis in fact what is being done) is to violate the gestalt principle. Steve, The k-means cluster analysis results in clusters of items. In the way that I am doing it, each cluster consists of items which have a similar pattern over the therapy sessions. So there is a getalt involved. One can createnew variables from the factors which result. For example, I created aQ-sort which reflected the degree of conflict between the differentself-images which resulted. The Conflict Q-Sort can then be entered intothe analysis with the other Q-sorts.

I'm not sure what this implies. How exactly was the new Q sortcreated? Steve, This analysis can be seen in the case study which I recently published in Clinical Case Studies. For each item, I looked at the largest difference in factor scores (expressed as a z-score) among the three factors. The factors were interpreted to reflect self-images of the person. For each item, if the largest difference was zero, this shows that the different self-images are the same. If the largest difference was maximal, this would show conflict between the self-images. The z-score different scorses were rank ordered from smallest to largest and this became a Q-sort that reflected conflict amont the self-images.Incidentally, I'm not saying that any of the above statistical strategies(e.g., analyzing items over time) isn't useful for some purpose oranother, but it's equally important to be mindful of the principles fromwhich these practices depart.

[此贴子已经被作者于2005-12-22 11:07:40编辑过]

使用道具

板凳
SPSSCHEN 发表于 2005-12-22 11:10:00 |只看作者 |坛友微信交流群
Hi to all,

Can anybody offer some help in the following problem? I want to analize clusters. I know how to do it when data is in the form of variables and subjects, but I have data that is directly in the form of similarity matrices. If I have one similarity matrix for each subject, can I input these matrices as data and perform the analisys from there? How do I do it? How about if I have a similarity matrix that is a summary of all subjects?

Thanks,

Sergio Chaigneau.

使用道具

报纸
SPSSCHEN 发表于 2005-12-22 11:11:00 |只看作者 |坛友微信交流群
I'd like to know how can I reach the distance matrix used in clustering in SPSS. How can I read it (format)? Are the rows in the same order than the data (when the syntaxe has been executed, -I guess- ? What about filtering?).

Best regards

使用道具

地板
SPSSCHEN 发表于 2005-12-22 11:15:00 |只看作者 |坛友微信交流群

hi spss listers

i would like to get some help on an output when i do a simple hierarchical cluster analysis:given below I have tried it both on version 10 and 8 for windows

Proximities

Warnings An error was encountered when attempting to open the input file named in the MATRIX subcommand. Check the existence and contents of the matrix input file. This command is not executed.

Cluster

Error # 5260 in column 16. Text: C:\TEMP\spssclus.tmp The file named above is empty or is not an SPSS data file. This command not executed Didier Gerard Rodney Soopr

Amanien

使用道具

7
SPSSCHEN 发表于 2005-12-22 11:16:00 |只看作者 |坛友微信交流群

Didier, I couldn't reproduce your problem in my version (9), but here are some thoughts: In the HIERARCHICAL CLUSTER dialog box, if you choose the option under METHOD that says something like "standardize...", SPSS uses the PROXIMITIES command to make a distance matrix (into a file called SPSSCLUS.TMP) before running your CLUSTER command. From what I can tell this is where your program went wrong. It seems to be looking for an input file for PROXIMITIES instead of making an output file. That's my guess without seeing the output "notes" or "spss.jnl".

As a workaround, I would use syntax to make sure it's doing what you want at each step: *First, the PROXIMITIES command with an explicit OUT qualifier. PROXIMITIES variable /MATRIX=OUT ("filename.ext") /PRINT=NONE /STANDARDIZE=SD . *The CLUSTER command using the file you made. CLUSTER /MATRIX=IN (''filename.ext'').

Write back if you want help with the exact syntax. Good luck,

使用道具

8
SPSSCHEN 发表于 2005-12-22 11:19:00 |只看作者 |坛友微信交流群

使用道具

9
SPSSCHEN 发表于 2005-12-22 11:21:00 |只看作者 |坛友微信交流群

使用道具

10
SPSSCHEN 发表于 2005-12-22 11:22:00 |只看作者 |坛友微信交流群

使用道具

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-5-1 16:46