楼主: 西风瘦
2139 2

[求助]精通主成分的高手请进!2000金钱答谢! [推广有奖]

  • 0关注
  • 0粉丝

博士生

70%

还不是VIP/贵宾

-

威望
0
论坛币
920082 个
通用积分
4.2921
学术水平
1 点
热心指数
3 点
信用等级
2 点
经验
6386 点
帖子
260
精华
0
在线时间
280 小时
注册时间
2004-11-11
最后登录
2024-3-24

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币

实在心急了,苦苦看了一些书,也没找到怎样用主成分缩减指标!

具体是这样的:我选取了18个指标,做主成分分析,出来三个主成分,可是每个主成分涉及的指标比较分散。我想把一些指标剔除,可是不知道如何操作来剔除,斑竹推荐了SAS,指点说使用SAS软件的PRINCOMP过程可以求得主成分,然后再剔除指标。本人没学过SAS,不知道有没有具体实例可以参照的,还望各位具体指点下,多谢了!!!

也许我的确急功近利了,可是实在急着要交了,又不想太马虎的应付了事,各位帮忙一下了,谢谢了!!!

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:主成分 princomp sas软件 主成分分析 急功近利 金钱 高手

沙发
shelby 发表于 2006-5-4 12:40:00 |只看作者 |坛友微信交流群

Here is what I thought:

1) Look at the scale of measurement of all the variables. If scales are different, e.g., Height vs age, you should normalize all the variables. Or use correlation matrix.

2) Check each variable, if they are close to normally distributed. Otherwise use transformation. This will ensure that all variables are in the elliptical space.


3) In SAS, here is an example: (using correlation matrix)


A) The data:


 data Crime;
title 'Crime Rates per 100,000 Population by State';
input State $1-15 Murder Rape Robbery Assault
Burglary Larceny Auto_Theft;
datalines;
Alabama 14.2 25.2 96.8 278.3 1135.5 1881.9 280.7
Alaska 10.8 51.6 96.8 284.0 1331.7 3369.8 753.3
Arizona 9.5 34.2 138.2 312.3 2346.1 4467.4 439.5
Arkansas 8.8 27.6 83.2 203.4 972.6 1862.1 183.4
California 11.5 49.4 287.0 358.0 2139.4 3499.8 663.5
Colorado 6.3 42.0 170.7 292.9 1935.2 3903.2 477.1
Connecticut 4.2 16.8 129.5 131.8 1346.0 2620.7 593.2
Delaware 6.0 24.9 157.0 194.2 1682.6 3678.4 467.0
Florida 10.2 39.6 187.9 449.1 1859.9 3840.5 351.4
Georgia 11.7 31.1 140.5 256.5 1351.1 2170.2 297.9
Hawaii 7.2 25.5 128.0 64.1 1911.5 3920.4 489.4
Idaho 5.5 19.4 39.6 172.5 1050.8 2599.6 237.6
Illinois 9.9 21.8 211.3 209.0 1085.0 2828.5 528.6
Indiana 7.4 26.5 123.2 153.5 1086.2 2498.7 377.4
Iowa 2.3 10.6 41.2 89.8 812.5 2685.1 219.9
Kansas 6.6 22.0 100.7 180.5 1270.4 2739.3 244.3
Kentucky 10.1 19.1 81.1 123.3 872.2 1662.1 245.4
Louisiana 15.5 30.9 142.9 335.5 1165.5 2469.9 337.7
Maine 2.4 13.5 38.7 170.0 1253.1 2350.7 246.9
Maryland 8.0 34.8 292.1 358.9 1400.0 3177.7 428.5
...
;


2) Use the 'PrinComp' procedure:

proc princomp out=Crime_Components;
run;

3) Look at the output (results): pay attention to the correlation matirx. If the correlation between two variables close to 1 or -1, you can omit one of them.
Also pay attention to the eigenvalues of correlation matrix. If eigenvalue < 1 do not use it


Hope this helps.



Crime Rates per 100,000 Population by State

The PRINCOMP Procedure

Observations 50
Variables 7

Simple Statistics
Murder Rape Robbery Assault Burglary Larceny Auto_Theft
Mean 7.444000000 25.73400000 124.0920000 211.3000000 1291.904000 2671.288000 377.5260000
StD 3.866768941 10.75962995 88.3485672 100.2530492 432.455711 725.908707 193.3944175

Correlation Matrix
Murder Rape Robbery Assault Burglary Larceny Auto_Theft
Murder 1.0000 0.6012 0.4837 0.6486 0.3858 0.1019 0.0688
Rape 0.6012 1.0000 0.5919 0.7403 0.7121 0.6140 0.3489
Robbery 0.4837 0.5919 1.0000 0.5571 0.6372 0.4467 0.5907
Assault 0.6486 0.7403 0.5571 1.0000 0.6229 0.4044 0.2758
Burglary 0.3858 0.7121 0.6372 0.6229 1.0000 0.7921 0.5580
Larceny 0.1019 0.6140 0.4467 0.4044 0.7921 1.0000 0.4442
Auto_Theft 0.0688 0.3489 0.5907 0.2758 0.5580 0.4442 1.0000

Eigenvalues of the Correlation Matrix
Eigenvalue Difference Proportion Cumulative
1 4.11495951 2.87623768 0.5879 0.5879
2 1.23872183 0.51290521 0.1770 0.7648
3 0.72581663 0.40938458 0.1037 0.8685
4 0.31643205 0.05845759 0.0452 0.9137
5 0.25797446 0.03593499 0.0369 0.9506
6 0.22203947 0.09798342 0.0317 0.9823
7 0.12405606 0.0177 1.0000

Eigenvectors
Prin1 Prin2 Prin3 Prin4 Prin5 Prin6 Prin7
Murder 0.300279 -.629174 0.178245 -.232114 0.538123 0.259117 0.267593
Rape 0.431759 -.169435 -.244198 0.062216 0.188471 -.773271 -.296485
Robbery 0.396875 0.042247 0.495861 -.557989 -.519977 -.114385 -.003903
Assault 0.396652 -.343528 -.069510 0.629804 -.506651 0.172363 0.191745
Burglary 0.440157 0.203341 -.209895 -.057555 0.101033 0.535987 -.648117
Larceny 0.357360 0.402319 -.539231 -.234890 0.030099 0.039406 0.601690
Auto_Theft 0.295177 0.502421 0.568384 0.419238 0.369753 -.057298 0.147046


使用道具

藤椅
rongchao 发表于 2006-5-4 17:12:00 |只看作者 |坛友微信交流群
主成分分析中得出的主成分只是原始指标的线性组合,即载荷矩阵,要实现主成分与原始指标的关联,还需要对载荷矩阵进行因子旋转,得出公共因子,有的时候并不一定得出结论。请参考人大出版社出版的《多元统计分析》何晓群编著,有主成分分析因子分析的上机实现(spss)很详细,在SPSS中主成分分析和因子分析是结合在一起的

使用道具

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-5-1 13:17