楼主: juventume
25512 15

[问答] 求教Fisher's Z test [推广有奖]

  • 0关注
  • 6粉丝

已卖:1份资源

博士生

88%

还不是VIP/贵宾

-

威望
0
论坛币
4546 个
通用积分
0.0343
学术水平
22 点
热心指数
22 点
信用等级
12 点
经验
15732 点
帖子
142
精华
0
在线时间
541 小时
注册时间
2009-3-1
最后登录
2020-6-25

楼主
juventume 发表于 2010-10-21 12:01:28 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
在回归分析中,若要比较两个自变量对因变量作用效果的大小,需要通过两个标准化回归系数计算Z值,即Fisher's Z test吧。有没有哪位知道如何做的?或者有什么参考文献可以推荐给小弟学习一下。谢了!
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Fisher test Fish Est fis test 求教 Fisher

回帖推荐

--墨子-- 发表于8楼  查看完整内容

噗 虽然晚了两年 也回答下算造福后人吧: Fisher r to z transformation: Fisher Z= 1/2*[ln(1+r)-ln(1-r)]=1/2*ln[(1+r)/(1-r)],对应的标准误为 SEz=1/sqrt(N-3) 如果比较X,Y的相关系数在两个population间是不是有显著差异: Z=(Z1-Z2)/sqrt[1/(N1-3)+1/(N2-3)] 如果是比较XZ的相关系数与YZ的相关系数是不是有显著差异: t=(Rxz-Ryz)*Sqrt[(N-3)(1+Rxy)/(2*(1-Rxy^2-rxz^2-ryz^2+2RxyRxzRyz))] ~ df=N-3

tyaer 发表于7楼  查看完整内容

数据标准化变换 书到用时方恨少! 今天做一个数据聚类分析方面的实例,无论如何也得不到满意的结果,很是郁闷了一下,呵呵,小躺了一下,才发现是自己没有进行数据标准化!谨以此文提醒自己! 顺便再copy别人的博文,壮大版面: 数据标准化是统计里面常用的手段,这种处理对数据有什么影响呢?这次只是看看减均值除方差的那种标准化方式;最后均值为0方差为1就不用说了;说两点别的性质吧。   1、不改变秩。   2、不 ...

本帖被以下文库推荐

沙发
juventume 发表于 2010-10-22 09:24:10
先自己顶一个

藤椅
juventume 发表于 2010-10-25 10:03:54
真悲剧,哪位热心的牛人指导指导小弟啊!

板凳
tangjingtian 发表于 2010-10-25 18:18:42
顶一个,我也不会

报纸
tyaer 发表于 2011-2-11 16:13:54
Fisher's exact test 应该是这样的


Fisher's exact test for 2x2 tables and some measures of association based on chi-square.

Fisher's Exact Test
Fisher's exact test is another test of association between the row and column variables. This test assumes that the row and column totals are fixed, and then uses the hypergeometric distribution to compute probabilities of possible tables with these observed row and column totals. Fisher's exact test does not depend on any large-sample distribution assumptions, and so it is appropriate even for small sample sizes and for sparse tables.
2 × 2 Tables
For 2 ×2 tables, PROC FREQ gives the following information for Fisher's exact test: table probability, two-sided p-value, left-sided p-value, and right-sided p-value. The table probability equals the hypergeometric probability of the observed table, and is in fact the value of the test statistic for Fisher's exact test.
Where  is the hypergeometric probability of a specific table with the observed row and column totals, Fisher's exact p-values are computed by summing probabilities  over defined sets of tables,

The two-sided p-value is the sum of all possible table probabilties (for tables having the observed row and column totals) that are less than or equal to the observed table probability. So, for the two-sided p-value, the set  includes all possible tables with hypergeometric probabilities less than or equal to the probability of the observed table. A small two-sided p-value supports the alternative hypothesis of association between the row and column variables.
One-sided tests are defined in terms of the frequency of the cell in the first row and first column of the table, the (1,1) cell. Denoting the observed (1,1) cell frequency by F, the left-sided p-value for Fisher's exact test is probability that the (1,1) cell frequency is less than or equal to F. So, for the left-sided p-value, the set  includes those tables with a (1,1) cell frequency less than or equal to F. A small left-sided p-value supports the alternative hypothesis that the probability of an observation being in the first cell is less than expected under the null hypothesis of independent row and column variables.
Similarly, for a right-sided alternative hypothesis,  is the set of tables where the frequency of the (1,1) cell is greater than or equal to that in the observed table. A small right-sided p-value supports the alternative that the probability of the first cell is greater than that expected under the null hypothesis.
Because the (1,1) cell frequency completely determines the 2 ×2 table when the marginal row and column sums are fixed, these one-sided alternatives can be equivalently stated in terms of other cell probabilities or ratios of cell probabilities. The left-sided alternative is equivalent to an odds ratio greater than 1, where the odds ratio equals ( ). Additionally, the left-sided alternative is equivalent to the column 1 risk for row 1 being less than the column 1 risk for row 2,  . Similarly, the right-sided alternative is equivalent to the column 1 risk for row 1 being greater than the column 1 risk for row 2,  . Refer to Agresti (1996).
R × C Tables
Fisher's exact test was extended to general R ×C tables by Freeman and Halton (1951), and this test is also known as the Freeman-Halton test. For R ×C tables, the two-sided p-value is defined the same as it is for 2 ×2 tables. The set  contains all tables with  less than or equal to the probability of the observed table. A small p-value supports the alternative hypothesis of association between the row and column variables. For R ×C tables, Fisher's exact test is inherently two-sided. The alternative hypothesis is defined only in terms of general, and not linear, association. Therefore, PROC FREQ does not provide right-sided or left-sided p-values for general R ×C tables.
For R ×C tables, PROC FREQ computes Fisher's exact test using the network algorithm of Mehta and Patel (1983), which provides a faster and more efficient solution than direct enumeration.
已有 1 人评分经验 论坛币 收起 理由
bakoll + 3 + 3 精彩帖子

总评分: 经验 + 3  论坛币 + 3   查看全部评分

卖油翁说:“无他,唯熟耳!”

地板
tyaer 发表于 2011-2-11 16:35:29
你说的应该是标准化变换吧
卖油翁说:“无他,唯熟耳!”

7
tyaer 发表于 2011-2-11 16:35:45
数据标准化变换

书到用时方恨少!
今天做一个数据聚类分析方面的实例,无论如何也得不到满意的结果,很是郁闷了一下,呵呵,小躺了一下,才发现是自己没有进行数据标准化!谨以此文提醒自己!
顺便再copy别人的博文,壮大版面:

数据标准化是统计里面常用的手段,这种处理对数据有什么影响呢?这次只是看看减均值除方差的那种标准化方式;最后均值为0方差为1就不用说了;说两点别的性质吧。
  1、不改变秩。
  2、不改变变量之间的相关系数——无论是Pearson相关系数还是Spearman或Kendall相关系数还是偏相关系数都不会改变。(结论推导很简单也比较有意思,先简单地推出Pearson,由1可推出Spearman和Kendall,由Pearson又可以推出偏相关系数)
  由于数据标准化的性质,以下场合使用比较频繁:
  1、回归分析中常用这种手段去除截距项。
  2、在一些需要加权平均的综合指标排名中,为了消除量纲影响(其实这几乎是标准化的本质用途),也常用标准化数据的方式。
3、为了在图中更清楚地看出若干个变量的相关关系,可以使用标准化的方法(仍然类似于消除量纲影响),让数据处于相近的数量级水平,这样作图会比较直观。

  如何进行数据的标准化?
答曰:若使用SPSS,在Descriptives(描述统计)分析中,有一个"Save standardized values as variables"选项,我每次都偷懒,用这种方法计算的;若正儿八经地算,那么就用Compute,套着公式做吧;用Excel输入公式计算当然也可 以,用两个函数Average和Stdev,在一个单元格中写好了,bia~ji~往下一拖,就完事了。

如果在SAS中;
Example 1: Standardizing to a Given Mean and Standard Deviation
This example
•        standardizes two variables to a mean of 75 and a standard deviation of 5
•        specifies the output data set
•        combines standardized variables with original variables
•        prints the output data set
data score;
   length Student $ 9;
   input Student $ StudentNumber Section $
         Test1 Test2 Final @@;
   format studentnumber z4.;
   datalines;
Capalleti 0545 1 94 91 87  Dubose    1252 2 51 65 91
Engles    1167 1 95 97 97  Grant     1230 2 63 75 80
Krupski   2527 2 80 69 71  Lundsford 4860 1 92 40 86
McBane    0674 1 75 78 72  Mullen    6445 2 89 82 93
Nguyen    0886 1 79 76 80  Patel     9164 2 71 77 83
Si        4915 1 75 71 73  Tanaka    8534 2 87 73 76
;

proc standard data=score mean=75 std=5 out=stndtest;
   var test1 test2;
run;

proc sql;
create table combined as
select old.student, old.studentnumber,
      old.section,
      old.test1, new.test1 as StdTest1,
      old.test2, new.test2 as StdTest2,
      old.final
from score as old, stndtest as new
where old.student=new.student;

proc print data=combined noobs round;
   title 'Standardized Test Scores for a College Course';
run;


Example 2: Standardizing BY Groups and Replacing Missing Values
This example
•        calculates Z scores separately for each BY group using a mean of 1 and standard deviation of 0
•        replaces missing values with the given mean
•        prints the mean and standard deviation for the variables to standardize
•        prints the output data set.
proc format;
   value popfmt 1='Stable'
                2='Rapid';
run;

data lifexp;
  input PopulationRate Country $char14. Life50 Life93 @@;
  label life50='1950 life expectancy'
     life93='1993 life expectancy';
  datalines;
2 Bangladesh     .  53 2 Brazil         51 67
2 China          41 70 2 Egypt          42 60
2 Ethiopia       33 46 1 France         67 77
1 Germany        68 75 2 India          39 59
2 Indonesia      38 59 1 Japan          64 79
2 Mozambique      . 47 2 Philippines    48 64
1 Russia          . 65 2 Turkey         44 66
1 United Kingdom 69 76 1 United States  69 75
;

proc sort data=lifexp;
   by populationrate;
run;

proc standard data=lifexp mean=0 std=1 replace print out=zscore;
   by populationrate;
   format populationrate popfmt.;
   title1 'Life Expectancies by Birth Rate';
run;

proc print data=zscore noobs;
   title 'Standardized Life Expectancies at Birth';
   title2 'by a Country''s Birth Rate';
run;
已有 1 人评分经验 论坛币 收起 理由
bakoll + 3 + 3 精彩帖子

总评分: 经验 + 3  论坛币 + 3   查看全部评分

卖油翁说:“无他,唯熟耳!”

8
--墨子-- 发表于 2012-12-27 11:29:17
噗 虽然晚了两年  也回答下算造福后人吧:

Fisher r to z transformation:

Fisher Z= 1/2*[ln(1+r)-ln(1-r)]=1/2*ln[(1+r)/(1-r)],对应的标准误为 SEz=1/sqrt(N-3)

如果比较X,Y的相关系数在两个population间是不是有显著差异:
Z=(Z1-Z2)/sqrt[1/(N1-3)+1/(N2-3)]

如果是比较XZ的相关系数与YZ的相关系数是不是有显著差异:
t=(Rxz-Ryz)*Sqrt[(N-3)(1+Rxy)/(2*(1-Rxy^2-rxz^2-ryz^2+2RxyRxzRyz))]  ~ df=N-3
已有 1 人评分经验 论坛币 收起 理由
bakoll + 3 + 3 精彩帖子

总评分: 经验 + 3  论坛币 + 3   查看全部评分

关注我的微博: http://weibo.com/weizhangmozi

9
ruqurulai 发表于 2013-3-12 18:45:18
--墨子-- 发表于 2012-12-27 11:29
噗 虽然晚了两年  也回答下算造福后人吧:

Fisher r to z transformation:
你最后说的XZ和YZ的差异,是不是Steiger's z-test?这个怎么算?SPSS中有么?能检验超过三个的么?

10
--墨子-- 发表于 2013-3-12 23:19:55
ruqurulai 发表于 2013-3-12 18:45
你最后说的XZ和YZ的差异,是不是Steiger's z-test?这个怎么算?SPSS中有么?能检验超过三个的么?
我写的是Hotelling’s  t-test

Steiger's z-test 是类似的算法,比上面那个更好地控制Type I Error (详细资料看这里:http://psych.unl.edu/psycrs/statpage/biv_corr_comp_eg.pdf

超过3个的方法我也不知道 不过你可以直接用Hotelling或者Steiger做多次比较 对α做Bonferroni校正就好了 缺点是power低
关注我的微博: http://weibo.com/weizhangmozi

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注cda
拉您进交流群
GMT+8, 2025-12-6 05:53