楼主: oliyiyi
1074 1

After 150 Years, the ASA Says No to p-values [推广有奖]

版主

已卖:2997份资源

泰斗

1%

还不是VIP/贵宾

-

TA的文库  其他...

计量文库

威望
7
论坛币
27300 个
通用积分
31674.3271
学术水平
1454 点
热心指数
1573 点
信用等级
1364 点
经验
384134 点
帖子
9629
精华
66
在线时间
5508 小时
注册时间
2007-5-21
最后登录
2025-7-8

初级学术勋章 初级热心勋章 初级信用勋章 中级信用勋章 中级学术勋章 中级热心勋章 高级热心勋章 高级学术勋章 高级信用勋章 特级热心勋章 特级学术勋章 特级信用勋章

楼主
oliyiyi 发表于 2016-3-14 14:23:14 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
March 7, 2016
By matloff









(This article was first published on Mad (Data) Scientist, and kindly contributed to R-bloggers)

Sadly, the concept of p-values and significance testing forms the very core of statistics. A number of us have been pointing out for decades that p-values are at best underinformative and often misleading. Almost all statisticians agree on this, yet they all continue to use it and, worse, teach it. I recall a few years ago, when Frank Harrell and I suggested that R place less emphasis on p-values in its output, there was solid pushback. One can’t blame the pusherbackers, though, as the use of p-values is so completely entrenched that R would not be serving its users well with such a radical move.
And yet, wonder of wonders, the American Statistical Association has finally taken a position against p-values. I never thought this would happen in my lifetime, or in anyone else’s, for that matter, but I say, Hooray for the ASA!
To illustrate the problem, consider the  one of the MovieLens data sets, consisting of user ratings of movies. There are 949 users. Here is an analysis in which I regress average rating per user against user age and gender:
> head(uu)  userid age gender      occup   zip  avg_rat1      1  24      0 technician 85711 3.6102942      2  53      0      other 94043 3.7096773      3  23      0     writer 32067 2.7962964      4  24      0 technician 43537 4.3333335      5  33      0      other 15213 2.8742866      6  42      0  executive 98101 3.635071> q  summary(q)...Coefficients:             Estimate Std. Error t value Pr(>|t|)    (Intercept) 3.4725821  0.0482655  71.947  < 2e-16 ***age         0.0033891  0.0011860   2.858  0.00436 ** gender      0.0002862  0.0318670   0.009  0.99284    ...Multiple R-squared:  0.008615,  Adjusted R-squared:  0.006505 Woohoo! Double-star significance on age! P-value of only 0.004! Age is a highly-significant predictor of movie ratings! Older people give higher ratings!
Well, no. A 10-year age difference corresponds to only a 0.03 difference in ratings — quite minuscule in light of the fact that ratings take values between 1 and 5.
The problem is that with large samples, significance tests pounce on tiny, unimportant departures from the null hypothesis, in this case H0: βage = 0, and ironically declare this unimportant result “significant.” We have the opposite problem with small samples: The power of the test is low, and we will announce that there is “no significant effect” when in fact we may have too little data to know whether the effect is important.
In addition, there is the hypocrisy aspect. Almost no null hypotheses are true in the real world, so performing a significance test on them is absurd and bizarre.
Speaking of hypocrisy: As noted above, instructors of statistics courses all know of the above problems, and yet teach testing anyway, with little or (likely) no warning about this dangerous method. Those instructors also do testing in their own work.
My hat is off to ASA for finally taking some action.


二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:p-value values value After years published continue article testing number

缺少币币的网友请访问有奖回帖集合
https://bbs.pinggu.org/thread-3990750-1-1.html

沙发
austen06 发表于 2016-3-15 07:48:37
But there is no consensus on how big or how small a dataset is.

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2026-1-27 08:57