楼主: oliyiyi
1522 2

Standard deviation vs Standard error [推广有奖]

版主

泰斗

0%

还不是VIP/贵宾

-

TA的文库  其他...

计量文库

威望
7
论坛币
271996 个
通用积分
31269.4471
学术水平
1435 点
热心指数
1554 点
信用等级
1345 点
经验
383775 点
帖子
9598
精华
66
在线时间
5470 小时
注册时间
2007-5-21
最后登录
2024-4-30

初级学术勋章 初级热心勋章 初级信用勋章 中级信用勋章 中级学术勋章 中级热心勋章 高级热心勋章 高级学术勋章 高级信用勋章 特级热心勋章 特级学术勋章 特级信用勋章

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
(This article was first published on DataScience+, and kindly contributed to R-bloggers)

I got often asked (i.e. more than two times) by colleagues if they should plot/use the standard deviation or the standard error, here is a small post trying to clarify the meaning of these two metrics and when to use them with some R code example.

Standard deviation

Standard deviation is a measure of dispersion of the data from the mean.

set.seed(20151204)#generate some random datax<-rnorm(10)#compute the standard deviationsd(x)1.144105

For normally distributed data the standard deviation has some extra information, namely the 68-95-99.7 rule which tells us the percentage of data lying within 1, 2 or 3 standard deviation from the mean.

plot(seq(-3.2,3.2,length=50),dnorm(seq(-3,3,length=50),0,1),type="l",xlab="",ylab="",ylim=c(0,0.5))segments(x0 = c(-3,3),y0 = c(-1,-1),x1 = c(-3,3),y1=c(1,1))text(x=0,y=0.45,labels = expression("99.7% of the data within 3" ~ sigma))arrows(x0=c(-2,2),y0=c(0.45,0.45),x1=c(-3,3),y1=c(0.45,0.45))segments(x0 = c(-2,2),y0 = c(-1,-1),x1 = c(-2,2),y1=c(0.4,0.4))text(x=0,y=0.3,labels = expression("95% of the data within 2" ~ sigma))arrows(x0=c(-1.5,1.5),y0=c(0.3,0.3),x1=c(-2,2),y1=c(0.3,0.3))segments(x0 = c(-1,1),y0 = c(-1,-1),x1 = c(-1,1),y1=c(0.25,0.25))text(x=0,y=0.15,labels = expression("68% of the data within 1" * sigma),cex=0.9)

Here is the plot we made:

Of course if the data are not normally distributed such interpretation is not valid. It remains that standard deviation can still be used as a measure of dispersion even for non-normally distributed data.

Standard error of the mean

It is a measure of how precise is our estimate of the mean.

#computation of the standard error of the meansem<-sd(x)/sqrt(length(x))#95% confidence intervals of the meanc(mean(x)-2*sem,mean(x)+2*sem)-1.1337038  0.3134877

The main use of the standard error of the mean is to give confidence intervals around the estimated means where it follows the same 68-95-99.7 rule BUT this time not for the data itself but for the mean. This can also be extended to test (in terms of null hypothesis testing) differences between means. For example if the 95% confidence intervals around the estimated fish sizes under Treatment A do not cross the estimated mean fish size under Treatment B then fish sizes are significantly different from one another between the two Treatments. Note that the standard error of the mean depends on the sample size, the standard error of the mean shrink to 0 as sample size increases to infinity.

When to use standard deviation? When to use standard error?

It depends. If the message you want to carry is about the spread and variability of the data, then standard deviation is the metric to use. If you are interested in the precision of the means or in comparing and testing differences between means then standard error is your metric. Of course deriving confidence intervals around your data (using standard deviation) or the mean (using standard error) requires your data to be normally distributed.Bootstrapping is an option to derive confidence intervals in cases when you are doubting the normality of your data.


二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Deviation Standard stand Error ATION error

缺少币币的网友请访问有奖回帖集合
https://bbs.pinggu.org/thread-3990750-1-1.html
沙发
loyerss 发表于 2015-12-6 16:15:44 |只看作者 |坛友微信交流群
受教了。bbs.paidai.com/topic/458335

使用道具

藤椅
seahhj 发表于 2015-12-6 18:43:09 |只看作者 |坛友微信交流群
THANKS for sharing

使用道具

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-5-2 11:43