楼主: Lisrelchen
2260 4

[精彩WinBUGS答问]Thinning chains in BUGS/JAGS [推广有奖]

  • 0关注
  • 62粉丝

VIP

院士

67%

还不是VIP/贵宾

-

TA的文库  其他...

Bayesian NewOccidental

Spatial Data Analysis

东西方数据挖掘

威望
0
论坛币
50057 个
通用积分
79.6987
学术水平
253 点
热心指数
300 点
信用等级
208 点
经验
41518 点
帖子
3256
精华
14
在线时间
766 小时
注册时间
2006-5-4
最后登录
2022-11-6

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
Hi I have a quick question about the details of running a model in JAGS and BUGS. Say I run a model with n.burnin=5000, n.iter=5000 and thin=2. Does this mean that the program will run 5,000 iterations, and discard results, and then run another 10,000 iterations, only keeping every second result?


If I save these simulations as a CODA object, are all 10,000 saved, or only the thinned 5,000? I'm just trying to understand which set of iterations are used to make the ACF plot?

Thank you
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:winbugs WINBUG Chains Chain BUGS understand question running another details

已有 1 人评分经验 收起 理由
oliyiyi + 100 精彩帖子

总评分: 经验 + 100   查看全部评分

本帖被以下文库推荐

沙发
Lisrelchen 发表于 2014-6-16 12:36:27 |只看作者 |坛友微信交流群
Thinning to reduce autocorrelation: Rarely useful!


Borys Paulewicz, commenting on a previous post, brought to my attention a very recent article about thinning of MCMC chains: Link, W. A. & Eaton, M. J. (2011) On thinning of chains in MCMC. Methods in Ecology and Evolution. First published online 17 June 2011. doi: 10.1111/j.2041-210X.2011.00131.x The basic conclusion of the article is that thinning of chains is not usually appropriate when the goal is precision of estimates from an MCMC sample. (Thinning can be useful for other reasons, such as memory or time constraints in post-chain processing, but those are very different motivations than precision of estimation of the posterior distribution.)

Here's the idea: Consider an MCMC chain that is strongly autocorrelated. Autocorrelation produces clumpy samples that are unrepresentative, in the short run, of the true underlying posterior distribution. Therefore, if possible, we would like to get rid of autocorrelation so that the MCMC sample provides a more precise estimate of the posterior sample. One way to decrease autocorrelation is to thin the sample, using only everynth step. If we keep 50,000 thinned steps with small autocorrelation, then we very probably have a more precise estimate of the posterior than 50,000 unthinned steps with high autocorrelation. But to get 50,000 kept steps in a thinned chain, we needed to generate n*50,000 steps. With such a long chain, the clumpy autocorrelation has probably all been averaged out anyway! In fact, Link and Eaton show that the longer (unthinned) chain usually yields better estimates of the true posterior than the shorter thinned chain, even for percentiles in the tail of the distribution, at least for the particular cases they consider.

The tricky part is knowing how long of a chain is long enough to smooth out short-run autocorrelation. The more severe the autocorrelation is, the longer the chain needs to be. I have not seen rules of thumb for how to translate an autocorrelation function into a recommended chain length. Link and Eaton suggest monitoring different independent chains and assaying whether the estimates produced by the different chains are suitably similar to each other.

For extreme autocorrelation, it's best not to rely on long-run averaging out, but instead to use other techniques that actually get rid of the autocorrelation. This usually involves reparameterization, as appropriate for the particular model.

Link and Eaton point out that the inefficiency of thinning has been known for years, but many practitioners have gone on using it anyway. My book followed those practitioners. It should be pointed out that thinning does not yield incorrect results (in the sense of being biased). Thinning merely produces correct results less efficiently (on average) than using the full chain from which the thinned chain was extracted. There are at least a couple more mathematically advanced textbooks that you might turn to for additional advice. For example, Jackman says in his 2009 book, Bayesian Analysis for the Social Sciences, "High levels of autocorrelation in a MCMC algorithm are not fatal in and of themselves, but they will indicate that a very long run of the sampler may be required. Thinning is not a strategy for avoiding these long runs, but it is a strategy for dealing with the otherwise overwhelming amount of MCMC output. (p. 263)" Christensen et al. say in their 2011 book, Bayesian Ideas and Data Analysis, "Unless there is severe autocorrelation, e.g., high correlation with, say [lag]=30, we don't believe that thinning is worthwhile. (p.146)" Unfortunately, there is no sure advice about how long a chain is needed. Longer is better. Perhaps if you're tempted to thin by n to reduce autocorrelation, just use a chain n times as long without thinning.

使用道具

藤椅
Lisrelchen 发表于 2014-6-16 12:37:36 |只看作者 |坛友微信交流群
Short answer: The number of iterations incorporates the burn in and does not incorporate thinning.

Less short answer: If you were to run a BUGS model through R2WinBUGS or R2OpenBUGS (or view a summary of WinBUGS output) with the arguments you stated:

n.iter=5000, n.burnin=5000, n.thin=2
you would get an error message/no output. n.iter refers to the total number of iterations including the burn in, hence all your iterations are burn in and are thrown away (or not included in the CODA output and any ACF plot in WinBUGS).

Thinning is treated differently (in relation to n.iter). For example if you set your MCMC up with any of the following arguments:

n.iter=6000, n.burnin=5000, n.thin=1
n.iter=6000, n.burnin=5000, n.thin=5
n.iter=6000, n.burnin=5000, n.thin=10
only 1000 iterations will be saved, i.e. all non-thinned simulations are discarded (in CODA output or any ACF plot in WinBUGS).

Not sure if this is the same for jags?

使用道具

板凳
Lisrelchen 发表于 2014-6-16 12:38:26 |只看作者 |坛友微信交流群
Other answered is correct about BUGS but his answer does not apply to JAGS (at least, not to rjags, R2jags might be different). I haven't used JAGS directly, but the writer of rjags is the creator of JAGS so I would guess they use the same convention.

In rjags, the jags.model object keeps track of the number of iterations that the chain(s) have been run.

Here is a small model in a file "tmpJags":

model {
    X ~ dnorm(Y, 1)
    Y ~ dt(0, 1, 1)
}
Then I run

X <- 1
jm <- jags.model(file = "tmpJags", data = list(X = X),
                 n.chains = 4, n.adapt = 1000)
jm consists of 4 chains, each of which has been run 1000 total iterations. Then I do

samps <- coda.samples(jm, "Y", n.iter = 1000, thin = 10)
Now all 4 of my chains have been run for a total of 2000 iterations each, and I have collected 100 samples from each chain, for 400 total samples saved.

To me, it makes sense to do it this way because for the purposes of monitoring chain convergence you would rather think in terms of total iterations than iterations after thinning

使用道具

报纸
oliyiyi 发表于 2015-4-28 14:53:23 |只看作者 |坛友微信交流群
nice post, thanks!

使用道具

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-5-12 16:10