Monday, 3 March 2014

Issue with thinning in R2OpenBUGS vs R2jags

While preparing the practicals for our course at the University of Alberta, I've discovered something kind of interesting. I'm sure this is nothing new and actually people who normally use both OpenBUGS and JAGS have already figured this out. 

But since I normally just use JAGS, it took me a bit to see this, so I thought I should post about it...

So: the issue is that when running a Bayesian model based on MCMC (eg Gibbs sampling), often it helps to improve convergence if the chains are "thinned" $-$ basically, instead of saving all the successive iterations of the process, only 1 on $s$ are stored and this generally reduces autocorrelation.

R2jags (which is the R library to interface R and JAGS) lets you select the total number of iterations you want to run, the number of "burn-in" iterations (which will be discarded) and the thinning factor. So a command
m <- jags(..., n.iter=20000, n.burnin=9500, n.thin=21, n.chains=2)
will generate two chains, each with 20000 iterations, discard the first 9500, which of course leaves 10500 iterations, and then save only one every 21 of these $-$ a total of 500 iterations per chain.

Conversely, R2OpenBUGS thinks in slightly different terms: as the help (which to be fair I've never bothered reading...) says "The thinning is implemented in the OpenBUGS update phase, so thinned samples are never stored, and they are not counted in n.burnin or n.iter. Setting n.thin=2, doubles the number of iterations OpenBUGS performs, but does not change n.iter or n.burnin". 

So the command 

m <- bugs(...,  n.iter=20000, n.burnin=9500, n.thin=21, n.chains=2)
will actually save 21000 iterations altogether (10500 per chain).

I realised this because the same relatively simple model was taking for ever when run using OpenBUGS and was really quick in JAGS $-$ no kidding!...

5 comments:

  1. Right, and the thinning argument behaves differently yet again in runjags (which I highly recommend).

    ReplyDelete
  2. Thanks John for pointing this out. I think in the end it's not a bad thing per se that these libraries have different ways of working $-$ after all I suppose they were written according to their developers' need and preferences... But I think these differences should be really advertised and clarified!

    ReplyDelete
  3. I think you may have a misconception. Thinning does not "improve convergence", nor does it "reduce autocorrelation" of the chain itself, only of the saved states. Thinning will in fact reduce the accuracy of your estimates.

    You might still want to thin, for one or both of two reasons: You might not have enough memory or disk space to store all the un-thinned iterations, and you might not want to spend the computation time needed to actually look at them all when using the states from the chain to estimate things. If you thin a chain at an interval that is small compared to the autocorrelation time (of the unthinned chain), you only lose a small amount of accuracy, so it may be desirable.

    ReplyDelete
  4. Thanks for this clarification, Radford!

    ReplyDelete
  5. Blog is very helpfull! Thank you for the blog. Thinning is the removal of a plant or part of a plant or tree in agriculture to help them grow better. Thinning involves the removal of some parts of a plant or tree, such as branches, buds and roots, but this does not mean cutting down the whole tree, a process commonly referred to as pruning. There are five methods of thinning that taper from below, taper at the top, diameter, geometrically, and freely. Thinning is also necessary to provide the resources needed for forest trees. Thinning helps trees get the resources they need to grow, such as access to water, soil nutrients and light. Thinning also helps prevent trees from insects and diseases. Light and space also help trees to grow, improve access to wildlife habitats and food, and develop biodiversity in general. Thinning also reduces the risk of fires on the "dry side".

    ReplyDelete