Running OpenBUGS through R
This is just a quick note on using OpenBUGS in R. This is particularly handy if you need to run a series of models on a data set, or if you need to manipulate the output of Bayesian analyses in R in order to produce pretty pictures, compare models etc.
You have to get R running first. If you’re in Windows, just open it by double-clicking as usual. If you’re in Linux you are probably not running as an administrator (at least, I hope not). However, you will have to invoke your superuser powers to install packages etc. To do this, just open a terminal and type:
sudo R
You will be prompted to enter your password. Just remember that you’re running as an admin when using R, so be a bit sensible.
As usual, we start by setting the working directory:
setwd("/home/Monkey/Documents/Statistical methods")
The package that allows R to interact with OpenBUGS is BRugs, so you need to install that and open the library
install.packages("BRugs")
library("BRugs")
The detailed instructions for the BRugs package are here. There’s even example code, so you probably don’t really need this!
As OpenBUGS works by compiling code describing the statistical model with code containing the data to be analysed, you need to read in these files and then compile them. Now, the main point of using OpenBUGS through R is to simplify (or automate) the data manipulation and analysis steps. However, to keep things simple let’s start by assuming that you have written the model code and have the data already prepared.
First, we check the model. This also tells BRugs where the model code is so that it can compile it with the data later. In this example, the file is in the working directory and is called “Random effects binomial likelihood logit link”:
modelCheck("Random effects binomial likelihood logit link.txt")
You should get some text confirming that the file is okay (“model is syntactically correct”). If not, open it up and take a look.
That really isn’t the neatest name for a file, but I like to store models using descriptive names with a consistent structure. I find this makes it much easier to re-use them.
Then, we read in the data. This is one of the things people often get wrong when they first use WinBUGS or OpenBUGS. If you have two sets of data in two formats, as you often do, it is easiest to load them as separate files. My data are in two files – one gives the basic parameters for the analysis (number of studies, number of treatments etc.) and the other the data from the studies I am meta-analysing:
modelData("DATA1.txt")
modelData("DATA2.txt")
Again, you should get text confirming that the files have been read in successfully (“data loaded”). You can see that the commands are very intuitive once you know how OpenBUGS works. This was quite a relief, I have to admit.
The next step is to compile the model and data. To do this, we have to tell BRugs how many chains we want to run. I usually use at least 3 so I can see the chains mixing (mixing statistics are no supplement for visual inspection in my opinion):
modelCompile(numChains=3)
Again, you should get some reassuring text (“model compiled”). Incidentally, if you don’t know what I mean by chains, mixing etc. shout and I’ll put something together explaining how OpenBUGS works.
The model is now compiled, but we can’t run it yet because there are no initial values set. You have two options for this. The first, and most responsible, is to load a file containing initial values:
modelInits("Initial conditions.txt")
You will probably have to do this if you have a particularly complex model as it may otherwise take a while for chains to mix or just not work. Having said that, I tend to think that a model that is highly sensitive to initial conditions is a bit of a worry anyway! Meta-analytic models are rarely complicated enough for this to be a genuine issue.
An alternative approach, and the one I usually use when I’m playing around, is to just get OpenBUGS to generate some initial conditions:
modelGenInits()
If you do it this way, you get some text “initial values generated, model initialized”.
The model can now be run. We usually use a burn-in to get the chains to converge. One of the challenges of Bayesian analysis using MCMC (Markov chain Monte-Carlo) methods (as used in OpenBUGS) can be ensuring that the simulation chains have converged and that it is safe to sample from the chains. The way you handle this may depend on the sort of models you produce. In this case, I’m going to sample my main parameters of interest and use a long burn-in to ensure convergence because the model will run very quickly. I can then look at the results later on and check for convergence.
I set the parameters that I want to monitor. We often refer to these as ‘nodes’. That may be a bit confusing, but it reflects the fact that these models are often described using Directed Acyclic Graphs (DAGs) and graph theory refers to ‘nodes’ and ‘edges’:
samplesSet("d")
samplesSet("delta")
modelUpdate(100000)
Once the update has run, you will get some text saying “x updates took y s”. To give you an idea of how quick this is, 100,000 updates of a standard meta-analytic model typically takes less than a minute on my computer.
I would usually recommend doing these stages as we have just done, at least initially, because something is bound to go wrong and it is much easier to fix things if you can work out where the error is. However, once you’ve got your model code and data working, BRugs is particularly cool in that the whole process can be reduced to a single line of code, such as:
BRugsFit(modelFile = "Random effects binomial likelihood logit link.txt", data = c("DATA1.txt", "DATA2.txt"), numChains = 3, parametersToSave = c("d", "delta", "mu"), nBurnin = 10000, nIter = 100000, nThin=1, DIC=TRUE, working.directory="/home/Monkey/Documents/Statistical methods", digits = 5)
If you are running the single line of code in order to compare models, make sure you plot the posterior distributions of your parameters to check that the mean is a reasonable measure of central tendency (i.e. that the distribution only has one peak and is not too skewed). Otherwise, the Deviance Information Criterion may be unreliable.
Plotting data directly in BRugs can, I find, be rather slow. Plus, I like to play with the data and, sometimes, use it. Therefore, I usually export it in a format the CODA package can read. I also save it for posterity:
samplesCoda(node="d", stem="/home/Monkey/Documents/Statistical methods/Analysis1")
I’ll leave details on the use of CODA data for another post.
That was a quick run-through. If you’ve got any questions, do let me know.
Leave a Reply
You must be logged in to post a comment.