Monday, 10 September 2012

INLA functions (continued)

I have polished up one of the two functions I've thought of implementing for INLA and it's now available in the development version of the R package. So if you've got INLA installed in your R version (see how you can do it here, if you don't), you can update it to the development version by typing

inla.upgrade(testing=TRUE)

This will add the new function inla.contrib.sd which can be used to express the uncertainty in the structured ("random") effects of your INLA model in terms of the standard deviation, instead of the precision (which INLA gives by default).

In the help, I consider a simple example: I simulate $N=12$ Binomial trials. For each, first I simulate the sample size to be a number between 80 and 100 (subjects)

n=12
Ntrials = sample(c(80:100), size=n, replace=TRUE)

Then I simulate the actual observations $y$ as Binomial variables depending on the sample size Ntrials and a probability prob which in turn depends on a variable $\eta\sim\mbox{Normal}(0,0.5)$, defined as $$\mbox{prob} = \frac{\exp(\eta)}{1+\exp(\eta)}$$
which I do with the code

eta = rnorm(n,0,0.5)
prob = exp(eta)/(1 + exp(eta))
y = rbinom(n, size=Ntrials, prob = prob)

At this point, I define a very simple hierarchical model where each trial represents a clustering unit (ie I assume a trial-specific effect). You do this in INLA by using the command

data=data.frame(y=y,z=1:n)
formula=y~f(z,model="iid")
m=inla(formula,data=data,family="binomial",Ntrials=Ntrials)
summary(m)

This produces an INLA object m and the standard summary is
Call:
"inla(formula = formula, family = \"binomial\", data = data, Ntrials = Ntrials)"

Time used:
 Pre-processing    Running inla Post-processing           Total 
     0.20580840      0.02721024      0.78909874      1.02211738 

Fixed effects:
                 mean        sd 0.025quant  0.5quant 0.975quant          kld
(Intercept) 0.1815634 0.1606374 -0.1371095 0.1815601  0.5003695 5.379434e-05

Random effects:
Name  Model   Max KLD 
z   IID model 

Model hyperparameters:
                mean   sd     0.025quant 0.5quant 0.975quant
Precision for z  4.506  2.300  1.508      4.032   10.290    

Expected number of effective parameters(std dev): 10.16(0.5718)
Number of equivalent replicates : 1.181 

Marginal Likelihood:  -56.18 

Now, while the summary of the posterior distribution for the precision of the structured effect $z$ is of course just as informative, it is in general (more) difficult to interpret, because it is on the scale of 1/standard deviation. On the other hand, you can't just take take the reciprocal of the (square-rooted) summaries to obtain info about the posterior distribution of the standard deviation, because the transformation is not linear and thus it does not directly apply to the moments and quantiles.

One easy way out is to sample from the posterior marginal of the precision, transform the draws from this distribution to draws of the resulting distribution for $\sigma = \frac{1}{\sqrt{\mbox{precision}}}$ and then summarise that posterior marginal distributions. That's what inla.contrib.sd does

s <- inla.contrib.sd(m)

[You need to specify the INLA model, m in this case, and you can also input the number of draws you require from the posteriors $-$ the default is nsamples=1000].
The resulting object s contains two elements: 

s$hyper

returns a table

              mean       sd      2.5%     97.5%
sd for z 0.5096883 0.125651 0.3132221 0.7948447

with the summary on the standard deviation scale. The other element s$samples contains the simulated values from the posterior, which can be used for example to draw a histogram of the distribution

hist(s$samples,xlab="standard deviation for z",main="")

which in this case produces the following graph.


No comments:

Post a Comment