cc Licensed under the Creative Commons attribution-noncommercial license. Please share & remix noncommercially, mentioning its origin.

library(MCMCglmm)
library(lme4)
library(brms)
library(coda)
library(lattice)
library(reshape2) ## for melt()
## devtools::install_github("mjskay/tidybayes")
## library(tidybayes) ## optional
library(ggplot2); theme_set(theme_bw())
load("../data/starling.RData")   ## loads "dataf"

As a preliminary/context for the difficulties we will experience below, run a standard lmer model (stmass~mnth*roostsitu+(1|subject)) and plot the likelihood profile for the variance components. Neither of the MLEs is exactly at zero, but the lower 95% confidence interval of among-subject standard deviance definitely includes zero … (cutoff shown in red). Even the 50% CI includes zero (dashed line).

Analyze starling data with MCMCglmm

MCMCglmm does a Bayesian analysis, which can be useful for more flexible inference, and has some more flexible covariance structures etc.. For basic models, the syntax is very similar to lme4/nlme. (The random effect is specified separately, as in lme.)

mcmcglmm1 <- MCMCglmm(stmass~mnth*roostsitu,
                      random=~subject,
                      data=dataf,
                      verbose=FALSE)

We use verbose=FALSE to turn off the progress messages, which would be ugly in this document but are generally useful.

For MCMC approaches, it is your responsibility to check that the chain(s) are well-behaved.

There is a plot(.) method for MCMCglmm objects - it shows the trace plots and density plots for all parameters - but it’s not very pretty or very flexible. I prefer xyplot() for trace plots and densityplot() for density plots; you have to load the lattice package first, and you have to extract the relevant component of the MCMCglmm object and convert it to an mcmc object.

xyplot(
    as.mcmc(mcmcglmm1$Sol),  ## convert matrix to `mcmc` object
    layout=c(2,4)            ## customize panel arrangement
)

These look the way they should (featureless “white noise” - no trends or slow variation).

You can plot the distributions:

densityplot(mcmcglmm1$Sol)

but I prefer violin plots for this case: first need to

md <- reshape2::melt(as.matrix(mcmcglmm1$Sol)) ## change matrix to long form
ggplot(subset(md,Var2!="(Intercept)"),
       aes(Var2,value))+geom_violin(fill="grey")+
    geom_hline(yintercept=0,lty=2)+
    coord_flip()

But: the variance components are definitely problematic.

xyplot(
    as.mcmc(mcmcglmm1$VCV),  ## convert matrix to `mcmc` object
    layout=c(2,1)            ## customize panel arrangement
)

Let’s try the simplest brute-force approach, running the model for longer by increasing the nitt parameter from its default of 13,000:

mcmcglmm1L <- MCMCglmm(stmass~mnth*roostsitu,
                      random=~subject,
                      data=dataf,
                      nitt=2e5,
                      verbose=FALSE)

Still bad:

xyplot(
    as.mcmc(mcmcglmm1L$VCV),  ## convert matrix to `mcmc` object
    layout=c(2,1)            ## customize panel arrangement
)

We probably have to change the prior, specifically choosing a prior that doesn’t have a big spike at zero. The prior is a triple list;

mcmcglmm2 <- MCMCglmm(stmass~mnth*roostsitu,
                      random=~subject,
                      data=dataf,
                      prior=list(G=list(list(V=1,nu=5))),
                      verbose=FALSE)

This works to give us a well-behaved trace plot:

xyplot(
    as.mcmc(mcmcglmm2$VCV),  ## convert matrix to `mcmc` object
    layout=c(2,1)            ## customize panel arrangement
)

densityplot(
    as.mcmc(mcmcglmm2$VCV),  ## convert matrix to `mcmc` object
    layout=c(2,1)            ## customize panel arrangement
)

parameter-expanded priors:

The pathologies of the mathematically convenient conjugate (inverse-gamma/inverse-Wishart) priors can be overcome in MCMCglmm by changing to what are called parameter-expanded priors (see Section 8.0.2 in the “CourseNotes” vignette of the MCMCglmm package: vignette("CourseNotes",package="MCMCglmm") or here. This is mathematically a bit complicated, but practically it boils down to specifying four parameters (V, nu, alpha.mu, and alpha.mu) instead of two. The good news is that you can generally set V=1 and alpha.mu=0, which leaves only two to choose from. alpha.V gives the expected variance (so sqrt(alpha.V) is the expected standard deviation), and nu is now equivalent to the degrees of freedom for a \(t\)-distribution. Setting V=1, nu=nu, alpha.mu = 0, alpha.V=s^2 gives a half-\(t\) distribution with standard deviation s and df nu, e.g.:

curve(dt(x/25,df=1),from=0,to=500,ylim=c(0,0.35))
abline(h=0,lty=2)

Set up the prior:

prior2 <- list(G = list(G1 = list(V = 1, nu = 1,
                                  alpha.mu = 0, alpha.V = 25^2)))

Run the model:

mcmcglmm3 <- MCMCglmm(stmass~mnth*roostsitu,
                      random=~subject,
                      data=dataf,
                      prior=prior2,
                      verbose=FALSE)

Looks good, and looks less shifted away from zero:

mm <- as.mcmc(mcmcglmm3$VCV)
xyplot(mm)

densityplot(mm)

with brms

brms1 <- brm(stmass~mnth*roostsitu+(1|subject),
             family=gaussian,
             data=dataf)
## Compiling the C++ model
## Start sampling
## 
## SAMPLING FOR MODEL 'gaussian brms-model' NOW (CHAIN 1).
## 
## Gradient evaluation took 5.7e-05 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0.57 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 2000 [  0%]  (Warmup)
## Iteration:  200 / 2000 [ 10%]  (Warmup)
## Iteration:  400 / 2000 [ 20%]  (Warmup)
## Iteration:  600 / 2000 [ 30%]  (Warmup)
## Iteration:  800 / 2000 [ 40%]  (Warmup)
## Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Iteration: 2000 / 2000 [100%]  (Sampling)
## 
##  Elapsed Time: 0.537984 seconds (Warm-up)
##                0.301486 seconds (Sampling)
##                0.83947 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'gaussian brms-model' NOW (CHAIN 2).
## 
## Gradient evaluation took 2.7e-05 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0.27 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 2000 [  0%]  (Warmup)
## Iteration:  200 / 2000 [ 10%]  (Warmup)
## Iteration:  400 / 2000 [ 20%]  (Warmup)
## Iteration:  600 / 2000 [ 30%]  (Warmup)
## Iteration:  800 / 2000 [ 40%]  (Warmup)
## Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Iteration: 2000 / 2000 [100%]  (Sampling)
## 
##  Elapsed Time: 0.471962 seconds (Warm-up)
##                0.293175 seconds (Sampling)
##                0.765137 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'gaussian brms-model' NOW (CHAIN 3).
## 
## Gradient evaluation took 0.000183 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 1.83 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 2000 [  0%]  (Warmup)
## Iteration:  200 / 2000 [ 10%]  (Warmup)
## Iteration:  400 / 2000 [ 20%]  (Warmup)
## Iteration:  600 / 2000 [ 30%]  (Warmup)
## Iteration:  800 / 2000 [ 40%]  (Warmup)
## Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Iteration: 2000 / 2000 [100%]  (Sampling)
## 
##  Elapsed Time: 0.59665 seconds (Warm-up)
##                0.296885 seconds (Sampling)
##                0.893535 seconds (Total)
## 
## 
## SAMPLING FOR MODEL 'gaussian brms-model' NOW (CHAIN 4).
## 
## Gradient evaluation took 2.7e-05 seconds
## 1000 transitions using 10 leapfrog steps per transition would take 0.27 seconds.
## Adjust your expectations accordingly!
## 
## 
## Iteration:    1 / 2000 [  0%]  (Warmup)
## Iteration:  200 / 2000 [ 10%]  (Warmup)
## Iteration:  400 / 2000 [ 20%]  (Warmup)
## Iteration:  600 / 2000 [ 30%]  (Warmup)
## Iteration:  800 / 2000 [ 40%]  (Warmup)
## Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Iteration: 2000 / 2000 [100%]  (Sampling)
## 
##  Elapsed Time: 0.505307 seconds (Warm-up)
##                0.424878 seconds (Sampling)
##                0.930185 seconds (Total)
print(brms1)
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: stmass ~ mnth * roostsitu + (1 | subject) 
##    Data: dataf (Number of observations: 80) 
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup samples = 4000
## 
## Group-Level Effects: 
## ~subject (Number of levels: 40) 
##               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept)     1.12      0.73     0.05     2.68        898 1.00
## 
## Population-Level Effects: 
##                           Estimate Est.Error l-95% CI u-95% CI Eff.Sample
## Intercept                    83.61      1.39    80.87    86.30       1779
## mnthJan                       7.19      1.83     3.54    10.68       1555
## roostsitunestMbox            -4.18      1.98    -8.01    -0.19       2199
## roostsituinside              -4.99      1.93    -8.77    -1.20       1867
## roostsituother               -8.21      1.94   -12.05    -4.46       2003
## mnthJan:roostsitunestMbox     3.56      2.60    -1.66     8.82       1970
## mnthJan:roostsituinside       2.36      2.60    -2.80     7.42       1879
## mnthJan:roostsituother        1.61      2.63    -3.53     6.75       1697
##                           Rhat
## Intercept                 1.00
## mnthJan                   1.00
## roostsitunestMbox         1.00
## roostsituinside           1.00
## roostsituother            1.00
## mnthJan:roostsitunestMbox 1.00
## mnthJan:roostsituinside   1.00
## mnthJan:roostsituother    1.00
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sigma     4.13      0.39     3.41     4.95       4000 1.00
## 
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
## is a crude measure of effective sample size, and Rhat is the potential 
## scale reduction factor on split chains (at convergence, Rhat = 1).

Histograms vs densities of posterior distributions

One drawback of density plots is that they don’t work very precisely near boundaries, so it can be hard to see whether the mode of the prior distribution is at zero or not. (This isn’t terribly important in a practical sense, but I was curious whether the various priors we set actually shifted the posterior mode away from zero or not.) Histograms can help in this case:

source("../R/histogram.mcmc.R")
histogram(
    as.mcmc(mcmcglmm2$VCV),  ## convert matrix to `mcmc` object
    layout=c(2,1),           ## customize panel arrangement
    nint=30
)

or with ggplot:

mmr <- reshape2::melt(as.matrix(mcmcglmm2$VCV))
ggplot(mmr,aes(value))+
    geom_histogram(binwidth=0.5,colour="black")+
    facet_wrap(~Var2,scale="free")