cc Licensed under the Creative Commons attribution-noncommercial license. Please share & remix noncommercially, mentioning its origin.

library(lme4)
library(glmmTMB)
library(aods3)
library(glmmTMB)
library(broom.mixed)
library(dotwhisker)

Model simplification examples

Barr et al. (2013); Bates et al. (2015); Matuschek et al. (2017)

CBPP data

Starting again with the “basic” CBPP model:

data("cbpp",package="lme4")
g1 <- glmer(incidence/size~period+(1|herd),
      data=cbpp,
      weights=size,
      family=binomial)

What happens if we try to fit the maximal model?

g1max <- update(g1,.~.-(1|herd) + (period|herd))
## Error: number of observations (=56) < number of random effects (=60) for term (period | herd); the random-effects parameters are probably unidentifiable

Use glmerControl() to override the warning …

g1max <- glmer(incidence/size~period+(period|herd),
      data=cbpp,
      weights=size,
      family=binomial,
      control=glmerControl(check.nobs.vs.nRE="warning"))
## Warning: number of observations (=56) < number of random effects (=60)
## for term (period | herd); the random-effects parameters are probably
## unidentifiable
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control
## $checkConv, : Model failed to converge with max|grad| = 0.0312476 (tol =
## 0.001, component 1)
VarCorr(g1max)
##  Groups Name        Std.Dev. Corr                
##  herd   (Intercept) 0.96037                      
##         period2     1.38887  -0.892              
##         period3     1.96166  -0.496  0.835       
##         period4     0.90495  -0.811  0.460 -0.105
eigen(VarCorr(g1max)[[1]])$values
## [1] 5.813944e+00 1.704308e+00 4.005021e-05 1.296521e-07

Look at ?convergence

source("../R/allFit.R")
af <- allFit(g1max)
## bobyqa : [OK]
## Nelder_Mead : [OK]
## nlminbw : [OK]
## nmkbw : [OK]
## optimx.L-BFGS-B : [OK]
## nloptwrap.NLOPT_LN_NELDERMEAD : [OK]
## nloptwrap.NLOPT_LN_BOBYQA : [OK]

(warnings/messages suppressed)

We can use summary(af) to compare all of these fits.

  • (period|herd) vs. (1|period/herd) ([positive] compound symmetry)

\[ (\textrm{intercept}, \textrm{slope}) = \textrm{MVN}\left(\boldsymbol 0, \left[ \begin{array}{cccc} \sigma^2_{\{h|1\}} & . & . & . \\ \sigma_{\{h|1\},\{h|p_{21}\}} & \sigma^2_{\{h|p_{21}\}} & . & . \\ \sigma_{\{h|1\}, \{h|p_{31}\}} & \sigma_{\{h|p_{21}\},\{h|p_{31}\}} & \sigma^2_{\{h|p_{31}\}} & . \\ \sigma_{\{h|1\} ,\{h|p_{41}\}} & \sigma_{\{h|p_{21}\},\{h|p_{41}\}} & \sigma_{\{h|p_{31}\},\{h|p_{41}\}} & \sigma^2_{\{h|p_{41}\}} \end{array} \right] \right) \] (=\((n(n+1))/2 = (4\times 5)/2 = 10\) parameters) vs. \[ \left[ \begin{array}{cccc} \sigma^2 & . & . & . \\ \rho \sigma^2 & \sigma^2 & . & . \\ \rho \sigma^2 & \rho \sigma^2 & \sigma^2 & . \\ \rho \sigma^2 & \rho \sigma^2 & \rho \sigma^2 & \sigma^2 \\ \end{array} \right] \] where \(\sigma^2 = \sigma^2_{\{b|1\}}+\sigma^2_{\{herd:period|1\}}\), \(\rho = \sigma^2_{\{b|1\}}/\sigma^2\) (=2 parameters; \(\rho\) must be >0)

g1cs <- update(g1max,
               . ~ . - (period|herd) + (1|herd/period))

The latter model is called a compound symmetry model, i.e. the variances are the same and the covariances/correlations between all pairs are the same. This is a slightly restricted version of compound symmetry, because (the way we have set it up) only non-negative correlations are possible. In general, this is a good way to simplify variation of factor effects across groups when there are many levels of the factor, and when it is plausible to treat the factor levels as exchangeable. The simplified (CS) model works fine in this example - but is equivalent (in this case, where there is only one observation per herd per period) to observation-level random effects!

gopher tortoise data

load("../data/gopherdat2.RData")
Gdat$obs <- factor(seq(nrow(Gdat)))

Our desired maximal model would be something like

glmer(shells~prev+offset(log(Area))+(1|year)+(1|Site)+(1|obs),
      family=poisson,data=Gdat)
## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: poisson  ( log )
## Formula: shells ~ prev + offset(log(Area)) + (1 | year) + (1 | Site) +  
##     (1 | obs)
##    Data: Gdat
##      AIC      BIC   logLik deviance df.resid 
##  90.0359  97.0419 -40.0179  80.0359       25 
## Random effects:
##  Groups Name        Std.Dev. 
##  obs    (Intercept) 0.000e+00
##  Site   (Intercept) 1.089e-05
##  year   (Intercept) 1.163e-01
## Number of obs: 30, groups:  obs, 30; Site, 10; year, 3
## Fixed Effects:
## (Intercept)         prev  
##    -3.52835      0.02107

or

glmmTMB(shells~prev+offset(log(Area))+(1|year)+(1|Site),
      family=nbinom2,data=Gdat)
## Warning in fitTMB(TMBStruc): Model convergence problem; extreme or very
## small eigen values detected. See vignette('troubleshooting')
## Warning in fitTMB(TMBStruc): Model convergence problem; false convergence
## (8). See vignette('troubleshooting')
## Formula:          
## shells ~ prev + offset(log(Area)) + (1 | year) + (1 | Site)
## Data: Gdat
##       AIC       BIC    logLik  df.resid 
##  90.03588  97.04186 -40.01794        25 
## Random-effects (co)variances:
## 
## Conditional model:
##  Groups Name        Std.Dev. 
##  Site   (Intercept) 1.593e-08
##  year   (Intercept) 1.164e-01
## 
## Number of obs: 30 / Conditional model: Site, 10; year, 3
## 
## Overdispersion parameter for nbinom2 family (): 2.57e+07 
## 
## Fixed Effects:
## 
## Conditional model:
## (Intercept)         prev  
##    -3.52839      0.02107

Problems …

ended up with this …

gmod_lme4_L <- glmer(shells~prev+offset(log(Area))+factor(year)+(1|Site),
      family=poisson,data=Gdat,
      control=glmerControl(optimizer="bobyqa",
                           check.conv.grad=.makeCC("warning",0.05)))

(see ecostats example)

Arabidopsis data

Try full model (except use only a simple fixed effect for the top-level region, reg, which has only 3 levels):

load("../data/Banta.RData")
t0 <- system.time(
    mp1 <- glmer(total.fruits ~ nutrient*amd +
                 reg + rack + status +
                 (amd*nutrient|popu)+
                 (amd*nutrient|gen),
             data=dat.tf,
             family="poisson")
)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control
## $checkConv, : Model failed to converge with max|grad| = 0.00929584 (tol =
## 0.001, component 1)

Inspect for singularity/overdispersion:

eigen(VarCorr(mp1)$gen)$values
## [1] 2.16692409 0.45254513 0.12343780 0.02322956
eigen(VarCorr(mp1)$popu)$values
## [1] 1.662864e+00 1.070736e-01 1.131348e-07 3.371084e-08
deviance(mp1)/df.residual(mp1) ## !!
## [1] 23.33933
aods3::gof(mp1)
## D  = 13910.24, df = 596, P(>D) = 0
## X2 = 15079.57, df = 596, P(>X2) = 0

Add observation-level random effect (switch optimizer to BOBYQA)

dat.tf$obs <- 1:nrow(dat.tf)
t1 <- system.time(
    mp1X <- update(mp1,
                   . ~. + (1|obs),
                   control=glmerControl(optimizer="bobyqa"))
)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control
## $checkConv, : unable to evaluate scaled gradient
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control
## $checkConv, : Model failed to converge: degenerate Hessian with 1 negative
## eigenvalues

This model has 28 parameters and takes 26 seconds … Is glmmTMB faster?

t2 <- system.time(
    mp1g <- glmmTMB(total.fruits ~ nutrient*amd +
                        reg + rack + status +
                        (amd*nutrient|popu)+
                        (amd*nutrient|gen)+
                        (1|obs),
                    data=dat.tf,
                    family="poisson")
)
## Warning in fitTMB(TMBStruc): Model convergence problem; non-positive-
## definite Hessian matrix. See vignette('troubleshooting')
## Warning in fitTMB(TMBStruc): Model convergence problem; false convergence
## (8). See vignette('troubleshooting')

Better (5), although glmmTMB doesn’t handle singular fits quite as gracefully …

mp1X2 <- update(mp1X,
                . ~ . - (amd*nutrient|popu)
                - (amd*nutrient|gen)
                + (amd+nutrient|popu)
                + (amd+nutrient|gen))

Ugh, looks even worse …

VarCorr(mp1X2)
##  Groups Name        Std.Dev. Corr         
##  obs    (Intercept) 1.419975              
##  gen    (Intercept) 0.482349              
##         amdclipped  0.036182 -1.000       
##         nutrient8   0.312491 -1.000  1.000
##  popu   (Intercept) 0.434531              
##         amdclipped  0.075766 1.000        
##         nutrient8   0.138279 1.000  1.000

Use lmer_alt to split factor variable into separate terms …

mp1X3 <- afex::lmer_alt(total.fruits ~ nutrient*amd +
                            reg + rack + status +
                            (amd*nutrient||popu)+
                            (amd*nutrient||gen)+
                            (1|obs),
                        data=dat.tf,
                        family="poisson")
## Registered S3 method overwritten by 'car':
##   method           from
##   influence.merMod lme4
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control
## $checkConv, : Model failed to converge with max|grad| = 0.137199 (tol =
## 0.001, component 1)

Strip down further:

mp1X4 <- update(mp1X2,
                . ~ . - (amd+nutrient|popu)
                - (amd+nutrient|gen)
                + (nutrient|popu)
                + (1|gen))

Still correlation=+1, reduce further:

mp1X5 <- afex::lmer_alt(total.fruits ~ nutrient*amd +
                            reg + rack + status +
                            (1|gen)+
                            (1+nutrient||popu)+
                            (1|obs),
                        data=dat.tf,
                        family="poisson")
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control
## $checkConv, : Model failed to converge with max|grad| = 0.0201128 (tol =
## 0.001, component 1)
dwplot(list(mp1,mp1X3,mp1X4,mp1X5),effect="fixed")+
    geom_vline(xintercept=0,lty=2)

References

Barr, Dale J., Roger Levy, Christoph Scheepers, and Harry J. Tily. 2013. “Random Effects Structure for Confirmatory Hypothesis Testing: Keep It Maximal.” Journal of Memory and Language 68 (3): 255–78. doi:10.1016/j.jml.2012.11.001.

Bates, Douglas, Reinhold Kliegl, Shravan Vasishth, and Harald Baayen. 2015. “Parsimonious Mixed Models.” ArXiv:1506.04967 [Stat], June. http://arxiv.org/abs/1506.04967.

Matuschek, Hannes, Reinhold Kliegl, Shravan Vasishth, Harald Baayen, and Douglas Bates. 2017. “Balancing Type I Error and Power in Linear Mixed Models.” Journal of Memory and Language 94: 305–15. doi:10.1016/j.jml.2017.01.001.