cc Licensed under the Creative Commons attribution-noncommercial license. Please share & remix noncommercially, mentioning its origin.

## Warning: S3 methods 'ggplot2::autoplot.zoo', 'ggplot2::fortify.zoo' were
## declared in NAMESPACE but not found

Model specification

Distribution/family

  • going to assume Gaussian (Normal) here
  • may need to transform
  • we care about the conditional distribution, not the marginal
  • Box-Cox not implemented: could try MASS::boxcox() on residuals
  • brute-force: if it makes sense or if residuals look bad, try log/sqrt?
  • bias \(\gg\) heteroscedasticity/outliers \(\gg\) Normality

Look at your data!

  • use table() to confirm how many observations per level of factor
    • which factors covary?
    • balanced?
  • plot the data

Random effects: reminder (Bolker 2015)

  • don’t want to test differences across specific groups
  • lots of groups (>5), few observations per group, unbalanced
  • exchangeable groups

Formulas

  • random effects specified with | as (a|g1) + (b|g2) + ...
  • right-hand side is the grouping variable (always categorical, usually a factor)
  • left-hand side is the varying term (most often 1)s
  • terms separated by + are independent
equation formula
\(β_0 + β_{1}X_{i} + e_{si}\) n/a (Not a mixed-effects model)
\((β_0 + b_{S,0s}) + β_{1}X_i + e_{si}\) ∼ X + (1∣Subject)
\((β_0 + b_{S,0s}) + (β_{1} + b_{S,1s}) X_i + e_{si}\) ~ X + (1 + X∣Subject)
\((β_0 + b_{S,0s} + b_{I,0i}) + (β_{1} + b_{S,1s}) X_i + e_{si}\) ∼ X + (1 + X∣Subject) + (1∣Item)
As above, but \(S_{0s}\), \(S_{1s}\) independent ∼ X + (1∣Subject) + (0 + X∣ Subject) + (1∣Item)
\((β_0 + b_{S,0s} + b_{I,0i}) + β_{1}X_i + e_{si}\) ∼ X + (1∣Subject) + (1∣Item)
\((β_0 + b_{I,0i}) + (β_{1} + b_{S,1s})X_i + e_{si}\) ∼ X + (0 + X∣Subject) + (1∣Item)

Modified from: http://stats.stackexchange.com/questions/13166/rs-lmer-cheat-sheet?lq=1 (Livius)

Formulas, interactions, nesting etc.

a*b = main effects plus interaction, a:b = interaction only, a/b = a + a:b

  • Nested: (1|f/g) equivalent to (1|f) + (1|f:g). e.g. subplots vary within plots (but “subplot 1 of every plot” isn’t meaningful)
  • Crossed: (1|f) + (1|g). e.g. years vary, and plots vary independently
  • Crossed+: (1|f) + (1|g) + (1|f:g). e.g. years vary, and plots vary independently, and plots also vary within years (for LMMs, assumes >1 observation per plot/year combination). ((1|f*g) should be allowed but …)

Don’t need explicit nesting if your sub-groups are uniquely labeled (i.e. A1, A2, …, B1, B2, …)

What is the maximal model?

  • Which effects vary within which groups?
  • If effects don’t vary within groups, then we can’t estimate among-group variation in the effect
    • convenient
    • maybe less powerful (among-group variation is lumped into residual variation)
  • e.g. female rats exposed to different doses of radiation, multiple pups per mother, multiple measurements per pup (labeled by time). Maximal model … ?

Maximal model often won’t work

Random-slopes models: what does (x|g) really do?

  • equivalent to (1+x|g)
  • both intercept (baseline) and slope vary across groups
  • estimates bivariate zero-centered distribution:

\[ (\textrm{intercept}, \textrm{slope}) = \textrm{MVN}\left(\boldsymbol 0, \left[ \begin{array}{cc} \sigma^2_{\textrm{int}} & \sigma_{\textrm{int},\textrm{slope}} \\ \sigma_{\textrm{int},\textrm{slope}} & \sigma^2_{\textrm{slope}} \end{array} \right] \right) \]


  • maximal model can get very complicated (e.g. (1+x+y+z+...|g): \(n\) effects require \(n(n+1)/2\) variances + covariances

Other FAQs

  • Can you have continuous variables as REs?
    A: yes, as varying terms, but not as grouping variables
  • Can a variable be in the model as both FE and RE?
    A: only in the special case where \(x\) is numeric but discrete (e.g. year) and >1 observation per \(x\) value;
    FE describes overall trend, RE describes variation around the trend

What is a practical model?

  • Fits aren’t singular
  • singular = zero variances, +/- 1 correlations
  • More subtle for larger models:
    all(abs(getME(x,"theta"))>1e-4)

Why are fits singular?

Essentially, because the observed among-group variation is less than the expected among-group variation (= \(\sigma^2_{\mbox{among}} + \sigma^2_{\mbox{within}}/n\)). More generally, because some dimension of the variance-covariance matrix has zero extent …

Simplified versions of models

  • (1|b/a) ([positive] compound symmetry) vs. (a|b): \[ (\textrm{intercept}, \textrm{slope}) = \textrm{MVN}\left(\boldsymbol 0, \left[ \begin{array}{ccccc} \sigma^2_{\{b|1\}} & . & . & . & . \\ \sigma_{\{b|1\},\{b|a_{21}\}} & \sigma^2_{\{b|a_{21}\}} & . & . & . \\ \sigma_{\{b|1\}, \{b|a_{31}\}} & \sigma_{\{b|a_{21}\},\{b|a_{31}\}} & \sigma^2_{\{b|a_{31}\}} & . & . \\ \sigma_{\{b|1\} ,\{b|a_{41}\}} & \sigma_{\{b|a_{21}\},\{b|a_{41}\}} & \sigma_{\{b|a_{31}\},\{b|a_{41}\}} & \sigma^2_{\{b|a_{41}\}} & . \\ \sigma_{\{b|1\},\{b|a_{51}\}} & \sigma_{\{b|a_{21}\},\{b|a_{51}\}} & \sigma_{\{b|a_{31}\},\{b|a_{51}\}} & \sigma_{\{b|a_{41}\},\{b|a_{51}\}} & \sigma^2_{\{b|a_{51}\}} \end{array} \right] \right) \] (=\((n(n+1))/2 = (4\times 5)/2 = 10\) parameters) vs. \[ \left[ \begin{array}{ccccc} \sigma^2 & . & . & . & . \\ \rho \sigma^2 & \sigma^2 & . & . & . \\ \rho \sigma^2 & \rho \sigma^2 & \sigma^2 & . & . \\ \rho \sigma^2 & \rho \sigma^2 & \rho \sigma^2 & \sigma^2 & . \\ \rho \sigma^2 & \rho \sigma^2 & \rho \sigma^2 & \rho \sigma^2 & \sigma^2 \end{array} \right] \] where \(\sigma^2 = \sigma^2_{\{b|1\}}+\sigma^2_{\{a:b|1\}}\), \(\rho = \sigma^2_{\{b|1\}}/\sigma^2\) (=2 parameters; \(\rho\) must be >0)

  • (1+x+y+z||b)
    • independent terms
    • expands to (1|b) + (0+x|b) + ...
    • lme4 version only works properly for continuous predictors
    • afex::mixed can do this
    • independent model is no longer invariant to shifts/reparameterization
    • \(n\) instead of \(n(n+1)/2\) parameters
  • RE “nested within” FE, e.g. if not enough groups at the top level; y ~ g1 + (1|g1:g2); more parameters but fixed rather than random

Restricted maximum likelihood (REML)

  • reduce bias in estimated variances and covariances
  • similar to dividing by \(n-1\) instead of \(n\) when computing sample variances
  • example: paired \(t\)-test
  • works by

Fit model

m1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)

Diagnostics: residual scale

Similar to generalized linear models: fitted vs. residual, scale-location, Q-Q

aa <- augment(m1)
## fitted vs resid
plot(m1,type=c("p","smooth"))
## scale-location
plot(m1,sqrt(abs(resid(.)))~fitted(.),type=c("p","smooth"))
## q-q
lattice::qqmath(m1,id=0.05)

Or with augment and ggplot …

aa <- augment(m1)
## fitted vs resid
ggplot(aa,aes(.fitted,.resid))+geom_point()+geom_smooth()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## scale-location
ggplot(aa,aes(.fitted,sqrt(abs(.resid))))+geom_point()+geom_smooth()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## q-q
ggplot(aa)+stat_qq(aes(sample=.resid))

The broom/broom.mixed packages

Inference

Likelihood

  • probability of data given model
  • for mixed models, includes the integral over the random effects

Profiles and intervals

Wald approximation

most important:

  • small sample sizes
  • values near boundaries
    • changing scales may help

more important for variance parameters than fixed-effect parameters

Finite-size corrections

e.g. Normal vs. \(t\)

Posterior predictive simulation

  • if there is some value you’re interested in that you can compute from your data (e.g. number of zero observations, or total NPP), you can simulate() it fro the fitted model:

Inference: fixed effects, Wald

  • parameter estimates: summary()
  • termwise tests: car::Anova, afex::anova
  • contrasts/effects/post-hoc tests: emmeans, effects packages
car::Anova(m1)
## Warning: S3 method 'xts::as.xts.data.table' was declared in NAMESPACE but
## not found

Degrees of freedom

  • level-counting: R/calcDenDF.R
  • lmerTest/afex; Satterthwaite, Kenward-Roger
library(lmerTest)
## 
## Attaching package: 'lmerTest'
## The following object is masked from 'package:lme4':
## 
##     lmer
## The following object is masked from 'package:stats':
## 
##     step
summary(as_lmerModLmerTest(m1))
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: Reaction ~ Days + (Days | Subject)
##    Data: sleepstudy
## 
## REML criterion at convergence: 1743.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.9536 -0.4634  0.0231  0.4634  5.1793 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr
##  Subject  (Intercept) 612.09   24.740       
##           Days         35.07    5.922   0.07
##  Residual             654.94   25.592       
## Number of obs: 180, groups:  Subject, 18
## 
## Fixed effects:
##             Estimate Std. Error      df t value Pr(>|t|)    
## (Intercept)  251.405      6.825  17.000  36.838  < 2e-16 ***
## Days          10.467      1.546  17.000   6.771 3.26e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##      (Intr)
## Days -0.138
summary(as_lmerModLmerTest(m1),ddf="Kenward-Roger")
## Linear mixed model fit by REML. t-tests use Kenward-Roger's method [
## lmerModLmerTest]
## Formula: Reaction ~ Days + (Days | Subject)
##    Data: sleepstudy
## 
## REML criterion at convergence: 1743.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.9536 -0.4634  0.0231  0.4634  5.1793 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr
##  Subject  (Intercept) 612.09   24.740       
##           Days         35.07    5.922   0.07
##  Residual             654.94   25.592       
## Number of obs: 180, groups:  Subject, 18
## 
## Fixed effects:
##             Estimate Std. Error      df t value Pr(>|t|)    
## (Intercept)  251.405      6.825  17.000  36.838  < 2e-16 ***
## Days          10.467      1.546  17.000   6.771 3.26e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##      (Intr)
## Days -0.138

Q: Why not use lmerTest all the time?
A: it can make it a little harder to diagnose fitting problems

Inference: fixed effects, likelihood ratio test

  • individual parameters: profile(), confint()
  • fit pairwise models and use anova() to compare
  • drop1, afex::anova()
confint(m1)
## Computing profile confidence intervals ...
##                   2.5 %     97.5 %
## .sig01       14.3815048  37.715996
## .sig02       -0.4815007   0.684986
## .sig03        3.8011641   8.753383
## .sigma       22.8982669  28.857997
## (Intercept) 237.6806955 265.129515
## Days          7.3586533  13.575919

What took so long?

Inference: parametric bootstrap

Very slow!

  • pbkrtest
  • confint(.,method="boot")

Inference: random effects

  • Wald is probably very bad
  • profile, anova()
  • parametric bootstrap
  • boundary problems

Inference: CIs on predictions etc.

mm <- model.matrix(terms(fitted_model),newdat)
se_pred <- sqrt(diag(mm %*% tcrossprod(vcov(fitted_model),mm)))

emmeans etc.

data("Oats",package="nlme")
m_oats <- lmer(yield~nitro*Variety+(1|Block),data=Oats)
(e1 <- emmeans(m_oats,~nitro|Variety))
## Variety = Golden Rain:
##  nitro   emmean       SE   df lower.CL upper.CL
##    0.3 104.5000 7.124279 6.61 87.45087 121.5491
## 
## Variety = Marvellous:
##  nitro   emmean       SE   df lower.CL upper.CL
##    0.3 109.7917 7.124279 6.61 92.74254 126.8408
## 
## Variety = Victory:
##  nitro   emmean       SE   df lower.CL upper.CL
##    0.3  97.6250 7.124279 6.61 80.57587 114.6741
## 
## Degrees-of-freedom method: kenward-roger 
## Confidence level used: 0.95
plot(e1)

e2 <- emmeans(m_oats,~Variety)
## NOTE: Results may be misleading due to involvement in interactions
##
contrast(e2,"pairwise")
##  contrast                  estimate       SE df t.ratio p.value
##  Golden Rain - Marvellous -5.291667 4.469012 61  -1.184  0.4671
##  Golden Rain - Victory     6.875000 4.469012 61   1.538  0.2804
##  Marvellous - Victory     12.166667 4.469012 61   2.722  0.0226
## 
## P value adjustment: tukey method for comparing a family of 3 estimates

also see the effects package.

car::Anova(m_oats,test="F")
## Analysis of Deviance Table (Type II Wald F tests with Kenward-Roger df)
## 
## Response: yield
##                     F Df Df.res    Pr(>F)    
## nitro         81.5155  1     61 7.648e-13 ***
## Variety        3.7268  2     61   0.02972 *  
## nitro:Variety  0.3512  2     61   0.70524    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Bolker, Benjamin M. 2015. “Linear and Generalized Linear Mixed Models.” In Ecological Statistics: Contemporary Theory and Application, edited by Gordon A. Fox, Simoneta Negrete-Yankelevich, and Vinicio J. Sosa. Oxford University Press.