Licensed under the Creative Commons attribution-noncommercial license. Please share & remix noncommercially, mentioning its origin.

## Warning: S3 methods 'ggplot2::autoplot.zoo', 'ggplot2::fortify.zoo' were
## declared in NAMESPACE but not found

Model specification

Distribution/family

going to assume Gaussian (Normal) here
may need to transform
we care about the conditional distribution, not the marginal
Box-Cox not implemented: could try MASS::boxcox() on residuals
brute-force: if it makes sense or if residuals look bad, try log/sqrt?
bias \(\gg\) heteroscedasticity/outliers \(\gg\) Normality

Look at your data!

use table() to confirm how many observations per level of factor
- which factors covary?
- balanced?
plot the data

Random effects: reminder (Bolker 2015)

don’t want to test differences across specific groups
lots of groups (>5), few observations per group, unbalanced
exchangeable groups

Formulas

random effects specified with | as (a|g1) + (b|g2) + ...
right-hand side is the grouping variable (always categorical, usually a factor)
left-hand side is the varying term (most often 1)s
terms separated by + are independent

equation	formula
\(β_0 + β_{1}X_{i} + e_{si}\)	n/a (Not a mixed-effects model)
\((β_0 + b_{S,0s}) + β_{1}X_i + e_{si}\)	`∼ X + (1∣Subject)`
\((β_0 + b_{S,0s}) + (β_{1} + b_{S,1s}) X_i + e_{si}\)	`~ X + (1 + X∣Subject)`
\((β_0 + b_{S,0s} + b_{I,0i}) + (β_{1} + b_{S,1s}) X_i + e_{si}\)	`∼ X + (1 + X∣Subject) + (1∣Item)`
As above, but \(S_{0s}\), \(S_{1s}\) independent	`∼ X + (1∣Subject) + (0 + X∣ Subject) + (1∣Item)`
\((β_0 + b_{S,0s} + b_{I,0i}) + β_{1}X_i + e_{si}\)	`∼ X + (1∣Subject) + (1∣Item)`
\((β_0 + b_{I,0i}) + (β_{1} + b_{S,1s})X_i + e_{si}\)	`∼ X + (0 + X∣Subject) + (1∣Item)`

Modified from: http://stats.stackexchange.com/questions/13166/rs-lmer-cheat-sheet?lq=1 (Livius)

Formulas, interactions, nesting etc.

a*b = main effects plus interaction, a:b = interaction only, a/b = a + a:b

Nested: (1|f/g) equivalent to (1|f) + (1|f:g). e.g. subplots vary within plots (but “subplot 1 of every plot” isn’t meaningful)
Crossed: (1|f) + (1|g). e.g. years vary, and plots vary independently
Crossed+: (1|f) + (1|g) + (1|f:g). e.g. years vary, and plots vary independently, and plots also vary within years (for LMMs, assumes >1 observation per plot/year combination). ((1|f*g) should be allowed but …)

Don’t need explicit nesting if your sub-groups are uniquely labeled (i.e. A1, A2, …, B1, B2, …)

What is the maximal model?

Which effects vary within which groups?
If effects don’t vary within groups, then we can’t estimate among-group variation in the effect
- convenient
- maybe less powerful (among-group variation is lumped into residual variation)
e.g. female rats exposed to different doses of radiation, multiple pups per mother, multiple measurements per pup (labeled by time). Maximal model … ?

Maximal model often won’t work

Random-slopes models: what does `(x|g)` really do?

equivalent to (1+x|g)
both intercept (baseline) and slope vary across groups
estimates bivariate zero-centered distribution:

\[ (\textrm{intercept}, \textrm{slope}) = \textrm{MVN}\left(\boldsymbol 0, \left[ \begin{array}{cc} \sigma^2_{\textrm{int}} & \sigma_{\textrm{int},\textrm{slope}} \\ \sigma_{\textrm{int},\textrm{slope}} & \sigma^2_{\textrm{slope}} \end{array} \right] \right) \]

maximal model can get very complicated (e.g. (1+x+y+z+...|g): \(n\) effects require \(n(n+1)/2\) variances + covariances

Other FAQs

Can you have continuous variables as REs?
A: yes, as varying terms, but not as grouping variables
Can a variable be in the model as both FE and RE?
A: only in the special case where \(x\) is numeric but discrete (e.g. year) and >1 observation per \(x\) value;
FE describes overall trend, RE describes variation around the trend

What is a practical model?

Fits aren’t singular
singular = zero variances, +/- 1 correlations
More subtle for larger models:
all(abs(getME(x,"theta"))>1e-4)

Why are fits singular?

Essentially, because the observed among-group variation is less than the expected among-group variation (= \(\sigma^2_{\mbox{among}} + \sigma^2_{\mbox{within}}/n\)). More generally, because some dimension of the variance-covariance matrix has zero extent …

Simplified versions of models

(1|b/a) ([positive] compound symmetry) vs. (a|b): \[ (\textrm{intercept}, \textrm{slope}) = \textrm{MVN}\left(\boldsymbol 0, \left[ \begin{array}{ccccc} \sigma^2_{\{b|1\}} & . & . & . & . \\ \sigma_{\{b|1\},\{b|a_{21}\}} & \sigma^2_{\{b|a_{21}\}} & . & . & . \\ \sigma_{\{b|1\}, \{b|a_{31}\}} & \sigma_{\{b|a_{21}\},\{b|a_{31}\}} & \sigma^2_{\{b|a_{31}\}} & . & . \\ \sigma_{\{b|1\} ,\{b|a_{41}\}} & \sigma_{\{b|a_{21}\},\{b|a_{41}\}} & \sigma_{\{b|a_{31}\},\{b|a_{41}\}} & \sigma^2_{\{b|a_{41}\}} & . \\ \sigma_{\{b|1\},\{b|a_{51}\}} & \sigma_{\{b|a_{21}\},\{b|a_{51}\}} & \sigma_{\{b|a_{31}\},\{b|a_{51}\}} & \sigma_{\{b|a_{41}\},\{b|a_{51}\}} & \sigma^2_{\{b|a_{51}\}} \end{array} \right] \right) \] (=\((n(n+1))/2 = (4\times 5)/2 = 10\) parameters) vs. \[ \left[ \begin{array}{ccccc} \sigma^2 & . & . & . & . \\ \rho \sigma^2 & \sigma^2 & . & . & . \\ \rho \sigma^2 & \rho \sigma^2 & \sigma^2 & . & . \\ \rho \sigma^2 & \rho \sigma^2 & \rho \sigma^2 & \sigma^2 & . \\ \rho \sigma^2 & \rho \sigma^2 & \rho \sigma^2 & \rho \sigma^2 & \sigma^2 \end{array} \right] \] where \(\sigma^2 = \sigma^2_{\{b|1\}}+\sigma^2_{\{a:b|1\}}\), \(\rho = \sigma^2_{\{b|1\}}/\sigma^2\) (=2 parameters; \(\rho\) must be >0)
(1+x+y+z||b)
- independent terms
- expands to (1|b) + (0+x|b) + ...
- lme4 version only works properly for continuous predictors
- afex::mixed can do this
- independent model is no longer invariant to shifts/reparameterization
- \(n\) instead of \(n(n+1)/2\) parameters
RE “nested within” FE, e.g. if not enough groups at the top level; y ~ g1 + (1|g1:g2); more parameters but fixed rather than random

Restricted maximum likelihood (REML)

reduce bias in estimated variances and covariances
similar to dividing by \(n-1\) instead of \(n\) when computing sample variances
example: paired \(t\)-test
works by

Fit model

m1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)

Diagnostics: residual scale

Similar to generalized linear models: fitted vs. residual, scale-location, Q-Q

aa <- augment(m1)
## fitted vs resid
plot(m1,type=c("p","smooth"))
## scale-location
plot(m1,sqrt(abs(resid(.)))~fitted(.),type=c("p","smooth"))
## q-q
lattice::qqmath(m1,id=0.05)

Or with augment and ggplot …

aa <- augment(m1)
## fitted vs resid
ggplot(aa,aes(.fitted,.resid))+geom_point()+geom_smooth()

## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

## scale-location
ggplot(aa,aes(.fitted,sqrt(abs(.resid))))+geom_point()+geom_smooth()

## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

## q-q
ggplot(aa)+stat_qq(aes(sample=.resid))

The broom/broom.mixed packages

tidy: get coefficient values
- effects="fixed"
- effects="ran_vars"
- effects="ran_vals"
- effects="ran_coefs"
glance: get model summaries
augment: get fitted, residuals, possibly predictions

Inference

Likelihood

probability of data given model
for mixed models, includes the integral over the random effects

Profiles and intervals

Wald approximation

most important:

small sample sizes
values near boundaries
- changing scales may help

more important for variance parameters than fixed-effect parameters

Finite-size corrections

e.g. Normal vs. \(t\)

Posterior predictive simulation

if there is some value you’re interested in that you can compute from your data (e.g. number of zero observations, or total NPP), you can simulate() it fro the fitted model:

Inference: fixed effects, Wald

parameter estimates: summary()
termwise tests: car::Anova, afex::anova
contrasts/effects/post-hoc tests: emmeans, effects packages

car::Anova(m1)

## Warning: S3 method 'xts::as.xts.data.table' was declared in NAMESPACE but
## not found

Degrees of freedom

level-counting: R/calcDenDF.R
lmerTest/afex; Satterthwaite, Kenward-Roger

library(lmerTest)

## 
## Attaching package: 'lmerTest'

## The following object is masked from 'package:lme4':
## 
##     lmer

## The following object is masked from 'package:stats':
## 
##     step

summary(as_lmerModLmerTest(m1))

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: Reaction ~ Days + (Days | Subject)
##    Data: sleepstudy
## 
## REML criterion at convergence: 1743.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.9536 -0.4634  0.0231  0.4634  5.1793 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr
##  Subject  (Intercept) 612.09   24.740       
##           Days         35.07    5.922   0.07
##  Residual             654.94   25.592       
## Number of obs: 180, groups:  Subject, 18
## 
## Fixed effects:
##             Estimate Std. Error      df t value Pr(>|t|)    
## (Intercept)  251.405      6.825  17.000  36.838  < 2e-16 ***
## Days          10.467      1.546  17.000   6.771 3.26e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##      (Intr)
## Days -0.138

summary(as_lmerModLmerTest(m1),ddf="Kenward-Roger")

## Linear mixed model fit by REML. t-tests use Kenward-Roger's method [
## lmerModLmerTest]
## Formula: Reaction ~ Days + (Days | Subject)
##    Data: sleepstudy
## 
## REML criterion at convergence: 1743.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.9536 -0.4634  0.0231  0.4634  5.1793 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr
##  Subject  (Intercept) 612.09   24.740       
##           Days         35.07    5.922   0.07
##  Residual             654.94   25.592       
## Number of obs: 180, groups:  Subject, 18
## 
## Fixed effects:
##             Estimate Std. Error      df t value Pr(>|t|)    
## (Intercept)  251.405      6.825  17.000  36.838  < 2e-16 ***
## Days          10.467      1.546  17.000   6.771 3.26e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##      (Intr)
## Days -0.138

Q: Why not use lmerTest all the time?
A: it can make it a little harder to diagnose fitting problems

Inference: fixed effects, likelihood ratio test

individual parameters: profile(), confint()
fit pairwise models and use anova() to compare
drop1, afex::anova()

confint(m1)

## Computing profile confidence intervals ...

##                   2.5 %     97.5 %
## .sig01       14.3815048  37.715996
## .sig02       -0.4815007   0.684986
## .sig03        3.8011641   8.753383
## .sigma       22.8982669  28.857997
## (Intercept) 237.6806955 265.129515
## Days          7.3586533  13.575919

What took so long?

Inference: parametric bootstrap

Very slow!

pbkrtest
confint(.,method="boot")

Inference: random effects

Wald is probably very bad
profile, anova()
parametric bootstrap
boundary problems

Inference: CIs on predictions etc.

mm <- model.matrix(terms(fitted_model),newdat)
se_pred <- sqrt(diag(mm %*% tcrossprod(vcov(fitted_model),mm)))

emmeans etc.

data("Oats",package="nlme")
m_oats <- lmer(yield~nitro*Variety+(1|Block),data=Oats)
(e1 <- emmeans(m_oats,~nitro|Variety))

## Variety = Golden Rain:
##  nitro   emmean       SE   df lower.CL upper.CL
##    0.3 104.5000 7.124279 6.61 87.45087 121.5491
## 
## Variety = Marvellous:
##  nitro   emmean       SE   df lower.CL upper.CL
##    0.3 109.7917 7.124279 6.61 92.74254 126.8408
## 
## Variety = Victory:
##  nitro   emmean       SE   df lower.CL upper.CL
##    0.3  97.6250 7.124279 6.61 80.57587 114.6741
## 
## Degrees-of-freedom method: kenward-roger 
## Confidence level used: 0.95

plot(e1)

e2 <- emmeans(m_oats,~Variety)

## NOTE: Results may be misleading due to involvement in interactions

##
contrast(e2,"pairwise")

##  contrast                  estimate       SE df t.ratio p.value
##  Golden Rain - Marvellous -5.291667 4.469012 61  -1.184  0.4671
##  Golden Rain - Victory     6.875000 4.469012 61   1.538  0.2804
##  Marvellous - Victory     12.166667 4.469012 61   2.722  0.0226
## 
## P value adjustment: tukey method for comparing a family of 3 estimates

also see the effects package.

car::Anova(m_oats,test="F")

## Analysis of Deviance Table (Type II Wald F tests with Kenward-Roger df)
## 
## Response: yield
##                     F Df Df.res    Pr(>F)    
## nitro         81.5155  1     61 7.648e-13 ***
## Variety        3.7268  2     61   0.02972 *  
## nitro:Variety  0.3512  2     61   0.70524    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Bolker, Benjamin M. 2015. “Linear and Generalized Linear Mixed Models.” In Ecological Statistics: Contemporary Theory and Application, edited by Gordon A. Fox, Simoneta Negrete-Yankelevich, and Vinicio J. Sosa. Oxford University Press.

Linear mixed models in R: nitty-gritty

Ben Bolker

12 November 2018

Model specification

Distribution/family

Look at your data!

Random effects: reminder (Bolker 2015)

Formulas

Formulas, interactions, nesting etc.

What is the maximal model?

Random-slopes models: what does `(x|g)` really do?

Other FAQs

What is a practical model?

Why are fits singular?

Simplified versions of models

Restricted maximum likelihood (REML)

Fit model

Diagnostics: residual scale

The broom/broom.mixed packages

Inference

Likelihood

Profiles and intervals

Wald approximation

Finite-size corrections

Posterior predictive simulation

Inference: fixed effects, Wald

Degrees of freedom

Inference: fixed effects, likelihood ratio test

Inference: parametric bootstrap

Inference: random effects

Inference: CIs on predictions etc.

emmeans etc.

Linear mixed models in R: nitty-gritty

Ben Bolker

12 November 2018

Model specification

Distribution/family

Look at your data!

Random effects: reminder (Bolker 2015)

Formulas

Formulas, interactions, nesting etc.

What is the maximal model?

Random-slopes models: what does (x|g) really do?

Other FAQs

What is a practical model?

Why are fits singular?

Simplified versions of models

Restricted maximum likelihood (REML)

Fit model

Diagnostics: residual scale

The broom/broom.mixed packages

Inference

Likelihood

Profiles and intervals

Wald approximation

Finite-size corrections

Posterior predictive simulation

Inference: fixed effects, Wald

Degrees of freedom

Inference: fixed effects, likelihood ratio test

Inference: parametric bootstrap

Inference: random effects

Inference: CIs on predictions etc.

emmeans etc.

Random-slopes models: what does `(x|g)` really do?