Licensed under the Creative Commons attribution-noncommercial license. Please share & remix noncommercially, mentioning its origin.
Load packages:
## modeling packages
library("lme4") ## basic (G)LMMs
library("nlme") ## more LMMs
library("glmmTMB")
library("afex") ## helper functions
library("emmeans")
## graphics
library("ggplot2"); theme_set(theme_bw())
## squash panels together
zero_margin <- theme(panel.spacing=grid::unit(0,"lines"))
library("lattice")
library("dotwhisker") ## coefficient plots
## data manipulation
library("broom.mixed")
Data from Toby Marthews, r-sig-mixed-models mailing list: you can find it here.
subject
: individual birdroostsitu
(tree, nest-box, inside, other): location of
nest boxmnth
(Nov, Jan): monthstmass
: mass of bird## cosmetic tweaks
load("data/starling.RData") ## loads "dataf"
ggplot(dataf,aes(x=mnth,y=stmass))+
geom_point()+
geom_line(aes(group=subject))+ ## connect subjects with a line
facet_grid(.~roostsitu)+ ## 1 row of panels by roost
zero_margin ## squash together
You could also try
ggplot(dataf,aes(mnth,stmass,colour=roostsitu))+
geom_point()+
geom_line(aes(group=subject))
Overkill for this data set, but sometimes it can be useful to put every individual in its own facet:
## reorder individuals by magnitude of difference
dataf <- transform(dataf, subject=reorder(subject,stmass,FUN=diff))
ggplot(dataf,aes(mnth,stmass,colour=roostsitu,group=subject))+
geom_point()+geom_line()+facet_wrap(~subject)+
zero_margin
It’s pretty obvious that the starting (November) mass varies among
roost situations (tree/nest-box/etc.), and that mass increases from
November to January, but we might like to know if the slopes differ
among situations. That means our fixed effects would be
~roostsitu*mnth
, with our attention focused on the
roostsitu:mnth
(interaction) term. For random effects, we
can allow both intercept (obviously) and slope (maybe) to vary among
individuals, via (1+mnth|subject)
or equivalent … in this
case, because measurements are only taken in two months, we can also
write the random term as (1|subject/mnth)
.
However, it turns out that we can’t actually estimate random slopes for this model, because every individual is only measured twice. That means that the variance in the slopes would be completely confounded with the residual variance.
If we fit this with lme4
:
lmer1 <- lmer(stmass~mnth*roostsitu+(1|subject/mnth),data=dataf)
## Error: number of levels of each grouping factor must be < number of observations (problems: mnth:subject)
We get an error. We can actually do this
(lmerControl(check.nobs.vs.nlev="ignore")
; see
?lmerControl
for details), and it will actually work, but
it will pick an arbitrary division between the residual variance and the
month:subject
variance.
In principle we can do this model with glmmTMB
, by
dropping the residual variance term (for technical reasons, this isn’t
possible in lme4
):
glmmTMB1 <- glmmTMB(stmass~mnth*roostsitu+(1|subject/mnth),
data = dataf,
## REML = TRUE, ## figure out why this gives a npd Hessian ...
dispformula = ~0)
But now let’s forget about including the (unidentifiable) random slopes.
lmer2 <- lmer(stmass~mnth*roostsitu+(1|subject),data=dataf)
We can now get estimates, although the subject-level random effects
are very uncertain: see confint(lmer2)
Walking through the output:
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: stmass ~ mnth * roostsitu + (1 | subject)
## Data: dataf
## REML criterion at convergence: 429.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
This is a quick reminder of the scaled residuals; these should be approximately unskewed (median \(\approx 0\)) and the extremes should be somewhere around \(\pm 2\) (bigger in a big data set )
##
## Random effects:
## Groups Name Variance Std.Dev.
## subject (Intercept) 0.3444 0.5869
## Residual 17.3500 4.1653
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 83.600 1.330 71.973 62.847 < 2e-16 ***
## mnthJan 7.200 1.863 36.000 3.865 0.000446 ***
## roostsitunest-box -4.200 1.881 71.973 -2.233 0.028685 *
## roostsituinside -5.000 1.881 71.973 -2.658 0.009682 **
## roostsituother -8.200 1.881 71.973 -4.359 4.27e-05 ***
## mnthJan:roostsitunest-box 3.600 2.634 36.000 1.367 0.180244
lmer
does
not tell us the denominator degrees of freedom for the test
(although we can get a rough idea of importance/significance fro the
\(t\) statistics; e.g. \(t>2.5\) will be significant at \(p<0.05\) for 6 or more degrees of
freedom). We’ll come back to this in the inference section.performance::check_model()
gives a relatively complete
set of model diagnostics.
performance::check_model(lmer2)
DHARMa
(diagnostics via simulated residuals: usually
unnecessary for LMMs):
sr2 <- DHARMa::simulateResiduals(lmer2)
plot(sr2)
lme4
also has some built-in plotting capabilities
(skipped during live-coding session)
Basic diagnostic plots: fitted vs. residuals, coloured according to
the roostsitu
variable (col=dataf$roostsitu
),
point type according to month (pch=
).
plot(lmer2,col=dataf$roostsitu,pch=dataf$mnth,id=0.05)
The id
argument specifies to mark points that are beyond
the specified Normal confidence limits (i.e. \(\alpha\)-values) (these are not outliers in
any formal statistical sense but might be points you want to look at if
they are otherwise interesting). By default they are labeled by
random-effect group; use idLabels
to change the default
labels, and idLabels=~.obs
to label by observation
number.
You can get a scale-location plot by specifying
## FIXME: why doesn't this match performance scale-resid plot?
ss <- residuals(lmer2)/(1-hatvalues(lmer2))
xyplot(sqrt(abs(ss)) ~ fitted(lmer2),type=c("p","smooth"))
Adding the smoothed line is helpful because uneven sampling can influence your perception of the pattern.
Q-Q plot (a straight line indicates normality)
qqmath(lmer2,col=dataf$roostsitu,pch=dataf$mnth)
We can also do this with broom.mixed::augment()
and
stat_qq()
(although there are also some some limitations -
colour
argument doesn’t actually work…)
ggplot(augment(lmer2),
aes(sample=.resid/sd(.resid)))+ ## scale to variance=1
stat_qq(aes(group=1,colour=roostsitu))+
geom_abline(intercept=0,slope=1)
## Warning: The following aesthetics were dropped during statistical
## transformation: colour.
## ℹ This can happen when ggplot fails to infer the correct grouping
## structure in the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a
## numerical variable into a factor?
There are some deviations here, but not enough that I would worry very much. In particular, the distribution is slightly thin-tailed (the smallest residuals are largest than expected, and the largest residuals are smaller than expected), which would make the results slightly conservative (a fat-tailed distribution would make them anticonservative).
Boxplots of residuals subdivided by roostsitu
(you have
to put the grouping variable on the left side of the formula
here):
plot(lmer2,roostsitu~resid(.))
Might be easier with augment
:
aa <- augment(lmer2)
ggplot(aa,aes(roostsitu,.resid))+
geom_boxplot()+coord_flip()
Plot random effects to look for outliers:
dotplot(ranef(lmer2))
## $subject
Or with tidy
+ ggplot
:
tt <- tidy(lmer2,effects="ran_vals")
tt <- transform(tt,level=reorder(level,estimate))
ggplot(tt,aes(level,estimate))+
geom_pointrange(aes(ymin=estimate-1.96*std.error,
ymax=estimate+1.96*std.error))+
coord_flip()
In general, try plotting residuals and \(\sqrt{\textrm{abs}(r)}\) both as a function
of the fitted values and as a function of particularly important
predictors.
hatvalues()
function can also be useful (although
all hat values are identical here)influence.ME
package. Try
plot(influence(lmer2,group="subject"))
and interpret the
results …tundra_agg.rda
(from the data
directory given above). The GS.NEE
records the net
ecosystem exchange (in grams of carbon/m^2/year) in a given year and
site; cYear
is a centered version of the year;
n
records the number of observations taken in a given
site/year (the NEE value given is an average of these
observations).ggplot2
if you can, otherwise
with base R graphics) to inspect the datalmer
fit (note this data set
does not have the issue of the starling data set), draw some preliminary
conclusions, and inspect the plot diagnostics.Never start doing inference until you’re finished with diagnostics/satisfied with model fit
The coefficient plot is a good first look:
dotwhisker::dwplot()
does this. Here we use
by_2sd=TRUE
to scale coefficients by 2x the SD of the
predictor variable …
dotwhisker::dwplot(lmer2, by_2sd=TRUE, effects = "fixed")+geom_vline(xintercept=0,lty=2)
(Note that the intercept is automatically dropped; this is usually
the right choice. Use show_intercept=TRUE
to have the
intercept included.) For more complicated coefficient plots, you can use
broom.mixed::tidy()
to get the parameters in a useful form
and dwplot
, or ggplot2
with
geom_pointrange()
, to draw the plot.
Stop and explain to yourself what these parameters mean. If you’re
not sure, try resetting the base level of the roostsitu
factor:
dataf2 <- transform(dataf,roostsitu=relevel(roostsitu,ref="other"))
,
predict what will happen to the results, and re-run the analysis.)
Exercise: This is actually a slightly trivial
example, because there are only two measurements for each individual.
Thus we can actually analyze the slopes and get the same answers by
using a paired analysis, i.e. by computing the mass difference
in subjects and then analyzing with a single-level linear model. (In the
simplest case this would reduce to a paired \(t\)-test, but in this case it is a 1-way
ANOVA on roostsitu
.
Rearrange the data to get differences by subject:
dataf2 <- aggregate(stmass~roostsitu+subject,
data=dataf,
FUN=function(x) x[2]-x[1])
This says to aggregate the data frame dataf
grouped by
roostsitu
and subject
; for each group, compute
the difference of the masses.
Draw a picture (boxplot+beeswarm plot):
ggplot(dataf2,aes(x=roostsitu,y=stmass))+geom_boxplot()+
geom_dotplot(binaxis="y",stackdir="center",fill="red",alpha=0.5,
binwidth=0.5)
As you can see, geom_dotplot()
adds a horizontal dot-plot
for each group (see the documentation [?geom_dotplot
] for
more details).
lm
and convince yourself that the
estimates (fixed-effect coefficients, \(t\) statistics, etc.) are equivalent to
those found from the previous analysis.What about \(p\)-values or more
sophisticated confidence intervals? The default confidence intervals
shown by dwplot()
are just Normal confidence intervals (not
taking into account any of the effects that make these approximate for
mixed models).
We can get profile confidence intervals via
confint(lmer2)
or
tidy(lmer2, conf.method = "profile")
; in this example the
profile confidence intervals are almost identical to the Wald intervals
(dwplot()
’s default).
(Insert long discussion of degrees of freedom and \(p\)-values here.)
We can use lmerTest
or afex::mixed
(which
both wrap functions from the pbkrtest
package) to get
“denominator degrees of freedom”/\(p\)-values.
lmerTest
wraps the lmer
function to
compute some additional information, and modifies the
summary()
function:library("lmerTest")
## or: as(lmer2, "lmerModLmerTest")
lmer2R <- lmer(stmass~mnth*roostsitu+(1|subject),data=dataf)
Capturing just the coefficient table from
summary(lmer2R)
:
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 83.600 1.330 71.973 62.847 < 2e-16 ***
## mnthJan 7.200 1.863 36.000 3.865 0.000446 ***
## roostsitunest-box -4.200 1.881 71.973 -2.233 0.028685 *
## roostsituinside -5.000 1.881 71.973 -2.658 0.009682 **
## roostsituother -8.200 1.881 71.973 -4.359 4.27e-05 ***
## mnthJan:roostsitunest-box 3.600 2.634 36.000 1.367 0.180244
## mnthJan:roostsituinside 2.400 2.634 36.000 0.911 0.368341
## mnthJan:roostsituother 1.600 2.634 36.000 0.607 0.547430
By default lmerTest
uses the Satterthwaite approximation
to the degrees of freedom. You can also use
summary(lmer2R,ddf="Kenward-Roger")
to get a slightly more
accurate (and slower) estimate, but in this case the answers are
basically identical.
For simple cases we can use (a slightly improved version of) the
level-counting algorithm from lme
.
source("R/calcDenDF.R")
calcDenDF(~mnth*roostsitu,data=dataf,random=~1|subject)
## (Intercept) mnthJan roostsitunest-box
## 36 36 36
## roostsituinside roostsituother mnthJan:roostsitunest-box
## 36 36 36
## mnthJan:roostsituinside mnthJan:roostsituother
## 36 36
We conclude that the interactions are not doing much, but there’s definitely an effect of the roosts and months.
However, we probably want to test the overall effect of the interactions, not the individual levels. Here are the type II (sequential) ANOVA results:
anova(lmer2)
## Type III Analysis of Variance Table with Satterthwaite's method
## Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
## mnth 1656.20 1656.20 1 36 95.4582 1.152e-11 ***
## roostsitu 552.46 184.15 3 36 10.6141 3.833e-05 ***
## mnth:roostsitu 34.20 11.40 3 36 0.6571 0.5838
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
If we want to evaluate the marginal sums of squares, i.e. dropping one term at a time from the model, when there are interactions, we usually need to change the model to use sum-to-zero contrasts:
lmer2B <- update(lmer2,
contrasts=list(mnth="contr.sum",roostsitu="contr.sum"))
The alternative approach is to use
options(contrasts=c("contr.sum","contr.poly"))
, then refit
the model, but I prefer to use the contrasts
argument
because it is more explicit.
afex::mixed
to do “type 3” tests of all effects in
a model (using Kenward-Roger to get df by default).afex::mixed(stmass~mnth*roostsitu+(1|subject),data=dataf)
## Mixed Model Anova Table (Type 3 tests, S-method)
##
## Model: stmass ~ mnth * roostsitu + (1 | subject)
## Data: dataf
## Effect df F p.value
## 1 mnth 1, 36.00 95.46 *** <.001
## 2 roostsitu 3, 36.00 10.61 *** <.001
## 3 mnth:roostsitu 3, 36.00 0.66 .584
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1
anova()
to do a likelihood ratio test on individual
pairs of models.drop1()
to drop terms from the model one at a
timebe careful when doing “marginal”/“type 3” tests in the presence of interactions!
In this case the results (\(F\) values) are identical because the original design is balanced (hence, orthogonal). Not true if the data are (1) unbalanced (which is often true of ANOVA [categorical-predictor] designs, and almost always true of regression designs) or (2) GLMM or nonlinear.
The explicit model-comparison approach uses a likelihood ratio test rather than an \(F\) test (i.e., it does not correct for the fact that the denominator sums of squares is estimated with error). In this case it hardly matters.
lmer2C <- update(lmer2B,REML=FALSE)
lmer2D <- update(lmer2C,. ~ . - mnth:roostsitu)
anova(lmer2C,lmer2D)
## Data: dataf
## Models:
## lmer2D: stmass ~ mnth + roostsitu + (1 | subject)
## lmer2C: stmass ~ mnth * roostsitu + (1 | subject)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## lmer2D 7 464.58 481.25 -225.29 450.58
## lmer2C 10 468.45 492.27 -224.22 448.45 2.1344 3 0.545
If we now want to use the model-comparison approach on the reduced
(no-interaction) model to test the significance of
roostsitu
, we can use update
to drop the
roostsitu
effect, but we also have to make sure to update
the contrasts
argument so that it only refers to predictors
that remain in the reduced model (otherwise, we get an error).
drop1(lmer2D,test="Chisq") ## ignore warnings
## Single term deletions using Satterthwaite's method:
##
## Model:
## stmass ~ mnth + roostsitu + (1 | subject)
## Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
## mnth 1656.2 1656.20 1 80 101.281 7.259e-16 ***
## roostsitu 574.4 191.47 3 80 11.709 1.941e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
If we want to test the random effect, we would in principle remove
the random effect and test with anova
, but this is a bit
problematic here: lmer
can’t fit a model without any random
effects.
Let’s try this with lm
:
lm1 <- lm(stmass~mnth*roostsitu,data=dataf)
(a1 <- anova(lmer2C,lm1))
## Data: dataf
## Models:
## lm1: stmass ~ mnth * roostsitu
## lmer2C: stmass ~ mnth * roostsitu + (1 | subject)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## lm1 9 466.46 487.90 -224.23 448.46
## lmer2C 10 468.45 492.27 -224.22 448.45 0.0152 1 0.902
We can also deduce the number of degrees of freedom from standard rules:
with(dataf,table(mnth,roostsitu))
## roostsitu
## mnth tree nest-box inside other
## Nov 10 10 10 10
## Jan 10 10 10 10
If we think about this in terms of the paired \(t\)-test, there are 40 comparisons and 4 parameters (one for each roost site)= 36 df.
If you wanted to compute the \(p\)-values by hand, you could:
## here we use lme4:::anova.merMod to force R to use the anova
## method from lme4 rather than the one from lmerTest
a2 <- lme4:::anova.merMod(lmer2B)
fvals <- a2[,"F value"]
numdf <- a2[,"npar"]
dendf <- 36
pf(fvals,numdf,dendf,lower.tail=FALSE)
## [1] 1.151629e-11 3.833216e-05 5.838288e-01
Alternatively, we can try a parametric bootstrap: the
pbkrtest
package can do this, or we can just set it up by
hand:
## 'log Lik.' 10.01564 (df=10)
## [1] 0.567
library(pbkrtest)
t1 <- system.time(PBmodcomp(lmer3,lmer4,nsim=500))
(takes 7 seconds)
In this case, the Kenward-Roger correction appropriately does nothing different – we have a classical balanced design and no correction is actually necessary. But it does give a denominator df and a \(p\)-value for this lmer model, which is handy …
Exercise: repeat for the tundra data
In lmer
, predict
has a re.form
argument that specifies which random effects should be included
(NA
or ~0
=none, population level;
NULL
(=all) or ~subject
=prediction at the
subject level; more complex models, might have additional nested
levels).
dataf$pred <- predict(lmer2,re.form=NA) ## population level
dataf$pred1 <- predict(lmer2) ## individual level
g0 <- ggplot(dataf,aes(mnth,stmass))+
geom_point()+
geom_line(aes(group=subject))+
facet_grid(.~roostsitu)+
zero_margin
g0 + geom_line(colour="gray",aes(y=pred1,group=subject)) +
geom_line(colour="red",aes(y=pred,group=subject))
There is so much shrinkage (the among-individual variance is very small) that we can barely see the individual-level predictions (gray lines) behind the population-level predictions (red lines).
VarCorr(lmer2)
## Groups Name Std.Dev.
## subject (Intercept) 0.58689
## Residual 4.16533
Unfortunately computing confidence intervals for the predictions is a little tricky: again, there is some code on the GLMM FAQ for this.
ggplot(dataf,aes(mnth,pred1))+
geom_line(aes(group=subject,x=as.numeric(mnth)),colour="gray")+
facet_wrap(~roostsitu,scale="free_y",nrow=1)+
geom_line(aes(y=pred,x=as.numeric(mnth)),colour="red")
For most cases you will want to set up a new data frame to do
prediction rather than just using the covariates from the original data
(e.g. if the data are sampled irregularly, or sparsely), and use the
newdata
argument of predict
. The
expand.grid
function is handy in this context too.