Licensed under the Creative Commons attribution-noncommercial license. Please share & remix noncommercially, mentioning its origin.
Load packages:
## modeling packages
library("lme4") ## basic (G)LMMs
library("nlme") ## more LMMs
library("afex") ## helper functions
library("emmeans")
## graphics
library("ggplot2"); theme_set(theme_bw())
## squash panels together
zero_margin <- theme(panel.spacing=grid::unit(0,"lines"))
library("lattice")
library("dotwhisker") ## coefficient plots
## data manipulation
library("broom")
library("broom.mixed")
Data from Toby Marthews, r-sig-mixed-models mailing list: you can find it here.
subject
: individual birdroostsitu
(tree, nest-box, inside, other): location of nest boxmnth
(Nov, Jan): monthstmass
: mass of bird## cosmetic tweaks
load("../../data/starling.RData") ## loads "dataf"
ggplot(dataf,aes(x=mnth,y=stmass))+
geom_point()+
geom_line(aes(group=subject))+ ## connect subjects with a line
facet_grid(.~roostsitu)+ ## 1 row of panels by roost
zero_margin ## squash together
You should also try
ggplot(dataf,aes(mnth,stmass,colour=roostsitu))+
geom_point()+
geom_line(aes(group=subject))
for comparison: what are the tradeoffs?
Overkill for this data set, but sometimes it can be useful to put every individual in its own facet:
## reorder individuals by magnitude of difference
dataf <- transform(dataf,subject=reorder(subject,stmass,FUN=diff))
ggplot(dataf,aes(mnth,stmass,colour=roostsitu,group=subject))+
geom_point()+geom_line()+facet_wrap(~subject)+
zero_margin
It’s pretty obvious that the starting (November) mass varies among roost situations (tree/nest-box/etc.), and that mass increases from November to January, but we might like to know if the slopes differ among situations. That means our fixed effects would be ~roostsitu*mnth
, with our attention focused on the roostsitu:mnth
(interaction) term. For random effects, we can allow both intercept (obviously) and slope (maybe) to vary among individuals, via (1+mnth|subject)
or equivalent … in this case, because measurements are only taken in two months, we can also write the random term as (1|subject/mnth)
.
However, it turns out that we can’t actually estimate random slopes for this model, because every individual is only measured twice. That means that the variance in the slopes would be completely confounded with the residual variance.
If we fit this with lme4
lme1 <- lmer(stmass~mnth*roostsitu+(1|subject/mnth),data=dataf)
## Error: number of levels of each grouping factor must be < number of observations (problems: mnth:subject)
We get an error. We can actually do this (lmerControl(check.nobs.vs.nlev="ignore")
; see ?lmerControl
for details), and it will actually work, but it will pick an arbitrary division between the residual variance and the month:subject
variance.
But now let’s forget about including the (unidentifiable) random slopes.
lmer2 <- lmer(stmass~mnth*roostsitu+(1|subject),data=dataf)
We can now get estimates, although the subject-level random effects are very uncertain: see confint(lmer2)
Walking through the output:
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: stmass ~ mnth * roostsitu + (1 | subject)
## Data: dataf
## REML criterion at convergence: 429.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
This is a quick reminder of the scaled residuals; these should be approximately unskewed (median \(\approx 0\)) and the extremes should be somewhere around \(\pm 2\) (bigger in a big data set )
##
## Random effects:
## Groups Name Variance Std.Dev.
## subject (Intercept) 0.3444 0.5869
## Residual 17.3500 4.1653
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 83.600 1.330 71.973 62.847 < 2e-16 ***
## mnthJan 7.200 1.863 36.000 3.865 0.000446 ***
## roostsitunest-box -4.200 1.881 71.973 -2.233 0.028685 *
## roostsituinside -5.000 1.881 71.973 -2.658 0.009682 **
## roostsituother -8.200 1.881 71.973 -4.359 4.27e-05 ***
## mnthJan:roostsitunest-box 3.600 2.634 36.000 1.367 0.180244
lmer
does not tell us the denominator degrees of freedom for the test (although we can get a rough idea of importance/significance fro the \(t\) statistics; e.g. \(t>2.5\) will be significant at \(p<0.05\) for 6 or more degrees of freedom). We’ll come back to this in the inference section.Basic diagnostic plots: fitted vs. residuals, coloured according to the roostsitu
variable (col=dataf$roostsitu
), point type according to month (pch=
).
plot(lmer2,col=dataf$roostsitu,pch=dataf$mnth,id=0.05)
The id
argument specifies to mark points that are beyond the specified Normal confidence limits (i.e. \(\alpha\)-values) (these are not outliers in any formal statistical sense but might be points you want to look at if they are otherwise interesting). By default they are labeled by random-effect group; use idLabels
to change the default labels, and idLabels=~.obs
to label by observation number.
You can get a scale-location plot by specifying
plot(lmer2,sqrt(abs(residuals(.))) ~ fitted(.),type=c("p","smooth"))
Adding the smoothed line is helpful because uneven sampling can influence your perception of the pattern.
Q-Q plot (a straight line indicates normality)
qqmath(lmer2,col=dataf$roostsitu,pch=dataf$mnth)
We can also do this with broom.mixed::augment()
and stat_qq()
(although there are also some some limitations - colour
argument doesn’t actually work…)
ggplot(augment(lmer2),
aes(sample=.resid/sd(.resid)))+ ## scale to variance=1
stat_qq(aes(group=1,colour=roostsitu))+
geom_abline(intercept=0,slope=1)
## Warning: The following aesthetics were dropped during statistical transformation: colour
## ℹ This can happen when ggplot fails to infer the correct grouping structure in the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical variable into a factor?
There are some deviations here, but not enough that I would worry very much. In particular, the distribution is slightly thin-tailed (the smallest residuals are largest than expected, and the largest residuals are smaller than expected), which would make the results slightly conservative (a fat-tailed distribution would make them anticonservative).
Boxplots of residuals subdivided by roostsitu
(you have to put the grouping variable on the left side of the formula here):
plot(lmer2,roostsitu~resid(.))
Might be easier with augment
:
aa <- augment(lmer2)
ggplot(aa,aes(roostsitu,.resid))+
geom_boxplot()+coord_flip()
Plot random effects to look for outliers:
dotplot(ranef(lmer2,condVar=TRUE))
## $subject
Or with tidy
+ ggplot
:
tt <- tidy(lmer2,effects="ran_vals")
tt <- transform(tt,level=reorder(level,estimate))
ggplot(tt,aes(level,estimate))+
geom_pointrange(aes(ymin=estimate-1.96*std.error,
ymax=estimate+1.96*std.error))+
coord_flip()
In general, try plotting residuals and \(\sqrt{\textrm{abs}(r)}\) both as a function of the fitted values and as a function of particularly important predictors.
hatvalues()
function can also be useful (although all hat values are identical here)influence.ME
package. Try plot(influence(lmer2,group="subject"))
and interpret the results …tundra_agg.rda
(from the data directory given above). The GS.NEE
records the net ecosystem exchange (in grams of carbon/m^2/year) in a given year and site; cYear
is a centered version of the year; n
records the number of observations taken in a given site/year (the NEE value given is an average of these observations).ggplot2
if you can, otherwise with base R graphics) to inspect the datalmer
fit (note this data set does not have the issue of the starling data set), draw some preliminary conclusions, and inspect the plot diagnostics.One good way to assess the results of a model fit is to look at a coefficient plot: dotwhisker::dwplot()
does this. Note that it automatically scales all coefficients by 2SD; use by_2sd=FALSE
to leave coefficients unscaled
dotwhisker::dwplot(lmer2)+geom_vline(xintercept=0,lty=2)
(Note that the intercept is automatically dropped; this is usually the right choice. Use show_intercept=TRUE
to have the intercept included.) For more complicated coefficient plots, you can use the combination of the broom.mixed
package to get the parameters in a useful form and dwplot
to draw the plot.
Stop and explain to yourself what these parameters mean. If you’re not sure, try resetting the base level of the roostsitu
factor: dataf2 <- transform(dataf,roostsitu=relevel(roostsitu,ref="other"))
, predict what will happen to the results, and re-run the analysis.)
Exercise: This is actually a slightly trivial example, because there are only two measurements for each individual. Thus we can actually analyze the slopes and get the same answers by using a paired analysis, i.e. by computing the mass difference in subjects and then analyzing with a single-level linear model. (In the simplest case this would reduce to a paired \(t\)-test, but in this case it is a 1-way ANOVA on roostsitu
.
Rearrange the data to get differences by subject:
dataf2 <- aggregate(stmass~roostsitu+subject,
data=dataf,
FUN=function(x) x[2]-x[1])
This says to aggregate the data frame dataf
grouped by roostsitu
and subject
; for each group, compute the difference of the masses.
Draw a picture (boxplot+beeswarm plot):
ggplot(dataf2,aes(x=roostsitu,y=stmass))+geom_boxplot()+
geom_dotplot(binaxis="y",stackdir="center",fill="red",alpha=0.5,
binwidth=0.5)
As you can see, geom_dotplot()
adds a horizontal dot-plot for each group (see the documentation [?geom_dotplot
] for more details).
lm
and convince yourself that the estimates (fixed-effect coefficients, \(t\) statistics, etc.) are equivalent to those found from the previous analysis.What about \(p\)-values or more sophisticated confidence intervals? The default confidence intervals shown by dwplot()
are just Normal confidence intervals (not taking into account any of the effects that make these approximate for mixed models).
We can get profile confidence intervals via confint(lmer2)
, but the confidence intervals are almost identical to those from confint(lmer2,method="Wald")
(the default shown by dwplot()
.
(Insert long discussion of degrees of freedom and \(p\)-values here.)
We can use lmerTest
or afex::mixed
(which both wrap functions from the pbkrtest
package) to get “denominator degrees of freedom”/\(p\)-values.
lmerTest
wraps the lmer
function to compute some additional information, and modifies the summary()
function:library("lmerTest")
lmer2R <- lmer(stmass~mnth*roostsitu+(1|subject),data=dataf)
detach("package:lmerTest")
Capturing just the coefficient table from summary(lmer2R)
:
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 83.600 1.330 71.973 62.847 < 2e-16 ***
## mnthJan 7.200 1.863 36.000 3.865 0.000446 ***
## roostsitunest-box -4.200 1.881 71.973 -2.233 0.028685 *
## roostsituinside -5.000 1.881 71.973 -2.658 0.009682 **
## roostsituother -8.200 1.881 71.973 -4.359 4.27e-05 ***
## mnthJan:roostsitunest-box 3.600 2.634 36.000 1.367 0.180244
## mnthJan:roostsituinside 2.400 2.634 36.000 0.911 0.368341
## mnthJan:roostsituother 1.600 2.634 36.000 0.607 0.547430
By default lmerTest
uses the Satterthwaite approximation to the degrees of freedom. You can also use summary(lmer2R,ddf="Kenward-Roger")
to get a slightly more accurate (and slower) estimate, but in this case the answers are basically identical.
Can use the algorithm that lme
uses:
source("../../R/calcDenDF.R")
calcDenDF(~mnth*roostsitu,data=dataf,random=~1|subject)
## (Intercept) mnthJan roostsitunest-box
## 36 36 36
## roostsituinside roostsituother mnthJan:roostsitunest-box
## 36 36 36
## mnthJan:roostsituinside mnthJan:roostsituother
## 36 36
We conclude that the interactions are not doing much, but there’s definitely an effect of the roosts and months. This agrees with the picture from dwplot
(above).
However, we probably want to test the overall effect of the interactions, not the individual levels. Here are the type II (sequential) ANOVA results:
anova(lmer2)
## Type III Analysis of Variance Table with Satterthwaite's method
## Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
## mnth 1656.20 1656.20 1 36 95.4582 1.152e-11 ***
## roostsitu 552.46 184.15 3 36 10.6141 3.833e-05 ***
## mnth:roostsitu 34.20 11.40 3 36 0.6571 0.5838
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
If we want to evaluate the marginal sums of squares, i.e. dropping one term at a time from the model, when there are interactions, we usually want to change the model to use sum-to-zero contrasts:
lmer2B <- update(lmer2,
contrasts=list(mnth="contr.sum",roostsitu="contr.sum"))
The alternative approach is to use options(contrasts=c("contr.sum","contr.poly"))
, then refit the model, but I prefer to use the contrasts
argument because it is more explicit.
afex::mixed
to do “type 3” tests of all effects in a model (using Kenward-Roger to get df by default).afex::mixed(stmass~mnth*roostsitu+(1|subject),data=dataf)
## Mixed Model Anova Table (Type 3 tests, S-method)
##
## Model: stmass ~ mnth * roostsitu + (1 | subject)
## Data: dataf
## Effect df F p.value
## 1 mnth 1, 36.00 95.46 *** <.001
## 2 roostsitu 3, 36.00 10.61 *** <.001
## 3 mnth:roostsitu 3, 36.00 0.66 .584
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1
anova()
to do a likelihood ratio test on individual pairs of models.drop1()
to drop terms from the model one at a timebe careful when doing “marginal”/“type 3” tests in the presence of interactions!
In this case the results (\(F\) values) are identical because the original design is balanced (hence, orthogonal). Not true if the data are (1) unbalanced (which is often true of ANOVA [categorical-predictor] designs, and almost always true of regression designs) or (2) GLMM or nonlinear.
The explicit model-comparison approach uses a likelihood ratio test rather than an \(F\) test (i.e., it does not correct for the fact that the denominator sums of squares is estimated with error). In this case it hardly matters.
lmer2C <- update(lmer2B,REML=FALSE)
lmer2D <- update(lmer2C,. ~ . - mnth:roostsitu)
anova(lmer2C,lmer2D)
## Data: dataf
## Models:
## lmer2D: stmass ~ mnth + roostsitu + (1 | subject)
## lmer2C: stmass ~ mnth * roostsitu + (1 | subject)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## lmer2D 7 464.58 481.25 -225.29 450.58
## lmer2C 10 468.45 492.27 -224.22 448.45 2.1344 3 0.545
If we now want to use the model-comparison approach on the reduced (no-interaction) model to test the significance of roostsitu
, we can use update
to drop the roostsitu
effect, but we also have to make sure to update the contrasts
argument so that it only refers to predictors that remain in the reduced model (otherwise, we get an error).
drop1(lmer2D,test="Chisq") ## ignore warnings
## Single term deletions using Satterthwaite's method:
##
## Model:
## stmass ~ mnth + roostsitu + (1 | subject)
## Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
## mnth 1656.2 1656.20 1 80 101.281 7.259e-16 ***
## roostsitu 574.4 191.47 3 80 11.709 1.941e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
If we want to test the random effect, we would in principle remove the random effect and test with anova
, but this is a bit problematic here: lmer
can’t fit a model without any random effects.
Let’s try this with lm
:
lm1 <- lm(stmass~mnth*roostsitu,data=dataf)
(a1 <- anova(lmer2C,lm1))
## Data: dataf
## Models:
## lm1: stmass ~ mnth * roostsitu
## lmer2C: stmass ~ mnth * roostsitu + (1 | subject)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## lm1 9 466.46 487.90 -224.23 448.46
## lmer2C 10 468.45 492.27 -224.22 448.45 0.0152 1 0.902
We can also deduce the number of degrees of freedom from standard rules:
with(dataf,table(mnth,roostsitu))
## roostsitu
## mnth tree nest-box inside other
## Nov 10 10 10 10
## Jan 10 10 10 10
If we think about this in terms of the paired \(t\)-test, there are 40 comparisons and 4 parameters (one for each roost site)= 36 df.
If you wanted to compute the \(p\)-values by hand, you could:
## here we use lme4:::anova.merMod to force R to use the anova
## method from lme4 rather than the one from lmerTest
a2 <- lme4:::anova.merMod(lmer2B)
fvals <- a2[,"F value"]
numdf <- a2[,"npar"]
dendf <- 36
pf(fvals,numdf,dendf,lower.tail=FALSE)
## [1] 1.151629e-11 3.833216e-05 5.838288e-01
Alternatively, we can try a parametric bootstrap: the pbkrtest
package can do this, or we can just set it up by hand:
## 'log Lik.' 13.06582 (df=10)
## [1] 0.557
library(pbkrtest)
t1 <- system.time(PBmodcomp(lmer3,lmer4,nsim=500))
(takes 7 seconds)
In this case, the Kenward-Roger correction appropriately does nothing different – we have a classical balanced design and no correction is actually necessary. But it does give a denominator df and a \(p\)-value for this lmer model, which is handy …
Exercise: repeat for the tundra data
In lmer
, predict
has a re.form
argument that specifies which random effects should be included (NA
or ~0
=none, population level; NULL
(=all) or ~subject
=prediction at the subject level; more complex models, might have additional nested levels).
dataf$pred <- predict(lmer2,re.form=NA) ## population level
dataf$pred1 <- predict(lmer2) ## individual level
g0 <- ggplot(dataf,aes(mnth,stmass))+
geom_point()+
geom_line(aes(group=subject))+
facet_grid(.~roostsitu)+
zero_margin
g0 + geom_line(colour="gray",aes(y=pred1,group=subject)) +
geom_line(colour="red",aes(y=pred,group=subject))
There is so much shrinkage (the among-individual variance is very small) that we can barely see the individual-level predictions (gray lines) behind the population-level predictions (red lines).
VarCorr(lmer2)
## Groups Name Std.Dev.
## subject (Intercept) 0.58689
## Residual 4.16533
Unfortunately computing confidence intervals for the predictions is a little tricky: again, there is some code on the GLMM FAQ for this.
ggplot(dataf,aes(mnth,pred1))+
geom_line(aes(group=subject,x=as.numeric(mnth)),colour="gray")+
facet_wrap(~roostsitu,scale="free_y",nrow=1)+
geom_line(aes(y=pred,x=as.numeric(mnth)),colour="red")
For most cases you will want to set up a new data frame to do prediction rather than just using the covariates from the original data (e.g. if the data are sampled irregularly, or sparsely), and use the newdata
argument of predict
. The expand.grid
function is handy in this context too.