Licensed under the Creative Commons attribution-noncommercial license. Please share & remix noncommercially, mentioning its origin.
## Warning: S3 methods 'ggplot2::autoplot.zoo', 'ggplot2::fortify.zoo' were
## declared in NAMESPACE but not found
MASS::boxcox()
on residualstable()
to confirm how many observations per level of factor
|
as (a|g1) + (b|g2) + ...
+
are independentequation | formula |
---|---|
\(β_0 + β_{1}X_{i} + e_{si}\) | n/a (Not a mixed-effects model) |
\((β_0 + b_{S,0s}) + β_{1}X_i + e_{si}\) | ∼ X + (1∣Subject) |
\((β_0 + b_{S,0s}) + (β_{1} + b_{S,1s}) X_i + e_{si}\) | ~ X + (1 + X∣Subject) |
\((β_0 + b_{S,0s} + b_{I,0i}) + (β_{1} + b_{S,1s}) X_i + e_{si}\) | ∼ X + (1 + X∣Subject) + (1∣Item) |
As above, but \(S_{0s}\), \(S_{1s}\) independent | ∼ X + (1∣Subject) + (0 + X∣ Subject) + (1∣Item) |
\((β_0 + b_{S,0s} + b_{I,0i}) + β_{1}X_i + e_{si}\) | ∼ X + (1∣Subject) + (1∣Item) |
\((β_0 + b_{I,0i}) + (β_{1} + b_{S,1s})X_i + e_{si}\) | ∼ X + (0 + X∣Subject) + (1∣Item) |
Modified from: http://stats.stackexchange.com/questions/13166/rs-lmer-cheat-sheet?lq=1 (Livius)
a*b
= main effects plus interaction, a:b
= interaction only, a/b
= a
+ a:b
(1|f/g)
equivalent to (1|f) + (1|f:g)
. e.g. subplots vary within plots (but “subplot 1 of every plot” isn’t meaningful)(1|f) + (1|g)
. e.g. years vary, and plots vary independently(1|f) + (1|g) + (1|f:g)
. e.g. years vary, and plots vary independently, and plots also vary within years (for LMMs, assumes >1 observation per plot/year combination). ((1|f*g)
should be allowed but …)Don’t need explicit nesting if your sub-groups are uniquely labeled (i.e. A1
, A2
, …, B1
, B2
, …)
Maximal model often won’t work
(x|g)
really do?(1+x|g)
\[ (\textrm{intercept}, \textrm{slope}) = \textrm{MVN}\left(\boldsymbol 0, \left[ \begin{array}{cc} \sigma^2_{\textrm{int}} & \sigma_{\textrm{int},\textrm{slope}} \\ \sigma_{\textrm{int},\textrm{slope}} & \sigma^2_{\textrm{slope}} \end{array} \right] \right) \]
(1+x+y+z+...|g)
: \(n\) effects require \(n(n+1)/2\) variances + covariancesall(abs(getME(x,"theta"))>1e-4)
Essentially, because the observed among-group variation is less than the expected among-group variation (= \(\sigma^2_{\mbox{among}} + \sigma^2_{\mbox{within}}/n\)). More generally, because some dimension of the variance-covariance matrix has zero extent …
(1|b/a)
([positive] compound symmetry) vs. (a|b)
: \[
(\textrm{intercept}, \textrm{slope}) =
\textrm{MVN}\left(\boldsymbol 0,
\left[
\begin{array}{ccccc}
\sigma^2_{\{b|1\}} & . & . & . & . \\
\sigma_{\{b|1\},\{b|a_{21}\}} &
\sigma^2_{\{b|a_{21}\}} & . & . & . \\
\sigma_{\{b|1\}, \{b|a_{31}\}} &
\sigma_{\{b|a_{21}\},\{b|a_{31}\}} &
\sigma^2_{\{b|a_{31}\}} & . & . \\
\sigma_{\{b|1\} ,\{b|a_{41}\}} &
\sigma_{\{b|a_{21}\},\{b|a_{41}\}} &
\sigma_{\{b|a_{31}\},\{b|a_{41}\}} &
\sigma^2_{\{b|a_{41}\}} & . \\
\sigma_{\{b|1\},\{b|a_{51}\}} &
\sigma_{\{b|a_{21}\},\{b|a_{51}\}} &
\sigma_{\{b|a_{31}\},\{b|a_{51}\}} &
\sigma_{\{b|a_{41}\},\{b|a_{51}\}} &
\sigma^2_{\{b|a_{51}\}}
\end{array}
\right]
\right)
\] (=\((n(n+1))/2 = (4\times 5)/2 = 10\) parameters) vs. \[
\left[
\begin{array}{ccccc}
\sigma^2 & . & . & . & . \\
\rho \sigma^2 & \sigma^2 & . & . & . \\
\rho \sigma^2 & \rho \sigma^2 & \sigma^2 & . & . \\
\rho \sigma^2 & \rho \sigma^2 & \rho \sigma^2 & \sigma^2 & . \\
\rho \sigma^2 & \rho \sigma^2 & \rho \sigma^2 & \rho \sigma^2 & \sigma^2
\end{array}
\right]
\] where \(\sigma^2 = \sigma^2_{\{b|1\}}+\sigma^2_{\{a:b|1\}}\), \(\rho = \sigma^2_{\{b|1\}}/\sigma^2\) (=2 parameters; \(\rho\) must be >0)
(1+x+y+z||b)
(1|b) + (0+x|b) + ...
lme4
version only works properly for continuous predictorsafex::mixed
can do thisRE “nested within” FE, e.g. if not enough groups at the top level; y ~ g1 + (1|g1:g2)
; more parameters but fixed rather than random
m1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
Similar to generalized linear models: fitted vs. residual, scale-location, Q-Q
aa <- augment(m1)
## fitted vs resid
plot(m1,type=c("p","smooth"))
## scale-location
plot(m1,sqrt(abs(resid(.)))~fitted(.),type=c("p","smooth"))
## q-q
lattice::qqmath(m1,id=0.05)
Or with augment
and ggplot …
aa <- augment(m1)
## fitted vs resid
ggplot(aa,aes(.fitted,.resid))+geom_point()+geom_smooth()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## scale-location
ggplot(aa,aes(.fitted,sqrt(abs(.resid))))+geom_point()+geom_smooth()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## q-q
ggplot(aa)+stat_qq(aes(sample=.resid))
tidy
: get coefficient values
effects="fixed"
effects="ran_vars"
effects="ran_vals"
effects="ran_coefs"
glance
: get model summariesaugment
: get fitted, residuals, possibly predictionsmost important:
more important for variance parameters than fixed-effect parameters
e.g. Normal vs. \(t\)
simulate()
it fro the fitted model:summary()
car::Anova
, afex::anova
emmeans
, effects
packagescar::Anova(m1)
## Warning: S3 method 'xts::as.xts.data.table' was declared in NAMESPACE but
## not found
R/calcDenDF.R
lmerTest
/afex
; Satterthwaite, Kenward-Rogerlibrary(lmerTest)
##
## Attaching package: 'lmerTest'
## The following object is masked from 'package:lme4':
##
## lmer
## The following object is masked from 'package:stats':
##
## step
summary(as_lmerModLmerTest(m1))
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: Reaction ~ Days + (Days | Subject)
## Data: sleepstudy
##
## REML criterion at convergence: 1743.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.9536 -0.4634 0.0231 0.4634 5.1793
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## Subject (Intercept) 612.09 24.740
## Days 35.07 5.922 0.07
## Residual 654.94 25.592
## Number of obs: 180, groups: Subject, 18
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 251.405 6.825 17.000 36.838 < 2e-16 ***
## Days 10.467 1.546 17.000 6.771 3.26e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## Days -0.138
summary(as_lmerModLmerTest(m1),ddf="Kenward-Roger")
## Linear mixed model fit by REML. t-tests use Kenward-Roger's method [
## lmerModLmerTest]
## Formula: Reaction ~ Days + (Days | Subject)
## Data: sleepstudy
##
## REML criterion at convergence: 1743.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.9536 -0.4634 0.0231 0.4634 5.1793
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## Subject (Intercept) 612.09 24.740
## Days 35.07 5.922 0.07
## Residual 654.94 25.592
## Number of obs: 180, groups: Subject, 18
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 251.405 6.825 17.000 36.838 < 2e-16 ***
## Days 10.467 1.546 17.000 6.771 3.26e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## Days -0.138
Q: Why not use lmerTest
all the time?
A: it can make it a little harder to diagnose fitting problems
profile()
, confint()
anova()
to comparedrop1
, afex::anova()
confint(m1)
## Computing profile confidence intervals ...
## 2.5 % 97.5 %
## .sig01 14.3815048 37.715996
## .sig02 -0.4815007 0.684986
## .sig03 3.8011641 8.753383
## .sigma 22.8982669 28.857997
## (Intercept) 237.6806955 265.129515
## Days 7.3586533 13.575919
What took so long?
Very slow!
pbkrtest
confint(.,method="boot")
anova()
mm <- model.matrix(terms(fitted_model),newdat)
se_pred <- sqrt(diag(mm %*% tcrossprod(vcov(fitted_model),mm)))
data("Oats",package="nlme")
m_oats <- lmer(yield~nitro*Variety+(1|Block),data=Oats)
(e1 <- emmeans(m_oats,~nitro|Variety))
## Variety = Golden Rain:
## nitro emmean SE df lower.CL upper.CL
## 0.3 104.5000 7.124279 6.61 87.45087 121.5491
##
## Variety = Marvellous:
## nitro emmean SE df lower.CL upper.CL
## 0.3 109.7917 7.124279 6.61 92.74254 126.8408
##
## Variety = Victory:
## nitro emmean SE df lower.CL upper.CL
## 0.3 97.6250 7.124279 6.61 80.57587 114.6741
##
## Degrees-of-freedom method: kenward-roger
## Confidence level used: 0.95
plot(e1)
e2 <- emmeans(m_oats,~Variety)
## NOTE: Results may be misleading due to involvement in interactions
##
contrast(e2,"pairwise")
## contrast estimate SE df t.ratio p.value
## Golden Rain - Marvellous -5.291667 4.469012 61 -1.184 0.4671
## Golden Rain - Victory 6.875000 4.469012 61 1.538 0.2804
## Marvellous - Victory 12.166667 4.469012 61 2.722 0.0226
##
## P value adjustment: tukey method for comparing a family of 3 estimates
also see the effects
package.
car::Anova(m_oats,test="F")
## Analysis of Deviance Table (Type II Wald F tests with Kenward-Roger df)
##
## Response: yield
## F Df Df.res Pr(>F)
## nitro 81.5155 1 61 7.648e-13 ***
## Variety 3.7268 2 61 0.02972 *
## nitro:Variety 0.3512 2 61 0.70524
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Bolker, Benjamin M. 2015. “Linear and Generalized Linear Mixed Models.” In Ecological Statistics: Contemporary Theory and Application, edited by Gordon A. Fox, Simoneta Negrete-Yankelevich, and Vinicio J. Sosa. Oxford University Press.