license

cc Licensed under the Creative Commons attribution-noncommercial license. Please share & remix noncommercially, mentioning its origin.

(Generalized) linear mixed models

(G)LMMs: a statistical modeling framework incorporating:


image

Coral protection from seastars (Culcita) by symbionts (McKeon et al., 2012)

Environmental stress: Glycera cell survival (D. Julian unpubl.)

Arabidopsis response to fertilization & herbivory (Banta et al., 2010)

Coral demography (J.-S. White unpubl.)

Simple mixed models
(scalar, intercept-only)

\[ \begin{split} y_{ij} & = \beta_0 + \beta_1 x_{ij} + \epsilon_{0,ij} + \epsilon_{1,j} \\ & = (\beta_0 + \epsilon_{1,j}) + \beta_1 x_{ij} + \epsilon_{1,j} \\ \epsilon_{0,ij} & \sim \textrm{Normal}(0,\sigma_0^2) \\ \epsilon_{1,j} & \sim \textrm{Normal}(0,\sigma_1^2) \end{split} \]

Random-slopes model

\[ \begin{split} y_{ij} & = \beta_0 + \beta_1 x_{ij} + \epsilon_{0,ij} + \epsilon_{1,j} + \epsilon_{2,j} x_{ij} \\ & = (\beta_0 + \epsilon_{1,j}) + (\beta_1 + \epsilon_{2,j}) x_{ij}) \\ \epsilon_{0,ij} & \sim \textrm{Normal}(0,\sigma_0^2) \\ \{\epsilon_{1,j}, \epsilon_{2,j}\} & \sim \textrm{MVN}(0,\Sigma) \end{split} \]

General definition

\[ \begin{split} \underbrace{Y_i}_{\text{response}} & \sim \overbrace{\text{Distr}}^{\substack{\text{conditional} \\ \text{distribution}}}(\underbrace{g^{-1}(\eta_i)}_{\substack{\text{inverse} \\ \text{link} \\ \text{function}}},\underbrace{\phi}_{\substack{\text{scale} \\ \text{parameter}}}) \\ \underbrace{\boldsymbol \eta}_{\substack{\text{linear} \\ \text{predictor}}} & = \underbrace{\boldsymbol X \boldsymbol \beta}_{\substack{\text{fixed} \\ \text{effects}}} + \underbrace{\boldsymbol Z \boldsymbol b}_{\substack{\text{random} \\ \text{effects}}} \\ \underbrace{\boldsymbol b}_{\substack{\text{conditional} \\ \text{modes}}} & \sim \text{MVN}(\boldsymbol 0, \underbrace{\Sigma(\boldsymbol \theta)}_{\substack{\text{variance-} \\ \text{covariance} \\ \text{matrix}}}) \end{split} \]

What are random effects?

A method for …

Random-effect myths

Reasons for random effects (inferential/philosophical)

Reasons for random effects (practical)

See also Crawley (2002); Gelman (2005)

Avoiding MM

Estimation

Overview

Maximum likelihood estimation

  • Best fit is a compromise between two components
    (consistency of data with fixed effects and conditional modes; consistency of random effect with RE distribution)
    • Goodness-of-fit integrates over conditional modes

## Warning: The `fun.y` argument of `stat_summary()` is deprecated as of ggplot2 3.3.0.
## ℹ Please use the `fun` argument instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.

Shrinkage: Arabidopsis conditional modes

Shrinkage in a random-slopes model

From Christophe Lalanne, see here:

Methods

Estimation methods

  • deterministic
    • various approximate integrals (Breslow, 2004)
    • penalized quasi-likelihood, Laplace, Gauss-Hermite quadrature, … (Biswas, 2015);
      best methods needed for large variance, small clusters
    • flexibility and speed vs. accuracy
  • stochastic
  • stochastic (Monte Carlo): frequentist and Bayesian
    • (Booth & Hobert, 1999; Ponciano et al., 2009; Sung & Geyer, 2007)
    • usually slower but flexible and accurate

Estimation: Culcita (McKeon et al., 2012)

Inference

Wald tests

  • typical results of summary()
  • exact for ANOVA, regression:
    approximation for GLM(M)s
  • fast
  • approximation is sometimes awful (Hauck-Donner effect)

Likelihood ratio tests

  • better than Wald, but still have two problems:
  • “denominator degrees of freedom”
    (when estimating scale)
    • finite-size issues for GLM(M)s: Bartlett corrections
    • Kenward-Roger correction? (Stroup, 2014)
  • Profile confidence intervals: expensive/fragile

finite-size p-values??

  • hope that min grps > 42
  • guess denominator DF from classic design (R code)
  • conservative: take minimum number of groups - 1
  • Satterthwaite/Kenward-Roger (lmerTest, LMMs only)
  • parametric bootstrap (pbkrtest)

Parametric bootstrapping

  • fit null model to data
  • simulate “data” from null model
  • fit null and working model, compute likelihood difference
  • repeat to estimate null distribution
  • should be OK but ??? not well tested
    (assumes estimated parameters are “sufficiently” good)

Bayesian inference

  • If we have a good sample from the posterior distribution (Markov chains have converged etc. etc.) we get most of the inferences we want for free by summarizing the marginal posteriors
  • post hoc Bayesian methods: use deterministic/frequentist methods to find the maximum, then sample around it

Challenges & open questions

On beyond lme4

  • glmmTMB: zero-inflated and other distributions
  • brms,rstanarm: interfaces to Stan
  • INLA: spatial and temporal correlations
  • rethinking package

Toolboxes

  • JAGS (R: rjags, r2jags)
  • TensorFlow (R: greta)
  • NIMBLE (R: nimble package)
  • Stan (R: rstan)
  • TMB (R: TMB)

On beyond R

  • Julia: MixedModels.jl package
  • SAS: PROC MIXED, NLMIXED
  • AS-REML
  • Stata (GLLAMM, xtmelogit)
  • HLM, MLWiN

image

Challenges

  • Small clusters: need AGQ/MCMC
  • Small numbers of clusters: need finite-size corrections (KR/PB/MCMC)
  • Small data sets: issues with singular fits
    (Barr et al., 2013) vs. (Bates et al., 2015)
  • Big data: speed!
  • Model diagnosis
  • Confidence intervals accounting for uncertainty in variances

The need for speed (Brooks et al., 2017)

more speed

Spatial and temporal correlations

  • Sometimes blocking takes care of non-independence …
  • but sometimes there is temporal or spatial correlation within blocks
  • … also phylogenetic … (Ives & Zhu, 2006)
  • “G-side” vs. “R-side” effects
  • tricky to implement for GLMMs, but new possibilities on the horizon (Rousset & Ferdy, 2014; Rue et al., 2009); https://github.com/stevencarlislewalker/lme4ord

Next steps

  • Complex random effects:
    regularization, model selection, penalized methods (lasso/fence)
  • Flexible correlation and variance structures
  • Flexible/nonparametric random effects distributions
  • hybrid & improved MCMC methods
  • Reliable assessment of out-of-sample performance

references

Banta, J. A., Stevens, M. H. H., & Pigliucci, M. (2010). A comprehensive test of the ’limiting resources’ framework applied to plant tolerance to apical meristem damage. Oikos, 119(2), 359–369. http://doi.org/10.1111/j.1600-0706.2009.17726.x

Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3), 255–278. http://doi.org/10.1016/j.jml.2012.11.001

Bates, D., Kliegl, R., Vasishth, S., & Baayen, H. (2015). Parsimonious Mixed Models. arXiv:1506.04967 [Stat]. Retrieved from http://arxiv.org/abs/1506.04967

Biswas, K. (2015). Performances of different estimation methods for generalized linear mixed models (Master’s thesis). McMaster University. Retrieved from https://macsphere.mcmaster.ca/bitstream/11375/17272/2/M.Sc_Thesis_final_Keya_Biswas.pdf

Bolker, B. M. (2015). Linear and generalized linear mixed models. In G. A. Fox, S. Negrete-Yankelevich, & V. J. Sosa (Eds.), Ecological statistics: Contemporary theory and application. Oxford University Press.

Booth, J. G., & Hobert, J. P. (1999). Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm. Journal of the Royal Statistical Society. Series B, 61(1), 265–285. http://doi.org/10.1111/1467-9868.00176

Breslow, N. E. (2004). Whither PQL? In D. Y. Lin & P. J. Heagerty (Eds.), Proceedings of the second Seattle symposium in biostatistics: Analysis of correlated data (pp. 1–22). Springer.

Brooks, M. E., Kristensen, K., van Benthem, K. J., Magnusson, A., Berg, C. W., Nielsen, A., … Bolker, B. M. (2017). glmmTMB balances speed and flexibility among packages for zero-inflated generalized linear mixed modeling. R Journal, 9(2), 378–400. Retrieved from https://journal.r-project.org/archive/2017/RJ-2017-066/RJ-2017-066.pdf

Crawley, M. J. (2002). Statistical computing: An introduction to data analysis using S-PLUS. John Wiley & Sons.

Gelman, A. (2005). Analysis of variance: Why it is more important than ever. Annals of Statistics, 33(1), 1–53. http://doi.org/doi:10.1214/009053604000001048

Ives, A. R., & Zhu, J. (2006). Statistics for correlated data: Phylogenies, space, and time. Ecological Applications, 16(1), 20–32. Retrieved from http://www.esajournals.org/doi/pdf/10.1890/04-0702

McKeon, C. S., Stier, A., McIlroy, S., & Bolker, B. (2012). Multiple defender effects: Synergistic coral defense by mutualist crustaceans. Oecologia, 169(4), 1095–1103. http://doi.org/10.1007/s00442-012-2275-2

Murtaugh, P. A. (2007). Simplicity and complexity in ecological data analysis. Ecology, 88(1), 56–62. Retrieved from http://www.esajournals.org/doi/abs/10.1890/0012-9658%282007%2988%5B56%3ASACIED%5D2.0.CO%3B2

Ponciano, J. M., Taper, M. L., Dennis, B., & Lele, S. R. (2009). Hierarchical models in ecology: Confidence intervals, hypothesis testing, and model selection using data cloning. Ecology, 90(2), 356–362. Retrieved from http://www.jstor.org/stable/27650990

Rousset, F., & Ferdy, J.-B. (2014). Testing environmental and genetic effects in the presence of spatial autocorrelation. Ecography, no–no. http://doi.org/10.1111/ecog.00566

Rue, H., Martino, S., & Chopin, N. (2009). Gaussian models using integrated nested Laplace approximations (with discussion). Journal of the Royal Statistical Society, Series B, 71(2), 319–392.

Stroup, W. W. (2014). Rethinking the analysis of non-normal data in plant and soil science. Agronomy Journal, 106, 1–17. http://doi.org/10.2134/agronj2013.0342

Sung, Y. J., & Geyer, C. J. (2007). Monte Carlo likelihood inference for missing data models. The Annals of Statistics, 35(3), 990–1011. http://doi.org/10.1214/009053606000001389