cc Licensed under the Creative Commons attribution-noncommercial license. Please share & remix noncommercially, mentioning its origin.

latent/random variables

generalized linear mixed models

\[ \begin{split} \underbrace{Y_i}_{\text{response}} & \sim \overbrace{\text{Distr}}^{\substack{\text{conditional} \\ \text{distribution}}}(\underbrace{g^{-1}(\eta_i)}_{\substack{\text{inverse} \\ \text{link} \\ \text{function}}},\underbrace{\phi}_{\substack{\text{scale} \\ \text{parameter}}}) \\ \underbrace{\boldsymbol \eta}_{\substack{\text{linear} \\ \text{predictor}}} & = \underbrace{\boldsymbol X \boldsymbol \beta}_{\substack{\text{fixed} \\ \text{effects}}} + \underbrace{\boldsymbol Z \boldsymbol b}_{\substack{\text{random} \\ \text{effects}}} \\ \underbrace{\boldsymbol b}_{\substack{\text{conditional} \\ \text{modes}}} & \sim \text{MVN}(\boldsymbol 0, \underbrace{\Sigma(\boldsymbol \theta)}_{\substack{\text{variance-} \\ \text{covariance} \\ \text{matrix}}}) \end{split} \]

nonlinear mixed models

(by hand)

What are random effects/latent variables?

A method for …

latent-variable likelihoods

\[L(x|\beta,\sigma) = \int L(x|\beta,u) \cdot L(u|\sigma) \, du\] - likelihood is both penalized (\(u\) values can’t be too spread out) and integrated - need: - efficient ways to estimate \(\hat u\) (high-dimensional) - reasonable ways to integrate over distribution of \(u\)

shrinkage

shrinkage: Arabidopsis conditional modes

load("../data/Banta.RData")
library(plotrix)
library(lme4)
z<- subset(dat.tf,amd=="clipped" & nutrient=="1")
m1 <- glm(total.fruits~gen-1,data=z,family="poisson")
m2 <- glmer(total.fruits~1+(1|gen),data=z,family="poisson")
tt <- table(z$gen)
rr <- unlist(ranef(m2)$gen)[order(coef(m1))]+fixef(m2)
m1s <- sort(coef(m1))
m1s[1:2] <- rep(-5,2)
gsd <- attr(VarCorr(m2)$gen,"stddev")
gm <- fixef(m2)
nseq <- seq(-3,6,length.out=50)
sizefun <- function(x,smin=0.5,smax=3,pow=2) {
    smin+(smax-smin)*((x-min(x))/diff(range(x)))^pow
}
nv <- dnorm(nseq,mean=gm,sd=gsd)
##
op <- par(las=1,cex=1.5,bty="l")
plot(exp(m1s),xlab="Genotype",ylab="Mean fruit set",
     axes=FALSE,xlim=c(-0.5,25),log="y",yaxs="i",xpd=NA,
     pch=16,cex=0.5)
axis(side=1)
axis(side=2,at=c(exp(-5),0.1,1,10,20),
     labels=c(0,0.1,1,10,20),cex=0.8)
##     ylim=c(-3,5))
polygon(c(rep(0,50),nv*10),exp(c(rev(nseq),nseq)),col="gray",xpd=NA)
n <- tt[order(coef(m1))]
points(exp(rr),pch=16,col=adjustcolor("red",alpha=0.5),
       cex=sizefun(n),xpd=NA)
## text(seq_along(rr),rr,n,pos=3,xpd=NA,cex=0.6)
box()
axis.break(axis=2,breakpos=exp(-4))
legend("bottomright",
       c("group mean","shrinkage est."),
       pch=16,pt.cex=c(1,2),
       col=c("black",adjustcolor("red",alpha=0.5)),
       bty="n")

par(op)

integration techniques: deterministic

Fast, flexible, inaccurate \(\leftrightarrow\) slow, restrictive, accurate (Biswas 2015)

integration techniques: stochastic

Biswas, Keya. 2015. “Performances of Different Estimation Methods for Generalized Linear Mixed Models.” Master’s thesis, McMaster University. https://macsphere.mcmaster.ca/bitstream/11375/17272/2/M.Sc_Thesis_final_Keya_Biswas.pdf.

Madsen, Henrik, and Poul Thyregod. 2011. Introduction to General and Generalized Linear Models. CRC Press.

Peng, Roger D. 2019. Advanced Statistical Computing. Accessed May 14. https://github.com/rdpeng/advstatcomp.