library(lme4)

A long-standing failing of lme4 is that it is not possible to fit models with structured covariance terms. However, this is hackable if you know what you’re doing …

Simulate data (unstructured/“pdDiag”).

dd <- expand.grid(id=factor(1:20),rep=factor(1:20))
set.seed(101)
dd <- transform(dd,x=rnorm(nrow(dd)),y=rnorm(nrow(dd)))
dd$z <- simulate(~1+ (x+y|id), family=gaussian, newdata=dd, newparams=list(beta=1, theta=rep(1,6), sigma=1))[[1]] • The recipes below will work for any GLMM/response distribution, but sticking to a Gaussian response for now will make it easier to compare with the results of lme. • In general, in keeping with lme4’s general approach, I’m going to fit models on constrained parameter spaces (e.g. keeping correlation parameters between -1 and 1 and standard deviations >0, rather than transforming to fit them on the logit and log scales, respectively) ## Full (unstructured but pos-def symmetric: pdSymm) lmod <- lFormula(z ~ 1 + (x+y|id), data=dd) devfun <- do.call(mkLmerDevfun, lmod) Optimize the deviance function: opt_Symm <- optimizeLmer(devfun) VarCorr(m_Symm <- mkMerMod(environment(devfun), opt_Symm, lmod$reTrms, fr = lmod$fr)) ## Groups Name Std.Dev. Corr ## id (Intercept) 0.73447 ## x 0.66192 0.575 ## y 1.18529 0.187 0.312 ## Residual 1.02588 This follows the steps shown in ?modular: it’s equivalent to what we would get from lmer(...). ## multiple of identity matrix This is equivalent to nlme’s pdIdent class. wrap_Ident <- function(theta) { tvec <- c(theta,0,0,theta,0,theta) devfun(tvec) } opt0 <- optim(par=1,wrap_Ident,method="Brent",lower=0,upper=100) opt_Ident <- with(opt0,list(par=par,fval=value,conv=convergence, message=message)) VarCorr(m_Ident <- mkMerMod(environment(devfun), opt_Ident, lmod$reTrms, fr = lmod$fr)) ## Groups Name Std.Dev. Corr ## id (Intercept) 0.89471 ## x 0.89471 0.000 ## y 0.89471 0.000 0.000 ## Residual 1.02545 We have to go a little out of our way to do 1-D derivative-free (and bounded) optimization. The upper bound must be finite, but 100 is a reasonable upper bound (the variance here is scaled relative to the residual std dev). ## diagonal, heterogeneous variances This is equivalent to nlme’s pdDiag class wrap_Diag <- function(theta) { tvec <- c(theta[1],0,0,theta[2],0,theta[3]) devfun(tvec) } opt_Diag <- nloptwrap(par=rep(1,3),wrap_Diag,lower=rep(0,3),upper=rep(Inf,3)) VarCorr(m_Diag <- mkMerMod(environment(devfun), opt_Diag, lmod$reTrms, fr = lmod$fr)) ## Groups Name Std.Dev. Corr ## id (Intercept) 0.74209 ## x 0.66310 0.000 ## y 1.18553 0.000 0.000 ## Residual 1.02529 Checking the environment for pd* and cor* functions from lme, which may be useful in determining Cholesky factors … library(nlme) apropos("^pd[^f]") ## [1] "pdBlocked" "pdCompSymm" "pdConstruct" "pdDiag" "pdIdent" ## [6] "pdLogChol" "pdMat" "pdMatrix" "pdNatural" "pdSymm" apropos("^cor[A-Z]") ## [1] "corAR1" "corARMA" "corCAR1" "corCompSymm" "corExp" ## [6] "corFactor" "corGaus" "corIdent" "corLin" "corMatrix" ## [11] "corNatural" "corRatio" "corSpatial" "corSpher" "corSymm" ## ?pdFactor ## AR1: homogeneous variances Stole some nice code from here ## direct computation of Cholesky root for an AR(1) matrix ar1_chol <- function(rho,p) { R <- matrix(0,p,p) R[1,] <- rho^(0:(p-1)) ## formula for 1st row cc <- sqrt(1 - rho^2); ## scaling factor: c^2 + rho^2 = 1 R2 <- cc * R[1,] ## formula for 2nd row */ for (j in 2:p) { ## shift elements in 2nd row for remaining rows R[j, j:p] <- R2[1:(p-j+1)] } return(R) } Could also use nlme:::corAR1(), get corFactor() (the inverse cholesky factor), and unpack it, but this seems easier. We’re going to assume here that the id values are AR1 (e.g. id values represent sequential measurements … ## keep messing up devfun?? ## devfun <- do.call(mkLmerDevfun, lmod) wrap_AR1 <- function(theta) { cc <- ar1_chol(theta[2],p=3) tvec <- t(cc)[lower.tri(cc,diag=TRUE)]*theta[1] d <- devfun(tvec) ## cat(theta,d,"\n") return(d) } opt_AR1 <- nloptwrap(par=c(1,0),wrap_AR1,lower=c(0,-0.49),upper=c(Inf,1)) VarCorr(m_AR1 <- mkMerMod(environment(devfun), opt_AR1, lmod$reTrms, fr = lmod$fr)) ## Groups Name Std.Dev. Corr ## id (Intercept) 0.90232 ## x 0.90232 0.399 ## y 0.90232 0.159 0.399 ## Residual 1.02560 ## AR1: heterogeneous variances wrap_AR1het <- function(theta) { n <- length(theta) cc <- ar1_chol(theta[n],p=n-1) tvec <- sweep(t(cc),1,theta[-n],"*") tvec <- tvec[lower.tri(tvec,diag=TRUE)] d <- devfun(tvec) return(d) } opt_AR1het <- nloptwrap(par=c(1,2,3,0.5), wrap_AR1het, lower=c(rep(0,3),-0.49), upper=c(rep(Inf,3),1)) VarCorr(m_AR1het <- mkMerMod(environment(devfun), opt_AR1het, lmod$reTrms, fr = lmod$fr)) ## Groups Name Std.Dev. Corr ## id (Intercept) 0.72066 ## x 0.65964 0.435 ## y 1.21390 0.189 0.435 ## Residual 1.02558 ## compound symmetry This might actually be the hardest case tried so far. We’d like to directly compute the Cholesky factor of a compound symmetric matrix, but it doesn’t seem to be very practical. The short version of this is that if we can map from a set of parameters to a desired covariance matrix, we can always brute-force numerically Cholesky-decompose it (this is done e.g. in the corCAR1 and the spatial correlation structures (corExp, corGaus, etc.) in nlme). The other question is whether there are natural bounds on the space of positive-(semi)-definite matrices of a particular type. In the compound-symmetric case it’s easy because the variances are >0 and the correlations are in $$(-1/n,1)$$ where $$n$$ is the dimension of the matrix. To get direct/simpler versions I tried things like: FullSimplify[CholeskyDecomposition[((1,rho,rho),(rho,1,rho),(rho,rho,1))], Element[rho,Reals]]  (note that the parens () in this expression should actually be brackets {}: this confused GitHub pages for some reason) in Wolfram Alpha. For 3 $$\times$$ 3 the result simplifies to: $\begin{split} A & = \sqrt{1-\rho^2} \\ & \left(\begin{array}{ccc} 1 & \rho & \rho \\ 0 & A & \rho(1-\rho)/A \\ 0 & 0 & \sqrt{-\rho^2 - \rho^2((\rho-1)^2)/A^2 + 1} \end{array} \right) \end{split}$ cs3_chol <- function(rho) { A <- sqrt(1-rho^2) matrix(c(1,rho,rho, 0,A,rho*(1-rho)/A, 0,0, sqrt(-rho^2-rho^2*((rho-1)^2)/A^2+1)), byrow=TRUE,nrow=3) } but for 4 $$\times$$ 4 it is: While nlme provides a matrix square-root for pd* objects, it never promises that they’re triangular … library(nlme) x <- c(-1,-1) attr(x,"ncol") <- 3 m <- matrix(nlme:::pdFactor.pdCompSymm(x),3,3) crossprod(m,m) ## is indeed cs ## [,1] [,2] [,3] ## [1,] 0.13533528 -0.01307175 -0.01307175 ## [2,] -0.01307175 0.13533528 -0.01307175 ## [3,] -0.01307175 -0.01307175 0.13533528 ## but NOT lower-triangular ## comp symm, homogeneous variance wrap_CShom <- function(theta=c(1,0.1)) { rho <- theta[2] sd <- theta[1] m <- matrix(rho,3,3) diag(m) <- 1 cc <- t(chol(m))*sd tvec <- cc[lower.tri(cc,diag=TRUE)] devfun(tvec) } opt_CShom <- nloptwrap(par=c(1,0.1),wrap_CShom,lower=c(0,-1/2),upper=c(Inf,1)) VarCorr(m_CShom <- mkMerMod(environment(devfun), opt_CShom, lmod$reTrms, fr = lmod$fr)) ## Groups Name Std.Dev. Corr ## id (Intercept) 0.89383 ## x 0.89383 0.280 ## y 0.89383 0.280 0.280 ## Residual 1.02560 ## heterogeneous diagonal wrap_CShet <- function(theta=c(rep(1,3),0.1)) { rho <- theta[length(theta)] sd <- theta[-length(theta)] m <- matrix(rho,3,3) diag(m) <- 1 m <- m * outer(sd,sd) cc <- t(chol(m)) tvec <- cc[lower.tri(cc,diag=TRUE)] devfun(tvec) } opt_CShet <- nloptwrap(par=c(rep(1,3),0.1), wrap_CShet, lower=c(rep(0,3),-1/2), upper=c(rep(Inf,3),1)) VarCorr(m_CShet <- mkMerMod(environment(devfun), opt_CShet, lmod$reTrms, fr = lmod\$fr))
##  Groups   Name        Std.Dev. Corr
##  id       (Intercept) 0.73382
##           x           0.64966  0.347
##           y           1.21385  0.347 0.347
##  Residual             1.02549

## To do

• more intro?
• more complex examples?
• ref lme4ord ?
• compare with lme/glmmTMB fits where possible?
• revive/redo flexLambda!
• comment on/compare with Lambda-hacking
• phylo examples
• Toeplitz/Kronecker examples?
• ref Tanaka/Hui project on mixed model syntax?
• ref MCMCglmm possibilities?
• discuss/compare residuals vs random effects (also, Gaussian vs non-Gaussian case)