hacking covariance structures in lme4

library(lme4)

A long-standing failing of lme4 is that it is not possible to fit models with structured covariance terms. However, this is hackable if you know what you’re doing …

Simulate data (unstructured/“pdDiag”).

dd <- expand.grid(id=factor(1:20),rep=factor(1:20))
set.seed(101)
dd <- transform(dd,x=rnorm(nrow(dd)),y=rnorm(nrow(dd)))
dd$z <- simulate(~1+ (x+y|id),
                 family=gaussian,
                 newdata=dd,
                 newparams=list(beta=1,
                                theta=rep(1,6),
                                sigma=1))[[1]]

The recipes below will work for any GLMM/response distribution, but sticking to a Gaussian response for now will make it easier to compare with the results of lme.
In general, in keeping with lme4’s general approach, I’m going to fit models on constrained parameter spaces (e.g. keeping correlation parameters between -1 and 1 and standard deviations >0, rather than transforming to fit them on the logit and log scales, respectively)

Full (unstructured but pos-def symmetric: `pdSymm`)

lmod <- lFormula(z ~ 1 + (x+y|id), data=dd)
devfun <- do.call(mkLmerDevfun, lmod)

Optimize the deviance function:

opt_Symm <- optimizeLmer(devfun)
VarCorr(m_Symm <- mkMerMod(environment(devfun), opt_Symm, lmod$reTrms, fr = lmod$fr))

##  Groups   Name        Std.Dev. Corr       
##  id       (Intercept) 0.73447             
##           x           0.66192  0.575      
##           y           1.18529  0.187 0.312
##  Residual             1.02588

This follows the steps shown in ?modular: it’s equivalent to what we would get from lmer(...).

multiple of identity matrix

This is equivalent to nlme’s pdIdent class.

wrap_Ident <- function(theta) {
  tvec <- c(theta,0,0,theta,0,theta)
  devfun(tvec)
}
opt0 <- optim(par=1,wrap_Ident,method="Brent",lower=0,upper=100)
opt_Ident <- with(opt0,list(par=par,fval=value,conv=convergence,
                            message=message))
VarCorr(m_Ident <- mkMerMod(environment(devfun), opt_Ident, lmod$reTrms, fr = lmod$fr))

##  Groups   Name        Std.Dev. Corr       
##  id       (Intercept) 0.89471             
##           x           0.89471  0.000      
##           y           0.89471  0.000 0.000
##  Residual             1.02545

We have to go a little out of our way to do 1-D derivative-free (and bounded) optimization. The upper bound must be finite, but 100 is a reasonable upper bound (the variance here is scaled relative to the residual std dev).

diagonal, heterogeneous variances

This is equivalent to nlme’s pdDiag class

wrap_Diag <- function(theta) {
  tvec <- c(theta[1],0,0,theta[2],0,theta[3])
  devfun(tvec)
}
opt_Diag <- nloptwrap(par=rep(1,3),wrap_Diag,lower=rep(0,3),upper=rep(Inf,3))
VarCorr(m_Diag <- mkMerMod(environment(devfun), opt_Diag, lmod$reTrms, fr = lmod$fr))

##  Groups   Name        Std.Dev. Corr       
##  id       (Intercept) 0.74209             
##           x           0.66310  0.000      
##           y           1.18553  0.000 0.000
##  Residual             1.02529

Checking the environment for pd* and cor* functions from lme, which may be useful in determining Cholesky factors …

library(nlme)
apropos("^pd[^f]")

##  [1] "pdBlocked"   "pdCompSymm"  "pdConstruct" "pdDiag"      "pdIdent"    
##  [6] "pdLogChol"   "pdMat"       "pdMatrix"    "pdNatural"   "pdSymm"

apropos("^cor[A-Z]")

##  [1] "corAR1"      "corARMA"     "corCAR1"     "corCompSymm" "corExp"     
##  [6] "corFactor"   "corGaus"     "corIdent"    "corLin"      "corMatrix"  
## [11] "corNatural"  "corRatio"    "corSpatial"  "corSpher"    "corSymm"

## ?pdFactor

AR1: homogeneous variances

Stole some nice code from here

## direct computation of Cholesky root for an AR(1) matrix
ar1_chol <- function(rho,p) {
  R <- matrix(0,p,p)
  R[1,] <- rho^(0:(p-1))        ## formula for 1st row
  cc <- sqrt(1 - rho^2);        ## scaling factor: c^2 + rho^2 = 1
  R2 <-  cc * R[1,]             ## formula for 2nd row */
  for (j in 2:p) {              ## shift elements in 2nd row for remaining rows
    R[j, j:p] <- R2[1:(p-j+1)] 
  }
  return(R)
}

Could also use nlme:::corAR1(), get corFactor() (the inverse cholesky factor), and unpack it, but this seems easier.

We’re going to assume here that the id values are AR1 (e.g. id values represent sequential measurements …

## keep messing up devfun??
## devfun <- do.call(mkLmerDevfun, lmod)
wrap_AR1 <- function(theta) {
  cc <- ar1_chol(theta[2],p=3)
  tvec <- t(cc)[lower.tri(cc,diag=TRUE)]*theta[1]
  d <- devfun(tvec)
  ## cat(theta,d,"\n")
  return(d)
}
opt_AR1 <- nloptwrap(par=c(1,0),wrap_AR1,lower=c(0,-0.49),upper=c(Inf,1))
VarCorr(m_AR1 <- mkMerMod(environment(devfun), opt_AR1, lmod$reTrms, fr = lmod$fr))

##  Groups   Name        Std.Dev. Corr       
##  id       (Intercept) 0.90232             
##           x           0.90232  0.399      
##           y           0.90232  0.159 0.399
##  Residual             1.02560

AR1: heterogeneous variances

wrap_AR1het <- function(theta) {
  n <- length(theta)
  cc <- ar1_chol(theta[n],p=n-1)
  tvec <- sweep(t(cc),1,theta[-n],"*")
  tvec <- tvec[lower.tri(tvec,diag=TRUE)]
  d <- devfun(tvec)
  return(d)
}
opt_AR1het <- nloptwrap(par=c(1,2,3,0.5),
                        wrap_AR1het,
                        lower=c(rep(0,3),-0.49),
                        upper=c(rep(Inf,3),1))
VarCorr(m_AR1het <- mkMerMod(environment(devfun),
                             opt_AR1het, lmod$reTrms, fr = lmod$fr))

##  Groups   Name        Std.Dev. Corr       
##  id       (Intercept) 0.72066             
##           x           0.65964  0.435      
##           y           1.21390  0.189 0.435
##  Residual             1.02558

compound symmetry

This might actually be the hardest case tried so far. We’d like to directly compute the Cholesky factor of a compound symmetric matrix, but it doesn’t seem to be very practical.

The short version of this is that if we can map from a set of parameters to a desired covariance matrix, we can always brute-force numerically Cholesky-decompose it (this is done e.g. in the corCAR1 and the spatial correlation structures (corExp, corGaus, etc.) in nlme).

The other question is whether there are natural bounds on the space of positive-(semi)-definite matrices of a particular type. In the compound-symmetric case it’s easy because the variances are >0 and the correlations are in \((-1/n,1)\) where \(n\) is the dimension of the matrix.

To get direct/simpler versions I tried things like:

FullSimplify[CholeskyDecomposition[((1,rho,rho),(rho,1,rho),(rho,rho,1))], Element[rho,Reals]]

(note that the parens `()` in this expression should actually be brackets `{}`: this confused GitHub pages for some reason)

in Wolfram Alpha.

For 3 \(\times\) 3 the result simplifies to:

\[ \begin{split} A & = \sqrt{1-\rho^2} \\ & \left(\begin{array}{ccc} 1 & \rho & \rho \\ 0 & A & \rho(1-\rho)/A \\ 0 & 0 & \sqrt{-\rho^2 - \rho^2((\rho-1)^2)/A^2 + 1} \end{array} \right) \end{split} \]

cs3_chol <- function(rho) {
  A <- sqrt(1-rho^2)
  matrix(c(1,rho,rho,
           0,A,rho*(1-rho)/A,
           0,0,
           sqrt(-rho^2-rho^2*((rho-1)^2)/A^2+1)),
         byrow=TRUE,nrow=3)
}

but for 4 \(\times\) 4 it is:

4x4 compsymm cholesky

While nlme provides a matrix square-root for pd* objects, it never promises that they’re triangular …

library(nlme)
x <- c(-1,-1)
attr(x,"ncol") <- 3
m <- matrix(nlme:::pdFactor.pdCompSymm(x),3,3)
crossprod(m,m) ## is indeed cs

##             [,1]        [,2]        [,3]
## [1,]  0.13533528 -0.01307175 -0.01307175
## [2,] -0.01307175  0.13533528 -0.01307175
## [3,] -0.01307175 -0.01307175  0.13533528

## but NOT lower-triangular

comp symm, homogeneous variance

wrap_CShom <- function(theta=c(1,0.1)) {
  rho <- theta[2]
  sd <- theta[1]
  m <- matrix(rho,3,3)
  diag(m) <- 1
  cc <- t(chol(m))*sd
  tvec <- cc[lower.tri(cc,diag=TRUE)]
  devfun(tvec)
}
opt_CShom <- nloptwrap(par=c(1,0.1),wrap_CShom,lower=c(0,-1/2),upper=c(Inf,1))
VarCorr(m_CShom <- mkMerMod(environment(devfun),
                            opt_CShom, lmod$reTrms, fr = lmod$fr))

##  Groups   Name        Std.Dev. Corr       
##  id       (Intercept) 0.89383             
##           x           0.89383  0.280      
##           y           0.89383  0.280 0.280
##  Residual             1.02560

heterogeneous diagonal

wrap_CShet <- function(theta=c(rep(1,3),0.1)) {
  rho <- theta[length(theta)]
  sd <- theta[-length(theta)]
  m <- matrix(rho,3,3)
  diag(m) <- 1
  m <- m * outer(sd,sd)
  cc <- t(chol(m))
  tvec <- cc[lower.tri(cc,diag=TRUE)]
  devfun(tvec)
}
opt_CShet <- nloptwrap(par=c(rep(1,3),0.1),
                  wrap_CShet,
                  lower=c(rep(0,3),-1/2),
                  upper=c(rep(Inf,3),1))
VarCorr(m_CShet <- mkMerMod(environment(devfun), opt_CShet,
                            lmod$reTrms, fr = lmod$fr))

##  Groups   Name        Std.Dev. Corr       
##  id       (Intercept) 0.73382             
##           x           0.64966  0.347      
##           y           1.21385  0.347 0.347
##  Residual             1.02549

To do

more intro?
more complex examples?
ref lme4ord ?
compare with lme/glmmTMB fits where possible?
revive/redo flexLambda!
comment on/compare with Lambda-hacking
phylo examples
Toeplitz/Kronecker examples?
ref Tanaka/Hui project on mixed model syntax?
ref MCMCglmm possibilities?
discuss/compare residuals vs random effects (also, Gaussian vs non-Gaussian case)

hacking covariance structures in lme4

Ben Bolker

09 Oct 2019

Full (unstructured but pos-def symmetric: `pdSymm`)

multiple of identity matrix

diagonal, heterogeneous variances

AR1: homogeneous variances

AR1: heterogeneous variances

compound symmetry

comp symm, homogeneous variance

heterogeneous diagonal

To do

hacking covariance structures in lme4

Ben Bolker

09 Oct 2019

Full (unstructured but pos-def symmetric: pdSymm)

multiple of identity matrix

diagonal, heterogeneous variances

AR1: homogeneous variances

AR1: heterogeneous variances

compound symmetry

comp symm, homogeneous variance

heterogeneous diagonal

To do

Full (unstructured but pos-def symmetric: `pdSymm`)