Responding to a query from Rolf Turner on `r-sig-mixed-models@r-project.org`

.

**tl;dr** the error messages are spurious/due to a singular fit. Considerable speed-ups can be achieved by (1) switching optimizers, (2) using `nAGQ=0`

, (3) using `glmmTMB`

or Julia …

**to do**

- compare variance-covariance matrix estimates (they vary a lot, although it doesn’t seem to have much effect on the fixed effects?
- play with/fight with Julia more (native mode rather than
`rjulia`

package? better retrieval of results? try`fast!`

(corresponding to`nAGQ=0`

)? - worth comparing with
`brms`

results … ?? - try estimation with treatment and treatment-by-x effects either as real random effects (i.e., with shrinkage)
*or*with large fixed variance (for computational efficiency)

**caveats**

- Julia timing is still questionable; I have previously gotten results as fast as 7 seconds for the full model fit (vs. the 36 seconds reported below)

```
## fitting packages
library(lme4)
library(glmmTMB)
library(brms)
## post-analysis pkgs
library(broom) ## need BMB version for glmmTMB
library(ggplot2); theme_set(theme_bw())
library(MASS) ## before dplyr; avoid masking dplyr::select()
library(dplyr)
library(tidyr)
library(ggstance)
```

To run the code in the Julia section below you need to install:

Julia; tested on versions 0.5.2 and 0.6.0, I used 0.6.0 below. On Ubuntu you can

`sudo add-apt-repository ppa:staticfloat/juliareleases sudo apt-get update sudo apt-get install julia`

- Doug Bates’s MixedModels.jl package (in Julia,
`Pkg.add("MixedModels")`

) - at present, an updated version of the
`GLM`

package for Julia that fixes an issue with the cloglog link (in Julia,`Pkg.checkout("GLM"); Pkg.checkout("GLM", "db/clampcloglog")`

this example no longer uses the

`rjulia`

package for R. It did in previous versions; I needed to create a modified version (in R,`devtools::install_github("bbolker/rjulia", ref="julia0.5")`

```
set.seed(42)
X <- artSim()
write.csv(X,file="artSim.csv",row.names=FALSE)
```

```
t1 <- system.time(fit1 <- glmer(cbind(Dead,Alive) ~ (Trt + 0)/x + (x | Rep),
family=binomial(link="cloglog"),
data=X))
```

```
## Warning in (function (fn, par, lower = rep.int(-Inf, n), upper =
## rep.int(Inf, : failure to converge in 10000 evaluations
```

```
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control
## $checkConv, : unable to evaluate scaled gradient
```

```
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control
## $checkConv, : Model failed to converge: degenerate Hessian with 1 negative
## eigenvalues
```

We should deal with the “failure to converge in 10000 evaluations” warning first: this comes directly from the optimizer (`minqa::bobyqa`

), and hence means we don’t think we’ve even gotten to the optimum yet. We need to use the `optCtrl`

component of the `glmerControl()`

function to adjust the maximum number of iterations.

`t2 <- system.time(fit2 <- update(fit1,control=glmerControl(optCtrl=list(maxfun=3e4))))`

```
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control
## $checkConv, : unable to evaluate scaled gradient
```

```
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control
## $checkConv, : Model failed to converge: degenerate Hessian with 1 negative
## eigenvalues
```

Now we have just the post-fitting convergence warnings. The first thing we should do is to check for singularity: if the fit is singular, i.e. some of the \(\theta\) (variance-covariance) parameters are at their constrained values, then the warning about negative eigenvalues of the Hessian is largely irrelevant (it is a long-term wish-list item to evaluate the Karush-Kuhn-Tucker conditions properly in this case, i.e. allowing for the constraints.

```
is.singular <- function(x,tol=1e-6) any(abs(getME(x,"theta"))<tol)
is.singular(fit2)
```

`## [1] TRUE`

Furthermore, in this case we can more simply look a the variance-covariance matrix and see that it has a zero variance (and an undefined correlation), which more directly indicates singularity.

`VarCorr(fit2)`

```
## Groups Name Std.Dev. Corr
## Rep (Intercept) 0.000000
## x 0.015276 NaN
```

I wanted to see if we could speed things up (without losing accuracy) by using the BOBYQA implementation from the `nloptr`

package. It turns out that in order to do this, we have to skip the `nAGQ=0`

init step (this is sometimes needed for cloglog models, especially those with offsets):

```
t3 <- system.time(fit3 <- update(fit1,control=glmerControl(optimizer="nloptwrap",
nAGQ0initStep=FALSE)))
```

```
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control
## $checkConv, : unable to evaluate scaled gradient
```

```
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control
## $checkConv, : Model failed to converge: degenerate Hessian with 1 negative
## eigenvalues
```

Unfortunately, skipping the init step gives a negative log-likelihood that is -74 log-likelihood units worse (!).

I tried `nAGQ0initStep=FALSE`

with the default optimizer or the BOBYQA optimizer as well; it was very very slow (\(\approx 600\) seconds with `maxfun=5e4`

, which wasn’t even enough …)

```
t4 <- system.time(fit4 <- update(fit1,
control=glmerControl(optimizer="bobyqa",
optCtrl=list(maxfun=1e5),
nAGQ0initStep=FALSE)))
```

What about the opposite (`nAGQ=0`

), as suggested by Tony Ives?

```
t5 <- system.time(fit5 <- update(fit1,nAGQ=0,control=glmerControl(optimizer="bobyqa",
optCtrl=list(maxfun=3e4))))
```

Can’t use `nAGQ > 1`

in this case because we have a random-slopes model … too bad. (We have a mean value of effective sample size = `min(alive,dead)`

3, which implies that the Gauss-Hermite quadrature corrections *should* be important, but …)

We can’t really compare the likelihoods with `nAGQ=0`

vs. `nAGQ>0`

, so we’ll just compare fixed-effect estimates (below).

```
t6 <- system.time(fit6 <- glmmTMB(Dead/(Alive+Dead) ~ (Trt + 0)/x + (x | Rep),
weights = Alive+Dead,
family=binomial(link="cloglog"),data=X))
```

```
t7 <- system.time(fit7 <- update(fit1,control=glmerControl(optimizer="bobyqa",
optCtrl=list(maxfun=3e4))))
```

```
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control
## $checkConv, : unable to evaluate scaled gradient
```

```
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control
## $checkConv, : Model failed to converge: degenerate Hessian with 3 negative
## eigenvalues
```

This is actually almost working, with a few caveats:

- I’m having trouble getting Julia to evaluate properly in an Rmarkdown chunk (can probably do this but will have to fight with it a bit)
- the
`runr`

-based strategy implemented here is flaky - I could hack together something simpler based on sending text to Julia for evaluation (
`rjulia -e`

?`-E`

?) but not sure how well it would work `rjulia`

works (if hacked as described above)- I fell back on simply running the Julia code (below, and here) as a separate process and printing the results to a file

- the
- requires fix to Julia GLM internals so cloglog doesn’t underflow

```
using DataFrames, GLM, MixedModels
const df = readtable("artSim.csv", makefactors = true);
## compute proportions and totals
df[:Tot] = float.(df[:Dead]+df[:Alive]);
df[:prop] = df[:Dead] ./ df[:Tot];
## for CPU timing:
### Pkg.clone("https://github.com/schmrlng/CPUTime.jl.git")
## fit without timing, for compilation
## (Trt+0)/x formula notation doesn't work, need to expand it ourselves
g3 = fit!(glmm(@formula(prop ~ 0 + Trt + Trt&x + (1 + x|Rep)), df,
Binomial(), CloglogLink(), wt=Array(df[:Tot])), fast = true);
e1 = @elapsed fit!(glmm(@formula(prop ~ 0 + Trt + Trt&x + (1 + x|Rep)), df,
Binomial(), CloglogLink(), wt=Array(df[:Tot])), fast = true)
## non-fast (nAGQ>0)
e2 = @elapsed g4 = fit!(glmm(@formula(prop ~ 0 + Trt + Trt&x + (1 + x|Rep)), df,
Binomial(), CloglogLink(), wt=Array(df[:Tot])))
## store timing info
open("julia_cloglog_timings.txt","w") do f
write(f,string(e1)," ",string(e2),"\n")
end
## store fixed-effect results
ct = coeftable(g4)
outdf = DataFrame(variable = ct.rownms,
Estimate = ct.cols[1],
StdError = ct.cols[2],
z_val = ct.cols[3])
writetable("julia_cloglog_coefs.csv",outdf)
ct = coeftable(g3)
outdf = DataFrame(variable = ct.rownms,
Estimate = ct.cols[1],
StdError = ct.cols[2],
z_val = ct.cols[3])
writetable("julia_cloglog_coefs_nagq0.csv",outdf)
```

This `brms`

fit works (sort of); it runs, but we get the following warnings:

```
1: The model has not converged (some Rhats are > 1.1). Do not analyse the results!
We recommend running more iterations and/or setting stronger priors.
2: There were 4000 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help.
See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
```

```
t8 <- system.time(
fit8 <- brm(Dead | trials(Dead+Alive) ~ (Trt + 0)/x + (x | Rep),
family=binomial(link="cloglog"),
inits="0",
data=X))
```

Might work if we added some kind of priors (or group priors, i.e. shrinkage) on the fixed effects?

Or perhaps more clearly: