marginal predictions

Exploring marginal predictions/bias correction for GLMMs. The basic problem is Jensen’s inequality; if we have among-group variation around a mean on the linear predictor scale, using the inverse-link function to transform it back to the response scale will give a biased estimate of the mean

GLMMadaptive has marginal_coefs() to deal with this emmeans has a delta-method approximation to deal with this

library(lme4)

## Loading required package: Matrix

library(GLMMadaptive)

## 
## Attaching package: 'GLMMadaptive'

## The following object is masked from 'package:lme4':
## 
##     negative.binomial

library(emmeans)

simulated example: binomial, low probability, large variance (should lead to large bias)

dd <- data.frame(grp=factor(rep(1:20,each=20)))
dd$y <- simulate(~1+(1|grp),
                 seed=101,
                 weights=rep(1,nrow(dd)),
                 family=binomial,
                 newdata=dd,
                 newparam=list(beta=-2,theta=2))[[1]]

Probability of original data is 0.205.

Fit lme4, GLMMadaptive models (nAGQ>1 is necessary; setting it to 11 matches GLMMadaptive default value)

m1 <- glmer(y~1+(1|grp), data=dd, family=binomial, nAGQ=11)
m2 <- mixed_model(y~1, random=~1|grp, data=dd, family=binomial)

Very similar results for fixed effect coef (intercept only) and among-group variance:

all.equal(fixef(m1),fixef(m2), tolerance=2e-3)

## [1] TRUE

all.equal(c(VarCorr(m1)[[1]]),c(m2$D), tolerance=0.01)

## [1] TRUE

Hack prediction slightly (there are no non-trivial predictors but we still need one row in the data frame):

plogis(predict(m1,newdata=data.frame(x=1),re.form=NA))

##         1 
## 0.1086174

## equivalently
plogis(fixef(m1))

## (Intercept) 
##   0.1086174

## these are biased by ~50% (0.108 vs 0.205)

Brute-force simulation of back-transforming Normal variation:

mean(plogis(rnorm(10000,mean=fixef(m1),sd=getME(m1,"theta"))))  ## 0.1994 vs 0.205

## [1] 0.1994262

## or:
sv <- sapply(simulate(m1,1000),mean)
## simulating results gives approx correct means
summary(sv)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0650  0.1644  0.1950  0.1995  0.2350  0.4350

hist(sv,main="",breaks=100)

GLMMadaptive also does this by Monte Carlo sim

marginal_coefs(m2)  ## on link scale

## (Intercept) 
##     -1.3884

plogis(marginal_coefs(m2)$betas)  ## back-transform

## (Intercept) 
##   0.1996687

delta-method approaches

summary(emmeans(m1, specs=~1), type = "response", bias.adj = TRUE)

##  1        prob     SE  df asymp.LCL asymp.UCL
##  overall 0.147 0.0553 Inf    0.0661     0.283
## 
## Confidence level used: 0.95 
## Intervals are back-transformed from the logit scale 
## Bias adjustment applied based on sigma = 1

Use Python so we can simplify (brute force D()/deriv() from R would probably work fine??)

from sympy import *
x = symbols('x')
d1 = simplify(diff(1/(1+exp(-x)), x))
d2 = simplify(diff(1/(1+exp(-x)), x, x))
d3 = simplify(diff(1/(1+exp(-x)), x, x, x))
d4 = simplify(diff(1/(1+exp(-x)), x, x, x, x))
print([d1, d2, d3, d4])

## [1/(4*cosh(x/2)**2), (1 - exp(x))*exp(x)/(exp(x) + 1)**3, ((exp(x) + 1)**3 - 6*(exp(x) + 1)**2 + 6*exp(x) + 6)*exp(x)/(exp(x) + 1)**5, (-(exp(x) + 1)**3 + 14*(exp(x) + 1)**2 - 36*exp(x) - 12)*exp(x)/(exp(x) + 1)**5]

d2 <- function(x) {
    ex <- exp(x)
    (1 - ex)*ex/(ex + 1)**3
}
d4 <- function(x) {
    ex <- exp(x)
    (-1*(ex + 1)^3 + 14*(ex + 1)^2 - 36*ex - 12)*ex/(ex + 1)^5
}

Check:

all.equal(d2(2.5), eval(D(D(expression(1/(1+exp(-x))),"x"),"x"), list(x=2.5)))

## [1] TRUE

## ugh
all.equal(d4(2.5), eval(D(D(D(D(expression(1/(1+exp(-x))),"x"),"x"),"x"),"x"),list(x=2.5)))

## [1] TRUE

b <- fixef(m1)
## second-order Taylor exp. (matches emmeans)
(delta_2 <- plogis(b) + sigma(m1)^2*d2(b)/2)

## (Intercept) 
##    0.146511

## attempt to extend Taylor expansion (skew==0, so take 4th deriv * kurtosis??)
delta_2 + (3*sigma(m1)^4)*d4(b)/factorial(4)

## (Intercept) 
##   0.1449778

## doesn't help, don't know what's wrong here ...
##  calculation error? thinko?