What does the set \(\{\rho_1, \rho_2, \rho_3\}: C \textrm{ is pos. def.}\) look like? This is answered nicely by Rousseeuw and Molenberghs (1994), but I will explore it some here. (Will also consider how this connects to singular model fits in mixed models …)
Based on the determinant characteristic for the edge of the space (\(\det(C) = 0\)), Roousseuw and Molenberghs derive this relationship:
Indeed, if we fix \(r_{YZ} = c\) with \(|c| < 1\), we find \[
r^2_{XY} + r^2_{XZ} - 2 c r_{XY} r_{XZ} = 1 - c^2
\]
which they presumably used to make this picture:
(they also comment that the rounded triangle you get from a projection is “as used in a Wankel engine or a movie projector”; see also here and Shung and Pennock (1994) for more on the shape of the Wankel rotor, which is probably not exactly the same as this projection …).
Brute force: generate a grid, compute determinants, draw the contour where \(\det(C) = 0\) (much faster than computing minimum eigenvalues!)
Code
n <-51cvec <-seq(-1, 1, length.out = n)M <-as.matrix(expand.grid(r1=cvec, r2=cvec, r3=cvec))efun <-function(x) { M <-diag(3)## works for 3x3 but not more generally? M[lower.tri(M)] <- x M[upper.tri(M)] <- x M}efun2 <-function(x) det(efun(x))e.val <-apply(M, 1, efun2)
Code
aa <-array(e.val, c(n, n, n))misc3d::contour3d(aa, cvec, cvec, cvec, level =0,## this renders as a nice clear mesh in X11 but## the rendering seems to be messed up in web3d/## whatever rgl using in the HTML rendering## fill = FALSE,color ="blue",alpha =0.5)## add pos-defcsym <- M |>as.data.frame() |> dplyr::filter(e.val>=0& r1 == r2 & r1 == r3)## min neg value for pos def?axes3d()with(csym, lines3d(r1, r2, r3, color ="red", lwd =3))spheres3d(-1/2, -1/2, -1/2, color ="red", radius =0.05)grid3d(side =c("x-", "y-", "z-"))
The red line shows the compound symmetric case (\(r_1=r_2=r_3\)), the ball is the minimum value for \(K=3\) (\(r=-1/2\)). You can use the mouse to rotate and zoom the figure …
It would be interesting to simulate a bunch of multivariate data sets with large noise/low replication and look at the distribution of \(\{\rho_1, \rho_2, \rho_3\}\) estimates in this space (i.e., how do they cluster on the boundaries?)
simulations
Now I’m interested in seeing the distribution of fitted correlation parameters in a mixed model in the case where we have small/noisy data sets. The model is
\[
\begin{split}
y_{ij} & = \beta_0 + \epsilon_{c, ij} + \epsilon_{r, ijk} \\
\epsilon_{r, ijk} & \sim N(0, \sigma^2_r) \\
\epsilon_{c, i} & \sim \textrm{MVN}(0, C)
\end{split}
\] where \(i\)=group, \(j\)=level, \(k\)=replicate. That is, we have a \(3\times 3\) correlated random effect (with all variances equal to 1).
Code
## shouldn't hard-code K=3 here ...my_eig <- purrr::safely(\(x) eigen(x, only.values =TRUE)$values,otherwise =rep(NA_real_, 3))#' @param ng number of groups#' @param n number per group#' @param K dimension#' @param eta LKJ parameter (1 = flat, >1 = biased toward diagonal)#' @param cor_method how to pick correlation (random or user-specified?)#' @param corsimfun <-function(ng =5, n =5, K =3, eta =0.1,cor_method =c("rlkj", "matspec"),sd_r =1,cor_val =NULL) { cor_method <-match.arg(cor_method)if (cor_method =="rlkj") { ok <-FALSEwhile (!ok) { cormat <- rethinking::rlkjcorr( n =1 , K =K , eta = eta) corchol <-try(t(chol(cormat)), silent =TRUE) ok <-!inherits(corchol, "try-error") } } else { cormat <- cor_val corchol <-t(chol(cormat)) } corcvec <- corchol[lower.tri(corchol, diag =TRUE)] varnm <-paste0("x", 1:(K-1)) dd <-matrix(rnorm((K-1)*n*ng), ncol = K-1) |>as.data.frame() |>setNames(varnm) dd$g <-factor(rep(1:ng, each = n)) reterm <-sprintf("(%s | g)", paste(varnm, collapse ="+")) dd$y <-simulate(reformulate(c("1", reterm)),family = gaussian,newdata = dd,newparams =list(beta =0, theta = corcvec, sigma = sd_r))[[1]]return(dd)}fitfun <-function(dd, method =c("lme4", "glmmTMB_cor")) { method <-match.arg(method) varnm <-setdiff(colnames(dd), c("y", "g")) reterm <-sprintf("(%s | g)", paste(c("1", varnm), collapse ="+")) form <-reformulate(c("1", reterm), response ="y") fit <-suppressWarnings(suppressMessages(switch(method,lme4 =lmer(form, data = dd),glmmTMB_cor =glmmTMB(form,data = dd,map =list(theta =factor(c(rep(NA, 3), 1, 2, 3))),start =list(theta =rep(0, 6))) ) ))return(fit)}sumfun <-function(fit) { vv <-VarCorr(fit) vv <-if (inherits(fit, "glmmTMB")) vv$cond$g else vv$g sdvec_est <-sqrt(diag(vv)) cormat_est <-attr(vv, "corr") corvec_est <- cormat_est[lower.tri(cormat_est)] detv <-det(vv) detc <-det(cormat_est)list(sdvec = sdvec_est, cormat = cormat_est,corvec = corvec_est,cov_det = detv, cor_det = detc,sing = performance::check_singularity(fit))}## testingset.seed(101)sim1 <-simfun(cor_method ="matspec", cor_val =efun(rep(0.8, 3)))sum1 <-sumfun(fit1 <-fitfun(sim1))sum2 <-sumfun(fit2 <-fitfun(sim1, method ="glmmTMB_cor"))## isSingular(fit1, tol = 1e-3)## not rated as singular because default tol is 1e-4 ...mk_sims <-function(true_corvec, nsims =1000,do_pb =interactive(), seed =NULL,sim_args =list(),fit_args =list()) {if (!is.null(seed)) set.seed(seed) sims <-vector("list", length = nsims)if (do_pb) pb <-txtProgressBar(style =3, max = nsims)for (i in1:nsims) {if (interactive()) setTxtProgressBar(pb, i) sim0 <-do.call(simfun,c(list(cor_method ="matspec", cor_val =efun(true_corvec)), sim_args)) fit0 <-do.call(fitfun, c(list(sim0), fit_args)) sims[[i]] <-sumfun(fit0) }names(true_corvec) <-paste0("r", 1:3)attr(sims, "truecor") <- true_corvecreturn(sims)}
Using true correlation parameters \(\{0.5, -0.3, 0.5\}\) (shown by the purple sphere below) with a small data set (5 groups, 5 [trivariate] observations per group) and large noise (\(\sigma^2_r=1\)), we get lots of variation in the estimated parameters, with many being singular/on the edge of the elliptope (red dots have \(\det(C) < 10^{-6}\); black dots are in the interior).