Simulate a 2-level Multilevel/HLM/linear mixed model

I think multilevel models are still “a thing”, right? Like they used to be all hot and sexy a few years ago but maybe they’ve started to become a little more common. The HLM software has made them very popular, although eventually I will rant about how much I dislike the whole “Level 1” and “Level 2” parlance that Raudenbush & Bryk helped disseminate. But for better or worse it’s here to stay and I need it if I want to communicate with my colleagues.

In any case, the model from which we will be simulating data in this post looks like this:

y = X\mathbf{\beta} + Z\mathbf{u} + \mathbf{\epsilon}

y is an nX1 continuous vector of data (the dependent variable).
X is an nXp model matrix with a pX1 vector of coefficients \mathbf{\beta} (so this is the ‘fixed effects’ part of the model).
Z is another nXq model matrix with a qX1 vector \mathbf{u} (the ‘random effects’).
\mathbf{\epsilon} are the residuals.

I know people reading this are probably more familiar withe the Level 1 and Level 2 approach of looking at things and I’m pretty sure it’s easy to make explicit its relationship with the way I stated the model, which is the Laird-Ware (1982) specification (this is, I think, a lot more common than the Raudenbush & Bryk one). Since I will be relying on matrices and matrix operations to simulate data, I thought it was a better idea to keep everything Laird-Ware.

In any case, by assuming that the random effects \mathbf{u} are distributed \mathbf{u} \sim N(0, \sigma^{2}D) (which is what we typically do when we fit a multilevel model) then Var(y) = Var(Z\mathbf{u}) + Var(\mathbf{\epsilon}) . From regular OLS regression the assumption is that Var(\mathbf{\epsilon})= \sigma^{2}I to which we simply need to add Var(Z\mathbf{u}) . After doing some basic matrix algebra we now have \epsilon = \sigma^{2}I+\sigma^{2}ZDZ^{t}) so that y \sim N(X\beta, \sigma^{2}(I+ZDZ^{t}))

Because we’ve worked through the math a little bit here, now we know what we need to do in order to simulate data the way we want it: specify X\beta for the fixed effects, the matrix D for the covariance of the random effects and \sigma^{2} for the residual.

So here we go, ladies and gents! the magic R function to simulate data for a 2-level multilevel model!


Simulate <- function(k, n, icc, beta, sigma=NA, sigmaXre=NA){

nk <- n*k
nx <- length(beta)-1
Z <- kronecker(diag(k), rep(1,n))

# X matrix
if ([1])) sigma <- diag(nx)
X <- mvrnorm(nk, rep(0,nx), sigma)

# random effects of X
if (![1])){
Xre <- t(matrix(rnorm(k*nx,rep(1,nx),sigmaXre),nrow=nx))
Xre <- cbind(rep(1,nk), X * (Z %*% Xre))
} else {
Xre <- cbind(rep(1,nk), X)
X <- cbind(rep(1,nk), X)

# create a factor to keep track of which "students" are in which "school"
group <- as.factor(rep(1:k, each=n))

# generate taus and epsilons
tecov <- diag(c(icc, rep(1-icc,n)))
te <- mvrnorm(k, rep(0,n+1), tecov)
epsilons <- as.vector(t(te[,-1]))
taus <- te[,1]

# generate Y data
ran <- Z %*% taus + epsilons
Y <- Xre %*% beta + ran

output <- list(Y, X[,2],X[,3],X[,4], group)
class(output) <- "data.frame"
colnames(output) <- c("Y", "X1", "X2","X3", "group")

The fact of the matter is that the lme4 package is not needed (you do need the MASSpackage though) but we might as well add it because we’re gonna use it in future examples. Pay particular attention to how the covariance matrix of random effects Xre is created and the section where you read #generate Y data. It becomes immediately apparent that the Laird-Ware notation is doing its magic here for us.

Now, how to use it? I’ll work through a few examples. Let’s pretend I want to explore a simple model where there are only random effects on the intercept. Maybe I would like to simulate data where I know the Intraclass Correlation Coefficient (ICC) in advance. we can do something like this:

# How many "schools"?
k <- 50
# How many "students" per "school"?
n <- 30
# Which ICC?
real.icc <- 0.6
# Which fixed effects?
beta <- c(1,2,3,4)
# Covariance matrices of the X variables (a positive-definite symmetric matrix)
sigma <- matrix(c(1,0,0,0,1,0,0,0,1), length(beta)-1)
# Covariance matrix of the random effects of X (or vector of variances where zero means no random effect)
sigmaXre <- matrix(c(rep(0,9)), length(beta)-1) #intercept-only model

So I’m simulating data with 50 Level 2 units (schools). Each one has 30 Level 1 units (students). My predictors are uncorrelated (so the sigma matrix is diagonal) and the ICC is the only random-effects component, so i have to set sigmaXre to be zero.

I only need to do this then (I am obviously assuming you declared the previous snippet of code above before using the Simulate() function:

data1 <- Simulate(k, n, real.icc, beta, sigma, sigmaXre)

mod1 <- summary(lmer( Y ~ 1 + X1 + X2 + X3 + (1|group), data1))

Linear mixed model fit by REML ['lmerMod']
Formula: Y ~ 1 + X1 + X2 + X3 + (1 | group)
Data: data1

REML criterion at convergence: 3009.6

Scaled residuals:
Min 1Q Median 3Q Max
-3.5056 -0.6846 -0.0119 0.6455 3.6272

Random effects:
Groups Name Variance Std.Dev.
group (Intercept) 0.5825 0.7632
Residual 0.3784 0.6152
Number of obs: 1500, groups: group, 50

Fixed effects:
Estimate Std. Error t value
(Intercept) 0.85045 0.10910 7.8
X1 2.00578 0.01642 122.2
X2 3.01568 0.01721 175.2
X3 4.01071 0.01654 242.5

Correlation of Fixed Effects:
(Intr) X1 X2
X1 -0.002
X2 0.004 0.012
X3 0.001 0.008 0.004

It seems like everything worked OK, right? I mean my fixed effects were 1,2,3,4 (where 1 is the intercept) and they were estimated at 0.85045, 2.00578, 3.01568 and 4.01071. So pretty darn close. My predictors are uncorrelated so the Correlation of fixed effects is pretty much 0 among all of them. But what about the ICC? well, let’s extract the relevant parts of the output and calculate it. Let’s use the all-too-common formula:

ICC = \frac{\tau_{00}}{\tau_00+\sigma^{2}_{e}}

estimated.icc <- as.numeric(attr(mod1$varcor$group,"stddev"))/(as.numeric(attr(mod1$varcor$group,"stddev"))+as.numeric(as.numeric(attr(mod1$varcor,"sc"))))

> estimated.icc
[1] 0.5537029

The population ICC is supposed to be 0.6 and the (large sample) estimate we got was 0.5537029, so I would say that is reasonably close.

But wait… maybe we would like for all the predictors to have a random effect component (so each one gets a random slope, not just the intrecept). The only thing we need to do is alter the sigmaXre matrices to something like, for example:

sigmaXre <- diag(c(rep(1,length(beta)-1)))

so that the random effects are assigned to each predictor… while keeping the random effects uncorrelated (notice that it is a diagonal matrix). If we run this new specification of sigmaXre with the previous arguments of the function what we now get is:

Linear mixed model fit by REML ['lmerMod']
Formula: Y ~ 1 + X1 + X2 + X3 + (1 + X1 + X2 + X3 | group)
Data: data1

REML criterion at convergence: 3883.5

Scaled residuals:
Min 1Q Median 3Q Max
-2.9220 -0.6330 0.0120 0.6335 3.9198

Random effects:
Groups Name Variance Std.Dev. Corr
group (Intercept) 0.6793 0.8242
X1 1.5114 1.2294 -0.14
X2 1.9644 1.4016 -0.13 0.01
X3 1.9936 1.4120 -0.10 0.00 -0.05
Residual 0.4255 0.6523
Number of obs: 1500, groups: group, 50

Fixed effects:
Estimate Std. Error t value
(Intercept) 0.9758 0.1179 8.273
X1 1.9439 0.1749 11.116
X2 3.1149 0.1991 15.647
X3 4.2951 0.2005 21.425

Correlation of Fixed Effects:
(Intr) X1 X2
X1 -0.142
X2 -0.125 0.009
X3 -0.096 0.002 -0.054

So… same fixed effects, but now a variance component for each slope and the intercept. In general, playing with sigma and sigmaXre helps generate various relationships among fixed and random effects.

I hope this function will be useful in your explorations of multilevel models!