I think multilevel models are still “a thing”, right? Like they used to be all hot and sexy a few years ago but maybe they’ve started to become a little more common. The HLM software has made them very popular, although eventually I will rant about how much I dislike the whole “Level 1” and “Level 2” parlance that Raudenbush & Bryk helped disseminate. But for better or worse it’s here to stay and I need it if I want to communicate with my colleagues.

In any case, the model from which we will be simulating data in this post looks like this:

Where:

is an nX1 continuous vector of data (the dependent variable).

is an nXp model matrix with a pX1 vector of coefficients (so this is the ‘fixed effects’ part of the model).

is another nXq model matrix with a qX1 vector (the ‘random effects’).

are the residuals.

I know people reading this are probably more familiar withe the Level 1 and Level 2 approach of looking at things and I’m pretty sure it’s easy to make explicit its relationship with the way I stated the model, which is the Laird-Ware (1982) specification (this is, I think, a lot more common than the Raudenbush & Bryk one). Since I will be relying on matrices and matrix operations to simulate data, I thought it was a better idea to keep everything Laird-Ware.

In any case, by assuming that the random effects are distributed (which is what we typically do when we fit a multilevel model) then . From regular OLS regression the assumption is that to which we simply need to add . After doing some basic matrix algebra we now have so that

Because we’ve worked through the math a little bit here, now we know what we need to do in order to simulate data the way we want it: specify for the fixed effects, the matrix for the covariance of the random effects and for the residual.

So here we go, ladies and gents! the magic R function to simulate data for a 2-level multilevel model!

`library(lme4)`

library(MASS)

Simulate <- function(k, n, icc, beta, sigma=NA, sigmaXre=NA){

nk <- n*k

nx <- length(beta)-1

Z <- kronecker(diag(k), rep(1,n))

```
```# X matrix

if (is.na(sigma[1])) sigma <- diag(nx)

X <- mvrnorm(nk, rep(0,nx), sigma)

# random effects of X

if (!is.na(sigmaXre[1])){

Xre <- t(matrix(rnorm(k*nx,rep(1,nx),sigmaXre),nrow=nx))

Xre <- cbind(rep(1,nk), X * (Z %*% Xre))

} else {

Xre <- cbind(rep(1,nk), X)

}

X <- cbind(rep(1,nk), X)

# create a factor to keep track of which "students" are in which "school"

group <- as.factor(rep(1:k, each=n))

# generate taus and epsilons

tecov <- diag(c(icc, rep(1-icc,n)))

te <- mvrnorm(k, rep(0,n+1), tecov)

epsilons <- as.vector(t(te[,-1]))

taus <- te[,1]

# generate Y data

ran <- Z %*% taus + epsilons

Y <- Xre %*% beta + ran

`output <- list(Y, X[,2],X[,3],X[,4], group)`

class(output) <- "data.frame"

colnames(output) <- c("Y", "X1", "X2","X3", "group")

return(output)

}

The fact of the matter is that the `lme4`

package is not needed (you *do* need the `MASS`

package though) but we might as well add it because we’re gonna use it in future examples. Pay particular attention to how the covariance matrix of random effects `Xre`

is created and the section where you read `#generate Y data`

. It becomes immediately apparent that the Laird-Ware notation is doing its magic here for us.

Now, how to use it? I’ll work through a few examples. Let’s pretend I want to explore a simple model where there are only random effects on the intercept. Maybe I would like to simulate data where I know the Intraclass Correlation Coefficient (ICC) in advance. we can do something like this:

# How many "schools"?

k <- 50

# How many "students" per "school"?

n <- 30

# Which ICC?

real.icc <- 0.6

# Which fixed effects?

beta <- c(1,2,3,4)

# Covariance matrices of the X variables (a positive-definite symmetric matrix)

sigma <- matrix(c(1,0,0,0,1,0,0,0,1), length(beta)-1)

# Covariance matrix of the random effects of X (or vector of variances where zero means no random effect)

sigmaXre <- matrix(c(rep(0,9)), length(beta)-1) #intercept-only model

So I’m simulating data with 50 Level 2 units (schools). Each one has 30 Level 1 units (students). My predictors are uncorrelated (so the `sigma`

matrix is diagonal) and the ICC is the only random-effects component, so i have to set `sigmaXre`

to be zero.

I only need to do this then (I am obviously assuming you declared the previous snippet of code above before using the `Simulate()`

function:

data1 <- Simulate(k, n, real.icc, beta, sigma, sigmaXre)

```
```mod1 <- summary(lmer( Y ~ 1 + X1 + X2 + X3 + (1|group), data1))

Linear mixed model fit by REML ['lmerMod']

Formula: Y ~ 1 + X1 + X2 + X3 + (1 | group)

Data: data1

REML criterion at convergence: 3009.6

Scaled residuals:

Min 1Q Median 3Q Max

-3.5056 -0.6846 -0.0119 0.6455 3.6272

Random effects:

Groups Name Variance Std.Dev.

group (Intercept) 0.5825 0.7632

Residual 0.3784 0.6152

Number of obs: 1500, groups: group, 50

Fixed effects:

Estimate Std. Error t value

(Intercept) 0.85045 0.10910 7.8

X1 2.00578 0.01642 122.2

X2 3.01568 0.01721 175.2

X3 4.01071 0.01654 242.5

`Correlation of Fixed Effects:`

(Intr) X1 X2

X1 -0.002

X2 0.004 0.012

X3 0.001 0.008 0.004

It seems like everything worked OK, right? I mean my fixed effects were 1,2,3,4 (where 1 is the intercept) and they were estimated at 0.85045, 2.00578, 3.01568 and 4.01071. So pretty darn close. My predictors are uncorrelated so the Correlation of fixed effects is pretty much 0 among all of them. But what about the ICC? well, let’s extract the relevant parts of the output and calculate it. Let’s use the all-too-common formula:

`estimated.icc <- as.numeric(attr(mod1$varcor$group,"stddev"))/(as.numeric(attr(mod1$varcor$group,"stddev"))+as.numeric(as.numeric(attr(mod1$varcor,"sc"))))`

`> estimated.icc`

[1] 0.5537029

The population ICC is supposed to be 0.6 and the (large sample) estimate we got was 0.5537029, so I would say that is reasonably close.

But wait… maybe we would like for all the predictors to have a random effect component (so each one gets a random slope, not just the intrecept). The only thing we need to do is alter the `sigmaXre`

matrices to something like, for example:

`sigmaXre <- diag(c(rep(1,length(beta)-1)))`

so that the random effects are assigned to each predictor… while keeping the *random effects uncorrelated* (notice that it is a diagonal matrix). If we run this new specification of `sigmaXre`

with the previous arguments of the function what we now get is:

Linear mixed model fit by REML ['lmerMod']

Formula: Y ~ 1 + X1 + X2 + X3 + (1 + X1 + X2 + X3 | group)

Data: data1

REML criterion at convergence: 3883.5

```
```Scaled residuals:

Min 1Q Median 3Q Max

-2.9220 -0.6330 0.0120 0.6335 3.9198

Random effects:

Groups Name Variance Std.Dev. Corr

group (Intercept) 0.6793 0.8242

X1 1.5114 1.2294 -0.14

X2 1.9644 1.4016 -0.13 0.01

X3 1.9936 1.4120 -0.10 0.00 -0.05

Residual 0.4255 0.6523

Number of obs: 1500, groups: group, 50

Fixed effects:

Estimate Std. Error t value

(Intercept) 0.9758 0.1179 8.273

X1 1.9439 0.1749 11.116

X2 3.1149 0.1991 15.647

X3 4.2951 0.2005 21.425

`Correlation of Fixed Effects:`

(Intr) X1 X2

X1 -0.142

X2 -0.125 0.009

X3 -0.096 0.002 -0.054

So… same fixed effects, but now a variance component for each slope and the intercept. In general, playing with `sigma`

and `sigmaXre`

helps generate various relationships among fixed and random effects.

I hope this function will be useful in your explorations of multilevel models!