Lawley’s test for equality of correlations

So… while I kept on cleaning computer files I found something pretty interesting. A “secret” (as in “I-didn’t-remember-where-it-was” kind of secret) folder where I kept all my attempts at a Master’s thesis. Apparently, I went through 11 different projects which I left at various unfinished stages. But there was one that just draw me back in because it was… well… kind of pointless?

For some reason that I can’t quite remember, I thought it would be a good idea to investigate the properties of Lawley’s test for the equality of correlations in a correlation matrix.  So the null hypothesis of this test kind of looks like this:

matrix

I have absolutely no idea why I thought this was a good idea. I’m thinking it may have been back in the days where I was not as proficient in Structural Equation Modelling (SEM) so I’d dwell in my mathematical training to accomplish things that can be trivially set up as SEMs. Following on my poorly-documented code and meager notes, it seems like I thought I could tweak the test to see whether or not the assumption of Parallel Tests or Tau-Equivalent Tests could be evaluated. You know, to make sure people would know whether their Cronbach’s alpha was a true estimate of reliability or just a lower bound. Little did I know at that point that people just use alpha whether the assumptions behind it hold or not. The truth is nobody really cares.

The one thing I remember is that I couldn’t find any R package that would run the test. So I ended up coding it myself. I’m not sure if this would be of any use to anyone, but I’m hoping this may save some time to someone out there who may need it for whatever reason.

So the R function you’d have to declare is:


lawley <- function(data) {

R <- cor(data)
m <- cor(data)

r <- nrow(m)
c <- ncol(m)

u <- c
z <- lower.tri(c,diag = FALSE)
u[z] <- 0

l <- c
z1 <- upper.tri(c,diag = FALSE)
l[z1] <- 0
p <- r
n <- nrow(data)

mcp <- 0*1:ncol(m)
for (i in 1:ncol(m))
{
mcp[i] <- (sum(m[,i])-1)/(ncol(m)-1)
} # vector of average values of non-diagonal elements of each row

lR <- R
z2 <- upper.tri(R,diag = FALSE)
lR[z2] <- 0

mc <- (2/(p*(p-1))) * sum(R-lR)

TR <- R
z3 <- upper.tri(R,diag = TRUE)
TR[z3] <-0

A <- (TR - mc)^2
A[z3] <-0
A <- sum(A)

B <- sum((mcp-mc)^2)

C <- ((p-1)^2*(1-(1-mc)^2))/(p-((p-2)*(1-mc)^2))
X2 <- (((n-1)/((1-mc)^2))*(A-C*B))
v <- ((p+1)*(p-2))/2
P <- 1-pchisq(X2,v,ncp = 0, lower.tail = TRUE, log.p = FALSE)

result <- rbind(c(X2, v, P))

dimnames(result) <- list(c("X2"), c("statistic", "df", "p.value"))

print(result)
}

And the way you use it is very simple. Let’s try it with a mock dataset we’ll generate in the R package lavaan. Here we’re generating data from a One Factor model with equal loadings of 0.3 and equal error variances:


library(lavaan)
set.seed(123)

pop = 'f1 =~ 0.3*x1 + 0.3*x2 + 0.3*x3 + 0.3*x4 + 0.3*x5'

dat = simulateData(model=pop, sample.nobs=500)

lawley(dat)
statistic df p.value
X2 7.840194 9 0.5503277

So… yeah. Non-significant p-value (as per glorious alpha of .05) tells us that the null hypothesis is true and the population correlation matrix indeed has equal elements all around.

Let’s change one loading and see what happens:


library(lavaan)
set.seed(123)

pop = 'f1 =~ 0.4*x1 + 0.3*x2 + 0.3*x3 + 0.3*x4 + 0.3*x5'

dat = simulateData(model=pop, sample.nobs=500)

lawley(dat)
statistic df p.value
X2 18.78748 9 0.02706175

So the null hypothesis is now rejected because one of the loadings is different so there is one element of the covariance matrix that is not equal to every other element.

I literally have no clue why Lawley thought having such a test was a good idea. But then again I was investigating it a few years ago so maybe *I* thought the test was a good idea in the first place.

Anyhoo, I hope this helps someone if they need it.