One of the questions that every now and then a curious student asks me is why is the assumption of parallel or tau-equivalent tests (with uncorrelated residuals) needed for Cronbach’s alpha to be an accurate estimate of reliability. We learn by repeating over and over again that alpha is a lower bound to reliability (assuming unidimensionality, uncorrelated residuals and a whole bunch of other stuff) but we never really learn *why* alpha is a lower bound to the true reliability. Just that it is.

In any case, just as a quick reference I thought it would be nice to start the ‘proof’ section of the page by showing that alpha underestimates reliability under the congeneric test model. The proof that I’m showing is shown on Lord & Novick (1968) but the book is hard to find nowadays (and expanding a few steps would make things a lot clearer to be honest) so here is where we start.

Assume are measurements, each with distinct true scores . Define the composite score . Then it follows that:

Let us start by taking two arbitrary true score variances, and . Because we are dealing with variances (so they’re always positive) the following inequality must hold. I’m just gonna expand it using basic rules of covariance algebra:

Now, because we’re assuming all the usual assumptions of Classical Test Theory (uncorrelated errors, errors uncorrelated with true scores, … basically, nothing that we dislike correlates with anything. That’s why assumptions are so awesome. you get to go crazy with them!) we can just sum across all different variances of true scores. For :

Here’s where things start getting a tad bit tricky because we’re gonna do some summation magic to cough up the terms that we want where we want them. We start our little sorcery by looking at the following relationships:

Ok, I know that this looks like an incredibly convoluted way to show that summing the same thing two-n times is the same as multiplying it times 2-n. But these will become incredibly useful when we start placing everything together.

Now, the next one is a tad bit trickier to follow:

In general, all we are doing here is splitting the first term into adding everything that’s similar alongside with everything that’s different in terms of the indices. If you first sum over all the instances where and then all the instances where you end up summing over ALL instances. And that is what the first term of the expression says (i.e. sum over everything).

Now we have the final ingredients to put everything together, which is basically just equating the two results we derived before:

And we can back-substitute this on the original inequality we were working on!

Things are starting to take some shape now, aren’t they? I think it is possible to see now where the ubiquitous Cronbach’s alpha is gonna show up very soon. Because we’re dealing with an inequality, let’s add something to both sides:

By adding all of the variances and all of the covariances among the true scores on the left hand side we are able to obtain , or the total true score variance of the composite score. The other term we just need to simplify further so things end up looking like:

The next step requires one to remember that, from the assumptions of Classical Test Theory follows that the covariances among observed measurements equal the covariances of their true scores (remember that the errors correlate with nothing), so the following holds true:

And, moreover, if from the variance of the initially-defined composite we subtract the sum of the variances among its constituting elements we are left with:

We are now ready to wrap everything up by dividing both sides of the inequality we were working on previously by :

Now it is very easy to recognize the first term as the definition of reliability and the second term is just the definition of Cronbach’s alpha, so that we can finally claim that:

And that completes the proof.