Technically this is not a refereed publication (yet), but I thought it would be nice to have a little space for the function at the centre of my master’s thesis. It deals with the Bayesian estimation of the polychoric correlation coefficient both under bivariate normal and non-normal data. It is based on this article written by James H Albert. I was a little bit surprised that such a useful article hadn’t really got a lot of citations, so I wanted to highlight his work by implementing it into R. In my code, I figured out how to deal with both right-skewed and left-skewed latent data (Albert’s article only handles right-skewed data) and it has proven to be very useful for handling difficult datasets, such as when one is dealing with low-count or zero-count cells in contingency tables.

The actual code for the function is here. The main function is called `polycorGibbs()`

because it is based of a Gibbs sampler.

And the way it works is super easy:

library(MASS)

`Q <- mvrnorm(100, c(0,0), matrix(c(1,.5,.5,1),2,2))`

`x <- Q[,1]`

y <- Q[,2]

`##Make sure you declare these as "integer", not "numeric" or "float"`

xb <- as.integer(cut(y, breaks=c(-Inf, mean(y)-sd(y),mean(y)+sd(y), Inf), labels=c(1:3)))

yb <- as.integer(cut(y, breaks=c(-Inf, mean(y)-sd(y),mean(y)+sd(y), Inf), labels=c(1:3)))

plyk <- polycorGibbs(xb, yb, iter = 1e4, trace=FALSE, graph=FALSE)

Summary

mean median SD Numeric SD

rho 0.481 0.487 0.00109 0.0282

c1 -1.070 -1.070 0.00885 0.1530

c2 1.020 1.020 0.01190 0.1500

d1 -1.070 -1.060 0.01090 0.1550

d2 1.010 1.010 0.01090 0.1520

Correlation

rho c1 c2 d1 d2

rho NA 0.00163 -0.0384 -0.0153 -0.0323

c1 NA NA 0.2100 0.5080 0.2260

c2 NA NA NA 0.2330 0.5220

d1 NA NA NA NA 0.2220

d2 NA NA NA NA NA

I offer some extra functionalities like plotting the time series plot for the Markov Chain and other ways to check how well everything converged and whatnot.

Overall, the function is a tad bit finicky. But then again this was my first “major” project and I pretty much had to teach myself Bayesian statistics and MCMC from scratch to get it to work, which is why I am very proud of it. But it will get better in time.

It would be really cool if you gave me a shout-out before using it 🙂