A Bayesian Estimator of the Intracluster Correlation Coeﬃcient from Correlated Binary Responses

: Clustered binary samples arise often in biomedical investigations. An important feature of such samples is that the binary responses within clusters tend to be correlated. The Beta-Binomial model is commonly applied to account for the intra-cluster correlation – the correlation between responses within the clusters – among dichotomous outcomes in cluster sampling. The intracluster correlation coeﬃcient (ICC) quantiﬁes this correlation or level of similarity. In this paper, we propose Bayesian point and interval estimators for the ICC under the Beta-Binomial model. Using Laplace’s method, the asymptotic posterior distribution of the ICC is approximated by a normal distribution. The posterior mean of this normal density is used as a central point estimator for the ICC, and 95% credible sets are calculated. A Monte Carlo simulation is used to evaluate the coverage probability and average length of the credible set of the proposed interval estimator. The simulations indicate that for the situation when the number of clusters is above 40, the underlying mean response probability falls in the range of [0.3;0.7], and the underlying ICC values are • 0 . 4, the proposed interval estimator performs quite well and attains the correct coverage level. Even for number of clusters as small as 20, the proposed interval estimator may still be useful in the case of small ICC ( • 0 . 2).


Introduction
The intraclass correlation coefficient (ICC) ρ has wide applications in biology, epidemiology and medical research.In family studies, it is used to measure the degree of intra-family resemblance with respect to characteristics such as blood pressure, weight and height.Moreover, it is used as means of investigating the heritability of certain traits, whether such traits are continuous or dichotomous.An extensive literature, summarized in Shoukri and Ward (1985) and Donner (1986) exists for the statistical analysis of ICC for continuous response variables using the frequentist approach.Recently, Bayesian techniques have been developed; for example Spiegelhalter (2001).For binary traits statistical analysis of the ICC were extensively discussed by Mak (1988), Donner, Klar and Eliasziw (1995), and Gao, Klar and Donner (1997).Less developed is Bayesian statistical analysis of ICC for binary outcomes, which are of considerable practical importance in many toxicological and psychological studies.Within the Bayesian paradigm, the application of multivariate normal theory of continuous outcomes to binary variables is not strictly valid, and appropriate modeling strategy is needed.Turner, Omar and Thompson (2006) obtain the posterior distribution of ρ under the hierarchical logistic model and the beta binomial model using MCMC (Markov Chain Monte Carlo) methods.The posterior median is used as a point estimator and 95% interval estimates are calculated.
In this paper we develop a Bayesian estimator of the ICC under the Beta-Binomial model of correlated binary responses.In particular, we derive the approximate/asymptotic posterior distribution of the ICC, and use it to obtain the point and interval estimators.The remainder of the paper is organized as follows: Section 2 describes the underlying Beta-Binomial model.The proposed Bayesian estimator and its approximate posterior distribution are presented in Section 3. To study the coverage probability and average length of the credible set of the proposed interval estimator, a simulation study is given in Section 4. This is followed by an example in Section 5, and discussion in Section 6.

The Beta-Binomial Model
Consider a random sample of k clusters each of size n i (i = 1, 2, . . ., k).We assume further that all subunits within the ith cluster , Y ij (j = 1, 2, . . ., n i ), are binary taking one of two possible values of Y ij , success or failure (coded as one and zero, respectively).The Y ij 's are conditionally independent with probability ∑ n i j=1 Y ij denote the total number of successes in the ith cluster.Thus, the conditional distribution of Y i• |p i is Binomial(n i , p i ).In addition, it is assumed that the p i 's are identically independently distributed as Beta(a, b) The correlation between any pair of responses within the same cluster corr(Y ij , Y il ), j = l, is given by the intracluster correlation coefficient ρ = 1/(1 + a + b) (Moore, 1987).Prentice (1986) showed that the Beta-Binomial distribution is valid for ρ values falling in the range min where n max is the size of the largest cluster.
A positive correlation means that responses within a cluster are more alike.An example for positive correlation is in toxicological experiments designed to study teratogenic effects of chemical compounds on animals.Fetuses within a litter tend to respond more similarly than fetuses from different litters, a phenomenon known as litter effect.Less commonly, the observations within a cluster are negatively correlated.This occurs for example in family studies where children may be competing for maternal care.The focus of this paper is on applications where the ICC is positive.
Following the logic of Moore (1987), the distribution of the p i 's may be reparameterized in terms of π and ρ, i.e. p i ∼ Beta(π, ρ).

A Bayesian Estimator of the ICC
The Beta-Binomial model in (2.1) is a popular example for hierarchical models.In the first stage, Y i• |p i are independent Binomial(n i , p i ).In the second stage, p i |π, ρ are i.i.d.Beta(π, ρ) .For a full Bayesian analysis, the hyperparameters, π and ρ , are in turn drawn from a hyperprior distribution.In this paper, it is assumed that they are distributed as Uniform(0, 1), and that they are -a prioriindependent of each other.The joint posterior distribution is obtained via Bayes theorem (Carlin and Louis, 2000, p. 19): The marginal posterior distribution of ρ is then derived via integrating p out, that is The evaluation of the previous integrals is intractable and computationally difficult.An alternative is to use asymptotic techniques to obtain approximations of the posterior density.In this paper, when the number of clusters k gets large and the cluster size n remains small, g(π, ρ|y) is approximated using Laplace's method.Following the logic of Kass and Steffey (1989), under mild regularity conditions, g(π, ρ|y) is asymptotically Bivariate Normal with mean and variance given by the posterior mode λ and the inverse of the negative Hessian of the log posterior evaluated at the mode is the prior introduced on λ = (π, ρ).When the Uniform prior on λ is adopted, which is the case here, λ and Σ can be replaced by the MLE's, π and ρ, and the inverse of the observed information matrix.That is asymptotically, g(π, ρ|y) ∼ Bivariate Normal( λ, Σ), where, λ = (π, ρ) and Σ = ( Îobs ) −1 , the inverse of the observed information matrix evaluated at π and ρ.The elements of the observed information matrix are given by Îij = , By properties of the Bivariate Normal distribution, the marginal posterior distribution g(ρ|y) is asymptotically Normal (ρ, σ 2 * ), where σ 2 * is the corresponding diagonal element of Σ. Griffiths (1973) describes an approach for ML estimation of the Beta-Binomial distribution with n = n i (i = 1, 2, . . ., k).The ML estimators of π and θ, where θ = ρ/(1 − ρ), are the solution of the following two equations (Griffiths, 1973): where, S i = ∑ I y=0 f y , (i = 0, 1, 2, . . ., n) and f y is the observed frequency (y = 0, 1, 2, . . ., n) .Equations (3.1) may be solved iteratively using numerical algorithms such as Newton-Raphson method (Press et al., 2007, chap. 9) or the Jenkins-Traub algorithm (Jenkins and Traub, 1970).The second derivatives can be easily obtained to find the information matrix of π and θ, Î(π, θ), and are given by: . By the Delta method (Kendall and Stuart, 1986, p. 324), Thus, the asymptotic posterior distribution of ρ is ) .
The posterior mean, mode or median can be used as a point estimator of ρ.In this paper, a squared error loss function is adopted.Therefore, the Bayes rule (the point estimator that minimizes the posterior risk) is the posterior mean.A 100(1 − α)% credible set (Bayesian confidence interval) for ρ is given by the 2.5 and 97.5 per cent quantiles of this posterior distribution.

Simulation Study
In order to evaluate the accuracy of the Bayesian credible set, we need to conduct a large scale Monte Carlo simulation.Since there are many parameters involved (n, k, π, ρ), a theoretical evaluation is difficult to conduct.Therefore, the simulation approach is adopted, to study the coverage probability and average length of the credible set of the proposed estimator.For the case of cluster size n = 2, the situations were considered where the number of clusters k equals 20, 40, 100 and 200, the underlying mean response probability p equals 0.1, 0.2, 0.3, 0.4, 0.6, and 0.7 and the ICC, ρ, takes one of four values 0.1, 0.2, 0.4, 0.6.A fully factorial combination of these three factors was used, giving a total of 96 combinations.For each combination, 2000 valid samples were generated.In situations where the parameter π was near the edge, i.e., outside the range [0.3; 0.7], the percentage of samples that gave invalid solutions was quite high, sometimes reaching 70%.Whenever a generated sample led to an invalid point estimate of π or ρ (i.e., when either parameter falls out of range), the sample was discarded and replaced by a new one until a total of 2000 valid samples were obtained.As for the limits of the interval estimates, if they exceeded one or fell below the minimum possible value given in equation (2.2), then they were replaced by the appropriate extreme value.For each combination of the simulation factors, the coverage probability and the average length of the credible set were calculated.A SAS program was used to generate the data from the Beta-Binomial distribution 1 .The random numbers were generated in two steps.First, the probability 1 SAS Institute Inc., (2004).Version 8.2.Cary, NC, USA.
of success, p i , were generated from the Beta distribution.In the second step, the y i 's were generated from a Binomial(2, p i ).The NSolve function for numerically solving sets of simultaneous equations in Mathematica software was used to find the roots of the ML equations 2 .
Table 1 shows for 95% nominal credible sets, the estimated coverage probability, the average length of the resulting credible sets, and the percentage of invalid samples.First, note that the estimated coverage probability is above or less than the nominal level by no more than 3% in 67 out of the 96 cases, i.e., 69.79% of the time.This ratio reaches 97.22% for the case where, simultaneously, the number of clusters increases (m ≥ 40) , ICC values are ≤ 0.4 , and π is in the interval [0.3; 0.7].Even for number of clusters as small as 20, the proposed interval estimator may still be useful in the case of small ICC (≤ 0.2).On the other side, the worst situation occurs when ρ = 0.6.Here, the coverage probability does not attain its nominal level.
Examining the average length of the confidence interval, it can be seen that , for all other conditions fixed, the average confidence interval length tends to decrease as either the number of clusters increases, or when π increases.
For π ≤ 0.4, all invalid samples in this simulation study are the result of the case when f 2 = 0.The number of invalid samples increases as π decreases.Recall the following relationship between P rb[y i1 = 1, y i2 = 1] and p i , and consequently π, (Shoukri and Pause,1998, p. 66): A small π/p i will result in a small P rb[y i1 = 1, y i2 = 1], and thus the chances of obtaining f 2 = 0 will increase.For any fixed π, the percentage of invalid samples decreases as either the value of ρ increases or the sampled number of clusters,m, increases.This can be explained also by equation (4.1),where it is clear that there is a proportional relationship between ρ and P rb[y i1 = 1, y i2 = 1].It also makes sense to say that as more clusters are sampled, the chances of obtaining two positive responses within any cluster will increase.Therefore, as m increases, the probability of obtaining f 2 = 0 will decrease.By a similar logic, for π ≥ 0.6, the invalid samples are the result of the case when f 1 = 0.
Based on this simulation study, it is recommended to use the proposed estimator when the sampled number of clusters is at least 40, when it is believed that the ICC values are ≤ 0.4, and the probability of positive response π is in the range [0.3;0.7].A limitation of this simulation study is that it is restricted to the case of cluster size 2.

Example
To illustrate the application of the proposed estimator, an example is considered.In ophthalmologic studies, the eye is the unit for statistical analysis rather than the individual.Typically, an individual contributes two eyes worth of information whose values might be correlated.Each person is a cluster of size two.To obtain valid inference, the statistical analysis must account for the effect of the intracluster correlation.Berson, Rosner and Simonoff (1980) and Rosner (1982) describe an ophthalmologic study conducted at the Massachusetts Eye and Ear Infirmary from 1970 to 1979 of an outpatient population of 216 persons aged 20-39 with retinitis pigmentosa (RP).This population was classified into the genetic types of autosomal recessive RP (AR), autosomal dominant RP (DOM), sex-linked RP (SL) , and isolate RP (ISO) for a study of differences among these four groups on certain measurements.The details of the study design and the procedures for genetic classification are given by Berson, Rosner and Simonoff (1980).In the current paper, the binary outcome of interest is the best corrected Snellen visual acuity (VA).Any eye is considered affected if VA is 20/50 or worse, and normal if VA is 20/40 or better.The number of affected eyes for the given sample is presented in Table 2. Hence, S 0 = f 0 = 92, S 1 = 129, and S 2 = 216.The goal is to estimate the ICC for this outpatient population using the Bayesian estimator described in section 3. The MLE estimators of π and θ are the solution of the following two equations: Equations (5.1) can be easily solved using a rootfinding routine in mathematical or statistical computer software such as Mathematica.Using the Nsolve function of Mathematica, the solution of equations (5.1) is found to be π = 0.488426 and θ = 1.917357 .The covariance matrix is given by: Therefore, var( θ) = 0.190609 , and the MLE of ρ is ρ = θ/(1 + θ) = 0.657224.By the Delta method, var(ρ) = 0.00263139.The asymptotic posterior distribution of ρ/y is N (0.657224, 0.00263139).The proposed point estimate of the ICC is given by the posterior mean, which is equal 0.657224.A 95% credible set of the ICC is [0.55668148, 0.7577665] .Rosner (1982) analyzed the same data set using the classical approach, and similar results were obtained.The effective number of units within a cluster, denoted by e, is used in his analysis.Easy manipulation shows that the ICC is related to e through the the following function: ρ = 2/e−1.The MLE of e obtained by Rosner is ê = 1.207.Therefore, the MLE of the ICC is ρ Rosner = 0.657001.

Discussion
We have presented a Bayesian estimator for the ICC under the Beta-Binomial model.The approach is based on Laplace's method to approximate the posterior distribution and moments of the ICC given the data.The technique is asymptotic, that is for sufficiently large number of clusters.An advantage of adopting a Bayesian perspective is the possibility of incorporating prior beliefs on likely values of the ICC.The approach is flexible to accommodate informative and noninformative priors for π and ρ.In the current study a Uniform prior has been adopted.When a nonuniform prior of π and ρ is used, then the mean and variance of the Normal approximation are given by the posterior mode and the inverse of the negative Hessian of the log posterior evaluated at the mode (Kass and Steffey, 1989).As noted by Turner, Omar and Thompson (2006), ICC values are generally reported unaccompanied by confidence intervals, which makes them of limited value as in estimation of any other parameter.The current paper provides point and interval estimators for a single ICC.The proposed approach is simple, relatively easy to implement, and is not computer intensive, which makes it useful for practitioners from the biomedical fields.This is in contrast to already existing Bayesian approaches depending on simulations and MCMC methods.It is clear that the technique used here to find the MLE's of the Beta-Binomial distribution (Griffiths, 1973) assumes equal cluster sizes.When the clusters are not of equal sizes, two nonlinear equations need to be solved numerically to obtain the posterior estimators of ρ and π.

Table 1 :
Performance of the proposed credible set based on Monte Carlo simulation: the estimated coverage probability for the ICC at nominal level 0.95, average length of the credible set (in parentheses), and the proportion of invalid samples[in brackets]

Table 2 :
Distribution of the number of affected eyes