Alternative Tests of Independence in Two-Way Categorical Tables

The chi-squared test for independence in two-way categorical tables depends on the assumptions that the data follow the multinomial distribution. Thus, we suggest alternatives when the assumptions of multinomial distribution do not hold. First, we consider the Bayes factor which is used for hypothesis testing in Bayesian statistics. Unfortunately, this has the problem that it is sensitive to the choice of prior distributions. We note here that the intrinsic Bayes factor is not appropriate because the prior distributions under consideration are all proper. Thus, we propose using Bayesian estimation which is generally not as sensitive to prior specifications as the Bayes factor. Our approach is to construct a 95% simultaneous credible region (i.e., a hyper-rectangle) for the interactions. A test that all interactions are zero is equivalent to a test of independence in two-way categorical tables. Thus, a 95% simultaneous credible region of the interactions provides a test of independence by inversion.


Introduction
There are many occasions when we need to understand the extent of the association of two attributes.For example, the National Center for Health Statistics has been collecting data on obesity in the U.S. for many years.The data have been used to establish the relation between obesity and socio-demographic variables.These data are typically presented in two-way categorical tables.Scientists routinely use the chi-squared test to analyze such tables.However, in many applications the chi-squared test can be defective; one example is when there is an intra-class correlation violating the assumptions in the multinomial distribution.This paper reviews shortcomings of methods based on adjusted chi-squared statistic and the Bayes factor, an alternative in Bayesian hypothesis testing to the chi-squared statistic, and we propose a simple method based on estimation rather than hypothesis testing to "test" for independence in two-way categorical tables.
We consider the analysis of data summarized in a r × c categorical table (i.e., there are two attributes, the first with r levels and the second with c levels).The dataset is a sample from a population, and the individuals in the sample are categorized according to the two attributes.Let π jk denote the probability that an individual falls in the j th level of the first attribute and k th level of the second attribute.Here r j=1 c k=1 π jk = 1.The two attributes are not associated if π jk = π (1) j π (2) k , j = 1, . . ., r; k = 1, . . ., c; otherwise they are associated.When a simple random sample of individuals is taken from the population, the sampled individuals can be allocated to the cells of the r ×c categorical table (multinomial sampling) to obtain a chi-squared test of association between the two categories based on the data collected.
Traditionally, there are two ways to test for association (or no association) in a r × c categorical table.First, we can use the well-known Pearson chi-squared statistic which works for simple random sampling (i.e., multinomial sampling).The second approach is to use the Bayes factor (Kass and Raftery 1995), an alternative to the chi-squared test.In multinomial sampling the individuals fall in the cells independently, and in this case the standard chi-squared test for association between the two categories of r × c table is correct asymptotically.There is a correlation because the sample size is fixed.However, if the counts in the cells are formed from members in a cluster, the assumptions of multinomial sampling (i.e., independence) no longer hold because there is an additional correlation among the members in the cluster (i.e., the intra-class correlation).
Thus, the standard chi-squared test is inappropriate, especially for borderline cases of significance, when there is an intra-class correlation.For members within the same cluster (e.g., children in the same family, rats in the same litter, students in the same class), one can expect a positive correlation among the members of the cluster because the members will tend to have similar trait or share the same effect.This is the intra-class correlation (i.e., correlation among the members inside the cluster).For data with an intra-class correlation, the information available from all the members within the cluster is effectively less than the information when there is independence among the members within the cluster.Thus, with an intra-class correlation the effective sample size is smaller than the cluster size, so that there is an increase in variability when there is an intraclass correlation.For a simple nonparametric method to calculate the intra-class correlation see Rao (1965, p. 159).
Several authors have recognized inaccuracy in the analysis when the usual chi-squared test is applied to correlated data; efforts to correct for spurious inflation in such test statistics have been based mainly on two approaches.The design-based approach provides inference with respect to the asymptotic sampling distribution of estimates over repetitions of the sample design (Fellegi 1980, Holt, Scott and Ewings 1980, Rao and Scott 1981, 1984, Bedrick 1983, and Fay 1985).They use design effect which is the ratio of two variances of an appropriate estimator, one from a more complex design than simple random sampling and the other from simple random sampling.For example, Rao andScott (1981, 1984) investigate the effects of stratification and clustering on the asymptotic distribution of Pearson's chi-squared statistic for goodness of fit and independence in multiway categorical tables.They propose generalized design effects which are used to adjust the standard chi-squared statistic.The model-based approach postulates a probability distribution to model the sample data (Altham 1976, Cohen 1976, Brier 1980, Fienberg 1979, and Choi and McHugh 1989).For example, Choi and McHugh (1989), applying the probabilistic development in Altham (1976) and Cohen (1976), shows how to adjust the standard chi-squared test statistic when there is an intra-class correlation.In the Bayesian approach these adjustments are not necessary because appropriate Bayesian models can be constructed to capture unusual features in the data.
For r × c categorical tables the Bayes factor is used to quantify the difference between a model with association and one without.This is the ratio of the posterior odds of one model to the other to their prior odds, and it is the same as the ratio of the marginal likelihoods of the data under two models, one without association and the other with association.The Bayes factor has recently been of much scientific interest.However, there are two important difficulties with the use of the Bayes factor.First, it is sensitive to the prior specifications, especially when there are not enough data to estimate the parameters under test; see Sinharay and Stern (2002) for an interesting discussion on nested models.Second, but less important is that we need to be careful with its interpretation; see Lavine and Schervish (1999).We discuss these issues in detail in Appendix A.
It is natural to think about alternative approaches when the cell counts do not follow a multinomial distribution, or some cell counts are small or zero.A Bayesian approach to the problem is desirable especially when the assumption of multinomial distribution does not hold.In this case, one can use the Bayes factor for testing association versus no association.The purpose of this paper is to alert practitioners of the improper routine use of the chi-squared test or Bayes factor, and to suggest alternatives.Thus, the reasons for writing this paper are: a.To review some of the defects of standard chi-squared testing; b.To show that the Bayes factor is defective as an alternative to the chisquared test because it is sensitive to prior specifications; c.To construct a Bayesian estimative alternative to the Bayes factor based on the interactions in the r × c table.
This paper has five more sections.Section 2 has a review on chi-squared test and Bayesian test.In Section 3 we present the Bayesian estimative alternative to the test of independence.Section 4 has a review of chi-squared testing and intra-class correlation and a simple implementation of the estimative procedure.Section 5 shows several examples which are related to activity limitation and age, forms of material and heat in fire incidences, and body mass index and family income.Section 6 has a discussion.

Chi-squared Test and Bayesian Test: A Review
In this section we give a detailed quantitative review of the chi-squared statistic and the Bayes factor for tests of association between the two categorical variables in a r × c categorical table.

Standard chi-squared test
Let n jk denote the number of individuals in the j th row and k th column of the r×c categorical table.Also let Then, Pearson's chi-squared statistic, under independence of the row and column classifications, is If the responses from the individual members are independent and identically distributed, then as n goes to infinity, X u converges in law to a chi-squared random variable with (r − 1)(c − 1) degrees of freedom.In practice, the validity of the chi-squared test depends on (a) the magnitude of the expected values e jk , and (b) whether the cell counts (n jk , j = 1, . . ., r; k = 1, . . ., c) follow a multinomial distribution given the sample size n (i.e., the individual responses are independent and identically distributed).In (a) the test is valid if the e jk are all larger than 5; see Greenwood and Nikulin (1996, Chapter 1, Section 2).Clearly the only way to achieve this is to increase the sample size subject to cost.In (b) when there is correlation among the members (e.g., intra-class correlation), the asymptotic distribution of X u is no longer χ 2 (r−1)(c−1) , and the estimates of the cell proportions can be inaccurate.The Pearson chi-squared test has received much attention.See Mirkin (2001) for a review of interpretations of the chisquared statistic as a measure of association or independence.
We describe one solution that has been proposed for the problem about the asymptotic distribution when sampling is not simple random sampling.Let n t denote the number of members in all families of the same size t = 1, . . ., T , and let θ t denote the intra-class correlation for clusters of size t (θ 1 ≡ 0).First, we describe the method of Rao and Scott (1981).Omitting redundancy, let V and W be the covariance matrices of the estimators of the cell proportions under the assumption of simple random sampling and a "complex" sample design respectively.
The design effects are the eigen-values of the matrix V −1 W , and the adjusted chi-squared statistic is χ 2 u /a, where a is the average of the eigen-values.Motivated by Rao and Scott (1981), for cluster sampling Choi and McHugh (1989) derives the following adjusted chi-squared statistic where θt is the maximum likelihood estimator of θ t under their model.The statistic X a ∼ χ 2 (r−1)(c−1) asymptotically, an improvement over X u (i.e., more accurately χ 2 (r−1)(c−1) ).The p-value corresponding to the adjusted chi-squared statistic will be larger.For weighted data they divide X a by the average weight.We note that sometimes X a is difficult to compute because W itself is difficult to compute.

Bayes factor
We now discuss the Bayes factor as an alternative to the chi-squared test.If two models, M 0 and M 1 , are fit to data y ∼ , the Bayes factor for comparing models M 1 and M 0 is defined as the ratio of the marginal likelihoods of the data y ∼ as where θ ∼ k is the parameter vector under M k , p(y is the probability density (or mass) function and p(θ ∼ k | M k ) is the prior density.For example, in our application M 0 is the model of no association and M 1 is the model of association, or vice versa.The Bayes factor summarizes the evidence provided by the data in favor of one scientific hypothesis M 1 relative to another M 0 .Kass and Raftery (1995) gave a comprehensive description of Bayes factors including their interpretation.For example, if 0 ≤ log e (B 10 ) < 1, the evidence against M 0 is "not worth more than a bare mention"; if 1 ≤ log e (B 10 ) < 3, the evidence against M 0 is "positive"; if 3 ≤ log e (B 10 ) < 5, the evidence against M 0 is "strong"; and if log e (B 10 ) ≥ 5, the evidence against M 0 is "very strong".
For the r × c categorical table, we can consider two multinomial-Dirichlet models, one with association and the other with no association.The model with . Thus, for the model with no association, with (1) ∼ Dirichlet(1, . . ., 1), and independently π ∼ (2) ∼ Dirichlet(1, . . ., 1), where π ∼ (1) and π ∼ (2) have r and c components respectively.It is easy to show that the marginal likelihood with association (as) is p as (n ∼ ) = (rc − 1)!n!/(n + rc − 1)! and with no association (nas) is Unfortunately, with this approach, the Bayes factor is sensitive to the prior specification (e.g., π Nandram, Cox and Choi (2005), Nandram and Choi (2006) and the examples in Section 5. Thus, we consider estimation theory (credible intervals) to form a test (i.e., an inverted interval).
How sensitive is the Bayes factor to the choice of the prior distributions?First, note that the prior density, any reasonable person might use in this problem, is the Dirichlet distribution.For the model with association we have selected the prior distributions to be π ∼ ∼ Dirichlet(κ ∼ ), and for the model with no association π ∼ (1) ∼ Dirichlet(κ ∼ (1) ) and independently π

r and n
(2) Then, it is easy to show that the Bayes factor for a test of association versus no association is where generically  (2) to be δ [e.g., in p as (n ∼ ) and p nas (n ∼ ), δ = 1].Sensitivity to the choice of prior distributions can be studied in terms of δ.Here δ = 1 corresponds to the uniform prior distribution and δ = .50,Jeffreys' prior; these are "noninformative" priors usually used in the multinomial-Dirichlet model.We will call a study of BF as a function of δ a sensitivity analysis.In a small sensitivity analysis we can study the behavior of BF at δ = 0.1, 0.5, 1., 1.5, 2, 3 with illustrative example.

The Bayesian Estimative Alternative to the Test of Independence
Our test for independence is simple, and our discussion consists of two parts.First, we state and prove a theorem which shows that the test of association is equivalent to a test that the (r − 1)(c − 1) two-way interactions are all 0.
Second, rather than using the Bayes factor for hypothesis testing, we obtain a 95% simultaneous credible region, a hyper-rectangle, for the (r − 1)(c − 1) interaction effects in the two-way table, and then we check to see whether each interval contains 0. Thus, we consider the problem of estimating the effects in a r × c categorical table.
Our basic model is where n ∼ = {n jk , j = 1, ..., r, k = 1, ..., c} is the vector of cell counts.A priori we take π where 1 ∼ is a vector of ones.Now, we include the effects in the r × c table by taking Note that in the general log-linear model for r × c tables log(nπ jk ) = µ + α j + β k + γ jk , and the grand mean µ gets absorbed into A; see Agresti (1990, Section 5.1) for a discussion of the saturated model.
We need a 95% simultaneous credible region for γ jk , j = 1, . . ., r − 1; k = 1, . . ., c − 1.We can do so using the method of Besag et al. (1995).Note that we do not need to specify the prior distributions of the α j , β k and γ jk .These parameters inherit their prior and posterior densities from the π jk .
It is easy to draw a random sample from π denote a large random sample (M=10,000).Then we have automatically a large random sample from the posterior distribution of γ ∼ by taking 3) Now, we can find a 95% simultaneous credible region for the {γ jk }, obtained by using the sample {γ (h) jk }, h = 1, . . ., M; see our Appendix C for the method of Besag et al. (1995).
Note that for a 2×2 table, there is just a single γ, so that we can have a credible interval for γ.
the log of the odds ratio.So the odds ratio is exp(γ), and we can find a 95% credible interval by using exp(γ (h) ), h = 1, ..., M .
Note that the 95% simultaneous credible region formed by this procedure (if r, c > 2) is a hyper-rectangle.We can form a test that γ jk = 0, j = 1, . . ., r − 1; k = 1, . . ., c − 1, easily.This is done by simply checking that the hyperrectangle contains γ ∼ = 0 ∼ .If each of the components of γ ∼ contains 0, then the hyper-rectangle does.That is, if all components of γ ∼ are 0, there is independence (i.e., no association) between the two attributes, and if at least one component of γ ∼ does not contain 0, there is dependence (i.e., association) between the two attributes.This is analogous to the F-test of the regression coefficients in a normal-theory linear regression model.
In a r × c table there are (r − 1)(c − 1) interactions.The theorem shows that under independence all these interactions must be zero.One should not compute (r − 1)(c − 1) individual 95% credible intervals, because the overall coverage will be much less than 95%.Thus, a 95% simultaneous credible region (a hyper-rectangle in (r − 1)(c − 1) dimensional Euclidean space) is needed; see Miller (1981) for a general discussion of simultaneous inference.Simultaneous regions of the regression parameters in the standard normal theory linear model are optimal ellipsoid (Box and Tiao, 1973).However, because r and c are both generally small, the individual intervals which form the hyper-rectangle are not much wider than those from the individual 95% credible intervals.Nandram and Choi (2006) used results from Altham (1976) to describe a Bayesian methodology to fit "multinomial" data when there is an intra-class correlation.We first review this model, and we show how to obtain the 95% simultaneous credible intervals for the interactions.We show more explicitly the difficulties using the chi-squared test in the situation when there is an intra-class correlation.

Chi-Squared Testing and Intra-Class Correlation
Suppose there are s i individuals in the i th cluster, i = 1, . . ., , and s ijk individuals fall in the j th row and k th column in the r × c table, j = 1, ...r; k = 1, ..., c.Here r j=1 c k=1 s ijk = s i , s ijk ≥ 0. Also, let C denote the set of clusters in which all individuals fall in a single cell of the r × c table.Then, using formulas in Altham (1976) and letting s ∼ i = (s i11 , . . ., s irc ), i = 1, . . ., , where we take θ 1 = 0 for one-member family.Let s = (s ∼ 1 , . . ., s ∼ ).Suppose each cluster has size t, t = 1, ..., T ; in applications T is 2 to 5 or so.Also, let g tjk denote the number of clusters in C of size t with all individuals in cell (j, k), and gt the number of clusters of size t outside C.Then, assuming independence over the clusters, Note that when π jk = π (1) j π (2) k , we simply replace π jk in (4.2) by π (1) j π (2) ) ).Note also that there are two parts in (4.2), one corresponds to the clusters in C and the other to those clusters outside C.
The estimative procedure is easy to implement.The procedure to obtain the 95% simultaneous credible region for γ ∼ is similar to the one in Section 3.Here we obtain sample {π (2) has c components, the marginal likelihoods with association is and the marginal likelihood with no association is Nandram and Choi (2006) construct Monte Carlo consistent estimators of the marginal likelihoods.

Examples
In this section we discuss four examples to compare the methods for testing association in r × c categorical tables, and to show sensitivity of the Bayes factor to prior specifications.In Example 1 we show that inference from the chi-squared test and the simultaneous credible region is similar, but this differs from that of the Bayes factor.In Example 2 we show that when there is a strong association, the three methods give the same inference.
While in Examples 1 and 2 there is mild sensitivity of the Bayes factor to the choice in the prior specifications, in Example 3 we show that there could be strong sensitivity to the prior specifications and to the stability of a cell size.In Example 4 we show how the intra-class correlation affects inference by the three methods.

Example 1: Bone mineral density and family income
We consider the 3 × 3 categorical table of bone mineral density (BMD) and family income (FI) for a probability sample drawn from the U.S. population.FI is a discrete variable, and there are three levels: low, medium and high.While BMD is a continuous variable, the World Health Organization has classified BMD into three levels: normal, osteopenia and osteoporosis.BMD is used to diagnose osteoporosis, a disease of elderly females, and in the National Health and Nutrition Examination Survey (NHANES III) it is measured for individuals at least twenty years old.In Table 1 we summarize the data on white females with chronic conditions.Note: BMD: 1(> 0.82g/cm 2 ; normal), 2(> 0.64, ≤ 0.82g/cm 2 ; osteopenia), 3(≤ 0.64g/cm 2 ; osteoporosis); FI: 1(< $20, 000), 2(≥ $20, 000, < $45, 000), 3(≥ $45, 000); BMD is only measured for age 20+.
Under independence (i.e., no association) the observed chi-squared statistic is 12.7 on 4 degrees of freedom with a p-value of .013,and no association is rejected.The log Bayes factor is 3.40 for evidence of no association relative to association.Therefore, while the chi-squared test provides strong evidence against no association, the log Bayes factor provides strong evidence for no association.This is the contradictory evidence because the chi-squared test rejects independence and the Bayes factor accepts independence.
We have performed the sensitivity test of Bayes factor for the prior parameter δ = .1,.5, 1., 2., 3. The corresponding values of the log Bayes factor are −7.0, − 3.6, −3.4, −4.7, −6.6.As expected, as δ increases, the independence hypothesis becomes stronger.The log Bayes factor is sensitive to the choices of the prior δ, but around δ = 1 (i.e., the uniform prior), there is little sensitivity of the log Bayes factor.Here the evidence for independence from both prior distributions (Jeffreys and uniform) is "strong", and this evidence increases on either side of δ = 1.We note that over these values of δ the 95% simultaneous region changes very little as seen above (i.e., independent from the prior sensitivity), and inference is unaffected.

Example 2: Material and Heat
The U.S. Consumer Product Safety Commission (CPSC) staff analyzes fire incident data to determine patterns and losses (e.g., fire deaths, injuries and property damage) associated with fires that involve household products.The purposes of this analysis are (1) to support or evaluate standards that would make household products less likely to ignite and (2) to identify products that pose new hazards and the patterns of usage associated with such hazards.Attributing fire losses to fire causes is an important part of this task (Greene et al. 2002).In Table 2 we present a 4 × 6 table of forms of materials by forms of heat of 468 fires completely classified, a scaled down version of a typical problem at CPSC (Greene et al. 2002).Note the sparseness of the data in Table 2 makes the standard asymptotic chi-squared test untrustworthy.Note: As reported by Greene et al. (2002) the values in the table do not represent actual data which they described as a scaled down hypothetical CPSC raking problem.Forms of materials are 1(Not furniture), 2(Furniture not in scope), 3(Upholstered furniture), 4(Unknown furniture), and forms of heat are 1(Fuel fired in scope), 2(Fuel fire not in scope), 3(Fuel fire unknown if in scope), 4(Smoking materials in scope), 5(Smoking materials not in scope) and 6(Smoking materials unknown if in scope).
albf3 to n 33 = 3; albf4 to n 33 = 4; and albf5 to n 33 = 5).Observe how sensitive the Bayes factor is to the value of the (3, 3) cell, and to the choices of δ.As δ moves away from one on either side, the log Bayes factor decreases rapidly from a positive value to negative values.The chi-squared test of no independence versus independence is not better.At n 33 = 1, 2, 3, 4, 5 the p-values are .006,.015,.031,.061,.106respectively.Independence is rejected at the 5% significance level for the first three choices of n 33 but not at the others, showing sensitivity to n 33 .Inference from the estimative alternative remains unaltered.

Example 4: Activity limitation status and age with intra-class correlation
We use data from the U.S. National Health Interview Survey on activity limitation status (ALS), a measure of long-term disability resulting from chronic conditions.In Table 3 we present a 3 × 3 categorical table of ALS by age for two states.In State 1 there are 164 sampled adults and in State 2 there are 153 sampled adults.Note that in both tables there are very sparse cell counts especially for the first category of ALS in State 1.We also note that in both tables, members are from 1-member to 5-member families.There is significant correlation between the two members of the 2-member families.A 95% confidence interval for θ 2 in State 1 is (.36, .71)and in State 2 it is (.50, .81)(θ 2 is the intra-class correlation between the two members in the 2-member families).The intra-class correlations among the members of the 3-member, 4-member and 5-member families are not substantial.Thus, we need to adjust the chi-squared test for intra-class correlation.We discuss each state separately.
First, consider State 1.For a test of no association versus association, the log Bayes factor is 5.40 for the model with intra-class correlation and 2.56 for the model with no intra-class correlation.This difference is important because the evidence in the former is "very strong" (i.e., 5.40) whereas in the latter it is "positive" (i.e., 2.56).
The unadjusted chi-squared statistic is 9.60 on 4 degrees of freedom with a pvalue of .048,and the adjusted chi-squared statistic is 7.75 on 4 degrees of freedom with a p-value of .101.Thus, the adjusted test does not reject the hypothesis of independence (i.e., p=0.048) whereas the unadjusted test rejects independence (i.e., p=0.101) at the 5% significance level, with the unadjusted chi-squared test leading to an erroneous conclusion.
Using the model with intra-class correlation the 95% simultaneous credible region is formed by the intervals (-2.93, 4.49), (-2.66, 5.59), (-2.56, 2.43), (-2.12, 3.29); all the intervals contain 0 and the estimative procedure does not reject independence (same conclusion as the adjusted chi-squared test).We note that, using the model with no intra-class correlation, inference from the 95% simultaneous credible region remains unchanged.This implies that the two different correlation models (intra-class and no intra-class) does not give different results for the border line situations (seen for both Bayes factor and chi-squared test).
Second, consider State 2. For a test of association versus no association, the log Bayes factor is 3.09 for the model with intra-class correlation and 0.38 for the model with no intra-class correlation.This difference is important because the evidence of the difference between association and no association in the former is "strong" whereas in the latter it is "not worth more than a bare mention".
The unadjusted chi-squared statistic is 13.52 on 4 degrees of freedom with a p-value of .009,and the adjusted chi-squared statistic is 11.18 on 4 degrees of freedom with a p-value of .025.Thus, there is strong evidence against independence with the unadjusted test, but for the test adjusted for correlation, the evidence is marginal at the 2.5% significance level.
Using the model with intra-class correlation, the 95% simultaneous credible region is formed by the intervals (-2.39, 5.09), (-1.35, 5.95), (-4.49, -0.26), (-2.39, 1.16); only the interval for γ 21 does not contain 0, and the estimative procedure rejects independence (or no association).Using the model with no intra-class correlation inference from the 95% simultaneous credible region remains unchanged, but now it is only the credible interval for γ 12 , not γ 21 , that does not contain 0.

Discussion
We have reviewed two current tools, the chi-squared test and the Bayes factor, to test for association in two-way tables.We have demonstrated the difficulties associated with the standard chi-squared test in two-way categorical tables when the multinomial assumptions are violated.We have also demonstrated difficulties associated with the use of the Bayes factor, especially its sensitivity to prior specifications.To overcome these problems, we propose a new method for "testing" association, capitalizing on the fact that Bayesian hypothesis testing is sensitive to prior specifications whereas estimation is not that sensitive.
Thus, responding to the important issue of the sensitivity of Bayes factors to the choice of the prior distributions in Bayesian hypothesis testing, we introduce a Bayesian estimative procedure for the analysis of two-way categorical tables.In a two-way categorical table, we have utilized a standard relation between the cell probabilities and the main effects and the interactions.A 95% simultaneous credible region for the interactions provides a "test" of independence.This is easy to implement, and it requires samples from the posterior distribution of the interactions.
We also demonstrated that the chi-squared test is inaccurate in nonstandard situations such as when there is intra-class correlation in the data (e.g., familial data).These situations require a model more elaborate than the simple multinomial-Dirichlet model.Thus, the chi-squared test needs special adjustment, and the Bayes factor is also sensitive to prior specifications.Our new method about independence via the 95% simultaneous credible region is simple and not sensitive to prior specifications.When there exists strong association between the two attributes, the three methods give the same conclusions.When there are borderline cases, adjustment is necessary, and our method should to be preferred.
We recommend the 95% simultaneous credible region to make inference about independence in two-way categorical tables in non-standard situations where both the chi-squared test and Bayes factor can fail.One can possibly quantify the strength of the evidence by counting the number of credible intervals containing 0; this is a topic for further research.There are other situations where the 95% simultaneous credible region will be useful.For example, when missing data differ from observed data, the chi-squared test will fail.It is possible to extend our Bayesian estimative procedure to higher dimensional categorical tables.We have not addressed model averaging in this paper because a test is performed using a selected model.

Appendix A: A Discussion of the Defects of the Bayes Factor
We discuss some defects of the Bayes factor in hypothesis testing.The discussion here holds for any Bayesian hypothesis testing problem in which the Bayes fastor is used.Thus, specifically it applies directly to the test of association in a r × c categorical table.Kass and Raftery (1995) have popularized the use of the Bayes factor in scientific problems.However, they have discussed controversies associated with the use of the Bayes factor.As is well known, Bayes factor requires the specification of a prior distribution.They stated, "This may be considered both good and bad.Good, because it is a way of including other information about the values of the parameters.Bad, because these prior densities may be hard to set when there is no such information."Indeed, a serious problem with the Bayes factor is the sensitivity to the prior specification as Kass and Raftery (1995) wrote "An important issue is the sensitivity of the Bayes factor to the choices of priors." With respect to the sensitivity of the Bayes factor to the choices of priors, Kass and Raftery (1999) wrote "Also in contrast with Bayesian point estimates such as the posterior mean, the Bayes factor does tend to be sensitive to the choices of priors on the model parameters," and they also stated that "Bayes factor tends to be more sensitive to the choice of prior than the posterior probability of an interval."These statements are now well known (e.g., see Kass 1993).Berger and Pericchi (1996) agrue that one should operate in strict accordance with two basic premises, selection should have a Bayesian basis and it should be automatic, and they stated that "The reason is that Bayes factors in hypothesis testing and model selection typically depend rather strongly on the prior distributions, much more so than in, say, estimation."We take advantage of these difficulties of the Bayes factor to construct a Bayesian estimative alternative to test of association in a two-way categorical table.
For a sensible calibration of the Bayes factor proper priors are needed (i.e., they must integrate to 1).Berger and Pericchi (1996) wrote "For most model selection problems, one cannot use standard improper noninformative priors; such priors are defined only up to a constant multiple, and the Bayes factor is itself a multiple of this arbitrary constant."Of course, this difficulty can be overcome by using proper priors, but then there is the issue of the sensitivity to prior specification.To overcome this problem, Berger and Pericchi (1996) introduced the intrinsic Bayes factors.But these also have their own problems.Clearly, intrinsic Bayes factors are not Bayes factors, they require part of the data to be treated as a training sample, they require enormous computation even if a minimal training sample or a small random sample of the set of training samples is used, and they are not calibrated with respect to the strength of the evidence as in Kass and Raftery (1995).Moreover, for our problem in the r × c categorical table, the set of minimal training samples is essentially empty; both Jeffreys' prior and the uniform prior are proper.The use of the intrinsic Bayes factor fixes the calibration problem, but not the sensitivity problem, and therefore, they are not appropriate in our context.
There is also a difficulty in interpreting the Bayes factor as pointed out by Lavine and Schervish (1999), "The removal of the prior odds from the posterior odds to produce the Bayes factor has consequences that affect the interpretation of the resulting ratio."As pointed out by Lavine and Schervish (1999), "Just because the data increase the support for a hypothesis H relative to its complement does not necessarily make H more likely than its complement, it only makes H more likely than it was a priori."See Lavine and Schervish (1999) for further discussion.We do not address this issue further in this paper.
k j=1 c j = 1.Then, we can choose each of the components of κ ∼ , κ ∼ (1) and κ

Table 1 :
Classification of bone mineral density (BMD) and family income (FI) for 1,844 white females, at least 20 years old (20+)

Table 2 :
Classification of fire deaths by form of materials and form of heat for 468 fires

Table 3 :
Classification of activity limitation status (ALS) and Age for two states