Analysis of Unbalanced Microarray Data

This paper investigates statistical procedures for analyzing microarray gene expression data obtained from studies with an unbalanced experimental design. We demonstrate the methods using microarray data from a study of opioid dependence in mice. The experiment was designed to investigate how morphine dependence alters gene expression in spinal cord mRNA. The aim was to identify genes that characterize the tolerance, withdrawal and two abstinence stages of dependence and to describe how gene expression is altered in moving from one stage to the next. The study design was unbalanced in several respects. First, for mice receiving morphine, arrays were made for four dependence stages, while for mice receiving placebo, arrays were made for only three stages. Second, administrative error led to an omitted replication for one treatment combination. Third, some expression readings were missing. Extending the two-stage ANOVA model of Lee et al (2000, 2002a) this paper first uses a chi-square statistic to identify a small set of genes that exhibit differential expression over one or more treatment combinations. This gene set is then examined further using cluster analysis and novel inference methods to uncover specific genes and gene clusters that play a role at different stages of opioid dependence and, in particular, a role in the persistence of effect into the late abstinence stage. The latter effect implies that morphine dependence has a long-term genetic impact. The statistical power of the study to uncover differentially expressed genes is calculated as a prelude to further investigation. The analytical results proved useful to scientists in understanding the link between opioid dependence and gene function.


Introduction
Linear statistical models, such as the methods of analysis of variance (ANOVA) as discussed by Kerr, Martin, et al (2000), Kerr and Churchill (2001), Lee et al (2000Lee et al ( , 2002a)), and mixed-effects statistical models discussed by Wolfinger, Gibson, et al (2001), have been widely used for analyzing microarray gene expression data.These investigations, however, have tended to focus on complete and balanced designs.A complete and balanced design is an ideal that is not always achieved in practice.The cost of conducting a microarray experiment, in terms of labor, time and money, sometimes makes it impractical to use a balanced design.Also, some studies set out with a balanced design but become unbalanced during implementation because of technical and administrative problems.Finally, it is common for scientists to embark on a study prior to consulting a statistician and the chosen design may be unbalanced.In recognition of this reality, this paper presents methods for analyzing microarray data from unbalanced designs.We illustrate the methods using data from a study of opioid dependence in mice.
This article extends the two-stage ANOVA model considered in Lee et al (2000Lee et al ( , 2002a) ) to unbalanced designs.First, we use a chi-square statistic to identify a small set of genes that exhibit differential expression over one or more treatment combinations.This gene set is then examined further using cluster analysis and novel inference methods to uncover specific genes and gene clusters that play a role at different stages of opioid dependence and, in particular, a role in the persistence of effect into the late abstinence stage.The latter effect implies that morphine dependence has a long-term genetic impact.We calculate the statistical power of the study to uncover differentially expressed genes as an aid to planning the scale of future studies in this series of investigations.Finally, we show how the analytical results proved useful to scientists in understanding the link between opioid dependence and gene function.

Experimental method and design
The impact of opioid dependence on the genetic makeup of spinal cord and brain tissue is medically and scientifically important and has attracted the attention of a number of researchers -see Loguinov et al. (2001) and references therein.The experiment reported here was designed to investigate how morphine dependence in mice alters gene expression in spinal cord mRNA.The study design involved two treatments, morphine and placebo, and four time points corresponding to consecutive stages of opioid dependence, classified as tolerance, withdrawal, early abstinence and late abstinence.The aim of the study was to identify genes that characterize the tolerance, withdrawal and two abstinence stages and to describe how gene expression is altered in moving from one stage to the next.
Pellets containing 75 mg of morphine base (treatment) or placebo (control) were implanted subcutaneously in eight-week old male mice.Treatment mice were sacrificed sequentially at four time points corresponding to the four dependence stages.Control mice were sacrificed at the same time points, with the exception of the withdrawal stage which was omitted on the assumption that the tolerance and withdrawal stages are identical in placebo group.Thus, four arrays were made for mice given the morphine treatment, one for each dependence stage, while only three arrays were made for mice given the placebo treatment.
The microarray data we consider resulted from hybridization of the dorsal part of the mouse spinal cord samples to a custom-designed cDNA array.The cDNA array contains 1667 cDNA fragments and 61 control spots.At each time point (i.e., at each stage), in both the treatment and control groups, three mice were sacrificed, for a total of 21 mice.The paucity of spinal column mRNA in any single mouse required that the mRNA of the three mice sacrificed together be combined and blended into a single sample.The treatment and control samples were labeled with Cy-5 dye and read on the red channel.Other control samples, derived from mouse brain tissue, were labeled with the Cy-3 (green) dye.As the purpose of this investigation is to learn how morphine dependence alters gene expression in spinal cord mRNA, the Cy-3 readings from mouse brain tissue ( a different tissue type) are not used in the analysis reported here.See Lee et al (2000) for another illustration of the use of single-channel microarray readings.DNA microarray analysis of gene expression was done essentially as described by Eisen and Brown (1999) with some modifications (Loguinov et al. 2001).
For each gene g, fluorescent expressions were measured for treatment group (1= placebo, 2= morphine), and time stage t (1= tolerance, 2= withdrawal, 3= early abstinence, 4= late abstinence), except for the placebowithdrawal combination where no array was created.The placebo-withdrawal combination was not implemented on the assumption that it would be identical to the placebo-tolerance combination.The original study plan called for each spinal column sample to be placed on two spots of the same slide, yielding a replicated expression reading.This plan was not carried through perfectly, however, as the spot replicate for the morphine-tolerance combination was omitted by mistake.
Finally, as often happens in microarray experiments, it was found that the array for the morphine-late abstinence combination had a large number of defective spots and several dozen spots on other arrays were faulty.The final microarray data set contains readings for only 1722 genes out of the original set of 1728 (=1667+61).Six genes were dropped because they had defective readings for the morphine-tolerance combination, the unreplicated treatment combination in the design.We did not choose to employ imputation to replace missing values.It was felt that any imputation assumption would be too speculative to be justified.
In summary, therefore, we find three elements of imbalance in this design: (1) an omitted treatment combination, (2) an omitted replicate and (3) several missing expression readings.

Analytical framework
As the design is not a complete factorial design (one cell is not complete), we shall initially view the K = 7 cells as seven individual experimental conditions, indexed by k = k 1 , . . ., k 7 .The reading Y gkr represents the natural logarithm of the raw Cy-5 gene expression reading, without background correction, for gene g, treatment combination k and spot replicate r.Table 1 shows the correspondence of the treatment combinations to the index values.
Table 1.The experimental design for the study involves seven treatment combinations of time stage and morphine treatment.The placebo-withdrawal treatment combination was not implemented on the assumption it would be identical to the placebo-tolerance combination.All treatment combinations involved replicated spots except the morphine-tolerance combination which had no replicate because of administrative error.

Dependence Time Stage
As shown by Lee et al (2002a), the following two-stage ANOVA model is appropriate for this kind of data set.
In equation (3.1), µ represents the overall population mean log-expression intensity, τ k is the main effect for the array corresponding to experimental condition k, and W gkr is a normalized (centered) quantity that captures all the gene-specific effects of the experimental condition.In (3.2), γ g is the main effect for gene g, (γτ ) gk is the interaction effect for experimental condition k and gene g, and gkr is an error term.Interaction (γτ ) gk reflects the differential expression of gene g in condition k.We assume that the error terms gkr have mean 0, have a common variance and are mutually independent.We make no assumption about their distribution form.
Recall that the original study design called for two spot replicates per treatment combination.The replication in the design ensures ample degrees of freedom for estimating all parameters.We note for clarity that the notation in models (3.1) and (3.2) does not take explicit account of the omitted spot replicate in treatment combination k 4 or the missing readings for defective spots.
The estimation of the parameters in equations (3.1) and (3.2) proceeds in two stages.The residuals from fitting the first model are estimates of the W gkr .The estimates τk are normalizing constants for the arrays.As there are G = 1722 genes in our microarray data set and the design is unbalanced, fitting the second model involves substantial calculation.In essence, a regression model is fitted for each gene g to provide estimates of γ g and (γτ ) gk , for k = k 1 , . . ., k 7 , g = 1, . . ., G. We impose sum-to-zero estimability constraints on each of the sets of parameters for main effects and interactions.
We shall use the simpler notation I gk for the interaction term (γτ ) gk and denote its estimate by Îgk .The Îgk are the estimated differential effects for the seven treatment combinations.

Analysis and results
We now describe the results of several analyses of differential gene expression in this experiment.

Differentially Expressed Gene Pool
First, we investigate which genes, if any, are differentially expressed in any of the treatment combinations, reflecting effects arising from either the time stage or morphine treatment.This first analysis identifies a set of genes that are qualified for study in subsequent stages of analysis in the sense that they show evidence of some form of differential expression.The sum of the squared interaction effects, shown next, is a summary measure that captures the combined effects across all treatment combinations.
We do not use measure C g directly because of the imbalance in this experimental design.Instead, we use the model sum of squares from the regression analysis of (3.2) for gene g.The model sum of squares is a weighted-sum version of C g .We denote this quantity by SSM g .With a balanced design having r readings for gene g on each of the treatment combinations, we note that C g and SSM g would be connected as follows: SSM g = rC g .
If the ANOVA error term in (3.2) were normally distributed, one might wish to use the regression F statistic F * g = MSM g /MSE g to test the null hypothesis of no differential expression for gene g, i.e., the hypothesis H 0 : I gk = 0 for k = k 1 , . . ., k 7 .Here MSM g = SSM g /6 and MSE g denote the regression mean squares for the model and error ANOVA components, respectively.The test would entail comparing F * with an appropriate F percentile.
We wish to avoid an assumption about the error term distribution, however, and thus we use the following chi-square based rule.This rule requires that the estimated interaction effects Îgk be approximately normally distributed.With replicated observations, the central limit theorem gives some assurance of approximate normality in this context.
Proceeding, we let M denote the median value of the SSM g values for all G = 1722 genes.We then calculate the statistic where K − 1 = 7 − 1 = 6 in this application.Quantity χ 2 K−1 (0.5) denotes the median of a χ 2 K−1 variable.The ratio of the respective medians in (4.2) scales the quantity SSM g appropriately so that S g will have an approximate χ 2 K−1 distribution under the null hypothesis of no differential expression, i.e., under H 0 : I gk = 0 for k = k 1 , . . ., k 7 .The distributional approximation holds if at least 50 percent of the genes are not differentially expressed, which we believe to be a reasonable assumption in this study.Now, we choose the following chi-square percentile as a cutoff for S g that separates differentially expressed genes from those that are not In this application, this cutoff value equals χ 2 6 (1722/1723) = 23.75.Criterion (4.3) has the property that it tends to produce one false positive in each application.

Cluster analysis
Next, we employ cluster analysis to identify subsets of the 43 genes in the differentially expressed gene pool that have similar expression profiles across the seven treatment combinations.For this purpose, we employ the routine for hierarchical cluster analysis from the JMP statistical software system of the SAS Institute.The clustering routine uses the standardized values of the Îgk and the centroid linkage criterion.The results are shown by a gene-wise dendrogram in Figure 1.We see a number of distinct clusters.We cut the tree at a distance of 1.423, creating 22 clusters.Table 3 shows the membership of these clusters.Seven clusters contain multiple genes and 15 are singleton clusters.

Comparing genes at the late abstinence stage
Continuing, we focus our investigation on differential expression for the late abstinence stage, i.e. time stage 4, to see if morphine administration has a lasting impact on gene expression.For this purpose, we are interested  3. Verticle line shows the selected cluster solution.
in the following difference between condition k 3 , placebo at time 4, and condition k 7 , morphine at time 4.
The estimate of this difference is denoted by dg , where dg = Îgk 7 − Îgk 3 .Again, based on the approximate normality of the estimated interaction effects, we have that dg follows an approximate N(0, σ 2 0 ) distribution under the null hypothesis H 0 : d g = 0, where σ 2 0 denotes the null variance parameter.This hypothesis states that there is no differential expression between the morphine and placebo treatments in the late abstinence stage.
We look at both the sample standard deviation of the dg and the standard deviation estimated from the interquartile range of the dg for the 1722 genes as alternative estimates of the null standard deviation σ 0 .The respective values are 0.5564 and 0.5942.They differ little.To be conservative, we shall use the smaller value as our estimate of σ 0 , denoting the estimate by σ0 .We then compute the standardized values z g = dg /σ 0 and compare |z g | for each gene g with the standard normal percentile z[G/(G + 1)] = z(1722/1723) = 3.2483.Values of |z g | that exceed this percentile are judged to be differentially expressed.This cutoff criterion will tend to give one false positive in each tail (two false positives in total) with each application.Table 4 lists the four genes that are up-regulated and the one gene that is down-regulated in the morphine treatment, relative to the placebo, in the late abstinence stage.All of these genes are in the differentially expressed gene pool identified in the first analysis.As Table 4 lists only five genes in total and two are expected to be false positives, three genes are expected to be true positives.

Summarizing differences across all time stages
As a final analysis, differences between morphine and placebo treatment combinations of the type shown in formula (4.4) were calculated for all four time stages for the whole gene pool.For the withdrawal stage, the interaction estimates Îgk 1 from the placebo-tolerance combination were used as estimates for the missing placebo-withdrawal combination in the experimental design.We wish to test the null hypothesis H 0 : d gt = 0 for all t.Here d gt is the difference for gene g in time stage t and t = 1, . . ., T , with T = 4.The standardized difference corresponding to estimate dgt is denoted by z gt and is calculated here using the standard deviation of all differences dgt at time stage t.As an approximate test procedure, we employ the following statistic, which is an analog of the earlier chi-square statistic (4.1).
A gene is judged to be differentially expressed between the morphine and placebo treatments at one or more dependence time stages if its statistic D g exceeds χ 2 T [G/(G + 1)] = χ 2 4 (1722/1723) = 19.67.In this application of the test procedure, D g will be only approximately distributed as χ 2 4 under H 0 .With the approximation, however, we know the procedure is expected to yield about one false positive in testing the whole set of G = 1722 genes.
Table 5 shows the nineteen genes identified as differentially expressed by this criterion.The nineteen genes have been grouped into nine clusters in Table 5 according to the pattern of expression across the four time stages.5. Verticle line shows the selected cluster souution.
The nine clusters were cut from the dendrogram in Figure 2 that was generated using the same hierarchical clustering routine described earlier.The cluster patterns are not internally uniform but some consistency is present.For example, the cluster with the largest membership, cluster B8 as listed in Table 5, shows genes that are fairly consistently up-regulated at all time stages under morphine relative to placebo.There is no guarantee that all genes that appear as differentially expressed by this criterion will be from the differentially expressed gene pool, although there should be a high degree of correspondence.We note that two of the 19 genes (g910, g1296) in cluster B8 are not in the pool of 43 genes listed in Table 2.

Estimated power
Using methods described in Lee et al (2002b), we have estimated the statistical power of the experimental design used in this study as a guide to designing further studies in this series of genetic investigations.To make this estimate, we use S g in (4.2) as the summary measure of differential expression.Under H 0 , this measure is approximately a χ 2 6 variate.As the specified alternative hypothesis H a , we postulate a single treatment condition (say, the morphine-late abstinence condition k 7 ) that is differentially expressed relative to all other treatment conditions which are assumed to have uniform expression levels.Specifically, we assume that the differential expression in this pattern is a four-fold difference.Under this specification of H a , S g follows a non-central chi-square distribution with six degrees of freedom and a non-centrality parameter which is estimated to be 25.0.
We have controlled the family type I error probability so that only one false positive is expected.We also have assumed that the estimation errors in the interaction estimates Îgk are independent for different genes.(We stress that the independence here relates to statistical estimation error and not gene co-regulation.Co-regulation is almost certainly present.) Finally, calculation of the relevant area under the non-central chi-square distribution gives the estimated power of this study under the specified alternative hypothesis as 0.739.This power is moderate to large, suggesting that differential expression levels consistent with a non-centrality parameter of 25.0 had more than seven chances in ten of being uncovered.The sum of the degrees of freedom and non-centrality value assumed under H a equals 6+ 25.0 = 31.Referring to the values of S g reported in Table 2 for differentially expressed genes, we see that a value of 31 is quite typical.If H a were changed to an hypothesis with an eight-fold differential expression then the noncentrality parameter increases to 56.4 and the power jumps to 0.999.

Discussion of results
Genes implicated in long-term genetic changes following morphine administration are expected to have a consistent and major differential impact across all time stages and, in particular, to be clearly significant in the early and late stages of abstinence.With this thought in mind, the results of the preceding statistical analysis were subjected to review by the principal scientist, giving the following findings and those summarized in Table 6.

Sorting out image artifacts
Upon review of the 43 genes in the differentially expressed gene pool in Table 2, 14 were observed to have image artifacts that may have caused them to appear as differentially expressed.Specifically, the artifact was identified as a six-fold or greater difference between the two duplicate spots As noted in Section 2, 61 of the 1728 spots were used as negative controls.Three of these negative-control genes (g1063, g1087 and g1134) appear in the list of 43 differentially expressed genes and are clearly false positives.As these three negative controls are among the 14 spots showing image artifacts, the finding suggests that it was faulty replication of the spot that allowed these negative controls to appear in the differentially expressed gene pool.

Functional classification
Table 6 classifies the 43 genes in the differentially expressed gene pool into functional groups.The 14 genes with image artifacts are segregated in one class of the table.The remaining 29 genes are classified into functional groups related to energy metabolism, regulatory function or structural function or as having unknown function.Recall from section 4.4 that two additional genes (g910, g1296) appeared in Table 5 as having differences in expression between morphine and placebo treatments but are not among the 43 differentially expressed genes in Table 2.These two genes have also been classified in Table 6 into the energy metabolism and structural functions groups, respectively.
The energy metabolism group in Table 6 is characterized by increased expression during the tolerance stage and then a slight decrease during the early abstinence stage.Expression for the regulatory group also increases during the tolerance stage, decreases during withdrawal and then increases during late abstinence.Finally, expression for structural genes increases during tolerance and decreases during late abstinence.
Matching these categories with the findings in Tables 3, 4 and 5, reveals some interesting features.A small subset of genes associated with energy metabolism appears in Cluster A2 of Table 3 (g438, g900, g914, g946).The only gene spot down-regulated in morphine in Table 4 turns out be be a negative control from the image artifact category (g1087).Of the four genes up-regulated in morphine in Table 4, three are associated with energy metabolism (g278, g438, g1199) and the last is found in the regulatory group (g1704).In Table 5, singleton cluster B6={g1105} (a gene in the unknown function category) is noteworthy for its pattern of strong down-regulation for the first three time stages and then moderate up-regulation in the late abstinence stage.Also, oddly, singleton cluster B2={g1024} (having unknown function) is strongly up-regulated in morphine in the withdrawal stage.Whether this occurrence reflects a data anomaly or a real expression response requires further investigation.We observe that cluster B3 in Table 5 consists almost completely of genes with image artifacts (including one negative control gene, g1063).Gene g633, a member of the energymetabolism group, is the only exception.Singleton cluster B7={g1073} is also in the image artifact category.Cluster B8, a large cluster, is a mixed lot consisting of genes associated with energy metabolism and structural function and one spot on the image artifact list (g1698).
Tables 3 and 5 pose a contrast in clustering of differential expression profiles.Table 3 clusters genes that have similar gene expression profiles across the seven treatment combinations as listed in Table 1, irrespective of how these profiles relate to time stage or morphine-placebo treatment.On the other hand, Table 5 clusters genes on the basis of similar differential expression patterns between morphine and placebo across the four time stages.Matching the cluster memberships in Tables 3 and 5, we see that cluster A5 in Table 3 and cluster B3 in Table 5 share three gene spots having image artifacts (g1063, g1257, g1649), the first being a negative control.Several of the singleton clusters also match up.Cluster A2 in Table 3 and cluster B8 in Table 5 share two genes (g438, g870).

Figure 1 .
Figure 1.Dendrogoram corresponding to Table3.Verticle line shows the selected cluster solution.

Table 2 :
Differentially Expressed Gene Pool: 43 genes exhibiting significant differential expression for one or more of the seven treatment combinations according to the criterion that model sum of squares SSM g is significantly large, ordered by chi-square statistic S g .ratebut the study team found this one to be acceptable in this application.Based on the cutoff value of 23.75, we find that the 43 genes listed in Table2are differentially expressed.We refer to this set as the differentially expressed gene pool.The fact that only one false positive is

Table 3 :
Gene clusters sharing similar differential expression profiles across the seven treatment combinations, irrespective of morphine or placebo treatments.The 43 genes are listed here.

Table 4 :
Five genes exhibiting a significant difference in differential expression between the morphine and placebo treatments in the late abstinence stage as assessed by the standard normal statistic z g .

Table 5 :
Differences in differential expression between the morphine and placebo treatments in each of four dependence time stages as assessed by the standard normal statistic z g for each stage.The clusters group genes with similar profiles across the four time stages.

Table 6 :
Functional classification of genes in the differentially expressed gene pool.