General Marginal Regression Models for the Joint Modeling of Event Frequency and Correlated Severities with Applications to Clinical Trials

In many clinical trials, information is collected on both the frequency of event occurrence and the severity of each event. For example, in evaluating a new anti-epileptic medication both the total number of seizures a patient has during the study period as well as the severity (e.g., mild, severe) of each seizure could be measured. In order to arrive at a full picture of drug or treatment performance, one needs to jointly model the number of events and their correlated ordinal severity measures. A separate analysis is not recommended as it is inefficient and can lead to what we define as “zero length bias” in estimates of treatment effect on severity. This paper proposes a general, likelihood based, marginal regression model for jointly modeling the number of events and their correlated ordinal severity measures. We describe parameter estimation issues and derive the Fisher information matrix for the joint model in order to obtain the asymptotic covariance matrix of the parameter estimates. A limited simulation study is conducted to examine the asymptotic properties of the maximum likelihood estimators. Using this joint model, we propose tests that incorporate information from both the number of events and their correlated ordinal severity measures. The methodology is illustrated with two examples from clinical trials: the first concerning a new drug treatment for epilepsy; the second evaluating the effect of a cholesterol lowering medication on coronary artery disease.


Introduction
In many clinical trials, data on both the frequency of a disease event as well as the severity of each event, often measured by an ordinal score, is recorded.Both of these endpoints (frequency and severity) can be important in evaluating treatment performance.Consider two examples.In a clinical trial of an antiepileptic seizure medication both the frequency and the severity of seizures are important endpoints in evaluating treatment performance.In a clinical trial of a cholesterol lowering drug both the number of arterial blockages as well as the degree of occlusion in each lesion are informative on drug performance.Data of this type are actually quite common.More examples include: depression may occur a random number of times with the severity of each depression episode being measured by an ordinal scale (e.g., 1=mildly depressed, 2=moderately depressed, 3=very depressed); patients suffering from migraine headache may have a random number of headaches with differing levels of pain; patients with cancer may have a random number of tumors of varying sizes, etc. Conceptually, we can think of these responses as a random ordinal vector of random length.A naive approach to such data would be to analyze the number of events separately from the multivariate severity measures.However, there are several reasons why this is not an ideal approach.First, even when one is primarily interested in each outcome separately, a separate analysis can result in a substantial decrease in efficiency (Albert et al., 1997) as information contained in one endpoint can be used to estimate and make inference on parameters that describe the distribution of the other endpoint.Second, a separate analysis of severity can also give misleading results.Consider the clinical trial of a cholesterol lowering medication detailed in section 6.2.A separate analysis of severity is restricted to patients with severity measures, i.e., the group of patients with at least one blockage (event).In a randomized clinical trial, the patients with at least one event may consist of a biased sample of the test population.We refer to this phenomenon as "zero length bias".If the treatment under investigation is effective, patients in the treatment group are less likely to experience an event and therefore less likely to have a severity measure.An estimate of the marginal probability of observing a certain severity level conditional on observing an event, will not, in general, be equal to the unconditional marginal probability of observing this severity level.In the clinical setting, the unconditional marginal probabilities seem more relevant, i.e., the probability of having a severe blockage seems more relevant in assessing treatment than the probability of a severe blockage given a blockage has occurred.By jointly modeling the distribution of the number of events and their correlated severity measures, we propose a formal test for questions of this type (see details in section 5).In addition, the investigator may be interested in the relationship between the number of events and their severity.For example, one may want to test whether a drug changes the inhibition process among cancerous tumors (i.e., large tumors suppressing the ability of other tumors to grow near them) or estimate the probability of observing two or more severe events.Questions of this type can only be answered by modeling the joint distribution of the number of events and their correlated severity measures.Barnhart (1992) gave the first probabilistic definition of random length elements.Barnhart and Sampson (1995) proposed a multiple population model to jointly model the number of events and the correlated severities.Their methods are likelihood-based and assume that the event severities are distributed as a multivariate normal distribution.Recently, Barnhart has proposed a probit model for multivariate random length data (Barnhart, 1998).The probit model, however, assumes a discretization of a latent multivariate normal distribution for the multivariate ordinal responses and may not be flexible enough in many applications.Furthermore, these previous methods for multivariate random length data produce estimates of two treatment effects: one for event severity and one for the number of events.One needs to be cautious in interpreting the treatment effect for severity as it conditions on an event occuring.The test proposed in this manuscript provides a test of the overall treatment effect which includes both the number of events and the severity of these events.
This paper proposes a general, likelihood based, marginal regression model for jointly modeling the number of events and the vector of correlated ordinal severity measures.The proposed method is an extension to the joint modeling situation of the marginal models for multivariate categorical data proposed by Molenberghs and Lesaffre (1999).These models re-parameterize the multinomial distribution through a generalized linear model (GLIM) construction (McCullagh and Nelder, 1989), and allow great flexibility in the specification of the margins and associations.
The organization of this paper is as follows: in section 3 we present the proposed joint model; in section 4 we discuss estimation issues; in section 5 we present a limited simulation study; and in section 6 we present two examples from two clinical trials-one concerning an anti-seizure medication, the other a cholesterol lowering drug.

Example 1: Epilepsy clinical trial
This example involves data collected as part of a double-blind placebo controlled clinical trial designed to study the effect of felbamate (FBM), a new antiepileptic drug, on epileptic seizure activity.A group of 40 epileptic patients were randomized to receive either placebo or felbamate.Patients were then followed for eighteen days (the first three being a titration period), and the number of partial seizures as well as a binary variable characterizing the severity of each seizure were recorded.The study was described in detail in Theodore et. al. (1995).For purposes of illustration, we collapsed the data in the following way.The study period (day 4 through day 18) was divided into four time periods.If an individual had any seizure activity in a given time period, we considered an event to have taken place.In addition, we assigned a severity score (mild or se-vere) based on the severity of seizures in that time period.The number of events, therefore, varied between 0 and 4 while the severity of each event was either mild or severe.The data is given in table 3. A quick look at the data suggests that felbamate lowers both the average number and severity of seizures.Let treatment group be denoted by the indicator variable FBM i with FBM i = 1 if individual i is given felbamate and FBM i = 0 if given placebo.Informative dropout was a problem in this dataset.Out of the original 40 individuals (20 randomized to felbamate and 20 randomized to placebo), 21 completed the protocol (13 in the felbamate group and 8 in the placebo group).Most of the dropouts in the placebo group dropped out because of increased seizure frequency and severity during the titration period.Those in the felbamate group tended to drop out due to drug induced side effects unrelated to seizure activity.To illustrate the current method, we ignore the problem of informative dropout and include only individuals that completed the protocol.Estimates based on this data set will tend to underestimate the effect of the drug therapy on seizure activity.

Example 2: Type II coronary intervention study
This example involves data collected during the National Heart, Lung and Blood Institute (NHLBI) Type II Coronary Intervention study (Brensike et al., 1982(Brensike et al., , 1984)).Patients with Type II hyperlipoproteinemia and coronary heart disease were randomly allocated to a daily dose of 24g of cholestyramine and diet (treatment group) or placebo and diet (control group).The goal of the study was to assess the effect of cholestyramine on coronary artery disease progression at the end of five years of study.The number and size of each lesion (blockage) in the lower arterial descending portion of the heart was determined by coronary angiogram at the fifth year of follow-up.We take the size of each lesion to be a severity score with small (=1), medium (=2), or large (=3) degrees of blockage.The observed number of blockages in this part of the heart varied from zero to four.A total of 143 patients were recruited into the study.At the end of five years of follow-up there were only 116 (57-placebo; 59-cholestyramine) patients who had their fifth year coronary angiogram done.For the purpose of illustration, this is the population we consider.Examination of the data did not suggest any problems with informative dropout.Let treatment group be denoted by the indicator variable TRT i with TRT i = 1 if individual i is given cholestyramine and TRT i = 0 if given placebo.We also have two additional covariates.SBP i is an indicator variable for whether systolic blood pressure is greater than 120, and RCA i is an indicator variable of whether a regional contractile abnormality is present.

A Joint Model
Suppose we observe data (k i , y i , x i ) for the i th individual, where the count k i is a realization of a discrete random variable K i (frequency of event in our context), y i is a realization of a k i -dimensional ordinal random variable (severity measures in our context), and x i is a vector of covariates.For simplicity, we will assume that each component of Y i can take on r distinct values (i.e., r ordinal categories), though this assumption can be relaxed.Without loss of generality we will denote these values as 1, 2, ..., r.We specify a joint distribution for K i and Y i by factoring it into two parts: a distribution for K i and a distribution for Y i |K i .Since the event severities and the counts both reflect the severity of the disease outcome, certain covariates that affect disease may affect both K i and Y i |K i .Therefore, we specify a dependence between them by allowing these two pieces to share information through common parameters.We summarize the proposed joint model here: 1.The distribution of the random length K i , following Barnhart et al. (1999), is modeled as a slightly modified generalized linear model (GLIM) (Mc-Cullagh and Nelder, 1989) with three components: (a) The random length variable K i has a discrete distribution: where the φ i 's are the natural parameters for the exponential family, J > 0 is assumed known (possibly infinite), and a (•) and b (•) are known positive functions; (b) The link function h: , where we choose h (•) such that h b (•) is either a strictly increasing or strictly decreasing function; (c) The systematic component: (Molenberghs and Lesaffre, 1999).This model is also comprised of three components: (a) The distribution of Y i |K i = k is multinomial with cell probabilities π i and corresponding orthant probabilities p i ; (b) A vector of link functions h: where h m is a vector of marginal link functions, h a is a vector of bivariate and higher order association link functions, and is a vector of univariate orthant probabilities.For example, a commonly used link function is the logit (h(p) = log(p/1 − p)).Additional examples can be found in section 3.2.For simplicity, we restrict the component functions of h m to be the same scalar function, i.e., h m = (h m , ..., h m ) , where h m : [0, 1] −→ is either a strictly increasing or decreasing function.Therefore, the size of h m is the same as p m i with each component operating on one and only one component of p m i ; (c) The systematic component: where X m i and X a i are design matrices formed by linear combinations of covariates and intercept terms, and e is a vector of all 1's with its size being the same as η m i .Hence, different covariates may be used in the marginal (η m i ) and association (η a i ) models.Here we make the proportional odds like assumption that β is the same for all levels of the severity measure Y i .We discuss the MMMC framework in more detail later in this section.

(K
Note that the β's are common parameters shared in the specifications for both K i and Y i |K i .The scalar parameter γ can be thought of as scaling the marginal effect of the covariates on the multivariate severities to the random length scale.In addition, the sign and magnitude of γ control the association of the random length with the multivariate severities.If γ = 0, there is no relationship between the number of events and their severities.By the model assumptions, we have . Suppose γ > 0 and K i and K i are two random length variables such that x i β > x i β, and therefore and h m are strictly increasing functions, the above model describes the following: for γ > 0, the larger the value of x i β the more likely we are to observe a larger number of events as well as greater severity in any given event; for γ < 0, the larger the value of x i β the more likely we are to observe a smaller number of events as well as greater severity in any given event.Note that if h is a canonical link function, then h b (•) is the identity function (McCullagh and Nelder, 1989), a strictly increasing function.

Marginal models for multivariate categorical data
The MMMC framework was proposed by Molenberghs and Lesaffre (1999).For simplicity, in this section "marginal" is used in the context of the conditional random variable Y i |K i .Note that given K i = k, all the information in Y i is contained in the r k contingency table obtained by cross-classifying Y i .Denote the j th entry in this table by Z i (j) (Z i (j) ∈ {0, 1}) where j = (j 1 , j 2 , ..., j k ) , 1 ≤ j ≤ r (r = (r, r, ..., r) , a k-dimensional vector).Assume this table is sampled from a multinomial distribution with j th cell probability π i (j) .Note that E (Z i (j)) = π i (j) .Define orthant probabilities p i (j) = l≤j π i (l) .This can be concisely written in matrix form as p i = Lπ i (Molenberghs and Lesaffre, 1999), where p i is a vector of all orthant probabilities, π i is a vector of all multinomial cell probabilities, and L is a r k ×r k matrix of 0 and 1's.For example, in the bivariate binary case, the matrix L is given by  Univariate (marginal) orthant probabilities are then given by setting all but one of their indices to their maximal value: p t i (j) = p i (r 1 , ..., r t−1 , j, r t+1 , ..., r k ) , where r 1 = ... = r t−1 = r t+1 = ... = r k = r.The n-variate (marginal, n < k) orthant-probabilities are given in a similar way by setting all but n of the indices to their maximal value.For example, in the bivariate case In the marginal models proposed by Molenberghs and Lesaffre, the cell probabilities in the above contingency table are expressed, via the link functions of orthant probabilities, as a function of covariates.This model is "marginal" since it explicitly models the univariate or-thant probabilities, through link functions, as linear functions of covariates.The bivariate and higher-dimensional orthant probabilities are then used to model the association structure with link functions chosen so as to parameterize a given association structure.

Examples of marginal models for multivariate categorical data
The MMMC framework provides a class of models with the generality and flexibility necessary to model multivariate categorical data.Many previously proposed models are MMMC models.For example, assume K i = 2 and r = 2, giving us two binary responses ) .
The η i 's are usually taken to be linear functions of the covariate vector x i .Note that the association is modeled with an odds ratio, given as exp (η i12 ) .A bivariate probit model can be defined for these data by the mapping π where r t (•) is the tetrachoric correlation.If we define Φ(x, y, ρ) to be the cumulative distribution function of a standard bivariate normal distribution with correlation ρ, the tetrachoric correlation is the value of ρ such that π i (2, 2) = 1 − Φ(η i1 , η i2 , ρ).Thus two popular models for univariate categorical data, the logit and the probit model, are unified in a multivariate setting with the marginal parameters having the same interpretation as in the univariate setting.Many more models are possible, and the bivariate logit and probit models can be generalized to higher dimensions.Molenberghs and Lesaffre (1999) discuss possible association structures for different marginal models.An important class of marginal models are the so called multivariate logistic models of Glonek and Mc-Cullagh (1995).The multivariate logistic transform is defined by the mapping and involves contrasts of log probabilities-given by linear combinations of multinomial cell probabilities.The explicit construction of C and A is given in Glonek and McCullagh (1994).This class of models extends the logit models for categorical data given in Agresti (1990, pp. 306-338) to the multivariate setting.Global and local odds ratios are common measures of association that are included in the multivariate logistic model class.The bivariate Dale model (Dale, 1986) and its n-variate extensions (multivariate Dale model: Molenberghs and Lesaffre, 1994) are also members of the MMMC family.These Dale models specify dependences between the ordinal outcomes in terms of (generalized) global odds ratios while there are no restrictions on the types of marginal distributions that can be used.The above bivariate logistic transform is actually a binary version of the bivariate Dale model.
An important feature of marginal models is that if a set of outcomes satisfy a marginal model, then so does every subvector of outcomes.As a result, given that we observe a severity outcome, the interpretation of the parameters does not depend on the actual number of outcomes.This property is important in the random length data context where the length of the outcome is random.The parameters will have the same interpretation regardless of size of the outcome vectors observed.

Maximum Likelihood Estimation
Collectively denote the model parameters as θ = α , ν , β ,δ, γ .The loglikelihood function for the model proposed in the above section is where N is the total number of individuals and δ (•) is an indicator function such that δ (K i ) = 1 if K i > 0, and δ (K i ) = 0 if K i = 0. Let I (θ) be the total Fisher information matrix for θ contained in the N independent multivariate random length observations Y i with random length K i , i = 1, ..., N. Then by Theorem 3.3.1 of Barnhart (1992), we have where I * (θ) is the total Fisher information matrix for θ contained in the random length data {K i , i = 1, ..., N } , and I (θ|k) is the total Fisher information matrix for θ contained in the conditional severities Y i |K i = k.The appendix gives the form of I * (θ) and I (θ|k) for the proposed model.Intuitively, the information is the sum of the information from the random length and the weighted average of information from severity over all possible lengths.
The maximum likelihood estimator (MLE) for θ can be obtained through a Fisher scoring algorithm or by numerical maximization of the likelihood through a standard optimization algorithm.Both techniques require the inversion of the link functions at each step of the algorithm to obtain the multinomial cell probabilities.A number of techniques are possible: Newton-Raphson (Glonek and McCullagh, 1995), iterative proportional fitting (Molenberghs and Lesaffre, 1999), Lagrange multipliers (Haber and Brown, 1986), etc.In the simulation and examples that follow, we use a multivariate Dale type approach where the orthant probabilities are found sequentially as solutions to polynomial equations (Molenberghs and Lesaffre, 1994).Barnhart (1992) has shown that, under the usual regularity conditions, the MLE is consistent and asymptotically normal.The asymptotic covariance matrix of the MLE, θ, is obtained as the inverse of the above information matrix and is estimated by I −1 θ .

Simulation
We performed a limited simulation study to evaluate the asymptotic behavior of the MLE for the simple case where the number of events is either one or two.For three different sets of parameters, we generate 1000 data sets where each data set consists of ordinal responses (3 levels) from two different treatment groups with 500 subjects in each.As before we specify the model in two parts.The first part, the distribution of Y i |K i = k, is taken to be multinomial with cell probabilities related to the covariates, via the orthant probabilities, by in the case when k = 2 and by when k = 1.Here x i is an indicator variable of treatment group.Hence, a proportional odds cumulative-logit model is being assumed for the marginal distributions of Y i |K i while the association is being described by global odds-ratios.
The distribution of K i is taken to be Bernoulli with logit(Pr ( For each of 1000 data sets, data for 500 subjects in each of two treatment groups (x i = 1, 0) were simulated in the following way: first, the length of the severity vector was determined by drawing from a Bernoulli distribution with probability given by equation ( 5.3); then given the value of k, the severity vector was obtained by drawing from a multinomial distribution with cell probabilities π i = L −1 p i , where p i is determined (k = 2) by equation ( 5.1) or (k = 1) equation ( 5.2) .
The parameter vectors used for the simulations were θ = (α 1 , α 2 , ν, β, δ, γ) = (0.5, 1.25, −1.0, ν, 0.8, 1.2) where ν = 0.5, 1.0, 2.0.For each data set, the above model was fit by numerically maximizing the log-likelihood.Table 1 displays the means, standard errors (s.e.) and ranges of the parameter estimates from the 1000 simulations.We found the empirical distribution of the MLE to be symmetric and approximately normal as evidenced by high degree of linearity found in normal probability plots (not shown).In table 2, we compare the standard errors of the MLE computed from the inverse of the Fisher total information matrix, I −1 (θ) , with the standard errors from the simulation.These results indicate that the two standard errors are quite close.

Examples
In this section, we illustrate the proposed joint model with two examples from clinical trials: the first concerning a new drug treatment for epilepsy; the second evaluating the effect of a cholesterol lowering medication on coronary artery disease.

Example 1: Epilepsy clinical trial
This example involves data collected as part of a double-blind placebo controlled clinical trial designed to study the effect of felbamate (FBM), a new antiepileptic drug, on epileptic seizure activity.A group of 40 epileptic patients were randomized to receive either placebo or felbamate.Patients were then followed for eighteen days (the first three being a titration period), and the number of partial seizures as well as a binary variable characterizing the severity of each seizure were recorded.The study was described in detail in Theodore et. al. (1995).For purposes of illustration, we collapsed the data in the following way.The study period (day 4 through day 18) was divided into four time periods.If an individual had any seizure activity in a given time period, we considered an event to have taken place.In addition, we assigned a severity score (mild or severe) based on the severity of seizures in that time period.The number of events, therefore, varied between 0 and 4 while the severity of each event was either mild or severe.The data is given in table 3. A quick look at the data suggests that felbamate lowers both the average number and severity of seizures.Let treatment group be denoted by the indicator variable FBM i with FBM i = 1 if individual i is given felbamate and FBM i = 0 if given placebo.Informative dropout was a problem in this dataset.Out of the original 40 individuals (20 randomized to felbamate and 20 randomized to placebo), 21 completed the protocol (13 in the felbamate group and 8 in the placebo group).Most of the dropouts in the placebo group dropped out because of increased seizure frequency and severity during the titration period.Those in the felbamate group tended to drop out due to drug induced side effects unrelated to seizure activity.To illustrate the current method, we ignore the problem of informative dropout and include only individuals that completed the protocol.Estimates based on this data set will tend to underestimate the effect of the drug therapy on seizure activity.We applied the proposed joint model to the data in table 3. The distribution of the number of events K i (time periods with seizures) was assumed to be Poisson, where was taken to be multinomial with cell probabilities related to the covariates via the orthant probabilities as before.Here we assumed the univariate orthant probabilities were related to the covariates through the logit link, i.e., α − βFBM i = logit p t i (1) , t = 1, 2, ..., k, and the bivariate associations were described with global odds ratios by,

k})
. The trivariate and higher order associations were also described by global odds ratios and were assumed to be zero on the log scale.The parameters for this model were α, ν, β, δ, and γ.Table 4 gives the maximum likelihood estimates and the corresponding standard errors computed from the inverse of the Fisher total information matrix.The estimate for the last term, γβ, was computed using the MLE invariance principle while the estimate's standard error was calculated using a standard delta-method approach.The negative values for γβ and β suggest that felbamate decreases the number and severity (conditional on k) of seizures, while the positive value for γ confirms that the effect of felbamate is in the same direction for both the number and severity of seizures.We also ran two separate analyses, modeling the length and severity individually (not shown).The model for the random length was assumed Poisson as above, while the multivariate severities were modeled using GEE with a logit link and an exchangeable working covariance matrix.Point estimates and standard errors from this approach were similar to corresponding parameters from the above joint model.However, the joint model is likelihood based and directly models the relationship between the random length and severities allowing more flexibility in the types of questions that can be addressed.For example it would be impossible, via the separate analysis, to address the question: "Does treatment significantly decrease my chances of having more than two severe events?"The joint model also makes explicit what is normally implicit in a separate analysis of severity: that you need to observe an event in order to observe a severity, i.e., that the severity analysis is conditional on the occurrence of at least one event.Ostensibly, a relevant question (perhaps the relevant question) regarding treatment effect with respect to seizure activity is whether the marginal probabilities of no seizure, mild seizure, or severe seizure are different between the two treatment groups.When some subjects have no events this cannot be answered by a separate analysis of severity since the marginal probability of having a mild or severe event is the product of both pieces from the joint model, i. for all i, j > 0. In order to illustrate how the joint model can be used to generate a test for a treatment effect, we compare the two multinomial distributions, π i = (Pr (K i = 0|FBM i ) , Pr (Y i = MILD|FBM i ) , Pr (Y = SEVERE|FBM i )) , for the two treatment groups respectively using contrasts of baseline category logits.Specifically we test for k = SEVERE, MILD.using a Wald chi-square test (details in appendix).
The resulting chi-square test statistic was 16.2.Comparing this to χ 2 2 gives a P-value of 0.0003.This indicates that, when compared to placebo, felbamate significantly increases the chance of no seizure while lowering the chances of having a severe seizure.The test effectively uses information from both endpoints being investigated, seizure frequency and severity, and is made possible by the joint-modeling framework.

Example 2: Type II coronary intervention study
This example involves data collected during the National Heart, Lung and Blood Institute (NHLBI) Type II Coronary Intervention study (Brensike et al., 1982(Brensike et al., , 1984)).Patients with Type II hyperlipoproteinemia and coronary heart disease were randomly allocated to a daily dose of 24g of cholestyramine and diet (treatment group) or placebo and diet (control group).The goal of the study was to assess the effect of cholestyramine on coronary artery disease progression at the end of five years of study.The number and size of each lesion (blockage) in the lower arterial descending portion of the heart was determined by coronary angiogram at the fifth year of follow-up.We take the size of each lesion to be a severity score with small (=1), medium (=2), or large (=3) degrees of blockage.The observed number of blockages in this part of the heart varied from zero to four.A total of 143 patients were recruited into the study.At the end of five years of follow-up there were only 116 (57-placebo; 59-cholestyramine) patients who had their fifth year coronary angiogram done.For the purpose of illustration, this is the population we consider.Examination of the data did not suggest any problems with informative dropout.Let treatment group be denoted by the indicator variable TRT i with TRT i = 1 if individual i is given cholestyramine and TRT i = 0 if given placebo.We also have two additional covariates.SBP i is an indicator variable for whether systolic blood pressure is greater than 120, and RCA i is an indicator variable of whether a regional contractile abnormality is present.
We applied the above joint model to the data.Again, the distribution of K i was assumed to be Poisson, where a is taken to be multinomial with cell probabilities related to the covariates via the orthant probabilities as before.Here we assume the univariate orthant probabilities are related to the covariates through a cumulative logit link, i.e., and the bivariate associations are described with global odds ratios by, , 2, ..., k}) .The trivariate and higher order associations are also described by global odds ratios and are assumed zero on the log scale.The parameters for this model are α 1 , α 2 , ν, β 1 , β 2 , β 3 , δ, and γ.Table 5 gives the maximum likelihood estimates and the corresponding standard errors computed from the inverse of the Fisher total information matrix.The estimates and standard errors for γβ 1 , γβ 2 , and γβ 3 are computed as described before.The negative values for γβ 1 and β 1 suggest that cholestyramine decreases the number and severity (conditional on k) of seizures.The positive value for γ confirms that, on average, the effect of cholestyramine and the other two covariates are in the same direction for both the number and severity of seizures.Again, we ran two separate analyses modeling the length and severity individually (not shown).The model for the random length was again assumed Poisson, while the multivariate severities were modeled using GEE with a cumulative logit link and an exchangeable working covariance matrix.Point estimates and standard errors from this approach differed from the corresponding parameters in the above joint model, though the overall pattern and directions of association were preserved.In particular, both −0.062 0.095 −0.647 β 1 is the parameter for TRT i ; β 2 is the parameter for RCA i ; β 3 is the parameter for SBP i .
approaches failed to find significant effects of cholestyramine on either of the disease outcomes (number or degree of occlusion of blockages).
Again we generated a test for a treatment effect, comparing the two multinomial distributions, Pr(Y = 2|T RT i , RCA i , SBP i ), Pr (Y = 3|TRT i ,RCA i , SBP i )) , for the two treatment groups (holding RCA i and SBP i at their population averages) using contrasts of baseline category logits, i.e., The Wald chi-square test statistic for H 0 was 10.2.Comparing this to χ 2 3 gives a P-value of 0.017.This indicates that cholestyramine is effective in reducing the chance of developing a lesion as well as the chance of having a larger lesion size.

Discussion
We have proposed a general, likelihood based, joint model for multivariate ordinal random length data where both the number of events and the severity of events are outcome measures.A limited simulation showed that the MLE is asymptotically normal.We also showed how one can construct a meaningful test for treatment effect based on both the number and severity of events within the joint modeling framework.We noted that informative dropout was a problem for the epilepsy data example.This was beyond the scope of this paper, though we are currently developing methodology for modeling the informative dropout process within the joint modeling framework.s asymptotically distributed as χ 2 rank(C) .

Table 4 :
MLE estimates of epilepsy data from joint model

Table 5 :
MLE estimates of heart data from joint model