A Correlated Binary Model for Ignorable Missing Data: Application to Rheumatoid Arthritis Clinical Data

: Incomplete data are common phenomenon in research that adopts the longitudinal design approach. If incomplete observations are present in the longitudinal data structure, ignoring it could lead to bias in statistical inference and interpretation. We adopt the disposition model and extend it to the analysis of longitudinal binary outcomes in the presence of monotone incomplete data. The response variable is modeled using a conditional logistic regression model. The nonresponse mechanism is assumed ignorable and developed as a combination of Markov’s transition and logistic regression model. MLE method is used for parameter estimation. Application of our approach to rheumatoid arthritis clinical trials is presented.


Introduction
An Correlated data are very common in clinical and social science research and include nested and clustered data. Data are correlated because of common attributes that are shared among the members of the group, or among several measures of a member over time.
Longitudinal and repeated data are specific cases of correlated data. Longitudinal data refer to data that are collected by repeatedly observing the same subject over a period of time. Repeated measures, which include longitudinal data, refer to data on subjects measured repeatedly over a period of time, or under different conditions, or both. In epidemiology, a study on members of the same family will be correlated on different covariate than non-family members. This means that the probability that a family member has an outcome is not necessarily the same as that of an individual randomly selected from the population.
Likelihood based approaches for analyzing correlated binary data are limited. Bonney (1998) introduced the term disposition to represent the conditional probability of the outcome of one member of a cluster given another member has the attribute. The development of the disposition model starts with random effects formulation and then introduces a theory for constructing likelihoods utilizing moment series representations. The disposition model was further investigated by Kwagyan (2001) through an alternative formulation from a finite mixture modeling perspective.
The attractive feature of reproducibility of the disposition model (Bonney, 1998(Bonney, , 2003 makes it desirable to naturally extend it to capture the types of correlation or dependence that arises in longitudinal data. The problem of incomplete responses is a common occurrence in longitudinal data. This happens when one or more of the subject measurement from a unit is not taking, lost or otherwise unavailable. Missingness could be related to the outcome of interest. When it is unrelated to the outcome of interest, the effect is weak and analysis of the parameters of interest is less complicated. However, when it is related to the outcome of interest, the impact of the missing data is great, and the analysis, which is complicated, should be carried out with care to avoid a potential bias of inference on the parameters of interest. This in particular is the case when individuals with missing data differ significantly in important ways from those with complete data structure (Molenberghs et. al., 2015).
When a missing data is related to the history of the observed responses, it is known as missing at random (MAR), when it is related to the current unobserved response, it is known as missing not at random (MNAR) (Little and Rubin, 2002). When the missingness is MAR, estimates will be valid and fully efficient when the likelihood and missing data model is correctly specified (Molenberghs et. al., 2002, Diggle andKenward, 1994). However, when the missingness is MNAR, statisticians are faced with difficulties when the parameters of interest are to be estimated. The pattern of missingness could be monotone and non-monotone. If an individual, or a subject miss an appointment for an observation and he or she is never observed again, the pattern of missingness is said to be monotone otherwise it is non-monotone.
Since missing data are common and still a challenging problem in longitudinal study design, several approaches for dealing with them has been suggested. Some of these methods include the Likelihood-based and Bayesian method, Multiple Imputation and the Weighted Equations. For a study with missing data, the validness, soundness and efficacy of any method of analysis will require the tenability of some assumptions regarding the reasons the missing values occurred to avoid a bias and misleading inference about the parameters of interest (Molenberghs et. al., 2015).
In Section 2, we introduced the joint distribution of the incomplete data by combining the model of disposition and the dropout model and present the corresponding likelihood function. In Section 3, we present and discuss the result of the application of our approach to the rheumatoid arthritis clinical trial. Conclusions are presented in Section 4.

Joint Distribution for Incomplete Data
In this section, we introduce the disposition model and adopt it to develop a model in the presence of incomplete data. A joint distribution for the incomplete data will be constructed and models for different dropout mechanism will be developed.Consider a sample of N clusters, each of size , = 1, … , and = ( 1 , … , ) denote the vector of binary outcomes for the ℎ cluster. Let denote the conditional probability of = 1 given that ′ = 1 that is, Let us further assume that a pair of observed response within the same group satisfies the following relation where , called the relative disposition is common for all pairs of observation and it measures the within-group aggregation (correlation): = 1 implies independence or no aggregation. With this, Bonney (1998Bonney ( , 2003 has shown that the joint distribution of the ℎ cluster is given as Let us temporarily drop the subscript i representing the ℎ unit when discussing a model of a cluster for simplicity of notation and let and * = ( 1 * , … , * ) denote the complete vector of intended sequence of measurement on an experimental unit, and = ( 1 , … , ) the set of times that corresponds to the intended measurement. The joint probability distribution of * is ) Let = ( 1 , … , ) denote the vector of complete observed sequences of binary observation for the ℎ unit.
The assumption for the dropout process is that if an experimental unit is still in the study at time (2 ≤ ≤ ), the sequence of measurement : = 1,2, … , ) associated with it follows the same joint distribution as that of the corresponding intended sequence ( * : = 1,2, … , ).
We define the preceding outcome as: where D is a random variable such that 2 ≤ ≤ denotes the dropout time, and = + 1 indicates no dropout.
For each k, let = ( 1 , … , −1 ) denote the observed history up to time −1 , and * , the value that would have been observed at time t_k, if there was no dropout in the unit. Similar to Diggle and Kenward (1994) selection model with non-ignorable dropout, we assume that the probability of dropout at time d depends on the history of the measurement process up to, and including the time of dropout ..
That is, where = ( 0 , … , 2+ ) is a vector of unknown parameters. With this, the following patterns of dropout process are identified: Dropout Completely At Random (DCAR).Dropout is completely at random if the dropout process is independent of , * . That is, Dropout At Random (DAR). Dropout is at random if the dropout process depends on , and not * . That is, Informative or Dropout Not At Random (DNAR). This is when the dropout process depends on * .
We adopt the regressive logistic models of Bonney (1986Bonney ( , 1987Bonney ( , 1998 to model the dropout process and define the logit as, is the p individual-specific covariates. Our motivation of this choice is that the probability of dropout at time is a direct consequence of the past outcomes, the present outcome, and possible set of covariates. Interested readers should see Bonney (1986Bonney ( , 1987 for full-parameterized forms with the specification of the dependence structure of the model. Following Diggle and Kenward (1994), the joint distribution for an incomplete sequence with dropout at the ℎ time point is: Thus, the full log-likelihood for the ℎ cluster for based on the data( : = 1, … , )is given as which is partitioned as: where is the log-likelihood for the observed response, together corresponds to the log-likelihood function for the dropout process, and P * ( | ; , ) denote the conditional probability distribution function of * given For non-ignorable dropout ℓ 3 ( , , ) contains information on ( , , ) as such, cannot be ignored. By applying the DAR (ignorability) condition to Eqn. (4), the reduced log-likelihood function is ℓ( ) = ℓ 1 ( , ) + ℓ 2 ( ) (6) where ℓ 3 ( , , ) = ℓ 3 ( ) depends only on and is absorbed by ℓ 2 ( )

Applications to Rheumatoid Arthritis Clinical Trial Data
In this section we use data consisting of 200 subjects from the rheumatoid arthritis clinical data reported in Bombardier et. al., (1986) to illustrate different ways we can fit the disposition model when the data are incomplete. Because closed form of solutions to the score function do not exist, estimation of the parameters will be done using MULTIMAX (Kwagyan, 2001, Bonney, 2003 for maximization likelihood estimation. Patients in this study have at most five unequally spaced binary self-assessment measurements of arthritis, where self-assessment equals 0 if "poor" and 1 if "good". An initial self-assessment measurement of all patients was recorded at the first time point (month 1), after which follow-up self-assessment measurements were taken monthly up to 5 months (k=2, 3,4,5).
Patients were randomized to one of the two treatments: placebo or auranofin at the second self-assessment time. After randomization, patients remained in the treatment groups for the entire 5 months study period. The covariates used in this study are age in years, sex (1= male, 0=female), and treatment (1=Auranofin, 0=Placebo). Details about eligibility criteria and the result of the study are reported elsewhere (Bombardier et. al., 1986).
Of the 200 patients, 33(16.5%) subjects had some of their response missing. Missingness is assumed to be monotone in the sense that if an observation is missing at time it is missing for time , > . The primary objective of the study was to determine the effects of treatment to positive self-assessment, adjusted for age and gender. Of importance to us is the effect of the dropout process to positive self-assessment. If we work with the assumption that subjects who showed no improvement or positive response to the treatment are likely to dropout of the study, then we cannot rule out a DAR mechanism as such, we constrain 2 = 0.
To test this assumption, three different models: a complete case where the analysis is based on those subject who did not have missingness in their observation, and two incomplete models that incorporates the dropout will be fitted. We consider the case when the regression parameters in the response model and dropout process are the same and when they are different.
The dropout probability is model by: while the logit of the individual disposition and the relative disposition are modeled by: where 0 is the parameter measuring the within cluster or group dependence and 0 is the intercept or the mean effect.

Results of Analysis
Three different analyses are carried out to investigate the impact of the dropout process in the estimation of the response variables.
Complete Case: This analysis is done by deleting all the subjects with missing values from the data set and estimate the parameters using only the dataset from those subjects without missing values also called the completers using the disposition model given by Eqn. (8).
Incomplete Model DAR: For this model, the parameter for the current response $\phi_{2}=0$ is constrained while assuming the covariate parameters for the dropout model and the model of disposition are the same. This is done because of the need to ascertain the significance or nonsignificance of the missingness.
[  Table 1 shows results of the fitted models. Complete Case: We observe that the parameter 0 measuring the within cluster dependence was not statistically significant, the sex ( 1 ) of the subjects and the treatment ( 3 ) received were statistically significant to the way the subjects perceived the positive self assessment of their arthritis status. The result suggests that treatment with auranofin tends to increase the odds of a positive self-assessment by 1.021 ≈ 2.80. There was no age effect to a positive self-assessment of the subject as the age ( 2 ) of the subjects was not statistically significant. This result is similar to the result obtained by  who analyzed a subset of the data under the assumption of missing completely at random using the generalized estimation equations.   Incomplete Models: The parameter 0 measuring the within cluster dependence is statistically significant for DAR I, and DAR II. This means that the clusters are correlated. This was expected since the observation is repeated in each experiment with only one subject in each cluster. Also, there was no age effect to the positive self-assessment of the subject as the age ( 2 ) of the subjects was not statistically significant. Further, there were gender and treatment effects as the gender ( 1 ) and treatment ( 3 ) parameters in the response model were statistically significant for DAR II models and I. The result suggests that treatment with auranofin has the tendency to increase the odds of a positive self-assessment by 0.4840 ≈ 1.62 and 0.8341 ≈ 2.30 for DAR I and DAR II models respectively.
The dropout parameter 1 measuring the response status at the previous time point was statistically significant for model DAR I when the covariates of the dropout model and the response model was the same meaning that we cannot rule out a DAR dropout mechanism. A similar result was obtained when the parameters of the covariates in the dropout and response model were different in DAR II. However, the covariate parameters 3 , 4 , and 5 for sex, age and treatment were not statistically significant. This means that the dropout is depended on the outcome of the previous visit (past history) rather than the treatment, gender or age of the subject.
Negative estimates of the dropout parameter 1 imply that the dropout time is more likely for those subjects who did not show any positive improvement in their self-assessment at the previous visit. The result suggests that holding other covariates constant, patients who did not show any positive improvement in their self-assessment at the previous visit are likely to have −0.8909 ≈ 0.41 and −0.9846 ≈ 0.37 times odds of continuing the study than their counterparts who experienced a positive self assessment of arthritis for DAR II and I respectively. The DAR I model was the best fitted according to Akaike's AIC, and the Likelihood Ratio Test. So we can conclude, for this example that the dropout process is random. Finally, since the purpose of this study was to determine the effects of treatment and the impact of the dropout process to positive self-assessment, adjusted for age and gender, a complete case analysis that cannot capture the dropout process cannot be relied upon to produce an unbiased estimate even though the result in the disposition parameters across the three models are similar. For example, the complete case analysis showed there is no dependence or correlation within the clusters, whereas, there is correlation within the clusters as captured by the incomplete DAR models.
It is not uncommon for the dropout process to only depend on the observed history. If this is the case, then incomplete DAR I model should be adopted. However, it is possible that the reason for the dropout is related to the observed history of the patient and other covariates. To analyze data that falls within this framework, the incomplete DAR II model should be used.

Concluding Remarks
In discussing an example to illustrate the applications of the disposition model to longitudinal binary response when the data is incomplete, we have provided two different models that can be fitted when the dropout mechanism is assumed to be at random. We considered the case when the regression parameters in the response model and dropout model are the same and when they are different. The choice of a model, for any given datasets, should be guided by the purpose of the analysis and assumption of the dropout process.

Acknowledgment
The Authors are grateful to the anonymous reviewer for helpful comments and suggestions on earlier drafts of the paper. This work is supported by: NIH/NCAT grant UL1TR000101 and NIH/NIMHHD grant G12MD007597.

Appendix: Estimation and Inference
In this section we determine the maximum likelihood estimates MLE of the parameters of Eqn. (6). Since we are dealing with binary outcomes, we will adopt the logit transformation and model the parameters ( , ) in terms of certain covariate as:  the log-likelihood term is given by The Score Vector with respect to Φ is obtained from Thus, the Score vector with respect to the parameters Φ is given by