Abstract: We group approaches to modeling correlated binary data accord ing to data recorded cross-sectionally as opposed to data recorded longi tudinally; according to models that are population-averaged as opposed to subject-specific; and according to data with time-dependent covariates as opposed to time-independent covariates. Standard logistic regression mod els are appropriate for cross-sectional data. However, for longitudinal data, methods such as generalized estimating equations (GEE) and generalized method of moments (GMM) are commonly used to fit population-averaged models, while random-effects models such as generalized linear mixed mod els (GLMM) are used to fit subject-specific models. Some of these methods account for time-dependence in covariates while others do not. This paper addressed these approaches with an illustration using a Medicare dataset as it relates to rehospitalization. In particular, we compared results from standard logistic models, GEE models, GMM models, and random-effects models by analyzing a binary outcome for four successive hospitalizations. We found that these procedures address differently the correlation among responses and the feedback from response to covariate. We found marginal GMM logistic regression models to be more appropriate when covariates are classified as time-dependent in comparison to GEE models. We also found conditional random-intercept models with time-dependent covariates decom posed into components to be more appropriate when time-dependent covari ates are present in comparison to ordinary random-effects models. We used the SAS procedures GLIMMIX, NLMIXED, IML, GENMOD, and LOGIS TIC to analyze the illustrative dataset, as well as unique programs written using the R language.
Abstract: It is believed that overdispersion or extravariation as often re ferred is present more in survey data due to the existence of heterogeneity among and between the units. One approach to address such a phenomenon is to use a generalized Dirichlet-multinomial model. In its application the generalized Dirichlet-multinomial model assumes that the clusters are of equal sizes and the number of clusters remains the same from time to time. In practice this may rarely ever be the case when clusters are observed over time. In this paper the random variability and the varying response rates are accounted for in the model. This requires modeling another level of variation. In effect, this can be considered a hierarchical model that allows varying response rates in the presence of overdispersed multinomial data. The model and its applicability are demonstrated through an illustrative application to a subset of the well known High School and Beyond survey data.