Joint models can describe the relationship between recurrent and terminal events. Typically, recurrent events are modeled using the total time scale, assuming constant covariate effects on each recurrent event. However, modeling the gap time between recurrent events could allow varying covariate effects and offer greater flexibility and accuracy. For instance, in HIV-infected patients, the intervals between the first occurrence of opportunistic infections (OIs) may follow a different distribution compared to later OIs. However, limited research has focused on mediation analysis using joint modeling of gap times and survival time. In this work, we propose a novel joint modeling approach that studies the mediation effect of recurrent events on survival outcomes by modeling the recurrent events by gap time. This allows us to handle cases where the first occurrence of a recurrent event behaves differently from subsequent events. Additionally, we use a relaxed “sequential ignorability” assumption to address unmeasured confounding. Simulation studies demonstrate that our model performs well in estimating both model parameters and mediation effects. We apply our method to an AIDS study to evaluate the comparative effectiveness of two treatments and the effect of baseline CD4 counts on overall survival, mediated by recurrent opportunistic infections modeled through gap times.
Pub. online:11 Jun 2025Type:Statistical Data ScienceOpen Access
Journal:Journal of Data Science
Volume 23, Issue 3 (2025): Special Issue: 2024 WNAR/IMS/Graybill Annual Meeting, pp. 499–520
Abstract
The rapidly expanding field of metabolomics presents an invaluable resource for understanding the associations between metabolites and various diseases. However, the high dimensionality, presence of missing values, and measurement errors associated with metabolomics data can present challenges in developing reliable and reproducible approaches for disease association studies. Therefore, there is a compelling need for robust statistical analyses that can navigate these complexities to achieve reliable and reproducible disease association studies. In this paper, we construct algorithms to perform variable selection for noisy data and control the False Discovery Rate when selecting mutual metabolomic predictors for multiple disease outcomes. We illustrate the versatility and performance of this procedure in a variety of scenarios, dealing with missing data and measurement errors. As a specific application of this novel methodology, we target two of the most prevalent cancers among US women: breast cancer and colorectal cancer. By applying our method to the Women’s Health Initiative data, we successfully identify metabolites that are associated with either or both of these cancers, demonstrating the practical utility and potential of our method in identifying consistent risk factors and understanding shared mechanisms between diseases.