Joint models can describe the relationship between recurrent and terminal events. Typically, recurrent events are modeled using the total time scale, assuming constant covariate effects on each recurrent event. However, modeling the gap time between recurrent events could allow varying covariate effects and offer greater flexibility and accuracy. For instance, in HIV-infected patients, the intervals between the first occurrence of opportunistic infections (OIs) may follow a different distribution compared to later OIs. However, limited research has focused on mediation analysis using joint modeling of gap times and survival time. In this work, we propose a novel joint modeling approach that studies the mediation effect of recurrent events on survival outcomes by modeling the recurrent events by gap time. This allows us to handle cases where the first occurrence of a recurrent event behaves differently from subsequent events. Additionally, we use a relaxed “sequential ignorability” assumption to address unmeasured confounding. Simulation studies demonstrate that our model performs well in estimating both model parameters and mediation effects. We apply our method to an AIDS study to evaluate the comparative effectiveness of two treatments and the effect of baseline CD4 counts on overall survival, mediated by recurrent opportunistic infections modeled through gap times.
Pub. online:10 Dec 2025Type:Data Science ReviewsOpen Access
Journal:Journal of Data Science
Volume 24, Issue 1 (2026): Special Issue: Statistical aspects of Trustworthy Machine Learning, pp. 86–105
Abstract
Reinforcement Learning (RL) is a powerful framework for sequential decision-making, enabling agents to optimize actions through interaction with their environment. While widely studied in computer science, statisticians have advanced RL by addressing challenges like uncertainty quantification, sample efficiency, and interpretability. These contributions are particularly impactful in healthcare, where RL complements Dynamic Treatment Regimes (DTRs), optimizing personalized medicine by tailoring treatments to individuals based on evolving characteristics. This paper serves as both a tutorial for statisticians new to RL and a review of its integration with statistical methodologies. It introduces foundational RL concepts, classical algorithms, and Q-learning variants, and highlights how statistical perspectives, especially causal inference, address challenges in DTRs. By bridging RL and statistical perspectives, the paper highlights opportunities to enhance decision-making in high-stakes domains like healthcare.
Pub. online:21 Oct 2025Type:Statistical Data ScienceOpen Access
Journal:Journal of Data Science
Volume 24, Issue 1 (2026): Special Issue: Statistical aspects of Trustworthy Machine Learning, pp. 146–166
Abstract
The extraordinary capabilities of large language models (LLMs) such as ChatGPT and GPT-4 are in part unleashed by aligning them with reward models that are trained on human preferences represented as rankings of responses to prompts. In this paper, we document the phenomenon of reward collapse, an empirical observation where the prevailing ranking-based approach results in an identical reward distribution for diverse prompts during the terminal phase of training. This outcome is undesirable as open-ended prompts like “write a short story about your best friend” should yield a continuous range of rewards for their completions, while specific prompts like “what is the capital city of New Zealand” should generate either high or low rewards. Our theoretical investigation reveals that reward collapse is primarily due to the insufficiency of the ranking-based objective function to incorporate prompt-related information during optimization. Then we derive closed-form expressions for the reward distribution associated with a set of utility functions in an asymptotic setting. Based on the reward distributions for different utility functions, we introduce a prompt-aware optimization scheme that provably admits a prompt-dependent reward distribution within the interpolating regime. Our experimental results suggest that our proposed prompt-aware utility functions significantly alleviate reward collapse during the training of reward models.
Statistical learning methods have been growing in popularity in recent years. Many of these procedures have parameters that must be tuned for models to perform well. Research has been extensive in neural networks, but not for many other learning methods. We looked at the behavior of tuning parameters for support vector machines, gradient boosting machines, and adaboost in both a classification and regression setting. We used grid search to identify ranges of tuning parameters where good models can be found across many different datasets. We then explored different optimization algorithms to select a model across the tuning parameter space. Models selected by the optimization algorithm were compared to the best models obtained through grid search to select well performing algorithms. This information was used to create an R package, EZtune, that automatically tunes support vector machines and boosted trees.
Abstract: Observational studies of relatively large data can have potentially hidden heterogeneity with respect to causal effects and propensity scores–patterns of a putative cause being exposed to study subjects. This underlying heterogeneity can be crucial in causal inference for any observational studies because it is systematically generated and structured by covariates which influence the cause and/or its related outcomes. Addressing the causal inference problem in view of data structure, machine learning techniques such as tree analysis can be naturally necessitated. Kang, Su, Hitsman, Liu and Lloyd-Jones (2012) proposed Marginal Tree (MT) procedure to explore both the confounding and interacting effects of the covariates on causal inference. In this paper, we extend the MT method to the case of binary responses along with a clear exposition of its relationship with established causal odds ratio. We assess the causal effect of dieting on emotional distress using both a real data set from the Lalonde’s National Supported Work Demonstration Analysis (NSW) and a simulated data set from the National Longitudinal Study of Adolescent Health (Add Health).