A Meta-Learner Framework to Estimate Individualized Treatment Effects for Survival Outcomes
Pub. online: 5 February 2024
Type: Statistical Data Science
Open Access
†
Contributed equally.
Received
13 October 2023
13 October 2023
Accepted
17 January 2024
17 January 2024
Published
5 February 2024
5 February 2024
Abstract
One crucial aspect of precision medicine is to allow physicians to recommend the most suitable treatment for their patients. This requires understanding the treatment heterogeneity from a patient-centric view, quantified by estimating the individualized treatment effect (ITE). With a large amount of genetics data and medical factors being collected, a complete picture of individuals’ characteristics is forming, which provides more opportunities to accurately estimate ITE. Recent development using machine learning methods within the counterfactual outcome framework shows excellent potential in analyzing such data. In this research, we propose to extend meta-learning approaches to estimate individualized treatment effects with survival outcomes. Two meta-learning algorithms are considered, T-learner and X-learner, each combined with three types of machine learning methods: random survival forest, Bayesian accelerated failure time model and survival neural network. We examine the performance of the proposed methods and provide practical guidelines for their application in randomized clinical trials (RCTs). Moreover, we propose to use the Boruta algorithm to identify risk factors that contribute to treatment heterogeneity based on ITE estimates. The finite sample performances of these methods are compared through extensive simulations under different randomization designs. The proposed approach is applied to a large RCT of eye disease, namely, age-related macular degeneration (AMD), to estimate the ITE on delaying time-to-AMD progression and to make individualized treatment recommendations.
Supplementary material
Supplementary Material
1.
We provide codes and data in GitHub https://github.com/nab1779321/HTEsurv that can be used to reproduce the simulation and real data analysis in this paper.
2.
We provide an additional pdf file that includes a) additional simulation results and b) additional real data analysis results.
References
Age-Related Eye Disease Study Research Group (1999). The age-related eye disease study (AREDS): design implications. AREDS report no. 1. Controlled Clinical Trials, 20(6): 573–600. https://doi.org/10.1016/S0197-2456(99)00031-8
Cascella R, Strafella C, Caputo V, Errichiello V, Zampatti S, Milano F, et al. (2018). Towards the application of precision medicine in age-related macular degeneration. Progress in Retinal and Eye Research, 63: 132–146. https://doi.org/10.1016/j.preteyeres.2017.11.004
Chew EY, Clemons T, SanGiovanni JP, Danis R, Domalpally A, McBee W, et al. (2012). The age-related eye disease study 2 (AREDS2): Study design and baseline characteristics (AREDS2 report number 1). Ophthalmology, 119(11): 2282–2289. https://doi.org/10.1016/j.ophtha.2012.05.027
Chew EY, Klein ML, Clemons TE, Agrón E, Abecasis GR (2015). Genetic testing in persons with age-related macular degeneration and the use of the AREDS supplements: to test or not to test? Ophthalmology, 122(1): 212–215. https://doi.org/10.1016/j.ophtha.2014.10.012
Cui Y, Kosorok MR, Sverdrup E, Wager S, Zhu R (2023). Estimating heterogeneous treatment effects with right-censored data via causal survival forests. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 85(2): 179–211. https://doi.org/10.1093/jrsssb/qkac001
Curth A, van der Schaar M (2021). Nonparametric estimation of heterogeneous treatment effects: From theory to learning algorithms. In: Proceedings of The 24th International Conference on Artificial Intelligence and Statistics (A Banerjee, K Fukumizu, eds.), volume 130 of Proceedings of Machine Learning Research, 1810–1818. PMLR.
Foster JC, Taylor JM, Ruberg SJ (2011). Subgroup identification from randomized clinical trial data. Statistics in Medicine, 30: 2867–2880. https://doi.org/10.1002/sim.4322
Henderson NC, Louis TA, Rosner GL, Varadhan R (2020). Individualized treatment effects with censored data via fully nonparametric bayesian accelerated failure time models. Biostatistics, 21(1): 50–68. https://doi.org/10.1093/biostatistics/kxy028
Hill JL (2011). Bayesian nonparametric modeling for causal inference. Journal of Computational and Graphical Statistics, 20: 217–240. https://doi.org/10.1198/jcgs.2010.08162
Klein ML, Francis PJ, Rosner B, Reynolds R, Hamon SC, Schultz DW, et al. (2008). CFH and LOC387715/ARMS2 genotypes and treatment with antioxidants and zinc for age-related macular degeneration. Ophthalmology, 115(6): 1019–1025. https://doi.org/10.1016/j.ophtha.2008.01.036
Koch B, Sainburg T, Geraldo P, Jiang S, Sun Y, Foster JG (2023). A primer on deep learning for causal inference. arXiv preprint: https://arxiv.org/abs/2110.04442v2.
Kunzel SR, Sekhon JS, Bickel PJ, Yu B (2018). Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the National Academy of Sciences of the United States of America, 116(10): 4156–4165. https://doi.org/10.1073/pnas.1804597116
Kursa MB, Rudnicki WR (2010). Feature selection with the boruta package. Journal of Statistical Software, 36: 1–13. https://doi.org/10.18637/jss.v036.i11
Rubin DB (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(4): 688–701. https://doi.org/10.1037/h0037350
Seddon J, Silver R, Rosner B (2016). Response to AREDS supplements according to genetic factors: Survival analysis approach using the eye as the unit of analysis. British Journal of Ophthalmology, 100(12): 1731–1737. https://doi.org/10.1136/bjophthalmol-2016-308624
Splawa-Neyman J, Dabrowska D, Speed T (1990). On the application of probability theory to agricultural experiments. Essay on principles. Statistical Science, 5(4): 465–472. https://doi.org/10.1214/ss/1177012031
Sun T, Ding Y (2019). Copula-based semiparametric regression method for bivariate data under general interval censoring. Biostatistics, 22: 315–330. https://doi.org/10.1093/biostatistics/kxz032
Sun T, Wei Y, Chen W, Ding Y (2020). Genome-wide association study-based deep learning for survival prediction. Statistics in Medicine, 39: 4605–4620. https://doi.org/10.1002/sim.8743
Tabib S, Larocque D (2020). Non-parametric individual treatment effect estimation for survival data with random forests. Bioinformatics, 36(2): 629–636. https://doi.org/10.1093/bioinformatics/btz602
Wager S, Athey S (2018). Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, 113: 1228–1242. https://doi.org/10.1080/01621459.2017.1319839
Wei Y, Hsu JC, Chen W, Chew EY, Ding Y (2021). Identification and inference for subgroups with differential treatment efficacy from randomized controlled trials with survival outcomes through multiple testing. Statistics in Medicine, 40(29): 6523–6540. https://doi.org/10.1002/sim.9196
Yan Q, Ding Y, Liu Y, Sun T, Fritsche LG, Clemons T, et al. (2018). Genome-wide analysis of disease progression in age-related macular degeneration. Human Molecular Genetics, 27(5): 929–940. https://doi.org/10.1093/hmg/ddy002
Yoon J, Jordon J, van der Schaar M (2018). GANITE: Estimation of individualized treatment effects using generative adversarial nets. International Conference on Learning Representations. https://openreview.net/forum?id=ByKWUeWA-
Zhao L, Tian L, Cai T, Claggett B, Wei LJ (2013). Effectively selecting a target population for a future comparative study. Journal of the American Statistical Association, 108(502): 527–539. PMID: 24058223. https://doi.org/10.1080/01621459.2013.770705
Zhu J, Gallego B (2020). Targeted estimation of heterogeneous treatment effect in observational survival analysis. Journal of Biomedical Informatics, 107: 103474. https://doi.org/10.1016/j.jbi.2020.103474