A Meta-Learner Framework to Estimate Individualized Treatment Effects for Survival Outcomes

Bo, Na; Wei, Yue; Zeng, Lang; Kang, Chaeryon; Ding, Ying

doi:10.6339/24-JDS1119

Journal of Data Science

A Meta-Learner Framework to Estimate Individualized Treatment Effects for Survival Outcomes

Volume 22, Issue 4 (2024), pp. 505–523

Na Bo ^† Yue Wei ^† Lang Zeng All authors (5)

https://doi.org/10.6339/24-JDS1119

Pub. online: 5 February 2024 Type: Statistical Data Science

Open Access

^† Contributed equally.

Received
13 October 2023

Accepted
17 January 2024

Published
5 February 2024

Abstract

One crucial aspect of precision medicine is to allow physicians to recommend the most suitable treatment for their patients. This requires understanding the treatment heterogeneity from a patient-centric view, quantified by estimating the individualized treatment effect (ITE). With a large amount of genetics data and medical factors being collected, a complete picture of individuals’ characteristics is forming, which provides more opportunities to accurately estimate ITE. Recent development using machine learning methods within the counterfactual outcome framework shows excellent potential in analyzing such data. In this research, we propose to extend meta-learning approaches to estimate individualized treatment effects with survival outcomes. Two meta-learning algorithms are considered, T-learner and X-learner, each combined with three types of machine learning methods: random survival forest, Bayesian accelerated failure time model and survival neural network. We examine the performance of the proposed methods and provide practical guidelines for their application in randomized clinical trials (RCTs). Moreover, we propose to use the Boruta algorithm to identify risk factors that contribute to treatment heterogeneity based on ITE estimates. The finite sample performances of these methods are compared through extensive simulations under different randomization designs. The proposed approach is applied to a large RCT of eye disease, namely, age-related macular degeneration (AMD), to estimate the ITE on delaying time-to-AMD progression and to make individualized treatment recommendations.

Supplementary material

Supplementary Material

1. We provide codes and data in GitHub https://github.com/nab1779321/HTEsurv that can be used to reproduce the simulation and real data analysis in this paper. 2. We provide an additional pdf file that includes a) additional simulation results and b) additional real data analysis results.

References

Age-Related Eye Disease Study Research Group (1999). The age-related eye disease study (AREDS): design implications. AREDS report no. 1. Controlled Clinical Trials, 20(6): 573–600. https://doi.org/10.1016/S0197-2456(99)00031-8

Belitser SV, Martens EP, Pestman WR, Groenwold RHH, de Boer A, Klungel OH (2011). Measuring balance and model selection in propensity score methods. Archives of Ophthalmology, 20(11): 1115–1129.

Cascella R, Strafella C, Caputo V, Errichiello V, Zampatti S, Milano F, et al. (2018). Towards the application of precision medicine in age-related macular degeneration. Progress in Retinal and Eye Research, 63: 132–146. https://doi.org/10.1016/j.preteyeres.2017.11.004

Chew EY, Clemons T, SanGiovanni JP, Danis R, Domalpally A, McBee W, et al. (2012). The age-related eye disease study 2 (AREDS2): Study design and baseline characteristics (AREDS2 report number 1). Ophthalmology, 119(11): 2282–2289. https://doi.org/10.1016/j.ophtha.2012.05.027

Chew EY, Klein ML, Clemons TE, Agrón E, Abecasis GR (2015). Genetic testing in persons with age-related macular degeneration and the use of the AREDS supplements: to test or not to test? Ophthalmology, 122(1): 212–215. https://doi.org/10.1016/j.ophtha.2014.10.012

Cui Y, Kosorok MR, Sverdrup E, Wager S, Zhu R (2023). Estimating heterogeneous treatment effects with right-censored data via causal survival forests. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 85(2): 179–211. https://doi.org/10.1093/jrsssb/qkac001

Curth A, van der Schaar M (2021). Nonparametric estimation of heterogeneous treatment effects: From theory to learning algorithms. In: Proceedings of The 24th International Conference on Artificial Intelligence and Statistics (A Banerjee, K Fukumizu, eds.), volume 130 of Proceedings of Machine Learning Research, 1810–1818. PMLR.

Foster JC, Taylor JM, Ruberg SJ (2011). Subgroup identification from randomized clinical trial data. Statistics in Medicine, 30: 2867–2880. https://doi.org/10.1002/sim.4322

Greenwell B, Boehmke B, Cunningham J, Developers G (2020). gbm: Generalized Boosted Regression Models. R package version 2.1.8.

Henderson NC, Louis TA, Rosner GL, Varadhan R (2020). Individualized treatment effects with censored data via fully nonparametric bayesian accelerated failure time models. Biostatistics, 21(1): 50–68. https://doi.org/10.1093/biostatistics/kxy028

Hill JL (2011). Bayesian nonparametric modeling for causal inference. Journal of Computational and Graphical Statistics, 20: 217–240. https://doi.org/10.1198/jcgs.2010.08162

Ishwaran H, Kogalur U (2007). Random survival forests for R. R News, 7(2): 25–31.

Johansson FD, Shalit U, Kallus N, Sontag D (2022). Generalization bounds and representation learning for estimation of potential outcomes and causal effects. Journal of Machine Learning Research, 23(166): 1–50.

Kennedy EH (2023). Towards optimal doubly robust estimation of heterogeneous causal effects. Electronic Journal of Statistics, 17(2): 3008–3049.

Klein ML, Francis PJ, Rosner B, Reynolds R, Hamon SC, Schultz DW, et al. (2008). CFH and LOC387715/ARMS2 genotypes and treatment with antioxidants and zinc for age-related macular degeneration. Ophthalmology, 115(6): 1019–1025. https://doi.org/10.1016/j.ophtha.2008.01.036

Koch B, Sainburg T, Geraldo P, Jiang S, Sun Y, Foster JG (2023). A primer on deep learning for causal inference. arXiv preprint: https://arxiv.org/abs/2110.04442v2.

Kunzel SR, Sekhon JS, Bickel PJ, Yu B (2018). Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the National Academy of Sciences of the United States of America, 116(10): 4156–4165. https://doi.org/10.1073/pnas.1804597116

Kursa MB, Rudnicki WR (2010). Feature selection with the boruta package. Journal of Statistical Software, 36: 1–13. https://doi.org/10.18637/jss.v036.i11

Liaw A, Wiener M (2002). Classification and regression by randomforest. R News, 2(3): 18–22.

Rubin DB (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(4): 688–701. https://doi.org/10.1037/h0037350

Seddon J, Silver R, Rosner B (2016). Response to AREDS supplements according to genetic factors: Survival analysis approach using the eye as the unit of analysis. British Journal of Ophthalmology, 100(12): 1731–1737. https://doi.org/10.1136/bjophthalmol-2016-308624

Shalit U, Johansson FD, Sontag D (2017). Estimating individual treatment effect: generalization bounds and algorithms. In: Proceedings of the 34th International Conference on Machine Learning (D Precup, YW Teh, eds.), volume 70 of Proceedings of Machine Learning Research, 3076–3085. PMLR.

Shi C, Blei D, Veitch V (2019). Adapting neural networks for the estimation of treatment effects. In: Advances in Neural Information Processing Systems (H Wallach, H Larochelle, A Beygelzimer, F d’Alché-Buc, E Fox, R Garnett, eds.), volume 32. Curran Associates Inc.

Splawa-Neyman J, Dabrowska D, Speed T (1990). On the application of probability theory to agricultural experiments. Essay on principles. Statistical Science, 5(4): 465–472. https://doi.org/10.1214/ss/1177012031

Sun T, Ding Y (2019). Copula-based semiparametric regression method for bivariate data under general interval censoring. Biostatistics, 22: 315–330. https://doi.org/10.1093/biostatistics/kxz032

Sun T, Wei Y, Chen W, Ding Y (2020). Genome-wide association study-based deep learning for survival prediction. Statistics in Medicine, 39: 4605–4620. https://doi.org/10.1002/sim.8743

Tabib S, Larocque D (2020). Non-parametric individual treatment effect estimation for survival data with random forests. Bioinformatics, 36(2): 629–636. https://doi.org/10.1093/bioinformatics/btz602

Wager S, Athey S (2018). Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, 113: 1228–1242. https://doi.org/10.1080/01621459.2017.1319839

Wei Y, Hsu JC, Chen W, Chew EY, Ding Y (2021). Identification and inference for subgroups with differential treatment efficacy from randomized controlled trials with survival outcomes through multiple testing. Statistics in Medicine, 40(29): 6523–6540. https://doi.org/10.1002/sim.9196

Yan Q, Ding Y, Liu Y, Sun T, Fritsche LG, Clemons T, et al. (2018). Genome-wide analysis of disease progression in age-related macular degeneration. Human Molecular Genetics, 27(5): 929–940. https://doi.org/10.1093/hmg/ddy002

Yoon J, Jordon J, van der Schaar M (2018). GANITE: Estimation of individualized treatment effects using generative adversarial nets. International Conference on Learning Representations. https://openreview.net/forum?id=ByKWUeWA-

Zhao L, Tian L, Cai T, Claggett B, Wei LJ (2013). Effectively selecting a target population for a future comparative study. Journal of the American Statistical Association, 108(502): 527–539. PMID: 24058223. https://doi.org/10.1080/01621459.2013.770705

Zhu J, Gallego B (2020). Targeted estimation of heterogeneous treatment effect in observational survival analysis. Journal of Biomedical Informatics, 107: 103474. https://doi.org/10.1016/j.jbi.2020.103474

2024 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.

Open access article under the CC BY license.

Keywords

age-related macular degeneration individualized treatment effect precision medicine randomized clinical trials survival outcomes

Metrics

since February 2021

1465

Article info
views

828

PDF
downloads

RSS

Authors

Abstract

Supplementary material

References

Export citation

Copy and paste formatted citation

Download citation in file