Precision Medicine: Interaction Survival Tree for Recurrent Event Data

Yang, Yushan; Perera, Chamila; Miller, Philip; Su, Xiaogang; Liu, Lei

doi:10.6339/24-JDS1126

Journal of Data Science

Precision Medicine: Interaction Survival Tree for Recurrent Event Data

Volume 22, Issue 2 (2024): Special Issue: 2023 Symposium on Data Science and Statistics (SDSS): “Inquire, Investigate, Implement, Innovate”, pp. 298–313

Yushan Yang Chamila Perera Philip Miller All authors (5)

https://doi.org/10.6339/24-JDS1126

Pub. online: 17 April 2024 Type: Statistical Data Science

Open Access

Received
1 August 2023

Accepted
15 March 2024

Published
17 April 2024

Abstract

In randomized controlled trials, individual subjects experiencing recurrent events may display heterogeneous treatment effects. That is, certain subjects might experience beneficial effects, while others might observe negligible improvements or even encounter detrimental effects. To identify subgroups with heterogeneous treatment effects, an interaction survival tree approach is developed in this paper. The Classification and Regression Tree (CART) methodology (Breiman et al., 1984) is inherited to recursively partition the data into subsets that show the greatest interaction with the treatment. The heterogeneity of treatment effects is assessed through Cox’s proportional hazards model, with a frailty term to account for the correlation among recurrent events on each subject. A simulation study is conducted for evaluating the performance of the proposed method. Additionally, the method is applied to identify subgroups from a randomized, double-blind, placebo-controlled study for chronic granulomatous disease. R implementation code is publicly available on GitHub at the following URL: https://github.com/xgsu/IT-Frailty.

References

Akaike H (1973). Information theory and an extension of the maximum likelihood principle. In: Selected Papers of Hirotugu Akaike, 199–213. Springer New York, New York, NY.

Amorim LD, Cai J (2015). Modelling recurrent events: A tutorial for analysis in epidemiology. International Journal of Epidemiology, 44(1): 324–333. https://doi.org/10.1093/ije/dyu222

Andersen PK, Gill RD (1982). Cox’s regression model for counting processes: A large sample study. The Annals of Statistics, 10: 1100–1120.

Breiman L, Friedman J, Olshen R, Stone C (1984). Classification and Regression Trees. Chapman & Hall/CRC.

Clayton D (1978). A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika, 65(1): 141–151. https://doi.org/10.1093/biomet/65.1.141

Cook R, Lawless J (2007). The Statistical Analysis of Recurrent Events. Springer, New York, NY.

Foster JC, Taylor JM, Ruberg SJ (2011). Subgroup identification from randomized clinical trial data. Statistics in Medicine, 30(24): 2867–2880. https://doi.org/10.1002/sim.4322

Hao N, Zhang HH (2014). Interaction screening for ultrahigh-dimensional data. Journal of the American Statistical Association, 109(507): 1285–1301. https://doi.org/10.1080/01621459.2014.881741

Hou J, Seneviratne C, Su X, Taylor J, Johnson B, Wang XQ, et al. (2015). Subgroup identification in personalized treatment of alcohol dependence. Alcoholism, Clinical and Experimental Research, 39(7): 1253–1259. https://doi.org/10.1111/acer.12759

Kelly PJ, Lim LLY (2000). Survival analysis for recurrent event data: An application to childhood infectious diseases. Statistics in Medicine, 19(1): 13–33. https://doi.org/10.1002/(SICI)1097-0258(20000115)19:1<13::AID-SIM279>3.0.CO;2-5

Kennedy BS, Kasl SV, Vaccarino V (2001). Repeated hospitalizations and self-rated health among the elderly: A multivariate failure time analysis. American Journal of Epidemiology, 153(3): 232–241. https://doi.org/10.1093/aje/153.3.232

Kong Y, Li D, Fan Y, Lv J (2017). Interaction pursuit in high-dimensional multi-response regression via distance correlation. The Annals of Statistics, 45(2): 897–922. https://doi.org/10.1214/16-AOS1474

LeBlanc M, Crowley J (1993). Survival trees by goodness of split. Journal of the American Statistical Association, 88(422): 457–467. https://doi.org/10.1080/01621459.1993.10476296

Liu L, Wolfe RA, Huang X (2004). Shared frailty models for recurrent events and a terminal event. Biometrics, 60(3): 747–756. https://doi.org/10.1111/j.0006-341X.2004.00225.x

Liu L, Yu Z (2008). A likelihood reformulation method in non-normal random effects models. Statistics in Medicine, 27(16): 3105–3124. https://doi.org/10.1002/sim.3153

Loh WY, Cao L, Zhou P (2019). Subgroup identification for precision medicine: A comparative review of 13 methods. WIREs Data Mining and Knowledge Discovery, 9(e1326): 1–21.

Martinez-Millana A, Hulst JM, Boon M, Witters P, Fernandez-Llatas C, Asseiceira I, et al. (2018). Optimisation of children z-score calculation based on new statistical techniques. PLoS ONE, 13(12): e0208362. https://doi.org/10.1371/journal.pone.0208362

Must A, Anderson S (2006). Body mass index in children and adolescents: Considerations for population-based applications. International Journal of Obesity, 30(4): 590–594. https://doi.org/10.1038/sj.ijo.0803300

Nelson KP, Lipsitz SR, Fitzmaurice GM, Ibrahim J, Parzen M, Strawderman R (2006). Use of the probability integral transformation to fit nonlinear mixed-effects models with nonnormal random effects. Journal of Computational and Graphical Statistics, 15(1): 39–57. https://doi.org/10.1198/106186006X96854

Pepe MS, Cai J (1993). Some graphical displays and marginal regression analyses for recurrent failure times and time dependent covariates. Journal of the American Statistical Association, 88(423): 811–820. https://doi.org/10.1080/01621459.1993.10476346

Pietro Bortoletto P, Lyman K, Camacho A, Fricchione M, Khanolkar A, Katz B (2015). Chronic granulomatous disease. The Pediatric Infectious Disease Journal, 34: 1110–1114. https://doi.org/10.1097/INF.0000000000000840

Prentice RL, Williams BJ, Peterson AV (1981). On the regression analysis of multivariate failure time data. Biometrika, 68(2): 373–379. https://doi.org/10.1093/biomet/68.2.373

R Core Team (2024). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.

Ripatti S, Palmgren J (2000). Estimation of multivariate frailty models using penalized partial likelihood. Biometrics, 56(4): 1016–1022. https://doi.org/10.1111/j.0006-341X.2000.01016.x

Schoenfeld D (1982). Partial residuals for the proportional hazards regression model. Biometrika, 69(1): 239–241. https://doi.org/10.1093/biomet/69.1.239

Schwarz G (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2): 461–464. https://doi.org/10.1214/aos/1176344136

Sleight P (2000). Debate: Subgroup analyses in clinical trials: Fun to look at – but don’t believe them! Current Controlled Trials in Cardiovascular Medicine, 1(1): 25–27. https://doi.org/10.1186/CVM-1-1-025

Su X, Meneses K, McNees P, Johnson WO (2011). Interaction trees: Exploring the differential effects of an intervention programme for breast cancer survivors. Journal of the Royal Statistical Society. Series C. Applied Statistics, 60(3): 457–474. https://doi.org/10.1111/j.1467-9876.2010.00754.x

Su X, Tsai CL, Wang H, Nickerson DM, Li B (2009). Subgroup analysis via recursive partitioning. Journal of Machine Learning Research, 10(2).

Su X, Zhou T, Yan X, Fan J, Yang S (2008). Interaction trees with censored survival data. The International Journal of Biostatistics, 4(1), Article 2.

The International Chronic Granulomatous Disease Cooperative Study Group (1991). A controlled trial of interferon gamma to prevent infection in chronic granulomatous disease. The New England Journal of Medicine, 324(8): 509–516. https://doi.org/10.1056/NEJM199102213240801

Therneau TM (2024). coxme: Mixed Effects Cox Models. R package version 2.2-20.

Therneau TM, Grambsch PM (2000). Modeling Survival Data: Extending the Cox Model. Springer.

Wei LJ, Lin DY Weissfeld L (1989). Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. Journal of the American Statistical Association, 84(408): 1065–1073. https://doi.org/10.1080/01621459.1989.10478873

WHO Multicentre Growth Reference Study Group (2006). WHO child growth standards based on length/height, weight and age. Acta Pdæiatrica. Supplement, 450: 76.

Yang W, Jepson C, Xie D, Roy JA, Shou H, Hsu JY, et al. (2017). Statistical methods for recurrent event analysis in cohort studies of ckd. Clinical Journal of the American Society of Nephrology, 12(12): 2066. https://doi.org/10.2215/CJN.0000000000000302

Zeng D, Lin D (2007). Semiparametric transformation models with random effects for recurrent events. Journal of the American Statistical Association, 102(477): 167–180. https://doi.org/10.1198/016214506000001239

2024 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.

Open access article under the CC BY license.

Keywords

interaction tree frailty model subgroup identification

Funding

This research is partly supported by NIH grants R21 AG084054 and UL1 TR002345.

Metrics

since February 2021

302

Article info
views

247

PDF
downloads

RSS

Authors

Abstract

References

Export citation

Copy and paste formatted citation

Download citation in file