Journal of Data Science logo


Login Register

  1. Home
  2. To appear
  3. Comparing Estimators of Discriminative P ...

Journal of Data Science

Submit your article Information
  • Article info
  • Related articles
  • More
    Article info Related articles

Comparing Estimators of Discriminative Performance of Time-to-Event Models
Ying Jin ORCID icon link to view author Ying Jin details   Andrew Leroux  

Authors

 
Placeholder
https://doi.org/10.6339/25-JDS1163
Pub. online: 18 February 2025      Type: Statistical Data Science      Open accessOpen Access

Received
5 June 2024
Accepted
1 January 2025
Published
18 February 2025

Abstract

Predicting the timing and occurrence of events is a major focus of data science applications, especially in the context of biomedical research. Performance for models estimating these outcomes, often referred to as time-to-event or survival outcomes, is frequently summarized using measures of discrimination, in particular time-dependent AUC and concordance. Many estimators for these quantities have been proposed which can be broadly categorized as either semi-parametric estimators or non-parametric estimators. In this paper, we review the mathematical construction of the two classes of estimators and compare their behavior. Importantly, we identify a previously unknown feature of the class of semi-parametric estimators that can result in vastly overoptimistic out-of-sample estimation of discriminative performance in common applied tasks. Although these semi-parametric estimators are popular in practice, the phenomenon we identify here suggests that this class of estimators may be inappropriate for use in model assessment and selection based on out-of-sample evaluation criteria. This is due to the semi-parametric estimators’ bias in favor of models that are overfit when using out-of-sample prediction criteria (e.g. cross-validation). Non-parametric estimators, which do not exhibit this behavior, are highly variable for local discrimination. We propose to address the high variability problem through penalized regression splines smoothing. The behavior of various estimators of time-dependent AUC and concordance are illustrated via a simulation study using two different mechanisms that produce overoptimistic out-of-sample estimates using semi-parametric estimators. Estimators are further compared using a case study using data from the National Health and Nutrition Examination Survey (NHANES) 2011–2014.

Supplementary material

 Supplementary Material
The supplementary material includes additional information that is relevant but not included in the manuscript, including figures, mathematical derivation and data file used for the data application section. It also includes a zipped file containing code scripts to reproduce the results presented above. Here is a brief summary of is content: • outlier_exp.R: to generate data and produce Figure 1 in the Introduction. • Simulation: code scripts used to implement the simulation study. – Sim_overfit.R: for the first scenario of model overfit in Section 3.2.1. – Sim_contamination.R: for the second scenario of covariate misalignment in Section 3.2.2. – helpers.R: functions to calculate discussed estimators. – trueAUC.R: calculate the true values of incident/dynamic AUC. – SimFigs.R: produce Figures 2 and 3. • DataAppl: scripts to reproduce the data application section. – data_appl.R: scripts to reproduce the data application results. – helpers_appl.R: functions to calculate discussed estimators. – DataApplFigs.R: produce Figure 4. • SuppFigs.R: to produce figures included in the supplement.

References

 
Abd ElHafeez S, D’Arrigo G, Leonardis D, Fusaro M, Tripepi G, Roumeliotis S (2021). Methods to analyze time-to-event data: The Cox regression analysis. Oxidative Medicine and Cellular Longevity.
 
Arlot S, Celisse A (2010). A survey of cross-validation procedures for model selection. Statistics Surveys, 4: 40–79. https://doi.org/10.1214/09-SS054
 
Blanche P, Latouche A, Viallon V (2013). Time-dependent auc with right-censored data: A survey. In: Risk Assessment and Evaluation of Predictions (MLT Lee, M Gail, R Pfeiffer, G Satten, T Cai, A Gandy, eds.), 239–251. Springer New York, New York, NY.
 
Burman P (1989). A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods. Biometrika, 76(3): 503–514. https://doi.org/10.1093/biomet/76.3.503
 
Cornec-Le Gall E, Audrézet MP, Rousseau A, Hourmant M, Renaudineau E, Charasse C, et al. (2016). The propkd score: A new algorithm to predict renal survival in autosomal dominant polycystic kidney disease. Journal of the American Society of Nephrology, 27(3): 942–951. https://doi.org/10.1681/ASN.2015010016
 
Cox D (1972). Regression models and life-tables. Journal of the Royal Statistical Society, Series B, Methodological, 34(2): 187–220. https://doi.org/10.1111/j.2517-6161.1972.tb00899.x
 
Crainiceanu C, Goldsmith J, Leroux A, Cui E (2024). Functional Data Analysis with R, 1st ed. Chapman and Hall/CRC.
 
Cui E, Crainiceanu C, Leroux A (2021). Additive functional Cox model. Journal of Computational and Graphical Statistics, 30(3): 780–793. https://doi.org/10.1080/10618600.2020.1853550
 
Gonen M, Heller G (2005). Concordance probability and discriminatory power in proportional hazards regression. Biometrika, 92(4): 965–970. https://doi.org/10.1093/biomet/92.4.965
 
Harrell FE, Lee KL, Mark DB (1996). Multivariable prognostic models: Issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Statistics in Medicine, 15(4): 361–387. https://doi.org/10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4
 
Heagerty PJ, Zheng Y (2005). Survival model predictive accuracy and roc curves. Biometrics, 61(1): 92–105. https://doi.org/10.1111/j.0006-341X.2005.030814.x
 
Leroux A, Di J, Smirnova E, McGuffey EJ, Cao Q, Bayatmokhtari E, et al. (2019). Organizing and analyzing the activity data in NHANES. Statistics in Biosciences, 11(2): 262–287. https://doi.org/10.1007/s12561-018-09229-9
 
Leroux A, Xu S, Kundu P, Muschelli J, Smirnova E, Chatterjee N, et al. (2021). Quantifying the predictive performance of objectively measured physical activity on mortality in the UK Biobank. The Journals of Gerontology. Series A, Biological Sciences and Medical Sciences, 76(8): 1486–1494. https://doi.org/10.1093/gerona/glaa250
 
Mortensen RN, Gerds TA, Jeppesen JL, Torp-Pedersen C (2017). Office blood pressure or ambulatory blood pressure for the prediction of cardiovascular events. European Heart Journal, 38(44): 3296–3304. https://doi.org/10.1093/eurheartj/ehx464
 
Pya N (2021). scam: Shape Constrained Additive Models. R package version 1.2-12.
 
R Core Team (2021). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
 
Ramlau-Hansen H (1983). Smoothing counting process intensities by means of kernel functions. The Annals of Statistics, 11(2): 453–466.
 
Ramsay JO, Silverman BW (2005). Functional Data Analysis. Springer New York, NY.
 
Schmid M, Potapov S (2012). A comparison of estimators to evaluate the discriminatory power of time-to-event models. Statistics in Medicine, 31(23): 2588–2609. https://doi.org/10.1002/sim.5464
 
Shen W, Ning J, Yuan Y (2015). A direct method to evaluate the time-dependent predictive accuracy for biomarkers. Biometrics, 71(2): 439–449. https://doi.org/10.1111/biom.12293
 
Smirnova E, Leroux A, Cao Q, Tabacu L, Zipunnikov V, Crainiceanu C, et al. (2020). The predictive performance of objective measures of physical activity derived from accelerometry data for 5-year all-cause mortality in older adults: National health and nutritional examination survey 2003–2006. The Journals of Gerontology. Series A, Biological Sciences and Medical Sciences, 75(9): 1779–1785. https://doi.org/10.1093/gerona/glz193
 
Song X, Zhou XH (2008). A semiparametric approach for the covariate specific roc curve with survival outcome. Statistica Sinica, 18(3): 947–965.
 
Song X, Zhou XH, Ma S (2012). Nonparametric receiver operating characteristic-based evaluation for survival outcomes. Statistics in Medicine, 31(23): 2660–2675. https://doi.org/10.1002/sim.5386
 
Stephenson AJ, Scardino PT, Eastham JA, Bianco FJ, Dotan ZA, DiBlasio CJ, et al. (2005). Postoperative nomogram predicting the 10-year probability of prostate cancer recurrence after radical prostatectomy. Journal of Clinical Oncology, 23(28): 7005–7012. https://doi.org/10.1200/JCO.2005.01.867
 
Uno H, Cai T, Pencinac MJ, D’Agostinod RB, Weib LJ (2011). On the c-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Statistics in Medicine, 30(10): 1105–1117. https://doi.org/10.1002/sim.4154
 
van Geloven N, He Y, Zwinderman A, Putter H (2021). Estimation of incident dynamic auc in practice. Computational Statistics & Data Analysis, 154: 107095. https://doi.org/10.1016/j.csda.2020.107095
 
Wang JL (2014). Smoothing Hazard Rates. John Wiley & Sons, Ltd.
 
Wood S (2003). Thin-plate regression splines. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 65(1): 95–114. https://doi.org/10.1111/1467-9868.00374
 
Wood S (2004). Stable and efficient multiple smoothing parameter estimation for generalized additive models. Journal of the American Statistical Association, 99(467): 673–686. https://doi.org/10.1198/016214504000000980
 
Wood S (2011). Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 73(1): 3–36. https://doi.org/10.1111/j.1467-9868.2010.00749.x
 
Wood S (2017). Generalized Additive Models: An Introduction with R, 2 edition. Chapman and Hall/CRC.
 
Xu R, O’Quigley J (2000). Proportional hazards estimate of the conditional survival function. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 62(4): 667–680. https://doi.org/10.1111/1467-9868.00256
 
Yates LA, Aandahl Z, Richards SA, Brook BW (2023). Cross validation for model selection: A review with examples from ecology. Ecological Monographs, 93(1): e1557. https://doi.org/10.1002/ecm.1557

Related articles PDF XML
Related articles PDF XML

Copyright
2025 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.
by logo by logo
Open access article under the CC BY license.

Keywords
C-index concordance proportional hazard model survival prediction time-dependent AUC

Metrics
since February 2021
141

Article info
views

40

PDF
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

Journal of data science

  • Online ISSN: 1683-8602
  • Print ISSN: 1680-743X

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer

Contact us

  • JDS@ruc.edu.cn
  • No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China
Powered by PubliMill  •  Privacy policy