Pseudo Partial Likelihood Method for Proportional Hazards Models when Time Origin Is Missing for Control Group with Applications to SARS-CoV-2 Seroprevalence Study
Pub. online: 7 October 2025
Type: Statistical Data Science
Open Access
Received
7 December 2024
7 December 2024
Accepted
18 September 2025
18 September 2025
Published
7 October 2025
7 October 2025
Abstract
Time-to-event data analysis without a well-defined time origin commonly occurs in observational studies that retrospectively collect survival endpoints. For instance, after enrolling participants who have or have not received a specific treatment, an event status can be observed for all participants; however, the start date of treatment is only observable for the treatment group. The corresponding time origin does not exist for the control group, resulting in missing survival time data. Complete-case analysis is often considered the standard approach, but it disregards information from all participants in the control group and does not allow us to compare their survival distributions. To address this challenge, we propose a novel semiparametric proportional hazards model by regarding these missing time origins as nuisance parameters. We approximate the risk sets as cumulative normal distributions to deal with these nuisance parameters and develop estimation and inference procedures for our proposed estimator. We study the asymptotic properties of this model and conduct the simulation studies to validate its finite sample property. Analysis of data from a recent SARS-CoV-2 seroprevaluence study illustrates the applicability of our methods. The proposed methods are implemented in the R package coxphm.
Supplementary material
Supplementary MaterialSections A and B of the Supplementary Material provide the proofs of Theorems 1–2 and additional simulation results, respectively. The SARS-CoV-2 serological prevalence data and corresponding R code used for analysis are also included in the Supplementary Material. The coxphm package (Chung, 2025), which implements the methods developed in this article, is publicly available on CRAN.
References
Anand S, Montez-Rath M, Han J, Bozeman J, Kerschmann R, Beyer P, et al. (2020). Prevalence of SARS-CoV-2 antibodies in a large nationwide sample of patients on dialysis in the USA: a cross-sectional study. The Lancet, 396(10259): 1335–1344. https://doi.org/10.1016/S0140-6736(20)32009-2
Baden LR, El Sahly HM, Essink B, Kotloff K, Frey S, Novak R, et al. (2021). Efficacy and safety of the mRNA-1273 SARS-CoV-2 vaccine. New England Journal of Medicine, 384(5): 403–416. https://doi.org/10.1056/NEJMoa2035389
Chen DG, Chung Y, Beyene KM (2024). Estimate time-to-infection (TTI) vaccination effect when TTI for unvaccinated group is unknown. Statistics in Biosciences, 16(3): 723–741. https://doi.org/10.1007/s12561-024-09417-w
Cox DR (1972). Regression models and life-tables (with discussion). Journal of the Royal Statistical Society. Series B, 34(2): 187–220. https://doi.org/10.1111/j.2517-6161.1972.tb00899.x
Efron B (1988). Logistic regression, survival analysis, and the Kaplan-Meier curve. Journal of the American Statistical Association, 83(402): 414–425. https://doi.org/10.1080/01621459.1988.10478612
Havers FP, Reed C, Lim T, Montgomery JM, Klena JD, Hall AJ, et al. (2020). Seroprevalence of antibodies to SARS-CoV-2 in 10 sites in the United States, March 23-May 12, 2020. JAMA Internal Medicine, 180(12): 1576–1586. https://doi.org/10.1001/jamainternmed.2020.4130
Hou CW, Williams S, Taylor K, Boyle V, Bobbett B, Kouvetakis J, et al. (2023). Serological survey to estimate SARS-CoV-2 infection and antibody seroprevalence at a large public university: a cross-sectional study. BMJ Open, 13(8): e072627. https://doi.org/10.1136/bmjopen-2023-072627
Lombardi A, Mangioni D, Consonni D, Cariani L, Bono P, Cantù AP, et al. (2021). Seroprevalence of anti-SARS-CoV-2 IgG among healthcare workers of a large university hospital in Milan, Lombardy, Italy: a cross-sectional study. BMJ Open, 11(2): e047216. https://doi.org/10.1136/bmjopen-2020-047216
Mercado-Reyes M, Malagón-Rojas J, Rodríguez-Barraquer I, Zapata-Bedoya S, Wiesner M, Cucunubá Z, et al. (2022). Seroprevalence of anti-SARS-CoV-2 antibodies in Colombia, 2020: a population-based study. The Lancet Regional Health–Americas, 9: 100195. https://doi.org/10.1016/j.lana.2022.100195
Nah EH, Cho S, Park H, Hwang I, Cho HI (2021). Nationwide seroprevalence of antibodies to SARS-CoV-2 in asymptomatic population in South Korea: a cross-sectional study. BMJ Open, 11(4): e049837. https://doi.org/10.1136/bmjopen-2021-049837
Polack FP, Thomas SJ, Kitchin N, Absalon J, Gurtman A, Lockhart S, et al. (2020). Safety and efficacy of the BNT162b2 mRNA COVID-19 vaccine. New England Journal of Medicine, 383(27): 2603–2615. https://doi.org/10.1056/NEJMoa2034577
Venugopal U, Jilani N, Rabah S, Shariff MA, Jawed M, Batres AM, et al. (2021). SARS-CoV-2 seroprevalence among health care workers in a New York City hospital: a cross-sectional analysis during the COVID-19 pandemic. International Journal of Infectious Diseases, 102: 63–69. https://doi.org/10.1016/j.ijid.2020.10.036
Vusirikala A, Whitaker H, Jones S, Tessier E, Borrow R, Linley E, et al. (2021). Seroprevalence of SARS-CoV-2 antibodies in university students: cross-sectional study, December 2020, England. Journal of Infection, 83(1): 104–111. https://doi.org/10.1016/j.jinf.2021.04.028
Xiong Y, Braun WJ, Hu XJ (2021). Estimating duration distribution aided by auxiliary longitudinal measures in presence of missing time origin. Lifetime Data Analysis, 27: 388–412. https://doi.org/10.1007/s10985-021-09520-w