We assessed the impact of the coronavirus disease 2019 (COVID-19) pandemic on the statistical analysis of time-to-event outcomes in late-phase oncology trials. Using a simulated case study that mimics a Phase III ongoing trial during the pandemic, we evaluated the impact of COVID-19-related deaths, time off-treatment and missed clinical visits due to the pandemic, on overall survival and/or progression-free survival in terms of test size (also referred to as Type 1 error rate or alpha level), power, and hazard ratio (HR) estimates. We found that COVID-19-related deaths would impact both size and power, and lead to biased HR estimates; the impact would be more severe if there was an imbalance in COVID-19-related deaths between the study arms. Approaches censoring COVID-19-related deaths may mitigate the impact on power and HR estimation, especially if study data cut-off was extended to recover censoring-related event loss. The impact of COVID-19-related time off-treatment would be modest for power, and moderate for size and HR estimation. Different rules of censoring cancer progression times result in a slight difference in the power for the analysis of progression-free survival. The simulations provided valuable information for determining whether clinical-trial modifications should be required for ongoing trials during the COVID-19 pandemic.
Abstract: Simulation studies are important statistical tools used to inves-tigate the performance, properties and adequacy of statistical models. The simulation of right censored time-to-event data involves the generation of two independent survival distributions, where the rst distribution repre-sents the uncensored survival times and the second distribution represents the censoring mechanism. In this brief report we discuss how we can make it so that the percentage of censored data is previously de ned. The described method was used to generate data from a Weibull distribution, but it can be adapted to any other lifetime distribution. We further presented an R code function for generating random samples, considering the proposed approach.
Pub. online:4 Aug 2022Type:Research ArticleOpen Access
Journal:Journal of Data Science
Volume 18, Issue 3 (2020): Special issue: Data Science in Action in Response to the Outbreak of COVID-19, pp. 526–535
Abstract
COVID-19 is a disease caused by the severe acute respiratory syndrome coronavirus 2 (SARSCoV-2) that was reported to spread in people in December 2019. Understanding epidemiological
features of COVID-19 is important for the ongoing global efforts to contain the virus. As a
complement to the available work, in this article we analyze the Kaggle novel coronavirus dataset
of 3397 patients dated from January 22, 2020 to March 29, 2020. We employ semiparametric
and nonparametric survival models as well as text mining and data visualization techniques to
examine the clinical manifestations and epidemiological features of COVID-19. Our analysis
shows that: (i) the median incubation time is about 5 days and older people tend to have a
longer incubation period; (ii) the median time for infected people to recover is about 20 days,
and the recovery time is significantly associated with age but not gender; (iii) the fatality rate
is higher for older infected patients than for younger patients
Semi-parametric Cox regression and parametric methods have been used to analyze survival data of cancer; however, no study has focused on the comparison of survival models in genetic association analysis of age at onset (AAO) of cancer. The Hepatocyte nuclear factor-1- beta (HNF1B) gene has been associated with risk of endometrial and prostate cancers; however, no study has focused on the effect of HNF1B gene on the AAO of cancer. This study examined 23 single nucleotide polymorphisms (SNPs) within the HNF1B gene in the Marshfield sample with 716 cancer cases and 2,848 non-cancer controls. Cox proportional hazards models in PROC PHREG and parametric survival models (including exponential, Weibull, log-normal, log-logistic, and gamma models) in PROC LIFEREG in SAS 9.4 were used to detect the genetic association of HNF1B gene with the AAO. The Akaike information criterion (AIC) and Bayesian information criterion (BIC) were used to compare the Cox models and parametric survival models. Both AIC and BIC values showed that the Weibull distribution is the best model for all the 23 SNPs and the Gamma distribution is the second best. The top two SNPs are rs4239217 and rs7501939 with time ratio (TR) =1.08 (p<0.0001 for the AA and AG genotypes, respectively) and 1.07 (p=0.0004 and 0.0002 for CC and CT genotypes, respectively) based on the Weibull model, respectively. This study shows that the parametric Weibull distribution is the best model for the genetic association of AAO of cancer and provides the first evidence of several genetic variants within the HNF1B gene associated with AAO of cancer.
Semi-parametric Cox regression and parametric methods have been used to analyze survival data of cancer; however, no study has focused on the comparison of survival models in genetic association analysis of age at onset (AAO) of cancer. The Hepatocyte nuclear factor-1- beta (HNF1B) gene has been associated with risk of endometrial and prostate cancers; however, no study has focused on the effect of HNF1B gene on the AAO of cancer. This study examined 23 single nucleotide polymorphisms (SNPs) within the HNF1B gene in the Marshfield sample with 716 cancer cases and 2,848 non-cancer controls. Cox proportional hazards models in PROC PHREG and parametric survival models (including exponential, Weibull, log-normal, log-logistic, and gamma models) in PROC LIFEREG in SAS 9.4 were used to detect the genetic association of HNF1B gene with the AAO. The Akaike information criterion (AIC) and Bayesian information criterion (BIC) were used to compare the Cox models and parametric survival models. Both AIC and BIC values showed that the Weibull distribution is the best model for all the 23 SNPs and the Gamma distribution is the second best. The top two SNPs are rs4239217 and rs7501939 with time ratio (TR) =1.08 (p<0.0001 for the AA and AG genotypes, respectively) and 1.07 (p=0.0004 and 0.0002 for CC and CT genotypes, respectively) based on the Weibull model, respectively. This study shows that the parametric Weibull distribution is the best model for the genetic association of AAO of cancer and provides the first evidence of several genetic variants within the HNF1B gene associated with AAO of cancer.
Survival analysis is the widely used statistical tool for new intervention comparison in presence of hazards of follow up studies. However, it is difficult to obtain suitable survival rate in presence of high level of hazard within few days of surgery. The group of patients can be directly stratified into cured and non-cured strata. The mixture models are natural choice for estimation of cure and non-cure rate estimation. The estimation of cure rate is an important parameter of success of any new intervention. The cure rate model is illustrated to compare the surgery of liver cirrhosis patients with consenting for participation HFLPC (Human Fatal Liver Progenitor Cells) Infusion vs. consenting for participation alone group in South Indian popula-tion. The surgery is best available technique for liver cirrhosis treatment. The success of the surgery is observed through follow up study. In this study, MELD (Model for End-Stage Liver Disease) score is considered as response of interest for cured and non-cured group. The primary efficacy of surgery is considered as covariates of interest. Distributional assumptions of the cure rate are solved with Markov Chain Monte Carlo (MCMC) techniques. It is found that cured model with parametric approach allows more consistent estimates in comparison to standard procedures. The risk of death due to liver transplantation in liver cirrhosis patients including time dependent effect terms has also been explored. The approach assists to model with different age and sex in both the treatment groups.