Journal of Data Science logo


Login Register

  1. Home
  2. Issues
  3. Volume 20, Issue 1 (2022)
  4. Dynamic Classification of Plasmodium viv ...

Journal of Data Science

Submit your article Information
  • Article info
  • More
    Article info

Dynamic Classification of Plasmodium vivax Malaria Recurrence: An Application of Classifying Unknown Cause of Failure in Competing Risks
Volume 20, Issue 1 (2022), pp. 51–78
Yutong Liu   Feng-Chang Lin   Jessica T. Lin     All authors (4)

Authors

 
Placeholder
https://doi.org/10.6339/21-JDS1026
Pub. online: 9 December 2021      Type: Data Science In Action      Open accessOpen Access

Received
8 June 2021
Accepted
2 October 2021
Published
9 December 2021

Abstract

A standard competing risks set-up requires both time to event and cause of failure to be fully observable for all subjects. However, in application, the cause of failure may not always be observable, thus impeding the risk assessment. In some extreme cases, none of the causes of failure is observable. In the case of a recurrent episode of Plasmodium vivax malaria following treatment, the patient may have suffered a relapse from a previous infection or acquired a new infection from a mosquito bite. In this case, the time to relapse cannot be modeled when a competing risk, a new infection, is present. The efficacy of a treatment for preventing relapse from a previous infection may be underestimated when the true cause of infection cannot be classified. In this paper, we developed a novel method for classifying the latent cause of failure under a competing risks set-up, which uses not only time to event information but also transition likelihoods between covariates at the baseline and at the time of event occurrence. Our classifier shows superior performance under various scenarios in simulation experiments. The method was applied to Plasmodium vivax infection data to classify recurrent infections of malaria.

Supplementary material

 Supplementary Material
In the Supplementary Materials, we provide additional simulation results for scenarios when the hazard models are misspecified. We also compare our classifiers with those proposed in Lin et al. (2020) for binary covariates. In addition, we provide results for parameter estimation performance under low-dimensional settings. Additional details of the P. vivax malaria study, including the data and codes are provided as well.

References

 
Baird JK (2013). Evidence and implications of mortality associated with acute plasmodium vivax malaria. Clinical Microbiology Reviews, 26(1): 36–57.
 
Bureau A, Shiboski S, Hughes JP (2003). Applications of continuous time hidden Markov models to the study of misclassified disease outcomes. Statistics in Medicine, 22(3): 441–462.
 
Chu CS, White NJ (2016). Management of relapsing plasmodium vivax malaria. Expert Review of Anti-Infective Therapy, 14(10): 885–900.
 
Dini S, Douglas NM, Poespoprodjo JR, Kenangalem E, Sugiarto P, Plumb ID, et al. (2020). The risk of morbidity and mortality following recurrent malaria in Papua, Indonesia: a retrospective cohort study. BMC Medicine, 18(1): 1–12.
 
Dinse GE (1982). Nonparametric estimation for partially-complete time and type of failure data. Biometrics, 38(2): 417–431.
 
Effraimidis G, Dahl CM (2014). Nonparametric estimation of cumulative incidence functions for competing risks data with missing cause of failure. Statistics & Probability Letters, 89: 1–7.
 
Fan J, Lv J (2011). Nonconcave penalized likelihood with NP-dimensionality. IEEE Transactions on Information Theory, 57(8): 5467–5484.
 
Ferreira MU, de Sousa TN, Rangel GW, Johansen IC, Corder RM, Ladeia-Andrade S, et al. (2020). Monitoring plasmodium vivax resistance to antimalarials: persisting challenges and future directions. International Journal for Parasitology: Drugs and Drug Resistance, 15: 9.
 
Friedrich LR, Popovici J, Kim S, Dysoley L, Zimmerman PA, Menard D, et al. (2016). Complexity of infection and genetic diversity in cambodian plasmodium vivax. PLoS Neglected Tropical Diseases, 10(3): e0004526.
 
Goetghebeur E, Ryan L (1995). Analysis of competing risks survival data when some failure types are missing. Biometrika, 82(4): 821–833.
 
Gouskova NA, Lin FC, Fine JP (2017). Nonparametric analysis of competing risks data with event category missing at random. Biometrics, 73(1): 104–113.
 
Hathaway NJ, Parobek CM, Juliano JJ, Bailey JA (2018). SeekDeep: single-base resolution de novo clustering for amplicon deep sequencing. Nucleic Acids Research, 46(4): e21.
 
Howes RE, Battle KE, Mendis KN, Smith DL, Cibulskis RE, Baird JK, et al. (2016). Global epidemiology of plasmodium vivax. The American Journal of Tropical Medicine and Hygiene, 95(6): 15–34.
 
Juraska M, Gilbert PB (2016). Mark-specific hazard ratio model with missing multivariate marks. Lifetime Data Analysis, 22(4): 606–625.
 
Kalbfleisch JD, Prentice RL (2002). The statistical analysis of failure time data, volume 360. John Wiley & Sons.
 
Lin DY, Wei LJ, Ying Z (1993). Checking the Cox model with cumulative sums of martingale-based residuals. Biometrika, 80(3): 557–572.
 
Lin FC, Li Q, Lin JT (2020). Relapse or reinfection: classification of malaria infection using transition likelihoods. Biometrics, 76(4): 1351–1363.
 
Lin JT, Hathaway NJ, Saunders DL, Lon C, Balasubramanian S, Kharabora O, et al. (2015). Using amplicon deep sequencing to detect genetic signatures of plasmodium vivax relapse. The Journal of Infectious Diseases, 212(6): 999–1008.
 
Lin JT, Patel JC, Kharabora O, Sattabongkot J, Muth S, Ubalee R, et al. (2013). Plasmodium vivax isolates from Cambodia and Thailand show high genetic complexity and distinct patterns of P. vivax multidrug resistance gene 1 (pvmdr1) polymorphisms. The American Journal of Tropical Medicine and Hygiene, 88(6): 1116–1123.
 
Lon C, Manning JE, Vanachayangkul P, So M, Sea D, Se Y, et al. (2014). Efficacy of two versus three-day regimens of dihydroartemisinin-piperaquine for uncomplicated malaria in military personnel in northern Cambodia: an open-label randomized trial. PLoS ONE, 9(3): e93138.
 
Lu K, Tsiatis AA (2001). Multiple imputation methods for estimating regression coefficients in the competing risks model with missing cause of failure. Biometrics, 57(4): 1191–1197.
 
McCullagh P, Nelder J (1989). Generalized linear models. Chapman and Hill.
 
Neafsey DE, Galinsky K, Jiang RH, Young L, Sykes SM, Saif S, et al. (2012). The malaria parasite plasmodium vivax exhibits greater genetic diversity than plasmodium falciparum. Nature Genetics, 44(9): 1046–1050.
 
Parikh N, Boyd S (2014). Proximal algorithms. Foundations and Trends in Optimization, 1(3): 127–239.
 
Parobek CM, Bailey JA, Hathaway NJ, Socheat D, Rogers WO, Juliano JJ (2014). Differing patterns of selection and geospatial genetic diversity within two leading plasmodium vivax candidate vaccine antigens. PLoS Neglected Tropical Diseases, 8(4): e2796.
 
Qin J (1998). Inferences for case-control and semiparametric two-sample density ratio models. Biometrika, 85(3): 619–630.
 
Robinson LJ, Wampfler R, Betuela I, Karl S, White MT, Suen CSLW, et al. (2015). Strategies for understanding and reducing the plasmodium vivax and plasmodium ovale hypnozoite reservoir in Papua new guinean children: a randomised placebo-controlled trial and mathematical model. PLoS Medicine, 12(10): e1001891.
 
Rubin DB (1976). Inference and missing data. Biometrika, 63(3): 581–592.
 
Schwarz G (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2): 461–464.
 
Sun Y, Gilbert PB (2012). Estimation of stratified mark-specific proportional hazards models with missing marks. Scandinavian Journal of Statistics, 39(1): 34–52.
 
Taylor AR, Watson JA, Chu CS, Puaprasert K, Duanguppama J, Day NP, et al. (2019). Resolving the cause of recurrent plasmodium vivax malaria probabilistically. Nature Communications, 10(1): 1–11.
 
Tibshirani R (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B, 58(1): 267–288.
 
WHO (2019). World malaria report 2019. World Health Organization.
 
Zou H, Hastie T (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B, 67(2): 301–320.

PDF XML
PDF XML

Copyright
2022 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.
by logo by logo
Open access article under the CC BY license.

Keywords
malaria relapse Markov transition model quadratic approximation two-stage estimation

Funding
Dr. Feng-Chang Lin’s research was partially supported by NIH grant UL1TR002489. Dr. Quefeng Li’s research was partially supported by NIH grant R01AG073259.

Metrics
since February 2021
1136

Article info
views

496

PDF
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

Journal of data science

  • Online ISSN: 1683-8602
  • Print ISSN: 1680-743X

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer

Contact us

  • JDS@ruc.edu.cn
  • No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China
Powered by PubliMill  •  Privacy policy