Methods, Challenges, and Practical Issues of COVID-19 Projection: A Data Science Perspective
Volume 19, Issue 2 (2021), pp. 219–242
Pub. online: 27 April 2021
Type: Philosophies Of Data Science
Received
21 April 2021
21 April 2021
Accepted
22 April 2021
22 April 2021
Published
27 April 2021
27 April 2021
Abstract
The coronavirus disease 2019 (COVID-19) pandemic caused by the novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has placed epidemic modeling at the center of attention of public policymaking. Predicting the severity and speed of transmission of COVID-19 is crucial to resource management and developing strategies to deal with this epidemic. Based on the available data from current and previous outbreaks, many efforts have been made to develop epidemiological models, including statistical models, computer simulations, mathematical representations of the virus and its impacts, and many more. Despite their usefulness, modeling and forecasting the spread of COVID-19 remains a challenge. In this article, we give an overview of the unique features and issues of COVID-19 data and how they impact epidemic modeling and projection. In addition, we illustrate how various models could be connected to each other. Moreover, we provide new data science perspectives on the challenges of COVID-19 forecasting, from data collection, curation, and validation to the limitations of models, as well as the uncertainty of the forecast. Finally, we discuss some data science practices that are crucial to more robust and accurate epidemic forecasting.
References
Altieri N, Barter RL, Duncan J, Dwivedi R, Kumbier K, Li X, et al. (2021). Curating a COVID-19 data repository and forecasting county-level death counts in the United States. Harvard Data Science Review. https://doi.org/10.1162/99608f92.1d4e0dae
Arik SO, Li CL, Yoon J, Sinha R, Epshteyn A, Le LT, et al. (2020). Interpretable sequence learning for COVID-19 forecasting. arXiv preprint: https://arxiv.org/abs/2008.00646.
Arora P, Kumar H, Panigrahi BK (2020). Prediction and analysis of COVID-19 positive cases using deep learning models: A descriptive case study of India. Chaos, Solitons & Fractals, 139: 110017. https://doi.org/10.1016/j.chaos.2020.110017
Atlantic (2021). The COVID tracking project. https://covidtracking.com
Castro L, Fairchild G, Michaud I, Osthus D (2020). COFFEE: COVID-19 forecasts using fast evaluations and estimation. https://covid-19.bsvgateway.org/static/COFFEE-methodology.pdf
Council of State and Territorial Epidemiologists (2020). Standardized surveillance case definition and national notification for 2019 novel coronavirus disease (COVID-19). https://cdn.ymaws.com/www.cste.org/resource/resmgr/2020ps/Interim-20-ID-01_covid-19.pdf
Cramer EY, Ray EL, Lopez VK, Bracher J, Brennen A, Rivadeneira AJC, et al. (2021). Evaluation of individual and ensemble probabilistic forecasts of COVID-19 mortality in the US. medRxiv preprint: https://www.medrxiv.org/content/10.1101/2021.02.03.21250974v1.
Efimov D, Ushirobira R (2021). On an interval prediction of COVID-19 development based on a SEIR epidemic model. Annual Reviews in Control, in press. https://doi.org/10.1016/j.arcontrol.2021.01.006
Hoffman H (2021). How day-of-week effects impact COVID-19 data. https://covidtracking.com/analysis-updates/how-day-of-week-effects-impact-covid-19-data.
Ioannidis JPA, Cripps S, Tanner MA (2020). Forecasting for COVID-19 has failed. International Journal of Forecasting, in press. https://doi.org/10.1016/j.ijforecast.2020.08.004
Johns Hopkins University Center for Systems Science and Engineering (2021). COVID-19 data repository. https://github.com/CSSEGISandData/COVID-19
KRR G, KVR M, SSP PR, Casella F (2020). Non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality. SSRN preprint: https://doi.org/10.2139/ssrn.3560688
Neto OP, Reis JC, Brizzi ACB, Zambrano GJ, de Souza JM, Pedroso W, et al. (2020). Compartmentalized mathematical model to predict future number of active cases and deaths of COVID-19. Research on Biomedical Engineering, in press. https://doi.org/10.1007/s42600-020-00084-6
New York Times (2021). Coronavirus (COVID-19) data in the United States. https://github.com/nytimes/covid-19-data.
Oreshkin BN, Carpov D, Chapados N, Bengio Y (2019). N-BEATS: neural basis expansion analysis for interpretable time series forecasting. arXiv preprint: https://arxiv.org/abs/1905.10437.
Peng L, Yang W, Zhang D, Zhuge C, Hong L (2020). Epidemic analysis of COVID-19 in China by dynamical modeling. arXiv preprint: https://arxiv.org/abs/2002.06563.
Ray EL, Wattanachit N, Niemi J, Kanji AH, House K, Cramer EY, et al. (2020). Ensemble forecasts of coronavirus disease 2019 (COVID-19) in the U.S. medRxiv preprint: https://www.medrxiv.org/content/10.1101/2020.08.19.20177493v1.
Rodriguez A, Tabassum A, Cui J, Xie J, Ho J, Agarwal P, et al. (2020). DeepCOVID: An operational deep learning-driven framework for explainable real-time COVID-19 forecasting. medRxiv preprint: https://www.medrxiv.org/content/10.1101/2020.09.28.20203109v2.
SAGE Working Group on Measles and Rubella (2019). Feasibility assessment of measles and rubella eradication. https://www.who.int/immunization/sage/meetings/2019/october/Feasibility_Assessment_of_Measles_and_Rubella_Eradication_final.pdf.
Singh A, Bajpai MK, Gupta SL (2020). A time-dependent mathematical model for COVID-19 transmission dynamics and analysis of critical and hospitalized cases with bed requirements. medRxiv preprint: https://www.medrxiv.org/content/10.1101/2020.10.28.20221721v1.full.
USAFacts (2021). Coronavirus locations: COVID-19 map by county and state. https://usafacts.org/visualizations/coronavirus-covid-19-spread-map
Wang G, Gu Z, Li X, Yu S, Kim M, Wang Y, et al. (2020). Comparing and integrating us COVID-19 data from multiple sources with anomaly detection and repairing. arXiv preprint: https://arxiv.org/abs/2006.01333.
World Health Organization (2020). Public health surveillance for COVID-19: interim guidance, 16 December 2020. https://www.who.int/publications/i/item/who-2019-nCoV-surveillanceguidance-2020.8
Zhang N, Jia W, Lei H, Wang P, Zhao P, Guo Y, et al. (2020). Effects of human behaviour changes during the COVID-19 pandemic on influenza spread in Hong Kong. Clinical Infectious Diseases, in press. https://doi.org/10.1093/cid/ciaa1818
Zou D, Wang L, Xu P, Chen J, Zhang W, Gu Q (2020). Epidemic model guided machine learning for COVID-19 forecasts in the United States. medRxiv preprint: https://www.medrxiv.org/content/10.1101/2020.05.24.20111989v1.