AICov: An Integrative Deep Learning Framework for COVID-19 Forecasting with Population Covariates
Volume 19, Issue 2 (2021), pp. 293–313
Pub. online: 22 February 2021 Type: Computing In Data Science
1 July 2020
1 July 2020
23 January 2021
23 January 2021
22 February 2021
22 February 2021
The COVID-19 (COrona VIrus Disease 2019) pandemic has had profound global consequences on health, economic, social, behavioral, and almost every major aspect of human life. Therefore, it is of great importance to model COVID-19 and other pandemics in terms of the broader social contexts in which they take place. We present the architecture of an artificial intelligence enhanced COVID-19 analysis (in short AICov), which provides an integrative deep learning framework for COVID-19 forecasting with population covariates, some of which may serve as putative risk factors. We have integrated multiple different strategies into AICov, including the ability to use deep learning strategies based on Long Short-Term Memory (LSTM) and event modeling. To demonstrate our approach, we have introduced a framework that integrates population covariates from multiple sources. Thus, AICov not only includes data on COVID-19 cases and deaths but, more importantly, the population’s socioeconomic, health, and behavioral risk factors at their specific locations. The compiled data are fed into AICov, and thus we obtain improved prediction by the integration of the data to our model as compared to one that only uses case and death data. As we use deep learning our models adapt over time while learning the model from past data.
Supplementary materialSupplementary Material
The code and paper document represented to implement AICov are contained in several repositories: 1. The entire cloudmesh code on which the cloud based implementation of the AICov framework is based and contains over 70 contributors is available publicly at https://github.com/cloudmesh. Cloudmesh contains a number of modules that dependent on the users access to cloud resources can be customized. A detailed manual about the configuration is available at https://cloudmesh.github.io/cloudmesh-manual/. 2. The entire COVID-19 analysis leverages cloudmesh and uses Jupyter notebooks to coordinate its workflow as discussed in the architecture Figure 2. The code and data for the results presented in this paper are located in the repository at https://github.com/cloudmesh/cloudmesh-covid. The data was analysed on a variety of supercomputing resources including an allocation of 20 compute nodes that were utilized to execute the repeated model creation to assure reproducible results. However, the use of the data is copyrighted and must be authorized to be used for other publications without contacting the authors. The data gathering and analysis is a significant intellectual contribution and we like to avoid that the data is taken before we have not secured a publication. 3. The entire paper is located in LaTeXsource in the GitHub repository https://github.com/cyberaide/paper-covid. This repository will be open sourced after acceptance of publication to not violate any publisher restrictions. If desired the authors can grant access to this repository prior to publication. Please contact the corresponding author. A zip file is provided for the publication for archival purposes. However, it will be much more convenient and easier to use our GitHub distribution as discussed in the supplementary section.
American Hospital Directory (2020). Information about hospitals from public and private data sources including medpar, opps, hospital cost reports, and other CMS files. Web Page. URL: https://www.ahd.com/.
Bertozzi A, Franco E, Mohler G, Short M, Sledge D (2020). The challenges of modeling and forecasting the spread of COVID-19. Proceedings of the National Academy of Sciences, 117(29): 16732–16738. https://www.pnas.org/content/117/29/16732, https://www.pnas.org/content/117/29/16732.full.pdf, doi: https://doi.org/10.1073/pnas.2006520117.
CDC (2020a). Behavioral risk factor surveillance system survey. Web Page. URL: https://www.cdc.gov/brfss/index.html.
CDC (2020b). Forecasts of total deaths. Web Page. URL: https://www.cdc.gov/coronavirus/2019-ncov/covid-data/forecasting-us.html#modeling-groups.
CDC (2020c). NCHS – National Center for Health Statistics. Web Page. URL: https://www.cdc.gov/nchs/index.htm.
Centers for Disease Control and Prevention (2020a). Open data for chronic disease and health promotion data and indicators. Web Page. URL: https://chronicdata.cdc.gov/.
Centers for Disease Control and Prevention (2020b). Social vulnerability index. Web Page. URL: https://svi.cdc.gov/.
Chang WL, von Laszewski G (2019). NIST Big Data Interoperability Framework: Volume 8, Reference Architecture Interfaces, Technical report, National Institute of Standards and Technology. URL: https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.1500-9r1.pdf.
Hochreiter S, Schmidhuber J (1997). Long short-term memory. Neural Computation, 9(8): 1735–1780. doi: https://doi.org/10.1162/neco.19184.108.40.2065.
Johns Hopkins Coronavirus Resource Center (2020). COVID-19 map. Web Page. URL: https://coronavirus.jhu.edu/map.html.
Kadupitiya J, Fox GC, Jadhao V (2020). Simulating molecular dynamics with large timesteps using recurrent neural networks. arXiv preprint: https://arxiv.org/abs/2004.06493.
Marsland R, Mehta1 P (2020). Data-driven modeling reveals a universal dynamic underlying the COVID-19 pandemic under social distancing. arXiv preprint: https://arxiv.org/abs/2004.10666.
New York Times (2020a). Coronavirus in the U.S.: Latest map and case count – The New York Times. Web Page. URL: https://www.nytimes.com/interactive/2020/us/coronavirus-us-cases.html.
New York Times (2020b). An ongoing repository of data on coronavirus cases and deaths in the U.S. GitHub. URL: https://github.com/nytimes/covid-19-data.
Petropoulos F, Makridakis S (2020). Forecasting the novel coronavirus COVID-19. PLoS One, 15(3): e0231236. doi: https://doi.org/10.1371/journal.pone.0231236.
Pyne S, Vullikanti AKS, Marathe MV (2015). Chapter 8 – Big data applications in health sciences and epidemiology. In: Handbook of Statistics (V Govindaraju, VV Raghavan, CR Rao, eds.), volume 33, 171–202. Elsevier. doi: https://doi.org/10.1016/B978-0-444-63492-4.00008-3.
Schmidhuber J, Wierstra D, Gomez FJ (2005). Evolino: Hybrid neuroevolution/optimal linear search for sequence learning. In: IJCAI-05, Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence (LP Kaelbling, A Saffiotti, eds.), Edinburgh, Scotland, UK, July 30–August 5, 853–858. Professional Book Center. URL: http://ijcai.org/Proceedings/05/Papers/1452.pdf.
Ting DSW, Carin L, Dzau V, Wong TY (2020). Digital technology and COVID-19. Nature Medicine, 26(4): 459–461. doi: https://doi.org/10.1038/s41591-020-0824-5.
US Census Bureau (2020b). QuickFacts: United States. Web Page. URL: https://www.census.gov/quickfacts/fact/table/US/PST045219.
von Laszewski G (2020). Cloudmesh manual. Web Page. URL: https://cloudmesh.github.io/cloudmesh-manual/.
von Laszewski G, Orlowski A, Otten RH, Markowitz R, Gandhi S, Chai A, et al. (2020a). Using gas for speedy generation of hybrid multi-cloud auto generated AI services, Technical report, Indiana University. Submitted for publication. URL: https://github.com/laszewski/laszewski.github.io/raw/master/papers/vonLaszewski-openapi.pdf.
von Laszewski G, et al. (2020b). Cloudmesh OpenAPI installation instructions. Web Page. URL: https://github.com/cloudmesh/cloudmesh-openapi/blob/main/README.md.