<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">JDS</journal-id>
<journal-title-group><journal-title>Journal of Data Science</journal-title></journal-title-group>
<issn pub-type="epub">1683-8602</issn>
<issn pub-type="ppub">1680-743X</issn>
<issn-l>1680-743X</issn-l>
<publisher>
<publisher-name>School of Statistics, Renmin University of China</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">JDS1007</article-id>
<article-id pub-id-type="doi">10.6339/21-JDS1007</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Computing in Data Science</subject></subj-group></article-categories>
<title-group>
<article-title>AICov: An Integrative Deep Learning Framework for COVID-19 Forecasting with Population Covariates</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Fox</surname><given-names>Geoffrey C.</given-names></name><xref ref-type="aff" rid="j_jds1007_aff_001">1</xref>
</contrib>
<contrib contrib-type="author">
<contrib-id contrib-id-type="orcid">https://orcid.org/0000-0001-9558-179X</contrib-id>
<name><surname>von Laszewski</surname><given-names>Gregor</given-names></name><email xlink:href="mailto:laszewski@gmail.com">laszewski@gmail.com</email><xref ref-type="aff" rid="j_jds1007_aff_001">1</xref><xref ref-type="corresp" rid="cor1">∗</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Wang</surname><given-names>Fugang</given-names></name><xref ref-type="aff" rid="j_jds1007_aff_001">1</xref>
</contrib>
<contrib contrib-type="author">
<contrib-id contrib-id-type="orcid">https://orcid.org/0000-0003-3470-2345</contrib-id>
<name><surname>Pyne</surname><given-names>Saumyadipta</given-names></name><xref ref-type="aff" rid="j_jds1007_aff_002">2</xref><xref ref-type="aff" rid="j_jds1007_aff_003">3</xref>
</contrib>
<aff id="j_jds1007_aff_001"><label>1</label>Digital Science Center, <institution>Indiana University</institution>, Bloomington, Indiana, <country>USA</country></aff>
<aff id="j_jds1007_aff_002"><label>2</label>Public Health Dynamics Laboratory, and Department of Biostatistics, <institution>University of Pittsburgh</institution>, Pittsburgh, Pennsylvania, <country>USA</country></aff>
<aff id="j_jds1007_aff_003"><label>3</label><institution>Health Analytics Network</institution>, Pennsylvania, <country>USA</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author. Email: <ext-link ext-link-type="uri" xlink:href="mailto:laszewski@gmail.com">laszewski@gmail.com</ext-link>.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2021</year></pub-date><pub-date pub-type="epub"><day>22</day><month>2</month><year>2021</year></pub-date>
<volume>19</volume><issue>2</issue><fpage>293</fpage><lpage>313</lpage><supplementary-material id="S1" content-type="archive" xlink:href="jds1007_s001.zip" mimetype="application" mime-subtype="x-zip-compressed">
<caption>
<title>Supplementary Material</title>
<p>The code and paper document represented to implement AICov are contained in several repositories: 
<list>
<list-item id="j_jds1007_li_001">
<label>1.</label>
<p>The entire cloudmesh code on which the cloud based implementation of the AICov framework is based and contains over 70 contributors is available publicly at <uri>https://github.com/cloudmesh</uri>. Cloudmesh contains a number of modules that dependent on the users access to cloud resources can be customized. A detailed manual about the configuration is available at <uri>https://cloudmesh.github.io/cloudmesh-manual/</uri>.</p>
</list-item>
<list-item id="j_jds1007_li_002">
<label>2.</label>
<p>The entire COVID-19 analysis leverages cloudmesh and uses Jupyter notebooks to coordinate its workflow as discussed in the architecture Figure 2. The code and data for the results presented in this paper are located in the repository at <uri>https://github.com/cloudmesh/cloudmesh-covid</uri>.</p>
<p>The data was analysed on a variety of supercomputing resources including an allocation of 20 compute nodes that were utilized to execute the repeated model creation to assure reproducible results.</p>
<p>However, the use of the data is copyrighted and must be authorized to be used for other publications without contacting the authors. The data gathering and analysis is a significant intellectual contribution and we like to avoid that the data is taken before we have not secured a publication.</p>
</list-item>
<list-item id="j_jds1007_li_003">
<label>3.</label>
<p>The entire paper is located in LaTeXsource in the GitHub repository <uri>https://github.com/cyberaide/paper-covid</uri>. This repository will be open sourced after acceptance of publication to not violate any publisher restrictions. If desired the authors can grant access to this repository prior to publication. Please contact the corresponding author.</p>
</list-item>
</list> 
A zip file is provided for the publication for archival purposes. However, it will be much more convenient and easier to use our GitHub distribution as discussed in the supplementary section.</p>
</caption>
</supplementary-material>
<history>
<date date-type="received"><day>1</day><month>7</month><year>2020</year></date>
<date date-type="accepted"><day>23</day><month>1</month><year>2021</year></date>
</history>
<permissions><copyright-statement>2021 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.</copyright-statement><copyright-year>2021</copyright-year>
<license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>The COVID-19 (COrona VIrus Disease 2019) pandemic has had profound global consequences on health, economic, social, behavioral, and almost every major aspect of human life. Therefore, it is of great importance to model COVID-19 and other pandemics in terms of the broader social contexts in which they take place. We present the architecture of an artificial intelligence enhanced COVID-19 analysis (in short AICov), which provides an integrative deep learning framework for COVID-19 forecasting with population covariates, some of which may serve as putative risk factors. We have integrated multiple different strategies into AICov, including the ability to use deep learning strategies based on Long Short-Term Memory (LSTM) and event modeling. To demonstrate our approach, we have introduced a framework that integrates population covariates from multiple sources. Thus, AICov not only includes data on COVID-19 cases and deaths but, more importantly, the population’s socioeconomic, health, and behavioral risk factors at their specific locations. The compiled data are fed into AICov, and thus we obtain improved prediction by the integration of the data to our model as compared to one that only uses case and death data. As we use deep learning our models adapt over time while learning the model from past data.</p>
</abstract>
<kwd-group>
<label>Keywords</label>
<kwd>Cloudmesh</kwd>
<kwd>comorbidities</kwd>
<kwd>prediction</kwd>
<kwd>risk factors</kwd>
</kwd-group>
<funding-group>
<award-group>
<funding-source xlink:href="https://doi.org/10.13039/100000001">National Science Foundation</funding-source>
<award-id>1443054</award-id>
<award-id>1720625</award-id>
<award-id>1829704</award-id>
<award-id>1835598</award-id>
<award-id>1918626</award-id>
</award-group>
<funding-statement>This work is partially supported by the National Science Foundation (NSF) through awards Cyberinfrastructure Framework for 21st Century Data Infrastructure Building Blocks (1443054), Network for Computational Nanotechnology Engineered nanoBIO Node (1720625), Cybertraining (1829704), CyberInfrastructure for Network Engineering and Science (1835598) and Global Pervasive Computational Epidemiology (1918626). </funding-statement>
</funding-group>
</article-meta>
</front>
<body/>
<back>
<ref-list id="j_jds1007_reflist_001">
<title>References</title>
<ref id="j_jds1007_ref_001">
<mixed-citation publication-type="other"> American Hospital Directory (2020). Information about hospitals from public and private data sources including medpar, opps, hospital cost reports, and other CMS files. Web Page. URL: <uri>https://www.ahd.com/</uri>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_002">
<mixed-citation publication-type="journal"> <string-name><surname>Bertozzi</surname> <given-names>A</given-names></string-name>, <string-name><surname>Franco</surname> <given-names>E</given-names></string-name>, <string-name><surname>Mohler</surname> <given-names>G</given-names></string-name>, <string-name><surname>Short</surname> <given-names>M</given-names></string-name>, <string-name><surname>Sledge</surname> <given-names>D</given-names></string-name> (<year>2020</year>). <article-title>The challenges of modeling and forecasting the spread of COVID-19</article-title>. <source>Proceedings of the National Academy of Sciences</source>, <volume>117</volume>(<issue>29</issue>): <fpage>16732</fpage>–<lpage>16738</lpage>. <uri>https://www.pnas.org/content/117/29/16732</uri>, <uri>https://www.pnas.org/content/117/29/16732.full.pdf</uri>, doi: <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1073/pnas.2006520117" xlink:type="simple">https://doi.org/10.1073/pnas.2006520117</ext-link>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_003">
<mixed-citation publication-type="other"> CDC (2020a). Behavioral risk factor surveillance system survey. Web Page. URL: <uri>https://www.cdc.gov/brfss/index.html</uri>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_004">
<mixed-citation publication-type="other"> CDC (2020b). Forecasts of total deaths. Web Page. URL: <uri>https://www.cdc.gov/coronavirus/2019-ncov/covid-data/forecasting-us.html#modeling-groups</uri>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_005">
<mixed-citation publication-type="other"> CDC (2020c). NCHS – National Center for Health Statistics. Web Page. URL: <uri>https://www.cdc.gov/nchs/index.htm</uri>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_006">
<mixed-citation publication-type="other"> Centers for Disease Control and Prevention (2020a). Open data for chronic disease and health promotion data and indicators. Web Page. URL: <uri>https://chronicdata.cdc.gov/</uri>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_007">
<mixed-citation publication-type="other"> Centers for Disease Control and Prevention (2020b). Social vulnerability index. Web Page. URL: <uri>https://svi.cdc.gov/</uri>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_008">
<mixed-citation publication-type="other"> <string-name><surname>Chang</surname> <given-names>WL</given-names></string-name>, <string-name><surname>von Laszewski</surname> <given-names>G</given-names></string-name> (2019). NIST Big Data Interoperability Framework: Volume 8, Reference Architecture Interfaces, <italic>Technical report</italic>, National Institute of Standards and Technology. URL: <uri>https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.1500-9r1.pdf</uri>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_009">
<mixed-citation publication-type="chapter"> <string-name><surname>Graves</surname> <given-names>A</given-names></string-name>, <string-name><surname>Schmidhuber</surname> <given-names>J</given-names></string-name> (<year>2009</year>). <chapter-title>Offline handwriting recognition with multidimensional recurrent neural networks</chapter-title>. In: <source>Advances in Neural Information Processing Systems</source> (<string-name><given-names>D</given-names> <surname>Koller</surname></string-name>, <string-name><given-names>D</given-names> <surname>Schuurmans</surname></string-name>, <string-name><given-names>Y</given-names> <surname>Bengio</surname></string-name>, <string-name><given-names>L</given-names> <surname>Bottou</surname></string-name>, eds.), volume <volume>21</volume>, <fpage>545</fpage>–<lpage>552</lpage>. <publisher-name>Curran Associates, Inc.</publisher-name></mixed-citation>
</ref>
<ref id="j_jds1007_ref_010">
<mixed-citation publication-type="journal"> <string-name><surname>Greff</surname> <given-names>K</given-names></string-name>, <string-name><surname>Srivastava</surname> <given-names>RK</given-names></string-name>, <string-name><surname>Koutník</surname> <given-names>J</given-names></string-name>, <string-name><surname>Steunebrink</surname> <given-names>BR</given-names></string-name>, <string-name><surname>Schmidhuber</surname> <given-names>J</given-names></string-name> (<year>2017</year>). <article-title>LSTM: A search space odyssey</article-title>. <source>IEEE Transactions on Neural Networks and Learning Systems</source>, <volume>28</volume>(<issue>10</issue>): <fpage>2222</fpage>–<lpage>2232</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_011">
<mixed-citation publication-type="other"> <string-name><surname>Hochreiter</surname> <given-names>S</given-names></string-name> (1991). Untersuchungen zu dynamischen neuronalen netzen, <italic>Technical Report Diploma thesis</italic>, Technische Univ. Munich, Institut f. Informatik.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_012">
<mixed-citation publication-type="journal"> <string-name><surname>Hochreiter</surname> <given-names>S</given-names></string-name>, <string-name><surname>Schmidhuber</surname> <given-names>J</given-names></string-name> (<year>1997</year>). <article-title>Long short-term memory</article-title>. <source>Neural Computation</source>, <volume>9</volume>(<issue>8</issue>): <fpage>1735</fpage>–<lpage>1780</lpage>. doi: <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1162/neco.1997.9.8.1735" xlink:type="simple">https://doi.org/10.1162/neco.1997.9.8.1735</ext-link>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_013">
<mixed-citation publication-type="journal"> <string-name><surname>Jewell</surname> <given-names>N</given-names></string-name>, <string-name><surname>Lewnard</surname> <given-names>J</given-names></string-name>, <string-name><surname>Jewell</surname> <given-names>B</given-names></string-name> (<year>2020</year>). <article-title>Predictive mathematical models of the COVID-19 pandemic</article-title>. <source>JAMA</source>, <volume>323</volume>(<issue>19</issue>): <fpage>1893</fpage>–<lpage>1894</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_014">
<mixed-citation publication-type="other"> Johns Hopkins Coronavirus Resource Center (2020). COVID-19 map. Web Page. URL: <uri>https://coronavirus.jhu.edu/map.html</uri>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_015">
<mixed-citation publication-type="other"> <string-name><surname>Kadupitiya</surname> <given-names>J</given-names></string-name>, <string-name><surname>Fox</surname> <given-names>GC</given-names></string-name>, <string-name><surname>Jadhao</surname> <given-names>V</given-names></string-name> (2020). Simulating molecular dynamics with large timesteps using recurrent neural networks. arXiv preprint: <uri>https://arxiv.org/abs/2004.06493</uri>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_016">
<mixed-citation publication-type="other"> <string-name><surname>Keras</surname></string-name> (2015). Working with RNNs. URL: <uri>https://keras.io/guides/working_with_rnns/</uri>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_017">
<mixed-citation publication-type="journal"> <string-name><surname>Maleki</surname> <given-names>M</given-names></string-name>, <string-name><surname>McLachlan</surname> <given-names>G</given-names></string-name>, <string-name><surname>Gurewitsch</surname> <given-names>R</given-names></string-name>, <string-name><surname>Aruru</surname> <given-names>M</given-names></string-name>, <string-name><surname>Pyne</surname> <given-names>S</given-names></string-name> (<year>2020</year>). <article-title>A mixture of regressions model of COVID-19 death rates and population comorbidities</article-title>. <source>Statistics and Applications</source>, <volume>18</volume>(<issue>1</issue>): <fpage>295</fpage>–<lpage>306</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_018">
<mixed-citation publication-type="other"> <string-name><surname>Marsland</surname> <given-names>R</given-names></string-name>, <string-name><surname>Mehta1</surname> <given-names>P</given-names></string-name> (2020). Data-driven modeling reveals a universal dynamic underlying the COVID-19 pandemic under social distancing. arXiv preprint: <uri>https://arxiv.org/abs/2004.10666</uri>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_019">
<mixed-citation publication-type="other"> New York Times (2020a). Coronavirus in the U.S.: Latest map and case count – The New York Times. Web Page. URL: <ext-link ext-link-type="uri" xlink:href="https://www.nytimes.com/interactive/2020/us/coronavirus-us-cases.html">https://www.nytimes.com/interactive/2020/us/coronavirus-us-cases.html</ext-link>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_020">
<mixed-citation publication-type="other"> New York Times (2020b). An ongoing repository of data on coronavirus cases and deaths in the U.S. GitHub. URL: <uri>https://github.com/nytimes/covid-19-data</uri>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_021">
<mixed-citation publication-type="journal"> <string-name><surname>Petropoulos</surname> <given-names>F</given-names></string-name>, <string-name><surname>Makridakis</surname> <given-names>S</given-names></string-name> (<year>2020</year>). <article-title>Forecasting the novel coronavirus COVID-19</article-title>. <source>PLoS One</source>, <volume>15</volume>(<issue>3</issue>): <elocation-id>e0231236</elocation-id>. doi: <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1371/journal.pone.0231236" xlink:type="simple">https://doi.org/10.1371/journal.pone.0231236</ext-link>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_022">
<mixed-citation publication-type="chapter"> <string-name><surname>Pyne</surname> <given-names>S</given-names></string-name>, <string-name><surname>Vullikanti</surname> <given-names>AKS</given-names></string-name>, <string-name><surname>Marathe</surname> <given-names>MV</given-names></string-name> (<year>2015</year>). <chapter-title>Chapter 8 – Big data applications in health sciences and epidemiology</chapter-title>. In: <source>Handbook of Statistics</source> (<string-name><given-names>V</given-names> <surname>Govindaraju</surname></string-name>, <string-name><given-names>VV</given-names> <surname>Raghavan</surname></string-name>, <string-name><given-names>CR</given-names> <surname>Rao</surname></string-name>, eds.), volume <volume>33</volume>, <fpage>171</fpage>–<lpage>202</lpage>. <publisher-name>Elsevier</publisher-name>. doi: <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/B978-0-444-63492-4.00008-3" xlink:type="simple">https://doi.org/10.1016/B978-0-444-63492-4.00008-3</ext-link>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_023">
<mixed-citation publication-type="journal"> <string-name><surname>Rumelhart</surname> <given-names>DE</given-names></string-name>, <string-name><surname>Hinton</surname> <given-names>GE</given-names></string-name>, <string-name><surname>Williams</surname> <given-names>RJ</given-names></string-name> (<year>1986</year>). <article-title>Learning representations by back-propagating errors</article-title>. <source>Nature</source>, <volume>323</volume>(<issue>6088</issue>): <fpage>533</fpage>–<lpage>536</lpage>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_024">
<mixed-citation publication-type="chapter"> <string-name><surname>Schmidhuber</surname> <given-names>J</given-names></string-name>, <string-name><surname>Wierstra</surname> <given-names>D</given-names></string-name>, <string-name><surname>Gomez</surname> <given-names>FJ</given-names></string-name> (<year>2005</year>). <chapter-title>Evolino: Hybrid neuroevolution/optimal linear search for sequence learning</chapter-title>. In: <source>IJCAI-05, Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence</source> (<string-name><given-names>LP</given-names> <surname>Kaelbling</surname></string-name>, <string-name><given-names>A</given-names> <surname>Saffiotti</surname></string-name>, eds.), <conf-loc>Edinburgh, Scotland, UK</conf-loc>, <conf-date>July 30–August 5</conf-date>, <fpage>853</fpage>–<lpage>858</lpage>. <publisher-name>Professional Book Center</publisher-name>. URL: <uri>http://ijcai.org/Proceedings/05/Papers/1452.pdf</uri>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_025">
<mixed-citation publication-type="journal"> <string-name><surname>Ting</surname> <given-names>DSW</given-names></string-name>, <string-name><surname>Carin</surname> <given-names>L</given-names></string-name>, <string-name><surname>Dzau</surname> <given-names>V</given-names></string-name>, <string-name><surname>Wong</surname> <given-names>TY</given-names></string-name> (<year>2020</year>). <article-title>Digital technology and COVID-19</article-title>. <source>Nature Medicine</source>, <volume>26</volume>(<issue>4</issue>): <fpage>459</fpage>–<lpage>461</lpage>. doi: <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1038/s41591-020-0824-5" xlink:type="simple">https://doi.org/10.1038/s41591-020-0824-5</ext-link>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_026">
<mixed-citation publication-type="other"> US Census Bureau (2020a). Census.gov. URL: <uri>https://www.census.gov/</uri>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_027">
<mixed-citation publication-type="other"> US Census Bureau (2020b). QuickFacts: United States. Web Page. URL: <uri>https://www.census.gov/quickfacts/fact/table/US/PST045219</uri>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_028">
<mixed-citation publication-type="other"> <string-name><surname>von Laszewski</surname> <given-names>G</given-names></string-name> (2020). Cloudmesh manual. Web Page. URL: <uri>https://cloudmesh.github.io/cloudmesh-manual/</uri>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_029">
<mixed-citation publication-type="other"> <string-name><surname>von Laszewski</surname> <given-names>G</given-names></string-name>, <string-name><surname>Orlowski</surname> <given-names>A</given-names></string-name>, <string-name><surname>Otten</surname> <given-names>RH</given-names></string-name>, <string-name><surname>Markowitz</surname> <given-names>R</given-names></string-name>, <string-name><surname>Gandhi</surname> <given-names>S</given-names></string-name>, <string-name><surname>Chai</surname> <given-names>A</given-names></string-name>, et al. (2020a). Using gas for speedy generation of hybrid multi-cloud auto generated AI services, <italic>Technical report</italic>, Indiana University. Submitted for publication. URL: <uri>https://github.com/laszewski/laszewski.github.io/raw/master/papers/vonLaszewski-openapi.pdf</uri>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_030">
<mixed-citation publication-type="other"> <string-name><surname>von Laszewski</surname> <given-names>G</given-names></string-name>, et al. (2020b). Cloudmesh OpenAPI installation instructions. Web Page. URL: <uri>https://github.com/cloudmesh/cloudmesh-openapi/blob/main/README.md</uri>.</mixed-citation>
</ref>
<ref id="j_jds1007_ref_031">
<mixed-citation publication-type="other"> Welt Health Organization (2020). WHO coronavirus disease (COVID-19) dashboard. Web Page. URL: <uri>https://covid19.who.int/</uri>.</mixed-citation>
</ref>
</ref-list>
</back>
</article>
